Staff AI Engineer, Inference & OptimizationSonatus • Sunnyvale, California, United States

[error_messages.no_longer_accepting]

Staff AI Engineer, Inference & Optimization

Sonatus • Sunnyvale, California, United States

[job_card.30_days_ago]

[job_preview.job_type]

[job_card.full_time]

[job_card.job_description]

Sonatus is a well-funded, fast-paced, and rapidly growing company whose software products and solutions help automakers build dynamic software-defined vehicles. With over four million vehicles already on the road with top global OEM brands, our vehicle and cloud software solutions are at the forefront of automotive digital transformation. The Sonatus team is a talented and diverse collection of technology and automotive specialists hailing from many of the most prominent companies in their respective industries.

The Opportunity :

We're looking for a highly skilled and experienced Staff AI Engineer with domain expertise in optimizing AI models for production Edge environments. You’ll own the full lifecycle of model inference and hardware acceleration , from initial optimization to large-scale deployment. In this role, you will be a key contributor to our team, ensuring our AI solutions are not just functional but also incredibly fast, efficient, and reliable on various inference hardware platforms.

Role and Responsibilities :

Design, build, and maintain robust pipelines and runtime environments for deploying and serving machine learning models at the Edge. Ensure high availability, low latency, and efficient resource utilization for inference at scale.
Collaborate with researchers and hardware engineers to optimize models for performance, latency, and power consumption on specific hardware, including GPUs, TPUs, NPUs, and FPGAs. This includes a strong focus on inference optimization techniques like quantization, pruning, and knowledge distillation.
Use of AI compilers and specialized software stacks (e.g., TensorRT, OpenVINO, TVM) to accelerate model execution, ensuring models are compiled and optimized for peak performance on target hardware.
Design, build, and maintain MLOps pipelines for deploying models to various edge devices (e.g., highly integrated vehicle compute), with a specific focus on performance and efficiency constraints.
Implement and maintain monitoring and alerting systems to track model performance, data drift, and overall model health in production.
Work with cloud platforms and on-device environments to provision and manage the necessary infrastructure for scalable and reliable model serving.
Proactively identify and resolve issues related to model performance, deployment failures, and data discrepancies, with a specific focus on inference bottlenecks.
Work closely with Machine Learning Engineers, Software Engineers, and Product Managers to bring models from design to high-performance production systems.

Qualifications :

Minimum 7 years of work experience in MLOps or a similar role with a strong focus on high-performance machine learning systems.

Proven experience with inference optimization techniques such as quantization (INT8, FP16), pruning, and model distillation.

Deep hands-on experience with hardware acceleration for machine learning, including familiarity with GPUs, TPUs, NPUs and related software ecosystems.

Strong experience with AI compilers and runtime environments like TensorRT, OpenVINO, and TVM.

Proven experience deploying and managing ML models on edge devices (e.g., NVIDIA Jetson, Raspberry Pi, NXP, Renesas).

Strong experience in designing and building distributed systems. Proficiency with inter-process communication protocols like gRPC, message queuing systems like MQTT, and efficient data handling techniques such as buffering and callbacks.

Hands-on experience with popular ML frameworks such as PyTorch, TensorFlow, TFLite, and ONNX.

Proficiency in programming languages, including Python and C++.

Solid understanding of machine learning concepts, the ML development lifecycle, and the challenges of deploying models at scale.

Proficiency with containerization technologies (Docker, Kubernetes) and cloud platforms (AWS, Azure).

Expertise in CI / CD principles and tools applied to machine learning workflows.

Bachelor's or Master's degree in Computer Science, Electrical Engineering, or a related quantitative field.

Benefits :

Sonatus is a tight-knit team aligned around a unified vision. You can expect a strong engineering-oriented culture that focuses on building the best products and solutions for our customers. We embrace equality and diversity in all regards because respect is ingrained in our every fiber. Other benefits Sonatus offers include :

Stock option plan

Health care plan (Medical, Dental & Vision)

Retirement plan (401k, IRA)

Life Insurance (Basic, Voluntary & AD&D)

Unlimited paid time off (Vacation, Sick & Public Holidays)

Family leave (Maternity, Paternity)

Flexible work arrangements

Free food & snacks in office

The posted salary range is a general guideline and represents a good faith estimate of what Sonatus ("Company") could reasonably expect to pay for a base salary for this position. The pay offered to a selected candidate will be determined based on factors such as (but not limited to) the scope and responsibilities of the position, the qualifications of the selected candidate, departmental budget availability, geographic location and external market pay for comparable jobs. The Company reserves the right to modify this range in the future, as needed, as market conditions change.

Pay range for this role

$197,500 - $260,000 USD

Sonatus is a fast-paced and innovative company and are seeking team members who are passionate about making a difference. If you are ready to take your career to the next level, we highly encourage you to apply.

To all recruitment agencies : Sonatus, Inc. ("Sonatus") does not accept unsolicited agency resumes. Please do not forward resumes to our careers alias or other Sonatus' employees. Sonatus is not responsible for any fees associated with unsolicited activities.

[job_alerts.create_a_job]

Staff Ai Engineer • Sunnyvale, California, United States

[internal_linking.related_jobs]

Staff Data Engineer

Elastic • Mountain View, CA, United States

[job_card.full_time]

Elastic, the Search AI Company, enables everyone to find the answers they need in real time, using all their data, at scale - unleashing the potential of businesses and people.The Elastic Search AI...[show_more]

[last_updated.last_updated_variable_days] • [promoted]

Staff ML Engineer : Generative AI & Large-Scale Systems

Adobe Inc. • San Jose, CA, US

[job_card.full_time]

A leading software company is seeking a Staff Machine Learning Engineer to work on AI and machine learning solutions that enhance customer experience. The ideal candidate will have strong expertise ...[show_more]

[last_updated.last_updated_variable_days] • [promoted]

AI Data Engineer

InsideHigherEd • Stanford, California, United States

[job_card.full_time]

Business Affairs : University IT (UIT), Redwood City, California, United States.Information Technology Services📅Sep 08, 2025 Post Date📅107222 Requisition #. Are you an experienced AI / GenAI en...[show_more]

[last_updated.last_updated_variable_days] • [promoted]

Sr. Staff Software Engineer - AI + Data Intelligence Platform

Databricks Inc. • Mountain View, CA, United States

[job_card.full_time]

Staff Software Engineer – AI + Data Intelligence Platform.Databricks is looking for an experienced engineer to build the next generation of our Data Intelligence Platform.You will work with product...[show_more]

[last_updated.last_updated_variable_days] • [promoted]

Staff ML Engineer - AI-Powered Observability Platform

Cisco Systems • San Jose, CA, US

[job_card.full_time]

A global technology company is looking for a seasoned software engineer to enhance AI capabilities within their observability platform. Candidates should have a strong background in AI / ML systems, c...[show_more]

[last_updated.last_updated_1_day] • [promoted]

Staff AI Researcher, Foundation Models

Verily Life Sciences • Mountain View, CA, United States

[job_card.full_time]

Verily is a subsidiary of Alphabet that is using a data-driven approach to change the way people manage their health and the way healthcare is delivered. Launched from Google X in 2015, our purpose ...[show_more]

[last_updated.last_updated_variable_days] • [promoted]

Staff Machine Learning Engineer - AI Foundation

XPENG & Volkswagen Group • Santa Clara, CA, United States

[job_card.full_time]

XPENG is a leading smart technology company at the forefront of innovation, integrating advanced AI and autonomous driving technologies into its vehicles, including electric vehicles (EVs), electri...[show_more]

[last_updated.last_updated_30] • [promoted]

Staff Machine Learning Engineer

Adobe Inc. • San Jose, CA, US

[job_card.full_time]

Overview Adobe Experience Intelligence Team is looking for a Staff Machine Learning Engineer who will apply AI and machine learning techniques to big-data problems to help Adobe better understand,...[show_more]

[last_updated.last_updated_variable_days] • [promoted]

Staff Machine Learning Engineer

Cisco Systems, Inc. • San Jose, California, United States

[job_card.full_time]

Meet the Team Join the engineering team building the intelligent backbone of Splunk Observability Cloud.We are committed to leveraging the latest advancements in data science and machine learning t...[show_more]

[last_updated.last_updated_variable_days] • [promoted]

Senior Decision Intelligence Engineer - AI & Optimization

NVIDIA Corporation • Santa Clara, CA, United States

[job_card.full_time]

A leading technology company is seeking a Senior Software Engineer for Decision Intelligence in Santa Clara.The role involves creating innovative techniques in decision intelligence, delivering rap...[show_more]

[last_updated.last_updated_variable_days] • [promoted]

Staff Software Engineer, Generative AI

Google Inc. • Mountain View, CA, United States

[job_card.full_time]

Bachelor's degree or equivalent practical experience.ML design and optimizing ML infrastructure (e.Generative AI (GenAI) techniques (e. LLMs, Multi-Modal, Large Vision Models) or with GenAI-related ...[show_more]

[last_updated.last_updated_variable_days] • [promoted]

Staff ML Engineer — AI Platform, GPUs, Hybrid

ServiceNow, Inc. • Santa Clara, CA, United States

[job_card.full_time]

A leading enterprise technology company seeks a Staff Machine Learning Engineer in Santa Clara.This role requires a commitment to building advanced AI infrastructures and collaborating with cross-f...[show_more]

[last_updated.last_updated_1_day] • [promoted]

Staff Integration Engineer

PsiQuantum • Palo Alto, CA, United States

[job_card.full_time]

PsiQuantum'smission is to build the first useful quantum computers-machines capable of delivering the breakthroughs the field has long promised. Since our founding in 2016, our singular focus has be...[show_more]

[last_updated.last_updated_1_day] • [promoted]

Staff ML Engineer (Client-Facing) : Trust & Safety AI

Reinforce Labs, Inc. • Palo Alto, California, United States

[job_card.full_time]

A technology firm specializing in AI solutions seeks a candidate to enhance safety and reliability in complex applications. The role involves engaging with clients to analyze data and deliver effect...[show_more]

[last_updated.last_updated_variable_days] • [promoted]

Staff Machine Learning Engineer

GEICO • Palo Alto, CA, United States

[job_card.full_time]

Staff Machine Learning Engineer • • • •Overview : • • •single • AI / Machine Learning team, responsible for the tech design and tech health of the team. You will build and architect scalable and reliable AIML...[show_more]

[last_updated.last_updated_variable_days] • [promoted]

Staff Machine Learning R&D Engineer

Matterport • Sunnyvale, CA, United States

[job_card.full_time]

Matterport is leading the digital transformation of the built world.Our groundbreaking spatial computing platform turns buildings into data making every space more valuable and accessible.Millions ...[show_more]

[last_updated.last_updated_variable_days] • [promoted]

Sr. Staff Machine Learning Engineer, Closeup Relevance

Pinterest • Palo Alto, CA, United States

[job_card.full_time]

Millions of people around the world come to our platform to find creative ideas, dream about new possibilities and plan for memories that will last a lifetime. At Pinterest, we're on a mission to br...[show_more]

[last_updated.last_updated_variable_days] • [promoted]

Senior Staff Research Engineer, On-Device Language Intelligence

Samsung Electronics GmbH • Mountain View, CA, US

[job_card.full_time]

Job Location Mountain View, CA Job Category Job Type Full-Time Job # 402599 Job Department Artificial Intelligence Center Lab Summary : Samsung AI Research Center (AIC) located in Mountain Vie...[show_more]

[last_updated.last_updated_30] • [promoted]