Talent.com
Staff ML Infrastructure Engineer
Staff ML Infrastructure EngineerCubiq Recruitment • Hayward, CA, US
Staff ML Infrastructure Engineer

Staff ML Infrastructure Engineer

Cubiq Recruitment • Hayward, CA, US
[job_card.variable_days_ago]
[job_preview.job_type]
  • [job_card.full_time]
[job_card.job_description]

Staff / Lead ML Infrastructure Engineer

San Francisco, CA — Onsite

Salary - Over market average + equity

We are building one of the world's leading generative video and multimodal AI platforms, and we're looking for a senior infrastructure engineer to drive the backbone that makes it possible. This role is ideal for an engineer from a top-tier tech company who has built cloud-scale systems, high-performance compute platforms, and battle-tested CI / CD pipelines that support complex ML workloads.

What You'll Own

  • Core ML Platform Architecture : Design and evolve the infrastructure that supports large-scale generative video and multimodal model training, evaluation, and deployment.
  • High-Throughput Compute Systems : Build and optimize GPU / TPU clusters, distributed training systems, and orchestration layers tailored for video-heavy pipelines.
  • Production Reliability for Generative Models : Create the tooling and services needed to safely push frequent model updates while handling massive compute loads and long-running jobs.
  • End-to-End CI / CD for ML : Lead the development of automated pipelines for model training, validation, artifact management, and production rollout.
  • Multimodal Data Infrastructure : Build systems to ingest, version, transform, and serve large-scale video, audio, and text datasets with high reliability.
  • Internal Developer Experience : Partner with research, product, and applied ML teams to build intuitive internal tooling for experiment tracking, model lineage, and resource scheduling.
  • Technical Leadership : Mentor engineers, set platform standards, and influence long-term architectural direction.

What You've Done

  • Experience architecting and operating large-scale infrastructure at a cloud provider, hyperscaler, or leading AI company.
  • Built or owned mission-critical CI / CD systems, high-capacity compute platforms, or data infrastructure supporting ML teams.
  • Deep experience with distributed compute across GPUs / accelerators, Kubernetes, and cloud infrastructure (AWS / GCP / Azure).
  • Strong engineering fundamentals in Python, Go, or equivalent languages.
  • Previous exposure to ML training pipelines—especially systems that handle heavy video, multimodal, or high-dimensional data.
  • Demonstrated ability to lead complex cross-org initiatives and drive technical strategy.
  • Nice to Have

  • Experience with video processing systems, large-scale media pipelines, or streaming architectures.
  • Familiarity with modern multimodal or video-generation frameworks (PyTorch, JAX, diffusers, custom accelerators).
  • Experience with Ray, Triton, CUDA optimization, or specialized scheduling for ML workloads.
  • Background working in high-growth AI startups or research-focused environments.
  • Security and compliance considerations for models that generate or process user content.
  • Why Join

  • Shape the underlying platform powering one of the most advanced generative video systems in the world.
  • Influence the future of multimodal AI by building infrastructure that directly accelerates research and product breakthroughs.
  • Work closely with experienced founding engineers, researchers, and platform builders from leading tech companies.
  • Highly competitive compensation, meaningful equity, and strong in-person engineering culture in San Francisco.
  • [job_alerts.create_a_job]

    Staff Engineer Infrastructure • Hayward, CA, US

    [internal_linking.related_jobs]
    Sr. Staff ML Platform Engineer (TLM)

    Sr. Staff ML Platform Engineer (TLM)

    Earnin • Mountain View, California, United States
    [job_card.full_time]
    As one of the first pioneers of earned wage access, our passion at EarnIn is building products that deliver real-time financial flexibility for those with the unique needs of living paycheck to pay...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Sr. Staff Software Engineer Network Infrastructure Observability

    Sr. Staff Software Engineer Network Infrastructure Observability

    LinkedIn • Mountain View, California, USA
    [job_card.full_time]
    At LinkedIn our approach to flexible work is centered on trust and optimized for culture connection clarity and the evolving needs of our business. The work location of this role is hybrid meaning i...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Senior ML Infrastructure Engineer, Battery Manufacturing

    Senior ML Infrastructure Engineer, Battery Manufacturing

    Tesla Motors, Inc. • Palo Alto, CA, United States
    [job_card.full_time]
    A leading electric vehicle manufacturer is seeking a full-stack Software Engineer in California to enhance their battery product development. The ideal candidate will have expertise in Python and C+...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Senior ML Platform Engineer : Scale LLM Infrastructure

    Senior ML Platform Engineer : Scale LLM Infrastructure

    GEICO • Palo Alto, CA, US
    [job_card.full_time]
    A leading insurance company in California is seeking a Senior ML Platform Engineer to enhance their machine learning infrastructure. This role involves designing scalable systems for Large Language ...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Staff Software Development Engineer (LLM)

    Staff Software Development Engineer (LLM)

    Fortinet • Sunnyvale, CA, United States
    [job_card.full_time]
    Architect and implement functions to monitor and filter LLM requests / responses in real time, preventing prompt injection attacks and unauthorized data leakage. Build a highly scalable pipeline capab...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Staff Systems Software Engineer, Infrastructure Platform

    Staff Systems Software Engineer, Infrastructure Platform

    GM • Mountain View, California, USA
    [job_card.full_time]
    The Infrastructure Engineering organisation at GM is building a cloud-native platform that transforms how developers interact with automotive test hardware. This platform treats physical benches mob...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Staff Systems Engineer

    Staff Systems Engineer

    Bio-Rad Laboratories • Pleasanton, CA, United States
    [job_card.full_time]
    Working within Bio-Rad's Life Science R&D Group as a Systems Engineer, you will take engineering concepts, requirements and transform them into functional prototypes and finished products that impr...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Sr. Software Engineer, Traffic Infrastructure

    Sr. Software Engineer, Traffic Infrastructure

    Genesis10 • Sunnyvale, CA, US
    [job_card.permanent]
    Genesis10 is currently seeking a Sr.Software Engineer, Traffic Infrastructure with our client in their Sunnyvale, CA location. This is a 6 month + contract remote position.Summary : As part of our wo...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Staff Machine Learning Engineer, ML Infrastructure (Predictive Planner)

    Staff Machine Learning Engineer, ML Infrastructure (Predictive Planner)

    Waymo • Mountain View, California, United States
    [job_card.full_time]
    Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver.Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on buildin...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Senior / Staff Software Engineer, Machine Learning Infrastructure

    Senior / Staff Software Engineer, Machine Learning Infrastructure

    Nuro • Mountain View, California, United States
    [job_card.full_time]
    Nuro is a self-driving technology company on a mission to make autonomy accessible to all.Founded in 2016, Nuro is building the world’s most scalable driver, combining cutting-edge AI with automoti...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Android AI ML Engineer - Infrastructure

    Android AI ML Engineer - Infrastructure

    Focuskpi • Mountain View, California, United States
    [job_card.temporary]
    Android AI ML Engineer - Infrastructure.The client is seeking an experienced Android AI / ML Engineer - Infrastructure to develop advanced on-device machine learning systems that enable secure, adapt...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Senior Staff Cloud Infrastructure Engineer

    Senior Staff Cloud Infrastructure Engineer

    Zscaler • San Jose, California, USA
    [job_card.full_time]
    Zscaler accelerates digital transformation so our customers can be more agile efficient resilient and secure.Our cloud native Zero Trust Exchange platform protects thousands of customers from cyber...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Enterprise Cloud Infrastructure Engineer

    Enterprise Cloud Infrastructure Engineer

    InsideHigherEd • Stanford, California, United States
    [job_card.full_time]
    Enterprise Cloud Infrastructure Engineer.Business Affairs : University IT (UIT), Redwood City, California, United States. Information Technology Services📅Sep 05, 2025 Post Date📅107211 Requisi...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Senior Infra Engineer - Gemini API+ Serving & ML

    Senior Infra Engineer - Gemini API+ Serving & ML

    Google Inc. • Sunnyvale, CA, United States
    [job_card.full_time]
    A leading technology company in Sunnyvale is seeking a Senior Software Engineer to develop infrastructure for AI applications. You will collaborate on machine learning projects and ensure robust, sc...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Staff ML Infrastructure Engineer

    Staff ML Infrastructure Engineer

    Cubiq Recruitment • Sunnyvale, CA, US
    [job_card.full_time]
    Staff / Lead ML Infrastructure Engineer.Salary - Over market average + equity.We are building one of the world's leading generative video and multimodal AI platforms, and we're looking for a senior...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    ML Infrastructure Engineer — Scale Generative Models

    ML Infrastructure Engineer — Scale Generative Models

    Apple Inc. • Cupertino, CA, United States
    [job_card.full_time]
    A leading technology company in Cupertino, California, is seeking a ML Infrastructure Engineer to design and optimize the systems that power large-scale model training. The ideal candidate will have...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Staff Systems Engineer

    Staff Systems Engineer

    Intuitive • Sunnyvale, California, USA
    [job_card.full_time]
    We are seeking a highly experienced Staff Engineer in Infrastructure to contribute to the strategy architecture and operations of Infrastructure as Code (IaC) for the Technical Operations group (Az...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Staff ML Engineer, Cross-Team Recommendations

    Staff ML Engineer, Cross-Team Recommendations

    Pinterest • Palo Alto, CA, US
    [job_card.full_time]
    A leading visual discovery platform is seeking a highly motivated Staff ML Engineer to work as a cross-team technical leader. This role involves innovating on large-scale machine learning recommenda...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]