Talent.com
AIML - Senior ML Infrastructure Engineer, ML Platform & Technologies - ML Compute
AIML - Senior ML Infrastructure Engineer, ML Platform & Technologies - ML ComputeApple Inc. • San Francisco, CA, United States
AIML - Senior ML Infrastructure Engineer, ML Platform & Technologies - ML Compute

AIML - Senior ML Infrastructure Engineer, ML Platform & Technologies - ML Compute

Apple Inc. • San Francisco, CA, United States
[job_card.variable_days_ago]
[job_preview.job_type]
  • [job_card.full_time]
[job_card.job_description]

San Francisco Bay Area, California, United States Machine Learning and AI

Apple is where individual imaginations gather together, committing to the values that lead to great work. Every new product we build, service we create, or Apple Store experience we deliver is the result of us making each other’s ideas stronger. That happens because every one of us shares a belief that we can make something wonderful and share it with the world, changing lives for the better. It’s the diversity of our people and their thinking that inspires the innovation that runs through everything we do. When we bring everybody in, we can do the best work of our lives. Here, you’ll do more than join something — you’ll add something!

Description

  • Drive large‑scale training initiatives to support our most complex models.
  • Operationalize large-scale ML workloads on Kubernetes.
  • Enhance distributed cloud training techniques for foundation models.
  • Design and integrate end-to-end lifecycles for distributed ML systems.
  • Develop tools and services to optimize ML systems beyond model selection.
  • Architect a robust MLOps platform to support seamless ML operations.
  • Collaborate with cross‑functional engineers to solve large-scale ML training challenges.
  • Research and implement new patterns and technologies to improve system performance, maintainability, and design.
  • Lead complex technical projects, defining requirements and tracking progress with team members.
  • Mentor engineers in areas of your expertise, fostering skill growth and knowledge sharing.
  • Cultivate a team centered on collaboration, technical excellence, and innovation.

Minimum Qualifications

  • Bachelor's degree in Computer Science, engineering, or a related field.
  • 4+ years of hands‑on experience building scalable backend systems for training and evaluation of machine learning models.
  • Proficient in relevant programming languages, such as Python or Go.
  • Strong expertise in distributed systems, reliability and scalability, containerization, and cloud platforms.
  • Proficient in cloud computing infrastructure and tools : Kubernetes, Ray, PySpark.
  • Ability to clearly and concisely communicate technical and architectural problems, while working with partners to iteratively find solutions.
  • Preferred Qualifications

  • Advanced degrees in Computer Science, engineering, or a related field.
  • Proficiency in working with and debugging accelerators, such as GPU, TPU, AWS Trainium.
  • Proficiency in ML training and deployment frameworks, such as JAX, TensorFlow, PyTorch, TensorRT, vLLM.
  • At Apple, base pay is one part of our total compensation package and is determined within a range. This provides the opportunity to progress as you grow and develop within a role. The base pay range for this role is between $181,100 and $318,400, and your base pay will depend on your skills, qualifications, experience, and location.

    Apple employees also have the opportunity to become an Apple shareholder through participation in Apple’s discretionary employee stock programs. Apple employees are eligible for discretionary restricted stock unit awards, and can purchase Apple stock at a discount if voluntarily participating in Apple’s Employee Stock Purchase Plan. You’ll also receive benefits including : Comprehensive medical and dental coverage, retirement benefits, a range of discounted products and free services, and for formal education related to advancing your career at Apple, reimbursement for certain educational expenses — including tuition. Additionally, this role might be eligible for discretionary bonuses or commission payments as well as relocation. Learn more about Apple Benefits.

    Note : Apple benefit, compensation and employee stock programs are subject to eligibility requirements and other terms of the applicable plan or program.

    Apple is an equal opportunity employer that is committed to inclusion and diversity. We seek to promote equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Learn more about your EEO rights as an applicant.

    Apple accepts applications to this posting on an ongoing basis.

    #J-18808-Ljbffr

    [job_alerts.create_a_job]

    Senior Ml Engineer • San Francisco, CA, United States

    [internal_linking.similar_jobs]
    Founding Engineer, ML Infrastructure

    Founding Engineer, ML Infrastructure

    Reactor • San Francisco, CA, United States
    [job_card.full_time]
    Founding Infrastructure Engineer.This is a highly technical, high-impact role focused on designing and evolving the foundation that powers our AI platform. You'll work across the entire infrastructu...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Senior ML Storage Infrastructure Engineer

    Senior ML Storage Infrastructure Engineer

    Zoox • Foster City, CA, US
    [job_card.full_time]
    Zoox is looking for a software engineer to work on our custom High-Performance Computing infrastructure and its supporting ecosystem of tools and services. This infrastructure is central to machine ...[show_more]
    [last_updated.last_updated_30] • [promoted]
    AI Infrastructure Engineer, Model Serving Platform

    AI Infrastructure Engineer, Model Serving Platform

    Scale AI • San Francisco, CA, United States
    [job_card.full_time]
    As a software engineer on the ML Infrastructure team, you will work on developing the platform for orchestrating post-training and model evaluation jobs. At Scale, we are constantly developing new d...[show_more]
    [last_updated.last_updated_30] • [promoted]
    ML Infrastructure Engineer — Scalable Training for GenAI

    ML Infrastructure Engineer — Scalable Training for GenAI

    Hedra, Inc • San Francisco, CA, United States
    [job_card.full_time]
    A pioneering generative media company is seeking an ML Engineer in San Francisco.The ideal candidate will have 3+ years of experience in high-performance computing and manage infrastructure for mac...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Senior ML Infrastructure Engineer

    Senior ML Infrastructure Engineer

    Gridware • San Francisco, CA, US
    [job_card.full_time]
    Gridware is a San Francisco-based technology company dedicated to protecting and enhancing the electrical grid.We pioneered a groundbreaking new class of grid management called active grid response...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    ML Infrastructure Engineer

    ML Infrastructure Engineer

    BlueSpace • Oakland, CA, US
    [job_card.full_time]
    Unlike conventional autonomy software, our patented 4D Predictive Perception removes reliance on data.By leveraging next-gen 4D sensors, we can precisely predict the motion of all objects, increasi...[show_more]
    [last_updated.last_updated_30] • [promoted]
    ML Infrastructure Engineer (Staff / Principal)

    ML Infrastructure Engineer (Staff / Principal)

    Genesis Therapeutics Inc. • Burlingame, CA, United States
    [job_card.full_time]
    We’re a tight-knit team of proven drug hunters, deep learning researchers, and software engineers united by a common mission — drive AI innovation in biochemistry, discovering and developing ground...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    AI Infrastructure Engineer, Model Serving Platform

    AI Infrastructure Engineer, Model Serving Platform

    Scale AI, Inc. • San Francisco, CA, United States
    [job_card.full_time]
    As a Software Engineer on the ML Infrastructure team, you will design and build platforms for scalable, reliable, and efficient serving of LLMs. Our platform powers cutting-edge research and product...[show_more]
    [last_updated.last_updated_30] • [promoted]
    MLE, ML Platform

    MLE, ML Platform

    zaimler • San Mateo, CA, US
    [job_card.full_time]
    We’re creating the foundation for AI systems that don’t just generate, but retrieve, link, and reason over enterprise knowledge. In just over a year, we’ve begun partnering with Fo...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Senior Software Engineer - ML Infrastructure

    Senior Software Engineer - ML Infrastructure

    Plaid • San Francisco, CA, US
    [job_card.full_time]
    Plaid is evolving into an AI-first company, where data and machine learning are the key enablers of smarter, more secure insight products built on top of Plaid’s vast financial data network.T...[show_more]
    [last_updated.last_updated_30] • [promoted]
    ML Infrastructure Engineer for Biology Foundation Models

    ML Infrastructure Engineer for Biology Foundation Models

    Prima Mente • San Francisco, CA, United States
    [job_card.full_time]
    A biotechnology company based in San Francisco is looking for a candidate to architect and scale foundational AI infrastructure. The role requires collaboration with researchers to deploy scalable M...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    ML Infrastructure Engineer

    ML Infrastructure Engineer

    Phizenix • Menlo Park, CA, US
    [job_card.full_time] +1
    Menlo Park, CA | On-Site | Full-Time / Direct Hire.Looking for ML Infra experts (Bay Area preferred) with deep experience in CUDA, GPU optimization, VLLMs, and LLM inference—pure language focus...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Senior ML Platform Engineer

    Senior ML Platform Engineer

    42dot • San Francisco, CA, United States
    [job_card.full_time]
    AI company committed to solving mobility challenges with software and AI.As the Global Software Center of Hyundai Motor Group, 42dot pioneers the future of mobility by advancing the development of ...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Senior Applied AI Engineer – ML for Systems & Infrastructure

    Senior Applied AI Engineer – ML for Systems & Infrastructure

    Databricks Inc. • San Francisco, CA, United States
    [job_card.full_time]
    Senior Applied AI Engineer – ML for Systems & Infrastructure.The Applied AI team at Databricks sits at the forefront of advancing GenAI-powered products. Over the past years, we’ve launched Databric...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Senior ML Ops Architect & Technical Lead

    Senior ML Ops Architect & Technical Lead

    Baton • San Francisco, CA, United States
    [job_card.full_time]
    A technology firm in San Francisco is seeking a Staff Software Engineer specializing in Machine Learning Operations.The role involves building scalable ML infrastructure and leading ML Ops projects...[show_more]
    [last_updated.last_updated_variable_hours] • [promoted] • [new]
    Senior ML Systems Engineer, Frameworks & Tooling

    Senior ML Systems Engineer, Frameworks & Tooling

    Cohere • San Francisco, CA, United States
    [job_card.full_time]
    Senior ML Systems Engineer, Frameworks & Tooling.Our mission is to scale intelligence to serve humanity.We’re training and deploying frontier models for developers and enterprises who are building ...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    ML Infrastructure Engineer, Safeguards

    ML Infrastructure Engineer, Safeguards

    Anthropic • San Francisco, CA, United States
    [job_card.full_time]
    Anthropic’s mission is to create reliable, interpretable, and steerable AI systems.We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group ...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Staff ML Infrastructure Engineer - Scale & Inference

    Staff ML Infrastructure Engineer - Scale & Inference

    Snap Inc. • San Francisco, CA, United States
    [job_card.full_time]
    A leading tech company is seeking a Software Engineer for ML Infrastructure in San Francisco.This role involves designing high-performance systems for machine learning workloads, collaborating with...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]