Talent.com
ML Infrastructure Engineer
ML Infrastructure EngineerOpenai • San Francisco, California, United States
[error_messages.no_longer_accepting]
ML Infrastructure Engineer

ML Infrastructure Engineer

Openai • San Francisco, California, United States
[job_card.30_days_ago]
[job_preview.job_type]
  • [job_card.full_time]
[job_card.job_description]

About the Team

The Runtime team builds the low level framework components to power our ML training systems.  We work on building robust, scalable, high performance components to support our distributed training workloads.  Our priorities are to maximize the productivity of our researchers and our hardware, with the goal of accelerating progress towards AGI.

About the Role

As a ML Infrastructure Engineer, you will work on improving the training throughput for our internal training framework, while enabling researchers to experiment with new ideas.  This requires good engineering (for example designing, implementing, and optimizing state-of-the-art AI models), writing bug-free machine learning code (surprisingly difficult!), and acquiring deep knowledge of the performance of supercomputers. In all the projects this role pursues, the ultimate goal is to push the field forward.

We’re looking for people who love optimizing performance, understanding distributed systems, and who cannot stand having bugs in their code.  Since our training framework is used for large runs with massive numbers of GPUs, performance improvements here will have a large impact.

This role is based in San Francisco, CA. We use a hybrid work model of 3 days in the office per week and offer relocation assistance to new employees.

In this role, you will :

Apply the latest techniques in our internal training framework to achieve impressive hardware efficiency for our training runs

Profile and optimize our training framework

Work with researchers to enable them to develop the next generation of models

You might thrive in this role if you :

Have run small scale ML experiments

Love figuring out how systems work and continuously come up with ideas for how to make them faster while minimizing complexity and maintenance burden

Have strong software engineering skills and are proficient in Python

About OpenAI

OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.

We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or any other legally protected status.

OpenAI Affirmative Action and Equal Employment Opportunity Policy Statement

For US Based Candidates : Pursuant to the San Francisco Fair Chance Ordinance, we will consider qualified applicants with arrest and conviction records.

We are committed to providing reasonable accommodations to applicants with disabilities, and requests can be made via this  link .

OpenAI Global Applicant Privacy Policy

At OpenAI, we believe artificial intelligence has the potential to help people solve immense global challenges, and we want the upside of AI to be widely shared. Join us in shaping the future of technology.

[job_alerts.create_a_job]

Ml Engineer • San Francisco, California, United States

[internal_linking.similar_jobs]
Machine Learning Infrastructure Engineer

Machine Learning Infrastructure Engineer

VirtualVocations • San Francisco, California, United States
[job_card.full_time]
A company is looking for a Member of Technical Staff, Inference.Key Responsibilities Productionize model checkpoints end-to-end from research to deployment Build and optimize inference systems f...[show_more]
[last_updated.last_updated_variable_hours] • [promoted] • [new]
ML Infrastructure Engineer, Safeguards

ML Infrastructure Engineer, Safeguards

The Rundown AI, Inc. • San Francisco, CA, United States
[job_card.full_time]
We are seeking a Machine Learning Infrastructure Engineer to join our Safeguards organization, where you'll build and scale the critical infrastructure that powers our AI safety systems.You'll work...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
ML Model Serving Infrastructure Engineer

ML Model Serving Infrastructure Engineer

Anyscale • San Francisco, CA, United States
[job_card.full_time]
A technology company in San Francisco is seeking an experienced engineer to develop highly available ML model serving systems. The role requires proficiency in algorithms, system design, and modern ...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Founding Engineer, ML Infrastructure

Founding Engineer, ML Infrastructure

Reactor • San Francisco, CA, United States
[job_card.full_time]
Founding Infrastructure Engineer.This is a highly technical, high-impact role focused on designing and evolving the foundation that powers our AI platform. You'll work across the entire infrastructu...[show_more]
[last_updated.last_updated_30] • [promoted]
Senior ML Infrastructure Engineer

Senior ML Infrastructure Engineer

Parametric (YC F25) • San Francisco, CA, United States
[job_card.full_time]
Senior ML Infrastructure Engineer.This range is provided by Parametric (YC F25).Your actual pay will be based on your skills and experience — talk with your recruiter to learn more.Parametric is bu...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
ML Infrastructure Engineer — Scalable Training for GenAI

ML Infrastructure Engineer — Scalable Training for GenAI

Hedra, Inc • San Francisco, CA, United States
[job_card.full_time]
A pioneering generative media company is seeking an ML Engineer in San Francisco.The ideal candidate will have 3+ years of experience in high-performance computing and manage infrastructure for mac...[show_more]
[last_updated.last_updated_30] • [promoted]
ML Infrastructure Engineer, Safeguards

ML Infrastructure Engineer, Safeguards

GlueGROUPS Inc. • San Francisco, CA, United States
[job_card.full_time]
Anthropic’s mission is to create reliable, interpretable, and steerable AI systems.We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group ...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Infrastructure Engineer, ML Systems

Infrastructure Engineer, ML Systems

Appliedcompute • San Francisco, CA, United States
[job_card.full_time]
Applied Compute builds Specific Intelligence for enterprises, unlocking the knowledge inside a company to train custom models and deploy an in-house agent workforce. Today’s state-of-the-art AI isn’...[show_more]
[last_updated.last_updated_30] • [promoted]
Senior ML Infrastructure Engineer - Scale & Inference

Senior ML Infrastructure Engineer - Scale & Inference

Snap Inc. • San Francisco, CA, United States
[job_card.full_time]
A leading technology company in San Francisco is seeking a Software Engineer for ML Infrastructure.The role involves designing and optimizing systems for machine learning workloads, developing high...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
ML Infrastructure Engineer – Large-Scale Training (Relocation)

ML Infrastructure Engineer – Large-Scale Training (Relocation)

G2M Talent • San Francisco, CA, United States
[job_card.full_time]
A tech-focused research team is searching for a Machine Learning Engineer to develop the infrastructure for large-scale training and experimentation of neural networks. The ideal candidate will desi...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Staff, ML Infrastructure Engineer

Staff, ML Infrastructure Engineer

Tubi Tv • San Francisco, CA, United States
[job_card.full_time]
Boldly built for every fandom, Tubi is quo cost Free streaming service that entertains over 100 million monthly active users. Tubi offers the world's largest collection of Hollywood movies and TV sh...[show_more]
[last_updated.last_updated_variable_hours] • [promoted] • [new]
Privacy-First ML Infrastructure Engineer

Privacy-First ML Infrastructure Engineer

Workshop Labs • San Francisco, CA, United States
[job_card.full_time]
A pioneering AI startup in San Francisco is looking for an experienced individual to build infrastructure for deploying personalized AI models. The role demands a strong understanding of machine lea...[show_more]
[last_updated.last_updated_30] • [promoted]
ML Infrastructure Engineer (Staff / Principal)

ML Infrastructure Engineer (Staff / Principal)

Genesis Molecular AI • Burlingame, CA, United States
[job_card.full_time]
Get AI-powered advice on this job and more exclusive features.We’re a tight‑knit team of proven drug hunters, deep learning researchers, and software engineers united by a common mission—to drive A...[show_more]
[last_updated.last_updated_30] • [promoted]
AIML - Senior ML Infrastructure Engineer, ML Platform & Technologies - ML Compute

AIML - Senior ML Infrastructure Engineer, ML Platform & Technologies - ML Compute

Apple Inc. • San Francisco, CA, United States
[job_card.full_time]
San Francisco Bay Area, California, United States Machine Learning and AI.Apple is where individual imaginations gather together, committing to the values that lead to great work.Every new product ...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Staff ML Infrastructure Engineer

Staff ML Infrastructure Engineer

Cubiq Recruitment • San Francisco, CA, United States
[job_card.full_time]
Staff / Lead ML Infrastructure Engineer.Salary - Over market average + equity.We are building one of the world’s leading generative video and multimodal AI platforms, and we’re looking for a senior...[show_more]
[last_updated.last_updated_30] • [promoted]
Cloud Infrastructure Staff Engineer

Cloud Infrastructure Staff Engineer

PayJoy • San Francisco, CA, US
[job_card.full_time]
PayJoy is a mission-first credit provider dedicated to helping under-served customers in emerging markets to achieve financial stability and success. Our patented technology for secured credit provi...[show_more]
[last_updated.last_updated_30] • [promoted]
ML Infrastructure Engineer, Safeguards

ML Infrastructure Engineer, Safeguards

Anthropic • San Francisco, CA, United States
[job_card.full_time]
Anthropic’s mission is to create reliable, interpretable, and steerable AI systems.We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group ...[show_more]
[last_updated.last_updated_30] • [promoted]
Staff ML Infrastructure Engineer - Remote

Staff ML Infrastructure Engineer - Remote

Darwin Recruitment • San Francisco, CA, United States
[filters.remote]
[job_card.full_time]
A fast-growing AI company is seeking a Staff / Principal ML Infrastructure Engineer to lead the design and deployment of large language model infrastructure. Responsibilities include optimizing mode...[show_more]
[last_updated.last_updated_variable_days] • [promoted]