Software Engineer, Model InferenceOpenai • San Francisco, California, United States

Software Engineer, Model Inference

Openai • San Francisco, California, United States

[job_card.30_days_ago]

[job_preview.job_type]

[job_card.full_time]

[job_card.job_description]

Our team brings OpenAI’s most capable research and technology to the world through our products. We empower consumers, enterprise and developers alike to use and access our start-of-the-art AI models, allowing them to do things that they’ve never been able to before. We focus on performant and efficient model inference, as well as accelerating research progression via model inference.

About the Role

We are looking for an engineer who wants to take the world's largest and most capable AI models and optimize them for use in a high-volume, low-latency, and high-availability production environment.

In this role, you will :

Work alongside machine learning researchers, engineers, and product managers to bring our latest technologies into production.

Introduce new techniques, tools, and architecture that improve the performance, latency, throughput, and efficiency of our deployed models.

Build tools to give us visibility into our bottlenecks and sources of instability and then design and implement solutions to address the highest priority issues.

Optimize our code and fleet of Azure VMs to utilize every FLOP and every GB of GPU RAM of our hardware.

You might thrive in this role if you :

Have an understanding of modern ML architectures and an intuition for how to optimize their performance, particularly for inference.

Own problems end-to-end, and are willing to pick up whatever knowledge you're missing to get the job done.

Have at least 3 years of professional software engineering experience.

Have or can quickly gain familiarity with PyTorch, NVidia GPUs and the software stacks that optimize them (e.g. NCCL, CUDA), as well as HPC technologies such as InfiniBand, MPI, etc.

Have experience architecting, observing, and debugging production distributed systems.

Have a humble attitude, an eagerness to help your colleagues, and a desire to do whatever it takes to make the team succeed.

Have needed to rebuild or substantially refactor production systems several times over due to rapidly increasing scale.

Are self-directed and enjoy figuring out the most important problem to work on.

Have a good intuition for when off-the-shelf solutions will work, and build tools to accelerate your own workflow quickly if they won’t.

About OpenAI

OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.

We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or any other legally protected status.

OpenAI Affirmative Action and Equal Employment Opportunity Policy Statement

For US Based Candidates : Pursuant to the San Francisco Fair Chance Ordinance, we will consider qualified applicants with arrest and conviction records.

We are committed to providing reasonable accommodations to applicants with disabilities, and requests can be made via this link .

OpenAI Global Applicant Privacy Policy

At OpenAI, we believe artificial intelligence has the potential to help people solve immense global challenges, and we want the upside of AI to be widely shared. Join us in shaping the future of technology.

[job_alerts.create_a_job]

Software Engineer • San Francisco, California, United States

[internal_linking.related_jobs]

AI / ML Software Engineer : Inference on Custom Accelerators

Amazon • San Francisco, CA, United States

[job_card.full_time]

A leading e-commerce platform in San Francisco is seeking a Software Development Engineer to develop and optimize machine learning models for custom hardware accelerators.This role involves perform...[show_more]

[last_updated.last_updated_variable_days] • [promoted]

Senior Software Engineer - Model Training

AI Fund • San Francisco, CA, US

[job_card.full_time]

Senior Software Engineer - Model Training Join to apply for the Senior Software Engineer - Model Training role at AI Fund Senior Software Engineer - Model Training 1 day ago Be among the first...[show_more]

[last_updated.last_updated_variable_days] • [promoted]

Model Deployment Engineer

Rime • San Francisco, CA, United States

[job_card.full_time]

Rime builds enterprise‑grade voice models that sound truly human — trusted by global telcos, healthcare systems, and leading brands to power billions of real customer interactions.Our mission is to...[show_more]

[last_updated.last_updated_variable_days] • [promoted]

Software Engineer - Localization, State Estimation & Prediction San Francisco, US

Lodestar • San Francisco, CA, United States

[job_card.full_time] +1

Lodestar's mission is to develop the first "Protect and Defend" capability for high-value space assets in orbit.Our flagship product MITHRIL is our hardware-agnostic, AI-enabled autonomy software s...[show_more]

[last_updated.last_updated_variable_days] • [promoted]

Senior Software Engineer, Machine Learning

Planet Labs PBC • San Francisco, CA, United States

[job_card.full_time]

We believe in using space to help life on Earth.Planet designs, builds, and operates the largest constellation of imaging satellites in history. This constellation delivers an unprecedented dataset ...[show_more]

[last_updated.last_updated_variable_days] • [promoted]

AI Inference Engineer

Pantera Capital • San Francisco, CA, United States

[job_card.full_time]

We are looking for an AI Inference engineer to join our growing team.Our current stack is Python, Rust, C++, PyTorch, Triton, CUDA, Kubernetes. You will have the opportunity to work on large-scale d...[show_more]

[last_updated.last_updated_variable_days] • [promoted]

Software Engineer, Model Inference

OpenAI • San Francisco, CA, United States

[job_card.full_time]

Our Inference team brings OpenAI's most capable research and technology to the world through our products.We empower consumers, enterprise and developers alike to use and access our start-of-the-ar...[show_more]

[last_updated.last_updated_30] • [promoted]

Software Engineer - ML / LLM Inference

Alldus • San Francisco, CA, United States

[job_card.full_time]

Get AI-powered advice on this job and more exclusive features.Direct message the job poster from Alldus.Principal Recruitment Consultant | AI & Machine Learning | Co-organizer of the AI in Action P...[show_more]

[last_updated.last_updated_30] • [promoted]

GenAI Inference Engineer — Scalable LLM Serving

Databricks Inc. • San Francisco, CA, United States

[job_card.full_time]

A leading AI-focused technology company in San Francisco is seeking a Software Engineer for GenAI inference.In this role, you'll design, develop, and optimize the inference engine powering the Foun...[show_more]

[last_updated.last_updated_variable_days] • [promoted]

AIML- Software Engineer, Machine Learning Platform Technologies

Apple Inc. • San Francisco, CA, United States

[job_card.full_time]

AIML- Machine Learning Engineer, Machine Learning Platform Technologies.San Francisco Bay Area, California, United States Machine Learning and AI. Imagine what you could do here.At Apple, great idea...[show_more]

[last_updated.last_updated_variable_days] • [promoted]

Controls & Machine Learning Engineer

Terranova • Berkeley, CA, US

[job_card.full_time]

Company Description Backed by leading climate and American dynamism investors, Terranova builds intelligent robotic systems to terraform the Earth itself – lifting land, restoring wetlands, and pro...[show_more]

[last_updated.last_updated_variable_days] • [promoted]

Distributed ML Systems Engineer- Inference

Together AI • San Francisco, CA, United States

[job_card.full_time]

Together AI is seeking a Distributed ML Systems Engineer to design and build scalable machine learning systems that power our accelerated AI initiatives. This role involves developing large-scale, f...[show_more]

[last_updated.last_updated_30] • [promoted]

Foundation Model ML Engineer — Remote-Friendly

Stripe • San Francisco, California, United States

[filters.remote]

[job_card.full_time]

A leading financial technology company is seeking a Machine Learning Engineer for their Foundation Model team.The candidate will develop and optimize machine learning models that enhance payments a...[show_more]

[last_updated.last_updated_variable_days] • [promoted]

ML Research Engineer, ML Systems

Scale AI, Inc. • San Francisco, CA, United States

[job_card.full_time]

Scale's ML platform (RLXF) team builds our internal distributed framework for large language model training and inference. The platform has been powering MLEs, researchers, data scientists and opera...[show_more]

[last_updated.last_updated_30] • [promoted]

Software Engineer, Inference

Trypulse • San Francisco, CA, United States

[job_card.full_time]

Pulse is tackling one of the most persistent challenges in data infrastructure : extracting accurate, structured information from complex documents at scale. We have a breakthrough approach to docume...[show_more]

[last_updated.last_updated_30] • [promoted]

Software Engineer, Inference

Anthropic • San Francisco, CA, United States

[job_card.full_time]

Senior / Staff Software Engineer, Inference.Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a ...[show_more]

[last_updated.last_updated_30] • [promoted]

Research Engineer – Scalable ML Training & Inference

Aldea Inc • San Francisco, California, United States

[job_card.full_time]

A leading AI company in San Francisco is looking for a Research Engineer (Machine Learning) to enhance their multi-modal AI capabilities. The role involves building and optimizing infrastructure for...[show_more]

[last_updated.last_updated_variable_days] • [promoted]

ML Inference Engineer - Scalable AI Systems

Together • San Francisco, CA, US

[job_card.full_time]

A pioneering AI company in San Francisco seeks a Machine Learning Engineer to join their Inference Engine team.This role involves optimizing AI inference systems, developing high-performance servic...[show_more]

[last_updated.last_updated_variable_days] • [promoted]