Talent.com
Software Engineer, Model Inference
Software Engineer, Model InferenceOpenAI • San Francisco
Software Engineer, Model Inference

Software Engineer, Model Inference

OpenAI • San Francisco
[job_card.30_days_ago]
[job_preview.job_type]
  • [job_card.full_time]
[job_card.job_description]

About the Team

Our Inference team brings OpenAI’s most capable research and technology to the world through our products. We empower consumers, enterprise and developers alike to use and access our start-of-the-art AI models, allowing them to do things that they’ve never been able to before. We focus on performant and efficient model inference, as well as accelerating research progression via model inference.

About the Role

We are looking for an engineer who wants to take the world's largest and most capable AI models and optimize them for use in a high-volume, low-latency, and high-availability production and research environment.

In this role, you will:

  • Work alongside machine learning researchers, engineers, and product managers to bring our latest technologies into production.

  • Work alongside researchers to enable advanced research through awesome engineering.

  • Introduce new techniques, tools, and architecture that improve the performance, latency, throughput, and efficiency of our model inference stack.

  • Build tools to give us visibility into our bottlenecks and sources of instability and then design and implement solutions to address the highest priority issues.

  • Optimize our code and fleet of Azure VMs to utilize every FLOP and every GB of GPU RAM of our hardware.

You might thrive in this role if you:

  • Have an understanding of modern ML architectures and an intuition for how to optimize their performance, particularly for inference.

  • Own problems end-to-end, and are willing to pick up whatever knowledge you're missing to get the job done.

  • Have at least 5 years of professional software engineering experience.

  • Have or can quickly gain familiarity with PyTorch, NVidia GPUs and the software stacks that optimize them (e.g. NCCL, CUDA), as well as HPC technologies such as InfiniBand, MPI, NVLink, etc.

  • Have experience architecting, building, observing, and debugging production distributed systems. Bonus point if worked on performance-critical distributed systems.

  • Have needed to rebuild or substantially refactor production systems several times over due to rapidly increasing scale.

  • Are self-directed and enjoy figuring out the most important problem to work on.

  • Have a humble attitude, an eagerness to help your colleagues, and a desire to do whatever it takes to make the team succeed.

[job_alerts.create_a_job]

Software Engineer, Model Inference • San Francisco

[internal_linking.similar_jobs]

Senior Engineer, Model Serving & Inference

DatabricksSan Francisco, CA, United States
[job_card.full_time]

A leading data and AI company is seeking a Senior Software Engineer, Model Serving to design and implement core systems that ensure scalability and operational excellence.You will drive architectur...[internal_linking.show_more]

 • [job_card.promoted]

Founding ML Inference Engineer: Ultra-Low Latency

ReactorSan Francisco, CA, United States
[job_card.full_time]

A pioneering technology firm in San Francisco is seeking a Founding Engineer for ML Inference.This highly technical role focuses on optimizing real-time generative media models.You'll design novel ...[internal_linking.show_more]

 • [job_card.promoted]

ML Systems Engineer: Distributed LLM Training & Inference

Scale AISan Francisco, CA, United States
[job_card.full_time]

A leading AI technology company in San Francisco seeks a team member to build and optimize a machine learning framework for large language models.Candidates should have system optimization experien...[internal_linking.show_more]

 • [job_card.promoted]

Machine Learning Engineer, Infrastructure

GleanSan Francisco, CA, United States
[job_card.full_time]

Software Engineer, Machine Learning (Infrastructure) at Glean — a company building an AI-powered knowledge management platform to help teams find, organize, and share information efficiently.Our pr...[internal_linking.show_more]

 • [job_card.promoted]

Software Engineer - Localization, State Estimation & Prediction San Francisco, US

LodestarSan Francisco, CA, United States
[job_card.full_time] +1

Lodestar's mission is to develop the first "Protect and Defend" capability for high-value space assets in orbit.Our flagship product MITHRIL is our hardware-agnostic, AI-enabled autonomy software s...[internal_linking.show_more]

 • [job_card.promoted]

Lead, Crypto-Communications Security Systems Engineer

L3Harris TechnologiesMIRAMAR, California, United States
[job_card.full_time]

L3Harris is dedicated to recruiting and developing high-performing talent who are passionate about what they do.Our employees are unified in a shared dedication to our customers’ mission and quest ...[internal_linking.show_more]

 • [job_card.promoted]

AI Inference Engineer

Quadric Inc.Burlingame, CA, United States
[job_card.full_time]

Quadric has created an innovative general purpose neural processing unit (GPNPU) architecture.Quadric's co-optimized software and hardware is targeted to run neural network (NN) inference workloads...[internal_linking.show_more]

 • [job_card.promoted]

Software Engineer, Machine Learning

NudgeSan Francisco, CA, United States
[job_card.full_time]

At Nudge, our mission is to develop the best technology for interfacing with the brain to improve people's lives.We're starting with an approach that we believe can help the most people the fastest...[internal_linking.show_more]

 • [job_card.promoted]

Senior Machine Learning Engineer, Conversion Modeling

UnitySan Francisco, CA, United States
[job_card.full_time]

At Unity, we’re committed to building a culture that fosters collaboration and innovation.Within our fast-paced environment, we’re tackling complex challenges that drive meaningful impact for creat...[internal_linking.show_more]

 • [job_card.promoted]

Software Engineer, Model Inference

OpenAISan Francisco, CA, United States
[job_card.full_time]

Software Engineer, Model Inference.Our Inference team brings OpenAI’s most capable research and technology to the world through our products.We empower consumers, enterprise and developers alike to...[internal_linking.show_more]

 • [job_card.promoted]

R & D Engineer 5 (0443) Job 83887 - Berkeley Wireless Research Center (BWRC)

InsideHigherEdBerkeley, California, United States
[job_card.full_time]

R & D Engineer 5 (0443) Job 83887 - Berkeley Wireless Research Center (BWRC).At the University of California, Berkeley, we are dedicated to fostering a community where everyone feels welcome and ca...[internal_linking.show_more]

 • [job_card.promoted]

Software Engineer, AI Training and Infrastructure

Skild.aiSan Francisco, CA, United States
[job_card.full_time]

Software Engineer, AI Training and Infrastructure.At Skild AI, we are building the world's first general purpose robotic intelligence that is robust and adapts to unseen scenarios without failing.W...[internal_linking.show_more]

 • [job_card.promoted]

Software Engineer, Models - US (Remote)

W&B Service Company, L.P.San Francisco, CA, United States
[filters.remote]
[job_card.full_time]

Employer Industry :AI Development Tools Why consider this job opportunity :Flexible time off to promote work-life balance Medical, Dental, and Vision benefits for employees and family coverage Remo...[internal_linking.show_more]

 • [job_card.promoted]

Senior ML Inference Engineer - PyTorch Performance

ComfySan Francisco, CA, United States
[job_card.full_time]

A leading AI platform company in San Francisco is seeking a talented individual to optimize model inference for their advanced visual AI product.The ideal candidate will engage in building efficien...[internal_linking.show_more]

 • [job_card.promoted]

Senior Software Development Engineer, AI/ML, AWS Neuron, Model Inference

AmazonSan Francisco, CA, United States
[job_card.full_time]

Senior Software Development Engineer, AI/ML, AWS Neuron, Model Inference.The Annapurna Labs team at Amazon Web Services (AWS) builds AWS Neuron, the software development kit used to accelerate deep...[internal_linking.show_more]

 • [job_card.promoted]

Founding Engineer (Systems + ML)

PartclSan Francisco, CA, United States
[job_card.full_time]

Founding Engineer (Systems + ML).Get AI-powered advice on this job and more exclusive features.This range is provided by Partcl.Your actual pay will be based on your skills and experience — talk wi...[internal_linking.show_more]

 • [job_card.promoted]

Software Engineer, Inference

TrypulseSan Francisco, CA, United States
[job_card.full_time]

Pulse is tackling one of the most persistent challenges in data infrastructure: extracting accurate, structured information from complex documents at scale.We have a breakthrough approach to docume...[internal_linking.show_more]

 • [job_card.promoted]

Staff Software Engineer, Perception

AerovectSan Francisco, CA, United States
[job_card.full_time]

AeroVect is transforming ground handling with autonomy, redefining how airlines and ground service providers around the globe run day-to-day operations.We are a Series A company backed by top-tier ...[internal_linking.show_more]