Talent.com
Software Engineer, Fleet Management
Software Engineer, Fleet ManagementOpenAI • San Francisco
Software Engineer, Fleet Management

Software Engineer, Fleet Management

OpenAI • San Francisco
[job_card.30_days_ago]
[job_preview.job_type]
  • [job_card.full_time]
[job_card.job_description]

The Fleet team at OpenAI supports the computing environment that powers our cutting-edge research and product development. We oversee large-scale systems that span data centers, GPUs, networking, and more, ensuring high availability, performance, and efficiency. Our work enables OpenAI’s models to operate seamlessly at scale, supporting both internal research and external products like ChatGPT. We prioritize safety, reliability, and responsible AI deployment over unchecked growth.

About the Role

The Software Engineer, Operating Systems & Orchestration will focus on building systems to manage hardware, configurations, vendors, and the people interacting with our infrastructure. You will design and develop solutions that integrate individual nodes and servers into unified clusters, directly contributing to advancing AI research by streamlining the overall research user experience. This role is based in San Francisco, CA. We use a hybrid work model of 3 days in the office per week and offer relocation assistance to new employees.

In this role, you will:

  • Design and build systems to manage both cloud and bare-metal fleets at scale.

  • Develop tools that integrate low-level hardware metrics with high-level job scheduling and cluster management algorithms.

  • Leverage LLMs to coordinate vendor operations and optimize infrastructure workflows.

  • Automate infrastructure processes, reducing repetitive toil and improving system reliability.

  • Collaborate with hardware, infrastructure, and research teams to ensure seamless integration across the stack.

  • Continuously improve tools, automation, processes, and documentation to enhance operational efficiency.

You might thrive in this role if you:

  • Have strong software engineering skills with experience in large-scale infrastructure environments.

  • Possess broad knowledge of cluster-level systems (e.g., Kubernetes, CI/CD pipelines, Terraform, cloud providers).

  • Have deep expertise in server-level systems (e.g., systems, containerization, Chef, Linux kernels, firmware management, host routing).

  • Are passionate about optimizing the performance and reliability of large compute fleets.

  • Thrive in dynamic environments and are eager to solve complex infrastructure challenges.

  • Value automation, efficiency, and continuous improvement in everything you build.

[job_alerts.create_a_job]

Software Engineer, Fleet Management • San Francisco

[internal_linking.similar_jobs]

Senior Software Engineer - Fleet Management

Perot JainSan Mateo, CA, United States
[job_card.full_time]

San Mateo, California, United States - Full-time About the role Skydio is the leading US drone company and the world leader in autonomous flight, the key technology for the future of drones and aer...[internal_linking.show_more]

 • [job_card.promoted] • [job_card.new]

Fleet Support Engineer

ArtechSan Mateo, CA, United States
[job_card.full_time]

Onsite in Foster City, CA | 5 days in office.The role involves serving as the highest level of field technical support, focusing on the systemic reliability of fleet vehicles.The position requires ...[internal_linking.show_more]

 • [job_card.promoted]

Senior Software Engineer II - Mobile Platform

SamsaraSan Francisco, CA, United States
[job_card.full_time]

Samsara (NYSE: IOT) is the pioneer of the Connected Operations Cloud, which is a platform that enables organizations that depend on physical operations to harness Internet of Things (IoT) data to d...[internal_linking.show_more]

 • [job_card.promoted] • [job_card.new]

Senior Software Engineer, LLM Performance

ParasailSan Francisco, CA, United States
[job_card.full_time]

Senior Software Engineer, LLM Performance.Parasail is redefining AI infrastructure by enabling seamless deployment across a distributed network of GPUs, optimizing for cost, performance, and flexib...[internal_linking.show_more]

 • [job_card.promoted]

Autonomy Software Engineer

SkydioSan Mateo, CA, United States
[job_card.full_time]

Skydio is the leading US drone company and the world leader in autonomous flight, the key technology for the future of drones and aerial mobility.The Skydio team combines deep expertise in artifici...[internal_linking.show_more]

 • [job_card.promoted]

Software Engineer, Platform

Scale AISan Francisco, CA, United States
[job_card.full_time]

Software is eating the world, but AI is eating software.We live in unprecedented times - AI has the potential to exponentially augment human intelligence.Every person will have a personal tutor, co...[internal_linking.show_more]

 • [job_card.promoted]

Software Engineer In-Vehicle Infotainment (IVI)

Sony Honda Mobility of AmericaSan Francisco, CA, United States
[job_card.full_time]

Software Engineer In-Vehicle Infotainment (IVI).Join Sony Honda Mobility of America Inc.Sony Honda Mobility of America Inc.Sony and Honda, headquartered in Tokyo, Japan.Our American headquarters in...[internal_linking.show_more]

 • [job_card.promoted]

Fleet Support Engineer

Manpower Group Inc.San Francisco, CA, United States
[job_card.full_time]

Our client, a leader in autonomous vehicle technology and fleet management, is seeking a Fleet Support Engineer to join their team.As a Fleet Support Engineer, you will be part of the Vehicle Devel...[internal_linking.show_more]

 • [job_card.promoted]

Senior Backend Software Engineer, Cloud Management

CrusoeSan Francisco, CA, United States
[job_card.full_time]

Crusoe is building the world's favorite AI-first cloud, and the Cloud Customer Experience (CCX) team is the engine behind that mission.As a Senior Software Engineer, you will design, build, and sca...[internal_linking.show_more]

 • [job_card.promoted]

Staff Software Engineer, ML Performance & Systems

FalSan Francisco, CA, United States
[job_card.full_time]

Staff Software Engineer, ML Performance & Systems.Help FAL maintain its frontier position on model performance for generative media models.Design and implement novel approaches to model serving arc...[internal_linking.show_more]

 • [job_card.promoted] • [job_card.new]

Fleet Support Engineer

ICONMASan Mateo, CA, United States
[job_card.permanent]

Our Client, v Manufacturing company, is looking for a Fleet Support Engineer for their Foster City, CA location.The Fleet Support Engineer on the Vehicle Development team will be actively engaged i...[internal_linking.show_more]

 • [job_card.promoted]

Software Engineer, Platform

Glean Technologies, Inc.San Francisco, CA, United States
[job_card.full_time]

Overview Glean is the Work AI platform that helps everyone work smarter with AI.What began as the industry’s most advanced enterprise search has evolved into a full-scale Work AI ecosystem, powerin...[internal_linking.show_more]

 • [job_card.promoted] • [job_card.new]

Ground Software & Systems Manager - Mission Operations (0346U), Space Sciences Laboratory - 83546

InsideHigherEdBerkeley, California, United States
[job_card.full_time]

Ground Software & Systems Manager - Mission Operations (0346U), Space Sciences Laboratory - 83546.At the University of California, Berkeley, we are dedicated to fostering a community where everyone...[internal_linking.show_more]

 • [job_card.promoted]

Mobile Software Engineer - Enterprise

SubstackSan Francisco, CA, United States
[job_card.full_time]

Substack is building a new economic engine for culture, giving the brightest, most interesting, and most creative people on the internet the power of their own publishing platform.The terms of our ...[internal_linking.show_more]

 • [job_card.promoted]

Software Engineer I, Monetization ML

TwitchSan Francisco, CA, United States
[job_card.full_time]

Twitch is the world's biggest live streaming service, with global communities built around gaming, entertainment, music, sports, cooking, and more.It is where thousands of communities come together...[internal_linking.show_more]

 • [job_card.promoted]

Technical Autonomous Fleet Manager

Alexander ChapmanSan Francisco, CA, United States
[job_card.full_time]

Role: Technical Field Operations Manager (Autonomous Systems) Location: U.Remote with frequent travel Employment Type: Full-time Overview We’re looking for a highly practical and technically strong...[internal_linking.show_more]

 • [job_card.promoted] • [job_card.new]

Configuration Management Engineer

1X Technologies ASSan Carlos, CA, United States
[job_card.full_time]

We build humanoid robots that work alongside people to solve labor shortages and create abundance.As a Configuration Management Engineer at 1X, you will own the systems and processes that keep our ...[internal_linking.show_more]

 • [job_card.promoted]

Software Engineer, Fleet Management

OpenAISan Francisco, CA, United States
[job_card.full_time]

The Fleet team at OpenAI supports the computing environment that powers our cutting-edge research and product development.We oversee large-scale systems that span data centers, GPUs, networking, an...[internal_linking.show_more]