Talent.com
Senior Site Reliability Engineer
Senior Site Reliability EngineerThe Recruiting Guy • Los Angeles, CA, US
Senior Site Reliability Engineer

Senior Site Reliability Engineer

The Recruiting Guy • Los Angeles, CA, US
[job_card.30_days_ago]
[job_preview.job_type]
  • [job_card.full_time]
[job_card.job_description]

Job Description

Job Description

If this role is still posted then we are still recruiting and needing applications.

Job Title : & Senior Cloud Infrastructure Engineer

Location : & San Francisco, CA. Remote unavailable.

Modality : On-Site only.

& & & & & & & & & Must live within commuting distance of San Francisco or be willing to relocate.

Relocation Assistance : No

Employment Type : & Salaried W2 Full-Time.

Salary Range : $175,000 - $250,000

About the company

We represent a pioneering open source technology company in San Francisco that is transforming the way creators interact with generative AI. They are the team behind a powerful, node based visual interface that gives artists, developers, and innovators the ability to design, control, and customize AI workflows with complete flexibility. Their platform allows users to connect modular components, build complex pipelines, and run everything locally with impressive speed and precision.

Their mission is to make generative AI open, transparent, and accessible to everyone. Built around community collaboration and creative empowerment, their tools help users experiment freely and bring their ideas to life. Whether it is visual storytelling, image generation, or advanced machine learning, their technology gives creators the freedom to explore without limitations.

& About the Role

In this role, you will take the lead on designing, deploying, and maintaining large-scale distributed systems that power AI workloads. The ideal candidate is deeply technical, self-sufficient, and motivated by solving complex infrastructure challenges. You will work closely with core engineers to shape the company’s long-term infrastructure vision while ensuring scalability, performance, and reliability across environments.

What You’ll Do

Design, build, and maintain the core infrastructure that powers AI workloads at scale

Manage and automate GPU compute clusters using tools such as Python, Kubernetes, Terraform, and Ansible

Architect and operate systems for orchestration, observability, distributed storage, and networking

Ensure reliability, scalability, and performance across production environments

Collaborate closely with core engineers to design infrastructure for new features and systems

Contribute to technical strategy and long-term infrastructure vision

Drive best practices for infrastructure automation, deployment, and monitoring

Requirements

5+ years experience as an Infrastructure Engineer or Site Reliability Engineer building and operating large-scale distributed systems

Skilled in Python and comfortable working with infrastructure-as-code tools such as Terraform and Ansible

Familiar with container orchestration systems such as Kubernetes and related tooling like FluxCD, Prometheus, and Grafana

Capable of managing high-performance GPU environments across cloud and bare metal setups

Highly adaptable, resourceful, and motivated by building things from the ground up

Excited to work in a small, fast-growing team where autonomy and accountability are key

Comfortable working on-site in a startup setting where collaboration and speed matter most

Bonus Points

Experience contributing to or maintaining open-source projects

Background working with AI infrastructure, ML pipelines, or GPU orchestration

Strong computer science fundamentals and ability to work across different programming languages or frameworks

[job_alerts.create_a_job]

Senior Site Reliability Engineer • Los Angeles, CA, US

[internal_linking.similar_jobs]
Site Reliability Engineer (TS / SCI)

Site Reliability Engineer (TS / SCI)

Clearance Jobs • Los Angeles, CA, US
[job_card.full_time]
Site Reliability / DevSecOps Engineer.Zachary Piper Solutions is seeking a Site Reliability / DevSecOps Engineer to support a cutting-edge spacecraft and satellite operations program in Los Angeles, CA...[show_more]
[last_updated.last_updated_1_day] • [promoted]
Appliance Pro Needed (Los Angeles)

Appliance Pro Needed (Los Angeles)

Lula • Los Angeles, CA, US
[job_card.full_time]
We are seeking individuals who have experience in the rental property industry and has an eager attitude.Lula is a service designed for property managers to eliminate the hassle of managing and coo...[show_more]
[last_updated.last_updated_30] • [promoted]
Senior Technology Site Reliability Engineer

Senior Technology Site Reliability Engineer

Cooley LLP • Santa Monica, CA, United States
[job_card.full_time]
Senior Technology Site Reliability Engineer.Cooley is seeking a Senior Site Reliability Engineer to join the.Infrastructure & Development Operations. The Senior Technology Site Reliability Engineer(...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Site Reliability Engineer

Site Reliability Engineer

Tentek, Inc. • Glendale, CA, US
[job_card.full_time]
Must report onsite in Glendale 3 days per week, typically Tuesday-Thursday.There will be 3 rounds of interviews for this position. Linux system admin and Windows but willing to consider only Linux b...[show_more]
[last_updated.last_updated_30] • [promoted]
Senior Software Engineer, Site Reliability

Senior Software Engineer, Site Reliability

ZipRecruiter • Los Angeles, CA, US
[job_card.full_time]
We offer a hybrid work environment.Most US-based positions can also.To actively connect people to their next great opportunity. ZipRecruiter is a leading online employment marketplace.Powered by AI-...[show_more]
[last_updated.last_updated_30] • [promoted]
Site Reliability Engineering

Site Reliability Engineering

Forhyre • Los Angeles, CA, US
[job_card.full_time]
Forhyre is looking for engineers who can bring unique perspectives and innovative ideas to all areas of development and are interested in continuing to improve our platform through the ever-changin...[show_more]
[last_updated.last_updated_30] • [promoted]
Senior Technology Site Reliability Engineering Manager

Senior Technology Site Reliability Engineering Manager

Cooley LLP • Los Angeles, CA, United States
[job_card.full_time]
Senior Technology Site Reliability Engineering Manager.Cooley is seeking a Senior Site Reliability Engineering Manager to join the. Infrastructure & Development Operations.The Senior Technology Site...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Senior Build Reliability Engineer

Senior Build Reliability Engineer

Castelion Corporation • Torrance, CA, US
[job_card.permanent]
Castelion is bringing a new approach to defense development and production : one that focuses on short, iterative design cycles, rapid testing in development, and modern commercial manufacturing str...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Senior Product Reliability Engineer

Senior Product Reliability Engineer

Divergent • Los Angeles, CA, US
[job_card.full_time]
Divergent is a technology company that has architected, invented, built, and commercialized an end-to-end factory system called the Divergent Adaptive Production System (DAPS) that comprehensively ...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Senior Site Reliability Engineer (SRE)

Senior Site Reliability Engineer (SRE)

StubHub • Los Angeles, CA, US
[job_card.full_time]
StubHub is on a mission to redefine the live event experience on a global scale.Whether someone is looking to attend their first event or their hundredth, we're here to delight them all the way...[show_more]
[last_updated.last_updated_30] • [promoted]
Senior Vehicle Reliability Engineer

Senior Vehicle Reliability Engineer

Czinger • Los Angeles, CA, US
[job_card.permanent] +1
Czinger Vehicles is redefining the future of automotive design and manufacturing.Founded in 2019, we're pioneering a new era of performance vehicles through revolutionary, proprietary technolog...[show_more]
[last_updated.last_updated_30] • [promoted]
Nuclear Hardness & Surivivability Engineer

Nuclear Hardness & Surivivability Engineer

The Aerospace Corporation • El Segundo, CA, United States
[job_card.full_time]
The Aerospace Corporation is the trusted partner to the nation's space programs, solving the hardest problems and providing unmatched technical expertise. As the operator of a federally funded resea...[show_more]
[last_updated.last_updated_1_day] • [promoted]
Resilient Missile Warning & Tracking Epoch 1- Lead SEIT Engineer

Resilient Missile Warning & Tracking Epoch 1- Lead SEIT Engineer

The Aerospace Corporation • El Segundo, CA, United States
[job_card.full_time]
The Aerospace Corporation is the trusted partner to the nation's space programs, solving the hardest problems and providing unmatched technical expertise. As the operator of a federally funded resea...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Lead Site Reliability Engineer - Federal Team

Lead Site Reliability Engineer - Federal Team

Saviynt • Los Angeles, CA, US
[job_card.full_time]
Saviynt is an identity authority platform built to power and protect the world at work.In a world of digital transformation, where organizations are faced with increasing cyber risk but cannot affo...[show_more]
[last_updated.last_updated_30] • [promoted]
Spacecraft Systems Engineer (Project Engineer / Senior Project Engineer)

Spacecraft Systems Engineer (Project Engineer / Senior Project Engineer)

The Aerospace Corporation • Los Angeles, CA, United States
[job_card.full_time]
The Aerospace Corporation is the trusted partner to the nation's space programs, solving the hardest problems and providing unmatched technical expertise. As the operator of a federally funded resea...[show_more]
[last_updated.last_updated_30] • [promoted]
Site Reliability Engineer

Site Reliability Engineer

Clearance Jobs • Los Angeles, CA, US
[job_card.full_time]
Zachary Piper Solutions is seeking a Site Reliability Engineer to join a mission-critical program supporting secure, highly available cloud infrastructure in federal environments.This role focuses ...[show_more]
[last_updated.last_updated_variable_hours] • [promoted] • [new]
Sr Mission Reliability Engineer

Sr Mission Reliability Engineer

Relativity Space • Long Beach, CA, US
[job_card.full_time]
At Relativity Space, we're building rockets to serve today's needs and tomorrow's breakthroughs.Our Terran R vehicle will deliver customer payloads to orbit, meeting the growing demand ...[show_more]
[last_updated.last_updated_30] • [promoted]
Sales Systems Engineer

Sales Systems Engineer

Commscope • La Defense Paris, Other, France
[job_card.full_time]
In our ‘always on’ world, we believe it’s essential to have a genuine connection with the work you do.CommScope is a global leader in network infrastructure, providing innovative solutions for tele...[show_more]
[last_updated.last_updated_30] • [promoted]