Talent.com
Senior Site Reliability Engineer
Senior Site Reliability EngineerNorthwoodspace • Torrance, California, United States
Senior Site Reliability Engineer

Senior Site Reliability Engineer

Northwoodspace • Torrance, California, United States
[job_card.variable_days_ago]
[job_preview.job_type]
  • [job_card.full_time]
[job_card.job_description]

Role :

Northwood is looking for a Senior Site Reliability Engineer to architect and lead the monitoring and reliability systems that keep satellites connected to Earth. As we rapidly scale our ground station network across multiple continents, you'll design and build the observability infrastructure that ensures our space communications systems operate 24 / 7 for customers ranging from commercial satellite operators to national security missions.

This is a high-impact leadership role where you'll architect global-scale reliability platforms while mentoring junior engineers and establishing SRE practices across the organization. You'll work directly with our founding engineering team and department heads to define the monitoring, alerting, and deployment strategies that will scale with us from startup to enterprise. If you're excited about space technology and want to architect infrastructure that directly supports mission-critical satellite operations while building and leading technical teams, this role offers that opportunity.

Responsibilities :

Architect and maintain enterprise observability stack (Grafana, Prometheus, Loki, Vector, VictoriaMetrics) monitoring ground stations, satellite communications, and multi-region AWS infrastructure

Design SRE practices, error budgets, and SLO / SLI frameworks for mission-critical satellite systems with 99.9%+ uptime requirements

Build advanced AWS infrastructure with Terraform, implementing multi-region reliability, automated scaling, and disaster recovery for ground station operations

Lead CI / CD pipeline architecture using GitLab and ArgoCD with advanced deployment strategies for mission-critical software releases

Mentor junior engineers and establish reliability standards across the growing engineering organization

Design comprehensive Kubernetes deployments with Helm, focusing on high availability and zero-downtime operations

Lead incident response, conduct post-mortems, and drive systematic reliability improvements

Basic Qualifications

5-8 years of production infrastructure and SRE experience with demonstrated leadership in reliability improvements and team mentorship

Expert-level experience with Kubernetes, Docker, and container orchestration in large-scale production environments

Strong background in infrastructure as code (Terraform) and advanced CI / CD practices with experience mentoring others on these technologies

Advanced AWS experience including multi-region architectures, networking, security, and cost optimization, with demonstrated ability to architect complex cloud solutions

Proven track record of leading technical projects from conception to production in fast-moving, high-growth environments

Deep understanding of SRE principles, error budgets, SLOs / SLIs, and experience implementing reliability frameworks across engineering organizations

Preferred Qualifications

Production experience architecting and scaling observability tools (Vector, Loki, Grafana, Prometheus, VictoriaMetrics) in high-throughput environments

Advanced experience with HashiCorp Vault, Okta, and enterprise identity / secrets management systems including policy design and implementation

Previous experience scaling infrastructure and leading technical teams at high-growth companies (startup to 500+ employees)

AWS Professional certification or equivalent demonstrated expertise with advanced cloud networking, security, and compliance frameworks

Strong Linux system administration and networking expertise with experience troubleshooting complex distributed systems

Background in aerospace, telecommunications, defense contracting, or other mission-critical, highly regulated industries

Experience with ITAR, NIST 800-171, or other defense / aerospace compliance requirements

[job_alerts.create_a_job]

Senior Site Reliability Engineer • Torrance, California, United States

[internal_linking.related_jobs]
Solutions Engineer - SLED

Solutions Engineer - SLED

Cisco Systems, Inc. • Glendale, CA, United States
[job_card.full_time]
The application window is expected to close on : 12 / 31 / 2025.Job posting may be removed earlier if the position is filled or if a sufficient number of applications are received.Candidates must reside...[show_more]
[last_updated.last_updated_30] • [promoted]
Mechanisms Systems Engineer III - Lunar Permanence

Mechanisms Systems Engineer III - Lunar Permanence

Blue Origin • Los Angeles, CA, United States
[job_card.permanent]
Applications will be accepted on an ongoing basis until the requisition is closed.At Blue Origin, we envision millions of people living and working in space for the benefit of Earth.We're working t...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Reliability Engineer

Reliability Engineer

Butler Aerospace and Defense • Los Angeles, California, United States
[job_card.full_time]
Citizenship only, no dual citizenship • •.Secret Clearance role - clearance required to start • •.Provide Reliability Engineering analyses on a wide variety of programs and help influence the design.Th...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Mix-Signal Senior Design Engineer

Mix-Signal Senior Design Engineer

Blue Origin • Los Angeles, CA, United States
[job_card.permanent]
Applications will be accepted on an ongoing basis until the requisition is closed.At Blue Origin, we envision millions of people living and working in space for the benefit of Earth.We're working t...[show_more]
[last_updated.last_updated_30] • [promoted]
Senior Technology Site Reliability Engineering Manager

Senior Technology Site Reliability Engineering Manager

Cooley LLP • Santa Monica, CA, United States
[job_card.full_time]
Senior Technology Site Reliability Engineering Manager.Cooley is seeking a Senior Site Reliability Engineering Manager to join the. Infrastructure & Development Operations.The Senior Technology Site...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Radar Systems Engineer (Experienced, Lead, Senior)

Radar Systems Engineer (Experienced, Lead, Senior)

Boeing • El Segundo, CA, US
[job_card.temporary]
At Boeing, we innovate and collaborate to make the world a better place.We're committed to fostering an environment for every teammate that's welcoming, respectful and inclusive, with great opportu...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Sr. Liquid System Responsible Engineer IV - Lunar Permanence

Sr. Liquid System Responsible Engineer IV - Lunar Permanence

Blue Origin • Los Angeles, CA, United States
[job_card.permanent]
Applications will be accepted on an ongoing basis until the requisition is closed.At Blue Origin, we envision millions of people living and working in space for the benefit of Earth.We're working t...[show_more]
[last_updated.last_updated_30] • [promoted]
Lunar RCS Propulsion Responsible Engineer

Lunar RCS Propulsion Responsible Engineer

Blue Origin • Los Angeles, CA, United States
[job_card.permanent]
Applications will be accepted on an ongoing basis until the requisition is closed.At Blue Origin, we envision millions of people living and working in space for the benefit of Earth.We're working t...[show_more]
[last_updated.last_updated_30] • [promoted]
Launch Systems Integration Project Engineer

Launch Systems Integration Project Engineer

The Aerospace Corporation • Los Angeles, CA, United States
[job_card.full_time]
The Aerospace Corporation is the trusted partner to the nation's space programs, solving the hardest problems and providing unmatched technical expertise. As the operator of a federally funded resea...[show_more]
[last_updated.last_updated_30] • [promoted]
Resilient Missile Warning & Tracking Epoch 1- Lead SEIT Engineer

Resilient Missile Warning & Tracking Epoch 1- Lead SEIT Engineer

The Aerospace Corporation • El Segundo, CA, United States
[job_card.full_time]
The Aerospace Corporation is the trusted partner to the nation's space programs, solving the hardest problems and providing unmatched technical expertise. As the operator of a federally funded resea...[show_more]
[last_updated.last_updated_1_day] • [promoted]
System of Systems Integration Engineer - Missile Warning, Tracking, Defense

System of Systems Integration Engineer - Missile Warning, Tracking, Defense

The Aerospace Corporation • El Segundo, CA, United States
[job_card.full_time]
The Aerospace Corporation is the trusted partner to the nation's space programs, solving the hardest problems and providing unmatched technical expertise. As the operator of a federally funded resea...[show_more]
[last_updated.last_updated_30] • [promoted]
Liquid Systems Engineer III - Lunar Permanence

Liquid Systems Engineer III - Lunar Permanence

Blue Origin • Los Angeles, CA, United States
[job_card.permanent]
At Blue Origin, we envision millions of people living and working in space for the benefit of Earth.We're working to develop reusable, safe, and low-cost space vehicles and systems within a culture...[show_more]
[last_updated.last_updated_30] • [promoted]
SLD Responsible Project Engineer III, Pumps - Lunar Permanence

SLD Responsible Project Engineer III, Pumps - Lunar Permanence

Blue Origin • Los Angeles, CA, United States
[job_card.permanent]
Applications will be accepted on an ongoing basis until the requisition is closed.At Blue Origin, we envision millions of people living and working in space for the benefit of Earth.We're working t...[show_more]
[last_updated.last_updated_30] • [promoted]
Senior Technology Site Reliability Engineer

Senior Technology Site Reliability Engineer

Cooley LLP • Los Angeles, CA, United States
[job_card.full_time]
Senior Technology Site Reliability Engineer.Cooley is seeking a Senior Site Reliability Engineer to join the.Infrastructure & Development Operations. The Senior Technology Site Reliability Engineer(...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Lead Site Reliability Engineer - Federal Team

Lead Site Reliability Engineer - Federal Team

Saviynt • Los Angeles, California, United States, 90001
[job_card.full_time]
Lead Site Reliability Engineer - Federal Team.Saviynt is an identity authority platform built to power and protect the world at work. In a world of digital transformation, where organizations are fa...[show_more]
[last_updated.last_updated_variable_days]
Spacecraft Systems Engineer (Project Engineer / Senior Project Engineer)

Spacecraft Systems Engineer (Project Engineer / Senior Project Engineer)

The Aerospace Corporation • Los Angeles, CA, United States
[job_card.full_time]
The Aerospace Corporation is the trusted partner to the nation's space programs, solving the hardest problems and providing unmatched technical expertise. As the operator of a federally funded resea...[show_more]
[last_updated.last_updated_30] • [promoted]
Claim Specialist - Property Field Inspection

Claim Specialist - Property Field Inspection

State Farm • Long Beach, CA, United States
[job_card.full_time]
US-CA-Long Beach;US-CA-Carson;US-CA-Hawthorne.Being good neighbors - helping people, investing in our communities, and making the world a better place - is who we are at State Farm.It is at the cor...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Lunar Fluid Components and Tubing, Design Engineer III - Lunar Permanence

Lunar Fluid Components and Tubing, Design Engineer III - Lunar Permanence

Blue Origin • Los Angeles, CA, United States
[job_card.permanent]
Applications will be accepted on an ongoing basis until the requisition is closed.At Blue Origin, we envision millions of people living and working in space for the benefit of Earth.We're working t...[show_more]
[last_updated.last_updated_30] • [promoted]