Talent.com
Network Engineer, Operations & Reliability
Network Engineer, Operations & ReliabilityFluidstack • San Francisco, CA, United States
Network Engineer, Operations & Reliability

Network Engineer, Operations & Reliability

Fluidstack • San Francisco, CA, United States
[job_card.30_days_ago]
[job_preview.job_type]
  • [job_card.full_time]
[job_card.job_description]

About the Role

Fluidstack is seeking a Network Operations Engineer to serve as a Regional Site Lead for one of our datacenter campuses. This is a hybrid role that combines hands‑on Tier 2 / 3 network operations with site leadership responsibilities. You'll be the boots‑on‑the‑ground expert for your assigned datacenter / campus, ensuring network reliability through incident response, break‑fix coordination, and operational excellence. You'll work remotely when workload allows but be onsite as needed for deployments, complex troubleshooting, and critical incidents.

This role is ideal for experienced network operators who want ownership of a datacenter campus while being part of a broader operations organization. You'll partner closely with the Operations & Reliability pillar lead, centralized NOC for Tier 1 escalations, and cross‑functional teams including Deployment, Hardware, and DC Operations. Success means maintaining high availability for your region, building strong relationships with onsite teams, and growing into regional operations leadership as the team scales.

Focus

Regional Operations Ownership : Serve as the primary network operations contact for your assigned datacenter campus. Own network health, respond to incidents escalated from NOC, and ensure fabrics run reliably. Build deep knowledge of your region's network topology, common failure modes, and operational characteristics.

Tier 2+ Incident Response : Handle network incidents escalated from Tier 1 NOC during your coverage window. Troubleshoot complex issues across physical and logical layers, coordinate with other engineers for follow‑the‑sun coverage, and drive incidents to resolution. Lead incident response when you're the subject matter expert on the ground.

Break‑Fix Coordination : Coordinate hardware break‑fix activities with onsite DC Operations technicians. Manage linecard swaps, optic replacements, device troubleshooting, and RMA processes. Ensure physical infrastructure issues are resolved quickly and don't impact production workloads.

Deployment Support : Provide operational support during new datacenter deployments and expansions in your region. Partner with Deployment teams on turn‑up activities, validate production readiness, and ensure smooth handovers from deployment to operations. Be the person who ensures new pods integrate seamlessly into operational workflows.

Runbook Execution & Improvement : Execute operational runbooks for common failure scenarios and maintenance procedures. Identify gaps in runbooks, document lessons learned, and provide feedback to the Operations pillar lead on runbook improvements. Build the operational knowledge base for your region.

Cross‑Team Collaboration : Build strong relationships with onsite DC Operations teams, structured cabling vendors, and hardware logistics partners. Serve as the network engineering liaison for your datacenter campus. Communicate clearly about network status, planned maintenance, and operational issues.

Regional Mentorship : As the regional team scales, mentor junior operations engineers assigned to your datacenter. Share operational knowledge, provide guidance during incidents, and help build regional operations capacity.

About You

Strong Operations Background : 5-8 years in network engineering with significant hands‑on operational experience. You've run production networks, responded to incidents at all hours, and debugged complex failures under pressure. You understand the difference between "working" and "production‑ready."

Datacenter Fabric Expertise : Deep experience operating modern datacenter networks including EVPN / VXLAN, BGP, CLOS topologies, and high‑radix switching. You're comfortable troubleshooting Layer 2 / 3 issues, BGP routing problems, fabric misconfigurations, and physical layer failures.

Incident Response Excellence : Proven ability to lead incident response, perform systematic troubleshooting, and drive issues to resolution. You remain calm during outages, communicate clearly with stakeholders, and know when to elevate versus dig deeper. You've been the person others call when things break.

Site Leadership Capability : You've been the go‑to network person for a site, datacenter, or region before. You understand how to build relationships with onsite teams, coordinate physical infrastructure work, and represent network engineering in a field environment. You know how to get things done in operational settings.

Operational Pragmatism : You balance perfection with progress. You can troubleshoot with imperfect information, make pragmatic decisions under time pressure, and prioritize based on business impact. You document as you go and continuously improve operational processes.

Hybrid Work Comfort : You're productive working remotely but understand that datacenter operations sometimes require hands‑on presence. You're comfortable with flexible schedules that adapt to operational needs—sometimes remote, sometimes onsite for days or weeks during critical periods.

Nice to Haves

AI / HPC Fabric Operations : Experience operating AI / ML or HPC fabrics with RDMA (RoCEv2), lossless Ethernet (PFC, ECN), or high‑performance networking. You understand the operational precision required when network performance directly impacts workload completion.

Regional / Campus Operations Leadership : You've been a site lead, campus engineer, or regional operations lead before. You know how to coordinate across teams in a specific geographic location while reporting into a centralized organization.

Hardware Break‑Fix Experience : Hands‑on experience coordinating hardware repairs, RMAs, and physical infrastructure work. You understand datacenter logistics, vendor escalation processes, and how to work effectively with onsite technicians.

Observability & Monitoring : Familiarity with network monitoring platforms, alerting systems, and telemetry collection. You've used monitoring tools to diagnose issues proactively and tune alerting to reduce noise.

Automation Exposure : Basic scripting or automation experience (Python, Ansible) for operational tasks. You may not be writing complex automation but you understand how to leverage tools to improve operational efficiency.

Follow‑the‑Sun Experience : Experience working in distributed operations teams with follow‑the‑sun coverage models. You understand how to hand off incidents cleanly, communicate operational status across time zones, and coordinate with global teams.

Salary & Benefits

Competitive total compensation package (salary + equity).

Retirement or pension plan, in line with local norms.

Health, dental, and vision insurance.

Generous PTO policy, in line with local norms.

The base salary range for this position is $150,000 - $250,000 per year, depending on experience, skills, qualifications, and location. This range represents our good faith estimate of the compensation for this role at the time of posting. Total compensation may also include equity in the form of stock options.

We are committed to pay equity and transparency.

Fluidstack is an Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability and protected veterans’ status, or any other characteristic protected by law. Fluidstack will consider for employment qualified applicants with arrest and conviction records pursuant to applicable law.

#J-18808-Ljbffr

[job_alerts.create_a_job]

Reliability Engineer • San Francisco, CA, United States

[internal_linking.similar_jobs]
Network Engineer, Operations & Reliability

Network Engineer, Operations & Reliability

Fluidstack • San Francisco, CA, United States
[job_card.full_time]
Network Engineer, Operations & Reliability.Network Engineer, Operations & Reliability.Regional Site Lead for one of our Datacenter campuses. This hybrid role combines hands‑on Tier 2 / 3 network opera...[show_more]
[last_updated.last_updated_30] • [promoted]
VP, Global Network Operations & Reliability

VP, Global Network Operations & Reliability

GIC Private Limited • San Francisco, CA, United States
[job_card.full_time]
A leading global sovereign wealth fund is seeking a VP, Network Operations Engineer to oversee network infrastructure across the Americas, ensuring service reliability and compliance in a regulated...[show_more]
[last_updated.last_updated_30] • [promoted]
Senior Network Engineer : AWS, Palo Alto & Terraform

Senior Network Engineer : AWS, Palo Alto & Terraform

People Consultancy Services (PCS) • San Francisco, CA, United States
[job_card.full_time]
A technology consulting firm is seeking a Network Engineer specializing in Data for Autonomous Systems.The ideal candidate will have over 10 years of experience and advanced skills in Palo Alto fir...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Network Systems Engineer Blue Harbors – Posted by BHUS

Network Systems Engineer Blue Harbors – Posted by BHUS

Blue Harbors Corporation • San Francisco, CA, United States
[job_card.full_time]
PositionMust Be Filled By April15, 2016, So Apply Soon!.This position requires both strong technical and interpersonal skills. The position will support the client’s enterprise client management pro...[show_more]
[last_updated.last_updated_30] • [promoted]
Lead Network Engineer - Backbone Engineering

Lead Network Engineer - Backbone Engineering

salesforce.com, inc. • San Francisco, CA, United States
[job_card.full_time]
To get the best candidate experience, please consider applying for a maximum of 3 roles within 12 months to ensure you are not duplicating efforts. Job Category : Enterprise Technology & Infrastructu...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Senior Network Engineer

Senior Network Engineer

San Francisco Health Plan • San Francisco, CA, United States
[job_card.full_time]
Reporting to Manager, Systems Infrastructure, the Senior Network Engineer's role is to ensure the stable operation of the in-house network infrastructure. This includes planning, developing, install...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Lead Network Engineer - Backbone Engineering

Lead Network Engineer - Backbone Engineering

Salesforce • San Francisco, CA, United States
[job_card.full_time]
Lead Network Engineer - Backbone Engineering.Salesforce is the #1 AI CRM, where humans with agents drive customer success together. And innovation isn’t a buzzword — it’s a way of life.The world of ...[show_more]
[last_updated.last_updated_30] • [promoted]
Senior Cloud Networking & SDN Reliability Engineer

Senior Cloud Networking & SDN Reliability Engineer

Lambda Inc. • San Francisco, CA, United States
[job_card.full_time]
A leading AI cloud infrastructure company in San Francisco is seeking a Site Reliability Engineer to help scale their high performance multi-tenant cloud network. The role requires 5+ years of exper...[show_more]
[last_updated.last_updated_30] • [promoted]
Lead Platform Engineer (Network Infrastructure)

Lead Platform Engineer (Network Infrastructure)

Capital One • San Francisco, CA, United States
[job_card.full_time]
Lead Platform Engineer (Network Infrastructure).Do you love building and pioneering in the technology space? Do you enjoy solving complex technical problems in a fast-paced, collaborative, inclusiv...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Senior IT Network Engineer

Senior IT Network Engineer

Samsara • San Francisco, CA, United States
[job_card.full_time]
Samsara (NYSE : IOT) is the pioneer of the Connected Operations™ Cloud, which is a platform that enables organizations that depend on physical operations to harness Internet of Things (IoT) data to ...[show_more]
[last_updated.last_updated_30] • [promoted]
Senior Network Engineer, Deployment

Senior Network Engineer, Deployment

Crusoe Energy Systems LLC • San Francisco, CA, United States
[job_card.full_time]
Crusoe's mission is to accelerate the abundance of energy and intelligence.We’re crafting the engine that powers a world where people can create ambitiously with AI — without sacrificing scale, spe...[show_more]
[last_updated.last_updated_30] • [promoted]
Network & Datacenter Deployment Engineer

Network & Datacenter Deployment Engineer

Cloudflare • San Francisco, CA, United States
[job_card.full_time]
Network & Datacenter Deployment Engineer.At Cloudflare, we are on a mission to help build a better Internet.Today the company runs one of the worlds largest networks that powers millions of website...[show_more]
[last_updated.last_updated_30] • [promoted]
Global Network Development Engineer - Manufacturing & Launch

Global Network Development Engineer - Manufacturing & Launch

Amazon • San Francisco, CA, United States
[job_card.full_time]
A leading company is seeking a Network Development Engineer to design and implement enterprise networks for manufacturing and research facilities. The role involves architecture planning, network au...[show_more]
[last_updated.last_updated_30] • [promoted]
Lead Network Engineer - Backbone Engineering

Lead Network Engineer - Backbone Engineering

Salesforce, Inc. • San Francisco, CA, United States
[job_card.full_time]
Act as a liaison with the vendor sourcing team • Helping to develop and maintain metrics on our capacity vendors • Project managing new capacity installs with various teams including sourcing and fie...[show_more]
[last_updated.last_updated_30] • [promoted]
Senior Networks Engineer – Enterprise Networking

Senior Networks Engineer – Enterprise Networking

City and County of San Francisco • San Francisco, CA, United States
[job_card.full_time]
A municipal government organization in San Francisco is seeking a Senior Information Systems Engineer with expertise in network systems. The candidate will manage complex networks, providing technic...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Senior Network Deployment Engineer - Design & Go-Live

Senior Network Deployment Engineer - Design & Go-Live

CRG - People and Technology • San Francisco, CA, United States
[job_card.full_time]
A technology recruitment agency is seeking a Network Deployment Engineer to design and implement wired and wireless networks. You will be responsible for translating customer needs into practical de...[show_more]
[last_updated.last_updated_30] • [promoted]
Platform Engineer — Infra / Reliability Specialist

Platform Engineer — Infra / Reliability Specialist

Poly • San Francisco, CA, United States
[job_card.full_time]
Platform Engineer — Infra / Reliability Specialist.Platform Engineer — Infra / Reliability Specialist.Platform Engineer — Infra / Reliability Specialist. Platform Engineer — Infra / Reliability Spec...[show_more]
[last_updated.last_updated_30] • [promoted]
Senior Site Reliability Engineer, Healthcare Cloud Infrastructure and Networking

Senior Site Reliability Engineer, Healthcare Cloud Infrastructure and Networking

Collective Health • San Francisco, CA, United States
[job_card.full_time]
Senior Site Reliability Engineer, Healthcare Cloud Infrastructure and Networking.At Collective Health, we’re transforming how employers and their people engage with their health benefits by seamles...[show_more]
[last_updated.last_updated_30] • [promoted]