Talent.com
Infrastructure Engineer - US Government
Infrastructure Engineer - US GovernmentxAI • Palo Alto, CA, US
Infrastructure Engineer - US Government

Infrastructure Engineer - US Government

xAI • Palo Alto, CA, US
[job_card.30_days_ago]
[job_preview.job_type]
  • [job_card.full_time]
[job_card.job_description]

Job Description

Job Description

About xAI

xAI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who appreciate challenging themselves and thrive on curiosity. We operate with a flat organizational structure. All employees are expected to be hands-on and to contribute directly to the company's mission. Leadership is given to those who show initiative and consistently deliver excellence. Work ethic and strong prioritization skills are important. All engineers are expected to have strong communication skills. They should be able to concisely and accurately share knowledge with their teammates.

About the Role

We are seeking a highly skilled Senior Infrastructure Engineer to join our US Government Team, focused on designing, building, and operating secure, scalable infrastructure for critical government projects. In this role, you will develop and manage training and inference clusters, as well as highly reliable applications, across bare metal, classified cloud, and hybrid cloud architectures. You will leverage your expertise in Kubernetes and GPU hardware to deliver robust, secure systems that support large-scale AI workloads while meeting stringent federal compliance requirements. This role demands a passion for automation, observability, and ensuring system integrity in a fast-paced, high-security environment.

Responsibilities

  • Develop and optimize software to provision and manage xAI's infrastructure across on-premise, virtual machine, and classified cloud environments, enabling efficient scaling for US government initiatives.
  • Enhance the reliability, performance, and cost-effectiveness of infrastructure to support large-scale AI and application workloads in secure, classified settings.
  • Collaborate with xAI engineers to understand workload requirements and design tailored solutions that meet government-specific needs and compliance standards.
  • Implement robust observability, monitoring, and security practices to ensure the integrity, availability, and confidentiality of critical systems, adhering to federal protocols.
  • Manage storage infrastructure using Infrastructure-as-Code (IaC) tools such as Pulumi, Terraform, or Ansible, with a focus on secure data handling.
  • Drive system reliability through incident management, postmortems, and the definition of clear SLAs and SLOs, while maintaining security and compliance.
  • This is an in-person role based in Palo Alto, CA or Washington, DC, with up to 50% travel required.

Required Qualifications

  • Active Top Secret (TS) security clearance.
  • 5+ years of experience as an Infrastructure Engineer, Site Reliability Engineer, or similar role, with a focus on building and maintaining reliable, scalable systems, preferably in secure or government environments.
  • Proficiency in managing storage infrastructure with IaC tools such as Pulumi, Terraform, or Ansible.
  • Deep understanding of the Kubernetes stack, including CNI, CRI, CSI, and related components.
  • Demonstrated ability to improve system reliability through incident management, postmortems, and defining SLAs / SLOs.
  • Excellent communication and documentation skills, with the ability to handle sensitive information concisely and accurately.
  • Preferred Qualifications

  • Deep familiarity with installing and using GPU hardware, including setting up drivers, debugging issues, and ensuring reliability.
  • Experience with high-traffic web or mobile application workloads, including optimizing Kubernetes for large-scale deployments in classified or federal settings.
  • Familiarity with chaos engineering, capacity planning, or similar practices for ensuring system resilience in government projects.
  • Proficiency with tools such as Kyverno, ArgoCD, or Go programming for infrastructure automation.
  • Strong sense of ownership, curiosity, and enthusiasm for tackling complex technical challenges in secure environments.
  • Passion for problem-solving and a proactive drive to deliver impactful results while adhering to security protocols.
  • Certifications in security-related fields (e.g., CISSP) or experience in secure federal environments.
  • Interview Process

    After submitting your application, our team will review your CV and statement of exceptional work. If your application advances, you will be invited to a 15-minute phone interview to discuss basic qualifications. Successful candidates will proceed to the main process, which includes :

  • Technical deep-dive : Discussing your infrastructure and secure systems experience.
  • A hands-on challenge focused on designing or troubleshooting infrastructure for secure environments.
  • A meet-and-greet with the wider team.
  • Our goal is to complete the main interview process within one week.

    Annual Salary Range

    $180,000 - $440,000 USD

    Benefits

    Base salary is just one part of our total rewards package at xAI, which also includes equity, comprehensive medical, vision, and dental coverage, access to a 401(k) retirement plan, short & long-term disability insurance, life insurance, and various other discounts and perks.

    xAI is an equal opportunity employer.

    California Consumer Privacy Act (CCPA) Notice

    [job_alerts.create_a_job]

    Infrastructure Engineer • Palo Alto, CA, US

    [internal_linking.similar_jobs]
    Infrastructure Engineer

    Infrastructure Engineer

    Meshy • Sunnyvale, CA, US
    [job_card.full_time]
    Meshy is the leading 3D generative AI company on a mission to.Meshy makes it effortless for both professional artists and hobbyists to create unique 3D assets—turning text and images into stu...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Senior Cloud Infrastructure Engineer

    Senior Cloud Infrastructure Engineer

    Five9 • San Ramon, CA, US
    [job_card.full_time]
    Join us in bringing joy to customer experience.Five9 is a leading provider of cloud contact center software, bringing the power of cloud innovation to customers worldwide.Living our values everyday...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Site Reliability Engineer - US Government

    Site Reliability Engineer - US Government

    x.ai • Palo Alto, CA, United States
    [job_card.full_time]
    Site Reliability Engineer - US Government.AI’s mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly mot...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Lead Platform Engineer (Network Infrastructure)

    Lead Platform Engineer (Network Infrastructure)

    Capital One • San Jose, CA, United States
    [job_card.part_time]
    Lead Platform Engineer (Network Infrastructure).Do you love building and pioneering in the technology space? Do you enjoy solving complex technical problems in a fast-paced, collaborative, inclusiv...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Staff ML Infrastructure Engineer

    Staff ML Infrastructure Engineer

    Cubiq Recruitment • Fremont, CA, US
    [job_card.full_time]
    Staff / Lead ML Infrastructure Engineer.San Francisco, CA — Onsite.Salary - Over market average + equity.We are building one of the world’s leading generative video and multimodal AI pl...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Data Center Infrastructure Engineer

    Data Center Infrastructure Engineer

    Foxconn Industrial Internet - FII • San Jose, CA, US
    [job_card.full_time]
    [filters_job_card.quick_apply]
    We are seeking a highly skilled and motivated Data Center Infrastructure Engineer to join our dynamic team.The ideal candidate will be responsible for the design, maintenance, and construction of d...[show_more]
    [last_updated.last_updated_variable_days]
    Infrastructure QA Engineer

    Infrastructure QA Engineer

    Fortinet • Sunnyvale, CA, United States
    [job_card.full_time]
    Fortinet is looking for a Network&Security QA Engineer to join the Infrastructure QA team in.Sunnyvale headquarters, California. This is a technical role, delivering testing service for Fortinet dat...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Software Engineer, Infrastructure

    Software Engineer, Infrastructure

    Mashgin • Palo Alto, CA, US
    [job_card.full_time]
    Mashgin powers the world's best checkout experience for over 40 million users.Customers just place their items on our kiosks and our AI rings up their entire order in less than a second.With Ma...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Senior Infrastructure Engineer (Core Infra, US)

    Senior Infrastructure Engineer (Core Infra, US)

    Workato • Palo Alto, CA, US
    [job_card.full_time]
    Workato transforms technology complexity into business opportunity.As the leader in enterprise orchestration, Workato helps businesses globally streamline operations by connecting data, processes, ...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Infrastructure Engineer

    Infrastructure Engineer

    Dtex Systems • Fremont, CA, US
    [job_card.full_time]
    DTEX is seeking an experienced Site Reliability Engineer (SRE) with a strong software engineering background to help drive modernization of our infrastructure and operations.This is a high-impact r...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Cloud Infrastructure Engineer

    Cloud Infrastructure Engineer

    Forhyre • Sunnyvale, CA, US
    [job_card.full_time]
    Do you enjoy solving technical issues, empathize with customer user experiences and want to keep up with the latest tech? We are looking for a Cloud Infrastructure Engineer that will work with tale...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Senior Infrastructure Engineer

    Senior Infrastructure Engineer

    Crusoe • Sunnyvale, CA, US
    [job_card.full_time]
    Crusoe's mission is to accelerate the abundance of energy and intelligence.We’re crafting the engine that powers a world where people can create ambitiously with AI — without sacrif...[show_more]
    [last_updated.last_updated_30] • [promoted]
    IT Infrastructure Engineer

    IT Infrastructure Engineer

    Samsung Semiconductor • San Jose, CA, US
    [job_card.full_time]
    To provide the best candidate experience amidst our high application volumes, each candidate is limited to 10 applications across all open jobs within a 6-month period.Advancing the World's Tec...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Software Engineer III, Cloud Infrastructure

    Software Engineer III, Cloud Infrastructure

    Match Group • Palo Alto, CA, US
    [job_card.full_time]
    Launched in 2012, Tinder® revolutionized how people meet, growing from 1 match to one billion matches in just two years.This rapid growth demonstrates its ability to fulfill a fundamental human...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Senior Infrastructure Engineer - Supercomputing

    Senior Infrastructure Engineer - Supercomputing

    Institute of Foundation Models • Sunnyvale, CA, US
    [job_card.full_time]
    About the Institute of Foundation Models.We are a dedicated research lab for building, understanding, using, and risk-managing foundation models. Our mandate is to advance research, nurture the next...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Design Infrastructure Engineer

    Design Infrastructure Engineer

    Etched • Cupertino, CA, US
    [job_card.full_time]
    Etched is building AI chips that are hard-coded for individual model architectures.Our first product (Sohu) only supports transformers, but has an order of magnitude more throughput and lower laten...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Senior Infrastructure Security Engineer - DGX Cloud

    Senior Infrastructure Security Engineer - DGX Cloud

    Nvidia Corporation • Santa Clara, CA, United States
    [job_card.full_time]
    NVIDIA is looking for a Sr Infrastructure Security Engineer who will design and implement security best practices for on-premise and cloud access, keeping in mind boundaries that securely enable NV...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Infrastructure Engineer

    Infrastructure Engineer

    DTEX Systems • Fremont (Hybrid), CA, US
    [job_card.full_time]
    [filters_job_card.quick_apply]
    DTEX is seeking an experienced Site Reliability Engineer (SRE) with a strong software engineering background to help drive modernization of our infrastructure and operations.This is a high-impact r...[show_more]
    [last_updated.last_updated_30]