Talent.com
Site Reliability Engineer (SRE) - grok.com & API
Site Reliability Engineer (SRE) - grok.com & APIXai • Palo Alto, California, United States
Site Reliability Engineer (SRE) - grok.com & API

Site Reliability Engineer (SRE) - grok.com & API

Xai • Palo Alto, California, United States
[job_card.30_days_ago]
[job_preview.job_type]
  • [job_card.full_time]
[job_card.job_description]

About xAI

xAI’s mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge.

Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who appreciate challenging themselves and thrive on curiosity.

We operate with a flat organizational structure. All employees are expected to be hands-on and to contribute directly to the company’s mission. Leadership is given to those who show initiative and consistently deliver excellence. Work ethic and strong prioritization skills are important.

All engineers and researchers are expected to have strong communication skills. They should be able to concisely and accurately share knowledge with their teammates.

About the team

You will work on the team that is responsible for the backend services that power grok.com and our API. Our team is currently based primarily in London with a small but growing number of engineers located in Palo Alto. We focus on writing highly scalable and reliable services that can efficiently process tens of thousands of queries per second. The services are hosted on a number of Kubernetes clusters (on-prem & cloud).

About the role

An ideal candidate meets at least the following requirements :

  • Expert knowledge of Kubernetes,
  • Expert knowledge of continuous deployment systems such as Buildkite and ArgoCD,
  • Expert knowledge of monitoring technologies such as Prometheus, Grafana, and PagerDuty,
  • Expert knowledge of infrastructure as code technologies such as Pulumi or Terraform.

Location

We hire engineers in London and in Palo Alto. We usually work from the office 5 days a week but allow for work-from-home days when required. Candidates joining the London team must be willing to attend late meetings at least once a week to coordinate with the rest of our team.

Interview process

After submitting your application, the team reviews your CV and statement of exceptional work. If your application passes this stage, you will be invited to a 15 minute interview (“phone interview”) during which a member of our team will ask some basic technical questions. If you clear the initial phone interview, you will enter the main process, which consists of two technical interviews.

All interviews will be conducted via Google Meet.

Benefits

  • Competitive cash-based compensation
  • xAI equity
  • Private health and dental insurance
  • Annual Salary Range

    $180,000 - $440,000 USD

    xAI is an equal opportunity employer and does not unlawfully discriminate based on race, color, religion, ethnicity, ancestry, national origin, sex (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender, gender identity, gender expression, age, disability, medical conditions, genetic information, marital status, military or veteran status, or any other applicable legally protected characteristics.

    Qualified applicants with arrest or conviction records will be considered for employment in accordance with all applicable federal, state, and local laws, including the San Francisco Fair Chance Ordinance, Los Angeles County Fair Chance Ordinance for Employers, and the California Fair Chance Act.

    For Los Angeles County (unincorporated) Candidates :

    xAI reasonably believes that criminal history may have a direct, adverse and negative relationship on the following job duties, potentially resulting in the withdrawal of a conditional offer of employment :

  • Access to information technology systems and confidential information, including proprietary and trade secret information, and / or user data;
  • Interacting with internal and / or external clients and colleagues; and
  • Exercising sound judgment.
  • California Consumer Privacy Act (CCPA) Notice

    [job_alerts.create_a_job]

    Site Reliability Engineer • Palo Alto, California, United States

    [internal_linking.related_jobs]
    Lead Site Reliability Engineer

    Lead Site Reliability Engineer

    VirtualVocations • Santa Clara, California, United States
    [job_card.full_time]
    A company is looking for a Lead Site Reliability Engineer.Key Responsibilities Design, implement, and operate scalable infrastructure on Google Cloud Platform, focusing on Google Kubernetes Engin...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Senior Technology Site Reliability Engineer

    Senior Technology Site Reliability Engineer

    Cooley LLP • Palo Alto, CA, United States
    [job_card.full_time]
    Senior Technology Site Reliability Engineer.Cooley is seeking a Senior Site Reliability Engineer to join the.Infrastructure & Development Operations. The Senior Technology Site Reliability Engineer(...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    SRE : Observability & Network Reliability Lead

    SRE : Observability & Network Reliability Lead

    PSI Quantum • Palo Alto, CA, United States
    [job_card.full_time]
    A leading quantum computing company in Palo Alto is seeking a Site Reliability Engineer to ensure their services remain healthy and fast. Responsibilities include defining SLIs / SLOs, maintaining obs...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Staff Site Reliability Engineer, Energy Software

    Staff Site Reliability Engineer, Energy Software

    Tesla Motors, Inc. • Palo Alto, CA, United States
    [job_card.full_time]
    Tesla is looking for a Site Reliability Engineer to build, enhance, and scale the infrastructure that underpins our Energy IoT applications. These applications provide real-time monitoring, optimiza...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Site Reliability Engineer

    Site Reliability Engineer

    PsiQuantum • Palo Alto, CA, United States
    [job_card.full_time]
    Quantum computing holds the promise of humanity's mastery over the natural world, but only if we can build a.PsiQuantum is on a mission to build the first real, useful quantum computers, capable of...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Site Reliability Engineer

    Site Reliability Engineer

    Cypress HCM • Fremont, CA, United States
    [job_card.full_time]
    As a Site Reliability Engineer (Contractor), you will be a hands-on contributor, focused on supporting and improving the reliability of our AWS cloud infrastructure. You will apply core SRE principl...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Site Reliability Engineer - Kubernetes Platform

    Site Reliability Engineer - Kubernetes Platform

    Pantera Capital • Palo Alto, CA, United States
    [job_card.full_time]
    AI’s mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excelle...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Staff Site Reliability Engineer, Fleetnet, Vehicle Software

    Staff Site Reliability Engineer, Fleetnet, Vehicle Software

    Tesla • Palo Alto, CA, United States
    [job_card.full_time]
    Staff Site Reliability Engineer, Fleetnet.Staff Site Reliability Engineer, Fleetnet.Staff Site Reliability Engineer, Fleetnet. Staff Site Reliability Engineer, Fleetnet.Get AI-powered advice on this...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    SRE Manager - Lead Reliability & Observability (Hybrid)

    SRE Manager - Lead Reliability & Observability (Hybrid)

    General Motors of Canada • Mountain View, California, United States
    [job_card.full_time]
    A leading automobile manufacturer is seeking a Site Reliability Engineering Manager.The role involves leading a team to ensure system reliability, mentoring engineers, and managing operational inci...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Site Reliability Engineer

    Site Reliability Engineer

    Foxconn Industrial Internet - FII • San Jose, CA, US
    [job_card.full_time] +1
    [filters_job_card.quick_apply]
    Site Reliability Engineer Foxconn Industrial Internet (Fii), is a world leading professional design and manufacturing service provider of communication network equipment, cloud service equipment, p...[show_more]
    [last_updated.last_updated_30]
    Site Reliability Engineer (SRE)

    Site Reliability Engineer (SRE)

    OPPO • Palo Alto, CA, United States
    [job_card.full_time]
    OPPO US Research Center is seeking a skilled and proactive.Site Reliability Engineer (SRE).In this role, you will be responsible for ensuring the stability, scalability, and performance of our appl...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Site Reliability Engineer - Openstack

    Site Reliability Engineer - Openstack

    Fortinet • Sunnyvale, California, United States
    [job_card.full_time]
    Fortinet is recruiting a Site Reliability Engineer- OPENSTACK to join our FortiStack team.This team is responsible for the management, operation and continued development of our Openstack-based pri...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Site Reliability Engineer - Observability

    Site Reliability Engineer - Observability

    Rivian and Volkswagen Group Technologies • Palo Alto, CA, United States
    [job_card.full_time]
    Senior Site Reliability Engineer (SRE).RivianVW's Data Platform - Production Engineering team.In this role, you will design, implement, and scale robust observability systems to ensure the health, ...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Sr. Reliability Engineer (26861)

    Sr. Reliability Engineer (26861)

    Supermicro • San Jose, California, United States
    [job_card.full_time]
    Supermicro is a Top Tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop / Big Data, Hyperscale, HPC and IoT / Embedded customers...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Staff Site Reliability Engineer

    Staff Site Reliability Engineer

    Grindr • Palo Alto, CA, United States
    [job_card.full_time]
    Staff Site Reliability Engineer.Get AI-powered advice on this job and more exclusive features.This range is provided by Grindr. Your actual pay will be based on your skills and experience — talk wit...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Site Reliability Engineer (SRE) at OPPO US Research Center Palo Alto, CA

    Site Reliability Engineer (SRE) at OPPO US Research Center Palo Alto, CA

    OPPO US Research Center • Palo Alto, CA, United States
    [job_card.full_time]
    Site Reliability Engineer (SRE) job at OPPO US Research Center.OPPO US Research Center is seeking a skilled and proactive. Site Reliability Engineer (SRE).In this role, you will be responsible for e...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Site Reliability Engineer (L2)

    Site Reliability Engineer (L2)

    Wave Money • Palo Alto, CA, United States
    [job_card.full_time]
    Job Location : The Campus, Pun Hlaing Estate, Hlaing Thar Yar Township, Yangon.Working Hours : 8 : 30 AM to 5 : 30 PM, (Monday to Friday). Site Reliability Engineer is to perform daily support and monitor...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Site Reliability Engineer – Observability & Automation

    Site Reliability Engineer – Observability & Automation

    black.ai • Palo Alto, CA, United States
    [job_card.full_time]
    A leading quantum computing company is seeking a Site Reliability Engineer to join their OS / Platform team in Palo Alto. This role involves maintaining the health and performance of services through ...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]