Talent.com

Reliability [h1.location_city]

[job_alerts.create_a_job]

Reliability • san mateo ca

[last_updated.last_updated_1_day]
  • [promoted]
Senior Site Reliability Engineer Cloud Platform

Senior Site Reliability Engineer Cloud Platform

ZillizRedwood City, CA, US
[job_card.full_time]
Zilliz is a fast-growing startup developing the industry’s leading vector database company for enterprise-grade AI.Founded by the engineers behind Milvus, the world’s most pop...[show_more][last_updated.last_updated_30]
  • [promoted]
Senior Site Reliability Engineer

Senior Site Reliability Engineer

IXL LearningSan Mateo, CA, US
[job_card.full_time]
Senior Site Reliability Engineer.IXL Learning, developer of personalized learning products used by millions of people globally, is seeking a Senior Site Reliability Engineer to join our team, and h...[show_more][last_updated.last_updated_30]
  • [promoted]
Software Engineer, Site Reliability

Software Engineer, Site Reliability

RobloxSan Mateo, California, United States
[job_card.full_time]
You’ll collaborate with cross-functional teams to build robust infrastructure that supports our growth.If you have a track record of solving complex technical challenges, we want to hear from you.J...[show_more][last_updated.last_updated_variable_days]
Lead Software Engineer- Middleware Reliability Engineering

Lead Software Engineer- Middleware Reliability Engineering

VisaFoster, California, USA
[job_card.full_time]
Transform global payment systems through automation and innovation.Join Visas Middleware Reliability Engineering team to revolutionize how we deliver and maintain Middleware products supporting cri...[show_more][last_updated.last_updated_variable_days]
Staff Site Reliability Engineer

Staff Site Reliability Engineer

SolarWindsSan Mateo, California
[job_card.full_time]
Work collaboratively with software engineering on infrastructure and deployment requirements;.Contribute actively and assist in our automation and observability initiatives.Build and maintain opera...[show_more][last_updated.last_updated_30]
  • [promoted]
Reliability Technician

Reliability Technician

Mentor Technical GroupMillbrae, CA, US
[job_card.full_time]
Mentor Technical Group (MTG) provides a comprehensive portfolio of technical support and solutions for the FDA-regulated industry. As a world leader in life science engineering and technical solutio...[show_more][last_updated.last_updated_30]
  • [promoted]
Reliability Engineer

Reliability Engineer

PeriodiclabsMenlo Park, CA, United States
[job_card.full_time]
We are an AI + physical sciences lab building state of the art models to make novel scientific discoveries.We are well funded and growing rapidly. Team members are owners who identify and solve prob...[show_more][last_updated.last_updated_30]
  • [promoted]
Sr Reliability Engineer

Sr Reliability Engineer

Hippocratic AISan Mateo, CA, US
[job_card.full_time]
Hippocratic AI is the leading generative AI company in healthcare.We have the only system that can have safe, autonomous, clinical conversations with patients. We have trained our own LLMs as part o...[show_more][last_updated.last_updated_variable_days]
Site Reliability Engineer Intern

Site Reliability Engineer Intern

OracleRedwood City, California, USA
[job_card.full_time]
You will be joining the OCSC (Oracle Cloud Service Centre) as an SRD (site reliability developer).Your job role will be helping Oracle ensure the availability of cloud services will leverage excel...[show_more][last_updated.last_updated_variable_days]
  • [promoted]
AI Evaluation & Reliability Engineer

AI Evaluation & Reliability Engineer

Saxon GlobalRedwood City, CA, United States
[job_card.full_time]
ML / LLM systems or large-scale distributes system.Proven ability to translate product requirements into measurable metrics and test plans. Hands-on experience running A / B tests, canaries, or experime...[show_more][last_updated.last_updated_1_day]
  • [promoted]
Senior SRE Engineer - Reliability & Scale

Senior SRE Engineer - Reliability & Scale

Roblox CorporationSan Mateo, CA, United States
[job_card.full_time]
A leading gaming platform is seeking a Senior Software Engineer - Site Reliability to ensure system performance, reliability, and efficiency. Responsibilities include creating resilient software, de...[show_more][last_updated.last_updated_30]
  • [promoted]
Site Reliability Engineer

Site Reliability Engineer

ZooxFoster City, CA, US
[job_card.full_time]
Zoox is seeking a Site Reliability Engineer to help ensure the availability, performance, and resilience of the services that power the development and operation of our autonomous vehicles.In this ...[show_more][last_updated.last_updated_variable_days]
  • [promoted]
Senior Reliability Engineer, South Campus - Site Services

Senior Reliability Engineer, South Campus - Site Services

GenentechSouth San Francisco, CA, United States
[job_card.full_time]
Are you passionate about driving operational excellence and ensuring the reliability of critical facilities and equipment? This exciting opportunity calls for a seasoned Reliability Engineer eager ...[show_more][last_updated.last_updated_1_day]
  • [promoted]
Reliability Engineer

Reliability Engineer

Robust.aiSan Carlos, CA, US
[job_card.full_time]
Robust AI is a fast-growing, early-stage startup founded in 2019 by an unsurpassed team of veterans in robotics, AI and business. We are a collaborative group with a wide range of backgrounds and pe...[show_more][last_updated.last_updated_30]
  • [promoted]
Senior Hardware Reliability Engineer - Design for Reliability

Senior Hardware Reliability Engineer - Design for Reliability

ZiplineSouth San Francisco, CA, US
[job_card.full_time]
Senior Hardware Reliability Engineer - Design for Reliability.Engineering | Hardware Reliability.Do you want to change the world? Zipline is on a mission to transform the way goods move.Our aim is ...[show_more][last_updated.last_updated_30]
  • [promoted]
Staff SRE (Reliability Engineering)

Staff SRE (Reliability Engineering)

PowerToFlyRedwood City, CA, United States
[job_card.full_time]
We're Celonis, the global leader in Process Intelligence technology and one of the world's fastest-growing SaaS firms.We believe there is a massive opportunity to unlock productivity by placing AI,...[show_more][last_updated.last_updated_30]
  • [promoted]
Reliability Engineer

Reliability Engineer

Periodic LabsMenlo Park, CA, United States
[job_card.full_time]
We are an AI + physical sciences lab building state of the art models to make novel scientific discoveries.We are well funded and growing rapidly. Team members are owners who identify and solve prob...[show_more][last_updated.last_updated_1_day]
  • [promoted]
Senior Site Reliability Engineer

Senior Site Reliability Engineer

CaptivateIQMenlo Park, CA, US
[job_card.full_time]
CaptivateIQ is transforming the way companies plan, manage, and optimize sales performance.We started by revolutionizing incentive compensation management, and now we're expanding our platform ...[show_more][last_updated.last_updated_variable_days]
Senior Site Reliability Engineer

Senior Site Reliability Engineer

IXLSan Mateo, CA
[job_card.full_time]
IXL Learning, developer of personalized learning products used by millions of people globally, is seeking a Senior Site Reliability Engineer to join our team, and help maintain the reliability and ...[show_more][last_updated.last_updated_30]
Senior Site Reliability Engineer Cloud Platform

Senior Site Reliability Engineer Cloud Platform

ZillizRedwood City, CA, US
[job_card.30_days_ago]
[job_preview.job_type]
  • [job_card.full_time]
[job_card.job_description]

Job Description

Job Description

Zilliz is a fast-growing startup developing the industry’s leading vector database company for enterprise-grade AI. Founded by the engineers behind Milvus, the world’s most popular open-source vector database, the company builds next-generation database technologies to help organizations quickly create AI applications. On a mission to democratize AI, Zilliz is committed to simplifying data management for AI applications and making vector databases accessible to every organization.

What you will do :

  • Work at the intersection of development and site reliability. Creating SRE tools and systems, as well as supporting existing infrastructure and platforms.
  • Ensure the reliability, availability, and performance of Zilliz’s distributed database systems.
  • Develop and implement strategies for monitoring, incident management, and disaster recovery.
  • Automate system operations and maintenance tasks to improve efficiency and reduce manual intervention.
  • Design and build tools to manage and monitor infrastructure, ensuring scalability and robustness.
  • Collaborate with software engineers to enhance system reliability, scalability, and performance.
  • Maintain and improve the CI / CD pipeline to ensure smooth and rapid deployment of changes.
  • Actively contribute to the Milvus Vector Database open-source community, focusing on improving reliability and operational efficiency.

What we are looking for :

  • 4+ years of experience in site reliability engineering or similar roles with a focus on cloud-native systems.
  • Proficiency in scripting languages such as Python, Go, or Java.
  • Strong knowledge of container orchestration technologies like Kubernetes and Docker.
  • Expertise with cloud platforms such as AWS, GCP, or Azure, and their respective monitoring and management tools.
  • Experience with infrastructure as code tools such as Terraform or Ansible.
  • Familiarity with CI / CD tools such as Jenkins, GitLab CI, or Argo.
  • Proven ability to troubleshoot complex distributed systems and resolve issues promptly.
  • Bachelor’s degree or above in computer science, software engineering, or other relevant disciplines.
  • Ability to thrive in a fast-paced, startup environment and handle multiple projects simultaneously.
  • Experience with Open Source Milvus Vector Database is nice to have
  • Zilliz is an Equal Opportunity Employer and welcome people from all backgrounds, experiences, abilities, and perspectives. All qualified applicants will receive consideration for employment regardless of race, color, national origin, religion, sexual orientation, gender, gender identity, age, physical disability, or length of time spent unemployed.

    We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.