Site Reliability EngineerFractal • San Francisco, California, United States

[error_messages.no_longer_accepting]

Site Reliability Engineer

Fractal • San Francisco, California, United States

[job_card.30_days_ago]

[job_preview.job_type]

[job_card.full_time]

[job_card.job_description]

This range is provided by Fractal. Your actual pay will be based on your skills and experience — talk with your recruiter to learn more.

Base pay range

$110,000.00 / yr - $160,000.00 / yr

Site Reliability Engineer

Fractal Analytics is a strategic AI partner to Fortune 500 companies with a vision to power every human decision in the enterprise. Fractal is building a world where individual choices, freedom, and diversity are the greatest assets. An ecosystem where human imagination is at the heart of every decision. Where no possibility is written off, only challenged to get better. We believe that a true Fractalite empowers imagination with intelligence. And that it will be such Fractalites that will continue to build the company for the next 100 years.

Please Note : This role is specifically located in the Bay Area of San Francisco. You will need to work onsite Monday - Friday. We offer paid relocation.

Role Overview

As a Site Reliability Engineer with Fractal, you will be dedicated to ensuring the highest system availability and performance levels. This role involves comprehensive monitoring, addressing complex technical issues, automating solutions to recurring problems, and contributing to developing resilient system architectures and deployment strategies. You will work closely with our Services and Engineering teams, playing a crucial role in optimizing our platforms and infrastructures.

Responsibilities

Ensure maximum uptime and system availability to meet or exceed functional and performance SLAs.

Implement thorough end-to-end monitoring and alerting on all critical components to ensure quick detection and response.

Tackle complex challenges affecting critical services, focusing on automating problem resolution to prevent future occurrences.

Drive the development of innovative designs, architectures, standards, and methodologies to support and enhance our platform.

Lead in scripting and automation efforts, aiming to refine system updates and upgrade processes.

Design and configure essential infrastructure, tools, and frameworks to enhance the deployment lifecycle.

Collaborate effectively with cross-functional teams within Services and Engineering.

Qualifications

Have interest and ability to become certified on the end client AI platform. (We will provide all the necessary training and support)

Bachelor’s or master’s degree in computer science, a related field, or equivalent professional experience.

Minimum of 10 years of relevant experience.

Proven experience in deploying, managing, and optimizing scalable, fault-tolerant Linux / Kubernetes / JVM infrastructure across various cloud platforms like AWS, GCP, and Azure.

Deep expertise in Linux Operating Systems, Networking principles, and Database management.

Practical experience with Cassandra or similar NoSQL technologies.

Proficiency with major cloud services providers, notably AWS, Azure, and GCP.

Familiarity with configuration management tools such as Ansible or Terraform.

Proficiency in programming languages like Ruby or Python, particularly for system automation and monitoring.

Strong problem-solving abilities, critical thinking skills, and effective communication capabilities.

Prior experience in a DevOps or system administration role, ideally supporting commercial SaaS solutions.

Pay :

The wage range for this role takes into account the wide range of factors that are considered in making compensation decisions, including but not limited to skill sets; experience and training; licensure and certifications; and other business and organizational needs. A reasonable estimate of the current range is : $110,000 - $160,000. In addition, you may be eligible for a discretionary bonus for the current performance period.

As a full-time employee of the company or as an hourly employee working more than 30 hours per week, you will be eligible to participate in the health, dental, vision, life insurance, and disability plans in accordance with the plan documents, which may be amended from time to time. You will be eligible for benefits on the first day of employment with the Company. In addition, you are eligible to participate in the Company 401(k) Plan after 30 days of employment, in accordance with the applicable plan terms. The Company provides for 11 paid holidays and 12 weeks of Parental Leave. We also follow a “free time” PTO policy, allowing you the flexibility to take the time needed for either sick time or vacation.

Fractal provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.

Seniority level

Mid-Senior level

Employment type

Full-time

Job function

Information Technology, Consulting, and Engineering

Industries

Technology, Information and Media, IT Services and IT Consulting, and Business Consulting and Services

#J-18808-Ljbffr

[job_alerts.create_a_job]

Site Reliability Engineer • San Francisco, California, United States

[internal_linking.similar_jobs]

Site Reliability Engineer

Mercor, Inc. • San Francisco, California, United States

[job_card.full_time]

About Mercor Mercor is at the intersection of labor markets and AI research.We partner with leading AI labs and enterprises to provide the human intelligence essential to AI development.Our vast ta...[show_more]

[last_updated.last_updated_variable_hours] • [promoted] • [new]

Senior Technology Site Reliability Engineer

Cooley LLP • San Francisco, CA, United States

[job_card.full_time]

Senior Technology Site Reliability Engineer.Cooley is seeking a Senior Site Reliability Engineer to join the.Infrastructure & Development Operations. The Senior Technology Site Reliability Engineer(...[show_more]

[last_updated.last_updated_30] • [promoted]

Site Reliability Engineer

gamma.app • San Francisco, CA, United States

[job_card.full_time]

We're building the creative layer for modern communication.Every month, over a billion people make presentations — but the tools they use to make them haven't evolved in decades.We're changing that...[show_more]

[last_updated.last_updated_variable_days] • [promoted]

Senior Site Reliability Engineer – Platform

Icon Ventures • San Francisco, CA, United States

[job_card.full_time]

At Quizlet, our mission is to help every learner achieve their outcomes in the most effective and delightful way.We blend cognitive science with machine learning to personalize and enhance the lear...[show_more]

[last_updated.last_updated_30] • [promoted]

Site Reliability Engineer

Mercor • San Francisco, CA, United States

[job_card.full_time]

Mercor is at the intersection of labor markets and AI research.We partner with leading AI labs and enterprises to provide the human intelligence essential to AI development.Our vast talent network ...[show_more]

[last_updated.last_updated_1_day] • [promoted]

Site Reliability Engineer

WorkOS • San Francisco, CA, United States

[job_card.full_time]

WorkOS builds tools and services for developers to help them implement authentication, identity, authorization, and overall enterprise readiness. We’re a fully distributed team with employees across...[show_more]

[last_updated.last_updated_30] • [promoted]

Staff Site Reliability Engineer

Redwood Materials, Inc. • San Francisco, CA, United States

[job_card.full_time]

Redwood is localizing a global battery supply chain that seamlessly integrates recovery, reuse, and recycling—keeping critical minerals in circulation and driving the energy transition.Founded in 2...[show_more]

[last_updated.last_updated_30] • [promoted]

Principal Site Reliability Engineer

Early Warning Services LLC • San Francisco, CA, United States

[job_card.full_time]

Positions located in Scottsdale, San Francisco, Chicago, or New York follow a hybrid work model to allow for a more collaborative working environment. Candidates responding to this posting must inde...[show_more]

[last_updated.last_updated_30] • [promoted]

Site Reliability Engineer I

prosper.com • San Francisco, CA, United States

[job_card.full_time]

As a Site Reliability Engineer I at Prosper, you will play a crucial role in enhancing the reliability, scalability, and maintainability of our technology platform. This entry‑level position is desi...[show_more]

[last_updated.last_updated_30] • [promoted]

Senior Site Reliability Engineer

Alembic Technologies • San Francisco, CA, United States

[job_card.full_time]

Senior Site Reliability Engineer.This range is provided by Alembic Technologies.Your actual pay will be based on your skills and experience — talk with your recruiter to learn more.We’re looking fo...[show_more]

[last_updated.last_updated_30] • [promoted]

Site Reliability Engineer

Fractal • San Francisco, CA, United States

[job_card.full_time]

This range is provided by Fractal.Your actual pay will be based on your skills and experience — talk with your recruiter to learn more. Fractal Analytics is a strategic AI partner to Fortune 500 com...[show_more]

[last_updated.last_updated_30] • [promoted]

Site Reliability Engineer

Primer • San Francisco, CA, United States

[job_card.full_time]

Primer helps B2B products break out of the B2C-centric marketing box.Our platform turns consumer ad channels, data streams, and emerging AI workflows into measurable growth engines for go-to-market...[show_more]

[last_updated.last_updated_30] • [promoted]

Senior Site Reliability Engineer

Hive • San Francisco, CA, United States

[job_card.full_time]

Hive is the leading provider of cloud-based AI solutions to understand, search, and generate content, and is trusted by hundreds of the world's largest and most innovative organizations.The company...[show_more]

[last_updated.last_updated_30] • [promoted]

Site Reliability Engineer

Writemed • San Francisco, CA, United States

[job_card.full_time]

Would you like to join one of the fastest-growing organizations with a goal of using the latest AI, GenAI, LLM, Cloud, and Digital Technologies to advance drug development and improve patient care ...[show_more]

[last_updated.last_updated_30] • [promoted]

Principal Site Reliability Engineer

Early Warning® • San Francisco, CA, United States

[job_card.full_time]

At Early Warning, we’ve powered and protected the U.Zelle®, Paze℠, and so much more.As a trusted name in payments, we partner with thousands of institutions to increase access to financial services...[show_more]

[last_updated.last_updated_30] • [promoted]

Senior Site Reliability Engineer

AppOmni • San Francisco, CA, United States

[job_card.full_time]

AppOmni, a leader in SaaS Security, helps customers achieve secure productivity with their applications.Security teams and owners can quickly detect and mitigate threats using unmatched depth of pr...[show_more]

[last_updated.last_updated_30] • [promoted]

Site Reliability Engineer

Happyrobot Inc. • San Francisco, CA, United States

[job_card.full_time]

HappyRobot is the AI-native operating system for the real economy—a system that closes the circuit between intelligence and action. By combining real-time truth, specialized AI workers, and an orche...[show_more]

[last_updated.last_updated_variable_hours] • [promoted] • [new]

Site Reliability Engineer

ConductorOne Inc. • San Francisco, CA, United States

[job_card.full_time]

ConductorOne is the first AI-native identity security platform that protects every identity : human, non-human, and AI.With powerful automation, platform-level AI, and out-of-the-box connectors, it ...[show_more]

[last_updated.last_updated_variable_days] • [promoted]