Talent.com
Site Reliability Engineer
Site Reliability EngineerPlenful • San Francisco, CA, United States
Site Reliability Engineer

Site Reliability Engineer

Plenful • San Francisco, CA, United States
[job_card.variable_hours_ago]
[job_preview.job_type]
  • [job_card.full_time]
[job_card.job_description]

Plenful is on a mission to transform healthcare operations from the inside out. Fresh off our recent founding round and backed by Notable Capital, Bessemer Venture Partners, TQ Ventures, Susa / Kivu Ventures, and other leading investors, we're building the category-defining AI workflow automation platform that healthcare teams rely on to operate smarter, faster, and more efficiently. We automate manual tasks across disparate systems to improve compliance posture, streamline manual work, and unlock critical revenue, so teams can deliver better patient care.

Built by healthcare operators for healthcare operators, Plenful is driven by a deep understanding of the challenges facing today's care teams. We're passionate about equipping healthcare teams with world-class tools that deliver real, measurable impact, and we're proud to serve 70+ leading health systems across the country. If you're excited to help shape the future of healthcare, we'd love to meet you.

About the role We're hiring an SRE to join our engineering team at Plenful and take ownership of the reliability and performance of the systems that power our product. You'll work across our distributed workflow engine, serverless pipelines, containerized services and Postgres based data layer. This role reports into engineering leadership and will influence how we build, scale and operate our platform as we continue to grow.

You'll bring strong technical judgment, calm problem solving during incidents and a practical approach to improving reliability. You'll collaborate closely with backend, ML and DevOps engineers and help shape a culture where operational excellence is clear, repeatable and shared across the team.

What you'll do

Reliability, Observability and Performance :

  • Maintain and evolve alerting so engineers receive clear, actionable signals for anomalies, latency regressions and reliability risks.
  • Define observability standards across metrics, logs and tracing with a focus on reliability, performance and customer impact instead of vanity data.
  • Investigate performance bottlenecks across our distributed systems including serverless task execution, containerized services, workflow orchestration and Postgres.
  • Lead incident response, coordinate root cause analysis and ensure reliability improvements are fully implemented and measured.

Infrastructure and Platform Operations :

  • Improve the reliability of our distributed task processing, including autoscaling behavior, execution patterns, retry logic, rate limiting and failure isolation.
  • Support the stability of our serverless pipelines that process high volume workloads across multiple execution layers.
  • Partner with backend and ML teams on designing resilient mechanisms for scheduling, queueing and workflow execution.
  • Maintain efficient and predictable resource usage across compute, networking and storage.
  • Security, Compliance and Operational Excellence :

  • Support security and compliance work including patching, audit readiness and vulnerability management.
  • Participate in the on‑call rotation and respond to production incidents quickly and calmly with a focus on restoring stable service and clear communication.
  • Contribute to blameless post‑mortems, drive follow through on fixes and ensure learnings are documented for future engineers.
  • What we're looking for

  • 5+ years of professional engineering experience in a B2B, SaaS company.
  • Strong experience operating production systems in cloud environments, ideally AWS.
  • Hands‑on experience with serverless compute patterns, containerized services, distributed workflows and Postgres.
  • Solid understanding of observability tooling, performance debugging and system behavior under load.
  • A high ownership mindset, empathy for teammates, straightforward communication and a one‑team attitude.
  • Comfortable working in a fast‑paced startup environment with a bias for action and thoughtful engineering judgment.
  • Comprehensive Benefits Package : Enjoy unlimited PTO, fully covered health insurance (medical, dental, and vision), meal stipend, health & wellness stipend, 401(k) matching, and stock options.
  • Mission‑Driven, World‑Class Team : Join an exceptional group of professionals aligned around a meaningful mission and committed to making an impact.
  • Opportunities for Growth : Strengthen your partnership expertise through collaboration with experienced, high‑performing leaders across the organization.
  • Flexible Work Environment : Employees based in the Bay Area enjoy two days per week in a brand‑new downtown San Francisco office. Employees based in other cities enjoy a fully remote work environment with the ability to travel for collaboration.
  • #J-18808-Ljbffr

    [job_alerts.create_a_job]

    Site Reliability Engineer • San Francisco, CA, United States

    [internal_linking.similar_jobs]
    Lead Site Reliability Engineer

    Lead Site Reliability Engineer

    Stuut • San Francisco, CA, US
    [job_card.full_time]
    Stuut is transforming accounts receivable for B2B companies—making collections smarter and faster for companies that have historically relied on manual processes that are labor intensive and ...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Site Reliability Engineer - Platform

    Site Reliability Engineer - Platform

    CodeRabbit • San Francisco, CA, United States
    [job_card.full_time]
    CodeRabbit is an innovative research and development company focused on building extraordinarily productive human‑machine collaboration systems. Our primary goal is to create the next generation of ...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Site Reliability Engineer

    Site Reliability Engineer

    Mercor, Inc. • San Francisco, California, United States
    [job_card.full_time]
    About Mercor Mercor is at the intersection of labor markets and AI research.We partner with leading AI labs and enterprises to provide the human intelligence essential to AI development.Our vast ta...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Site Reliability Engineer

    Site Reliability Engineer

    VirtualVocations • San Francisco, California, United States
    [job_card.full_time]
    A company is looking for a Site Reliability Engineer (SRE) with strong GitLab platform expertise.Key Responsibilities Administer and optimize GitLab, Jira, and Confluence for reliability, securit...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Senior Site Reliability Engineer Cloud Platform

    Senior Site Reliability Engineer Cloud Platform

    Zilliz • Redwood City, CA, US
    [job_card.full_time]
    Zilliz is a fast-growing startup developing the industry’s leading vector database company for enterprise-grade AI.Founded by the engineers behind Milvus, the world’s most pop...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Site Reliability Engineer

    Site Reliability Engineer

    Together • San Francisco, California, United States
    [job_card.full_time]
    As a Site Reliability Engineer (SRE) at Together, you are responsible for keeping all user-facing services and production systems running smoothly. You are a blend of a pragmatic operator and a soft...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Site Reliability Engineer

    Site Reliability Engineer

    gamma.app • San Francisco, CA, United States
    [job_card.full_time]
    We're building the creative layer for modern communication.Every month, over a billion people make presentations — but the tools they use to make them haven't evolved in decades.We're changing that...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Site Reliability Engineer

    Site Reliability Engineer

    Attain • Redwood City, CA, United States
    [job_card.full_time]
    Built for consumers and companies, alike.In a world driven by data, we believe consumers and businesses can coexist.Our founders had a vision to empower consumers to leverage their greatest asset—t...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Site Reliability Engineer

    Site Reliability Engineer

    Mercor • San Francisco, CA, United States
    [job_card.full_time]
    Mercor is at the intersection of labor markets and AI research.We partner with leading AI labs and enterprises to provide the human intelligence essential to AI development.Our vast talent network ...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Senior Site Reliability Engineer - Platform

    Senior Site Reliability Engineer - Platform

    Quizlet • San Francisco, CA, US
    [job_card.full_time]
    At Quizlet, our mission is to help every learner achieve their outcomes in the most effective and delightful way.Our $1B+ learning platform serves tens of millions of students every month, in...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Zipline • South San Francisco, CA, US
    [job_card.full_time]
    Do you want to change the world? Zipline is on a mission to transform the way goods move.Our aim is to solve the world's most urgent and complex access challenges by building, manufacturing and...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Site Reliability Engineer

    Site Reliability Engineer

    Zoox • Foster City, CA, US
    [job_card.full_time]
    Zoox is seeking a Site Reliability Engineer to help ensure the availability, performance, and resilience of the services that power the development and operation of our autonomous vehicles.In this ...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Site Reliability Engineer

    Site Reliability Engineer

    Happyrobot Inc. • San Francisco, California, United States
    [job_card.full_time]
    About HappyRobot HappyRobot is the AI-native operating system for the real economy—a system that closes the circuit between intelligence and action. By combining real-time truth, specialized AI work...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Site Reliability Engineer

    Site Reliability Engineer

    Fractal • San Francisco, CA, United States
    [job_card.full_time]
    This range is provided by Fractal.Your actual pay will be based on your skills and experience — talk with your recruiter to learn more. Fractal Analytics is a strategic AI partner to Fortune 500 com...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Senior+ Site Reliability Engineer

    Senior+ Site Reliability Engineer

    Crusoe • San Francisco, CA, US
    [job_card.full_time]
    Crusoe's mission is to accelerate the abundance of energy and intelligence.We’re crafting the engine that powers a world where people can create ambitiously with AI — without sacrif...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Site Reliability Engineer

    Site Reliability Engineer

    HappyRobot • San Francisco, California, United States
    [job_card.full_time]
    About HappyRobot HappyRobot is the AI‑native operating system for the real economy—a system that closes the circuit between intelligence and action. By combining real‑time truth, specialized AI work...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Gradle Technologies • San Francisco, CA, US
    [job_card.full_time]
    Develocity is a first-of-its-kind toolchain observability and acceleration platform that helps software teams adopt and improve DORA capabilities (including continuous delivery) in order to achieve...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Site Reliability Engineer

    Site Reliability Engineer

    The Voleon Group • Berkeley, CA, United States
    [job_card.full_time]
    Voleon is a technology company that applies state‑of‑the‑art AI and machine learning techniques to real‑world problems in finance. For nearly two decades, we have led our industry and worked at the ...[show_more]
    [last_updated.last_updated_30] • [promoted]