Talent.com
Data Engineer, Knowledge Graphs
Data Engineer, Knowledge GraphsMithrl • San Francisco, CA, United States
Data Engineer, Knowledge Graphs

Data Engineer, Knowledge Graphs

Mithrl • San Francisco, CA, United States
[job_card.variable_days_ago]
[job_preview.job_type]
  • [job_card.full_time]
[job_card.job_description]

ABOUT MITHRL

We imagine a world where new medicines reach patients in months, not years, and where scientific breakthroughs happen at the speed of thought.

Mithrl is building the world’s first commercially available AI Co-Scientist. It is a discovery engine that transforms messy biological data into insights in minutes. Scientists ask questions in natural language, and Mithrl responds with analysis, novel targets, hypotheses, and patent-ready reports.

No coding. No waiting. No bioinformatics bottlenecks.

We are one of the fastest growing tech bio companies in the Bay Area with 12x year over year revenue growth. Our platform is used across three continents by leading biotechs and big pharmas. We power breakthroughs from early target discovery to mechanism-of-action. And we are just getting started.

ABOUT THE ROLE

We are hiring a Data Engineer, Knowledge Graphs to build the infrastructure that powers Mithrl’s biological knowledge layer. You will partner closely with the Data Scientist, Knowledge Graphs to take curated knowledge sources and transform them into scalable, reliable, production ready systems that serve the entire platform.

Your work includes building ETL pipelines for large biological datasets, designing schemas and storage models for graph structured data, and creating the API surfaces that allow ML engineers, application teams, and the AI Co-Scientist to query and use the knowledge graph efficiently. You will also own the reliability, performance, and versioning of knowledge graph infrastructure across releases.

This role is the bridge between biological knowledge ingestion and the high performance engineering systems that use it. If you enjoy working on data modeling, schema design, graph storage, ETL, and scalable infrastructure, this is an opportunity to have deep impact on the intelligence layer of Mithrl.

WHAT YOU WILL DO

  • Build and maintain ETL pipelines for large public biological datasets and curated knowledge sources
  • Design, implement, and evolve schemas and storage models for graph structured biological data
  • Create efficient APIs and query surfaces that allow internal teams and AI systems to retrieve nodes, relationships, pathways, annotations, and graph analytics
  • Partner closely with the Data Scientists to operationalize curated relationships, harmonized variable IDs, metadata standards, and ontology mappings
  • Build data models that support multi tenant access, versioning, and reproducibility across releases
  • Implement scalable storage and indexing strategies for high volume graph data
  • Maintain data quality, validate data integrity, and build monitoring around ingestion and usage
  • Work with ML engineers and application teams to ensure the knowledge graph infrastructure supports downstream reasoning, analysis, and discovery applications
  • Support data warehousing, documentation, and API reliability
  • Ensure performance, reliability, and uptime for knowledge graph services

WHAT YOU BRING

Required Qualifications

  • Strong experience as a data engineer or backend engineer working with data intensive systems
  • Experience building ETL or ELT pipelines for large structured or semi structured datasets
  • Strong understanding of database design, schema modeling, and data architecture
  • Experience with graph data models or willingness to learn graph storage concepts
  • Proficiency in Python or similar languages for data engineering
  • Experience designing and maintaining APIs for data access
  • Understanding of versioning, provenance, validation, and reproducibility in data systems
  • Experience with cloud infrastructure and modern data stack tools
  • Strong communication skills and ability to work closely with scientific and engineering teams
  • Nice to Have

  • Experience with graph databases or graph query languages
  • Experience with biological or chemical data sources
  • Familiarity with ontologies, controlled vocabularies, and metadata standards
  • Experience with data warehousing and analytical storage formats
  • Previous work in a tech bio company or scientific platform environment
  • WHAT YOU WILL LOVE AT MITHRL

  • You will build the core infrastructure that makes the biological knowledge graph fast, reliable, and usable
  • Team : Join a tight-knit, talent-dense team of engineers, scientists, and builders
  • Culture : We value consistency, clarity, and hard work. We solve hard problems through focused daily execution
  • Speed : We ship fast (2x / week) and improve continuously based on real user feedback
  • Location : Beautiful SF office with a high-energy, in-person culture
  • Benefits : Comprehensive PPO health coverage through Anthem (medical, dental, and vision) + 401(k) with top-tier plans
  • [job_alerts.create_a_job]

    Data Engineer • San Francisco, CA, United States

    [internal_linking.similar_jobs]
    Senior Manager, REMS Data Programmer

    Senior Manager, REMS Data Programmer

    Jazz Pharmaceuticals • Atherton, California, USA
    [job_card.full_time]
    If you are a current Jazz employee please apply via the Internal Career site.Jazz Pharmaceuticals is a global biopharma company whose purpose is to innovate to transform the lives of patients and ...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Senior Director, Data and AI Architecture Leader

    Senior Director, Data and AI Architecture Leader

    Dynavax Technologies • Emeryville, CA, United States
    [job_card.full_time]
    This position can be 100% remote, but must be located in the United States.Dynavax is a commercial-stage biopharmaceutical company developing and commercializing novel vaccines to help protect the ...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Remote Financial Planner - AI Trainer ($150 per hour)

    Remote Financial Planner - AI Trainer ($150 per hour)

    Mercor • Richmond, California, US
    [filters.remote]
    [job_card.full_time]
    UK / Canada / Europe / Singapore / Dubai / Australia-based • •Investment Banking or Private Equity Experts • • for a research project with a leading foundational model AI lab. You are a good fit if you : - Have •...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Data Platform Engineer / AI Workloads (Alameda)

    Data Platform Engineer / AI Workloads (Alameda)

    The Crypto Recruiters • Alameda, CA, United States
    [job_card.permanent]
    We are actively searching for a Data Infrastructure Engineer to join our team on a permanent basis.In this founding engineer role you will focus on building next-generation data infrastructure for ...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Staff Data Engineer : Compliance Data Platform Lead

    Staff Data Engineer : Compliance Data Platform Lead

    Block • San Francisco, CA, United States
    [job_card.full_time]
    A leading technology company is seeking an experienced Data Engineer to enhance data architecture and support compliance initiatives. The candidate will design ETL solutions, monitor data quality, a...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Data Engineer (Alameda)

    Data Engineer (Alameda)

    Midjourney • Alameda, CA, United States
    [job_card.full_time]
    Midjourney is a research lab exploring new mediums to expand the imaginative powers of the human species.We are a small, self-funded team focused on design, human infrastructure, and AI.We have no ...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Staff Systems Engineer

    Staff Systems Engineer

    Bio-Rad Laboratories • Hercules, CA, United States
    [job_card.full_time]
    Working within Bio-Rad's Life Science R&D Group as a Systems Engineer, you will take engineering concepts, requirements and transform them into functional prototypes and finished products that impr...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Staff Machine Learning Engineer - Knowledge Graph

    Staff Machine Learning Engineer - Knowledge Graph

    Prophecy, Inc. • San Francisco, CA, United States
    [job_card.full_time]
    The leader in AI-native data preparation and analysis, Prophecy is revolutionizing how the world’s top enterprises turn data chaos into reliable insights. We introduce the AI-native data lifecycle (...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Founding AI Engineer — Build Knowledge Graphs & RAG

    Founding AI Engineer — Build Knowledge Graphs & RAG

    Falconer • San Francisco, CA, United States
    [job_card.full_time]
    A technology startup in San Francisco is seeking a Founding AI Engineer to develop an AI-powered knowledge platform.You will participate in product development from architecture to deployment, opti...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Data Infrastructure Engineer

    Data Infrastructure Engineer

    zaimler • San Mateo, CA, US
    [job_card.full_time]
    We’re creating the foundation for AI systems that don’t just generate, but retrieve, link, and reason over enterprise knowledge. In just over a year, we’ve begun partnering with Fo...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Senior / Lead Data Solution Engineer

    Senior / Lead Data Solution Engineer

    Meltwater • Redwood City, CA, United States
    [job_card.full_time]
    We're thrilled to embark on the search for a seasoned.Senior / Lead Data Solution Engineer.This pivotal role offers an exciting opportunity to shape the future of technology within our organization.A...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Lecturer - Data Science Undergraduate Studies - College of Computing, Data Science, and Society

    Lecturer - Data Science Undergraduate Studies - College of Computing, Data Science, and Society

    InsideHigherEd • Berkeley, California, United States
    [job_card.full_time]
    Lecturer - Data Science Undergraduate Studies - College of Computing, Data Science, and Society.The UC academic salary scales set the minimum pay at appointment. See the following table for the curr...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Imaging Data Engineer / Architect

    Imaging Data Engineer / Architect

    Intuitive.ai • San Francisco, CA, US
    [job_card.full_time]
    With the reputation of being a.Digital Transformation challenges across following Intuitive Superpowers : .Application & Database Modernization. Platform Engineering (IaC / EaC, DevSecOps & SRE)...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Lead Data Engineer (Alameda)

    Lead Data Engineer (Alameda)

    Mentor Talent Acquisition • Alameda, CA, United States
    [job_card.full_time]
    Were looking for a Lead Data Engineer to spearhead the design, implementation, and iteration of a world-class, modern data infrastructure that powers analytics, data science, and ML / AI systems.You ...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    HPC / AI Data Performance Engineer

    HPC / AI Data Performance Engineer

    Lawrence Berkeley National Laboratory • Berkeley, CA, United States
    [job_card.full_time] +1
    In this exciting role, you will serve as a Data Performance Engineer in NERSC's Application Performance Group, architecting HPC and AI data services that advance fundamental science.You'll optimize...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Remote M&A Associate - AI Trainer ($50-$60 / hour)

    Remote M&A Associate - AI Trainer ($50-$60 / hour)

    Data Annotation • Richmond, California
    [filters.remote]
    [job_card.full_time] +1
    We are looking for a finance professional to join our team to train AI models.You will measure the progress of these AI chatbots, evaluate their logic, and solve problems to improve the quality of ...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Senior AI & Data Platform Engineer

    Senior AI & Data Platform Engineer

    Icon Ventures • San Francisco, CA, United States
    [job_card.full_time]
    At Quizlet, our mission is to help every learner achieve their outcomes in the most effective and delightful way.Our $1B+ learning platform serves tens of millions of students every month, includin...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Associate Principal Data and Analytics Platforms Architect

    Associate Principal Data and Analytics Platforms Architect

    Exelixis • Alameda, CA, United States
    [job_card.full_time]
    Associate Principal Data and Analytics Platforms Architect.This position will help define and design our cloud platform, which includes cloud automation tools and standards, CI / CD pipelines, DevOps...[show_more]
    [last_updated.last_updated_30] • [promoted]