Talent.com
Data Engineer, Knowledge Graphs
Data Engineer, Knowledge GraphsMithrl • San Francisco, CA, US
Data Engineer, Knowledge Graphs

Data Engineer, Knowledge Graphs

Mithrl • San Francisco, CA, US
[job_card.variable_days_ago]
[job_preview.job_type]
  • [job_card.full_time]
[job_card.job_description]

Job Description

Job Description

ABOUT MITHRL

We imagine a world where new medicines reach patients in months, not years, and where scientific breakthroughs happen at the speed of thought.

Mithrl is building the world’s first commercially available AI Co-Scientist. It is a discovery engine that transforms messy biological data into insights in minutes. Scientists ask questions in natural language, and Mithrl responds with analysis, novel targets, hypotheses, and patent-ready reports.

Our traction speaks for itself :

12X year-over-year revenue growth

Trusted by leading biotechs and big pharma across three continents

Driving real breakthroughs from target discovery to patient outcomes.

ABOUT THE ROLE

We are hiring a Data Engineer, Knowledge Graphs to build the infrastructure that powers Mithrl’s biological knowledge layer. You will partner closely with the Data Scientist, Knowledge Graphs to take curated knowledge sources and transform them into scalable, reliable, production ready systems that serve the entire platform.

Your work includes building ETL pipelines for large biological datasets, designing schemas and storage models for graph structured data, and creating the API surfaces that allow ML engineers, application teams, and the AI Co-Scientist to query and use the knowledge graph efficiently. You will also own the reliability, performance, and versioning of knowledge graph infrastructure across releases.

This role is the bridge between biological knowledge ingestion and the high performance engineering systems that use it. If you enjoy working on data modeling, schema design, graph storage, ETL, and scalable infrastructure, this is an opportunity to have deep impact on the intelligence layer of Mithrl.

WHAT YOU WILL DO

Build and maintain ETL pipelines for large public biological datasets and curated knowledge sources

Design, implement, and evolve schemas and storage models for graph structured biological data

Create efficient APIs and query surfaces that allow internal teams and AI systems to retrieve nodes, relationships, pathways, annotations, and graph analytics

Partner closely with the Data Scientists to operationalize curated relationships, harmonized variable IDs, metadata standards, and ontology mappings

Build data models that support multi tenant access, versioning, and reproducibility across releases

Implement scalable storage and indexing strategies for high volume graph data

Maintain data quality, validate data integrity, and build monitoring around ingestion and usage

Work with ML engineers and application teams to ensure the knowledge graph infrastructure supports downstream reasoning, analysis, and discovery applications

Support data warehousing, documentation, and API reliability

Ensure performance, reliability, and uptime for knowledge graph services

WHAT YOU BRING

Required Qualifications

Strong experience as a data engineer or backend engineer working with data intensive systems

Experience building ETL or ELT pipelines for large structured or semi structured datasets

Strong understanding of database design, schema modeling, and data architecture

Experience with graph data models or willingness to learn graph storage concepts

Proficiency in Python or similar languages for data engineering

Experience designing and maintaining APIs for data access

Understanding of versioning, provenance, validation, and reproducibility in data systems

Experience with cloud infrastructure and modern data stack tools

Strong communication skills and ability to work closely with scientific and engineering teams

Nice to Have

Experience with graph databases or graph query languages

Experience with biological or chemical data sources

Familiarity with ontologies, controlled vocabularies, and metadata standards

Experience with data warehousing and analytical storage formats

Previous work in a tech bio company or scientific platform environment

WHAT YOU WILL LOVE AT MITHRL

You will build the core infrastructure that makes the biological knowledge graph fast, reliable, and usable

Team : Join a tight-knit, talent-dense team of engineers, scientists, and builders

Culture : We value consistency, clarity, and hard work. We solve hard problems through focused daily execution

Speed : We ship fast (2x / week) and improve continuously based on real user feedback

Location : Beautiful SF office with a high-energy, in-person culture

Benefits : Comprehensive PPO health coverage through Anthem (medical, dental, and vision) + 401(k) with top-tier plans

We encourage you to apply even if you do not believe you meet every single qualification. Not all strong candidates will meet every single qualification as listed. Research shows that people who identify as being from underrepresented groups are more prone to experiencing imposter syndrome and doubting the strength of their candidacy, so we urge you not to exclude yourself prematurely and to submit an application if you're interested in this work. We think AI systems like the ones we're building have enormous social and ethical implications. We think this makes representation even more important, and we strive to include a range of diverse perspectives on our team.

Compensation Range : $150K - $200K

[job_alerts.create_a_job]

Data Engineer • San Francisco, CA, US

[internal_linking.similar_jobs]
Staff Machine Learning Data Engineer

Staff Machine Learning Data Engineer

Backflip • San Francisco, CA, United States
[job_card.full_time]
Mechanical design, the work done in CAD, is the rate-limiter for progress in the physical world.However, there are only 2-4 million people on Earth who know how to CAD. But what if hundreds of milli...[show_more]
[last_updated.last_updated_30] • [promoted]
Senior Data Engineer, Data Lake & Governance

Senior Data Engineer, Data Lake & Governance

Gridware • San Francisco, CA, United States
[job_card.full_time]
Get AI-powered advice on this job and more exclusive features.Gridware is a San Francisco-based technology company dedicated to protecting and enhancing the electrical grid.We pioneered a groundbre...[show_more]
[last_updated.last_updated_30] • [promoted]
AI Systems & Data Engineer

AI Systems & Data Engineer

HyperFi, Inc. • San Francisco, CA, United States
[job_card.full_time]
We're building the kind of platform we always wanted to use : fast, flexible, and built for making sense of real-world complexity. Behind the scenes is a robust, event-driven architecture that connec...[show_more]
[last_updated.last_updated_30] • [promoted]
Product Engineer, AI-Driven Knowledge Platform - Remote

Product Engineer, AI-Driven Knowledge Platform - Remote

OfficeHours Technologies Co. • San Francisco, CA, United States
[filters.remote]
[job_card.full_time]
A leading consulting platform company based in San Francisco is seeking a creative Product Engineer.This role involves building impactful user-facing features and collaborating across teams to deli...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Solutions Engineer, Data Infra

Solutions Engineer, Data Infra

Foxglove Technologies Inc. • San Francisco, CA, United States
[job_card.full_time]
Robotics will have a massive positive impact on the world economy and global human productivity over the coming decade.At Foxglove, we're excited for this future, and we're building powerful open s...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Data Engineer

Data Engineer

Lancesoft INC • San Bruno, CA, US
[job_card.full_time]
Hybrid — minimum 3 days onsite (typically Tue–Thu).You’ll work across the data stack—designing, building, and maintaining data models and pipelines that power product and an...[show_more]
[last_updated.last_updated_30] • [promoted]
Data Engineer II

Data Engineer II

VirtualVocations • San Francisco, California, United States
[job_card.full_time]
A company is looking for a Data Engineer II - Gen AI - Music.Key Responsibilities Build and maintain large-scale data pipelines using data processing frameworks on Google Cloud Platform Drive op...[show_more]
[last_updated.last_updated_less] • [promoted] • [new]
Senior C# Full-Stack Engineer — AI Data & Infrastructure

Senior C# Full-Stack Engineer — AI Data & Infrastructure

Labelbox • San Francisco, CA, United States
[job_card.full_time]
Alignerr connects top technical experts with leading AI labs to build, evaluate, and improve next-generation models.We work on real production systems and high-impact research workflows across data...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Senior Knowledge Engineer & Graph AI Leader

Senior Knowledge Engineer & Graph AI Leader

Accenture • San Francisco, CA, United States
[job_card.full_time]
A leading consultancy firm seeks a Knowledge Engineer in San Francisco, California, to lead AI projects and develop Knowledge Graph solutions. The ideal candidate will possess significant expertise ...[show_more]
[last_updated.last_updated_variable_hours] • [promoted] • [new]
Senior AI & Data Platform Engineer

Senior AI & Data Platform Engineer

Quizlet • San Francisco, CA, US
[job_card.full_time]
At Quizlet, our mission is to help every learner achieve their outcomes in the most effective and delightful way.Our $1B+ learning platform serves tens of millions of students every month, in...[show_more]
[last_updated.last_updated_30] • [promoted]
Data Engineer

Data Engineer

Zipline • South San Francisco, CA, US
[job_card.full_time]
Do you want to change the world? Zipline is on a mission to transform the way goods move.Our aim is to solve the world's most urgent and complex access challenges by building, manufacturing and...[show_more]
[last_updated.last_updated_variable_hours] • [promoted] • [new]
Principal Data Platform Engineer : Graph & AI

Principal Data Platform Engineer : Graph & AI

Salesforce, Inc. • San Francisco, CA, United States
[job_card.full_time]
A leading technology company is looking for a Principal Software Engineer to architect their data platform.Candidates should have extensive experience in software engineering focusing on backend sy...[show_more]
[last_updated.last_updated_30] • [promoted]
Backend Engineer, Knowledge Systems & AI Workflows

Backend Engineer, Knowledge Systems & AI Workflows

OpenAI • San Francisco, CA, United States
[job_card.full_time]
A leading AI research and deployment company in San Francisco is seeking a Backend Software Engineer to design robust services and systems that drive knowledge automation.The ideal candidate will h...[show_more]
[last_updated.last_updated_30] • [promoted]
Staff Data Management Engineer – AI‑Driven Governance

Staff Data Management Engineer – AI‑Driven Governance

Amplitude • San Francisco, CA, United States
[job_card.full_time]
A leading analytics company is seeking a Staff Software Engineer specializing in Data Management to drive the strategy and execution of data governance solutions. The role involves designing scalabl...[show_more]
[last_updated.last_updated_30] • [promoted]
Backend Engineer for AI Knowledge Platform

Backend Engineer for AI Knowledge Platform

Glean • San Francisco, CA, United States
[job_card.full_time]
An innovative AI-powered platform is seeking creative engineers to develop user-facing features.You will oversee the complete lifecycle of features, collaborating with cross-functional teams to des...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Senior AI & Data Platform Engineer

Senior AI & Data Platform Engineer

Icon Ventures • San Francisco, CA, United States
[job_card.full_time]
At Quizlet, our mission is to help every learner achieve their outcomes in the most effective and delightful way.Our $1B+ learning platform serves tens of millions of students every month, includin...[show_more]
[last_updated.last_updated_30] • [promoted]
HPC / AI Data Performance Engineer

HPC / AI Data Performance Engineer

Lawrence Berkeley Lab • Berkeley, CA, United States
[job_card.full_time] +1
In this exciting role, you will serve as a Data Performance Engineer in NERSC's Application Performance Group, architecting HPC and AI data services that advance fundamental science.You'll optimize...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Staff Data Engineer : Architect a Next-Gen Snowflake Platform

Staff Data Engineer : Architect a Next-Gen Snowflake Platform

Imprint • San Francisco, CA, United States
[job_card.full_time]
A leading FinTech company in San Francisco is seeking a Staff Data Engineer to architect a next-generation data platform, optimize systems, and mentor teams. The ideal candidate has over 10 years of...[show_more]
[last_updated.last_updated_30] • [promoted]