Machine Learning Evaluation EngineerBedrock Robotics • San Francisco, California, United States

[error_messages.no_longer_accepting]

Machine Learning Evaluation Engineer

Bedrock Robotics • San Francisco, California, United States

[job_card.variable_days_ago]

[job_preview.job_type]

[job_card.full_time]

[job_card.job_description]

The Role

Machine Learning Evaluation Engineer :

Bedrock is bringing autonomy to the construction industry! We’re a group of veterans from the autonomous vehicle industry who are passionate about bringing the benefits of automation to areas in the construction industry currently underserved by the market.

We’re looking for a highly motivated engineer with experience evaluating complex ML systems deployed in the real world. Your Mission : Translate the infinite nuance of the built world into actionable, AI-native evaluations that accelerate Bedrock Operator adoption.

The ideal candidate has hands‑on experience in building evaluation systems and designing and executing statistical tests to gauge performance deltas between system iterations. More importantly, you’ve iterated on complex ML systems run in production environments, and you understand the complexities that come with it.

What you’ll do :

Design and maintain eval systems :

Build pipelines for measuring system performance – across open loop and closed loop simulation, hardware in the loop systems, and field data from Bedrock Operator equipped machinery. Excite other teams to gain insights earlier in the development cycle through streamlined workflows.

Develop metrics :

Connect product goals and system behavior - by bridging real-world specification to measurable indicators from logged data. Empower confident decision making from parameter tuning to program planning by slicing through the noise and delivering objective insights.

Classify data sources for training and testing :

Implement infrastructure and classifiers - to self-annotate data and allow creation of datasets for a variety of training and evaluation use cases. Leverage models to source rich annotations for massive datasets to accelerate model iteration.

Predict system performance :

Model metrics and interpret results – from various sources ranging from raw sensor data to key leading indicators. Determine whether new construction sites pose hidden challenges and drive business decisions about deployment readiness.

What we’re looking for :

Engineers who are currently Senior or Staff level with 5+ years of professional software engineering, data science, or research experience

2+ years of professional experience analyzing modern ML or robotics system performance on real-world problems

Proficiency in Python and a data warehouse query language and comfort with development on infrastructure within parallelized cloud-based frameworks

Strong statistical analysis skills (e.g. classification, model fit bias determination, hypothesis testing, and uncertainty quantification)

Experience working with large datasets

Bonus points : We’re especially interested in engineers who have applied statistical backgrounds to ML research or real-world robotics applications.

Our roles are often flexible. If you don't fit all the criteria, or are in another location (especially one where we have an office like SF or NY) please apply anyway! We'd love to consider you.

Join the team bringing advanced autonomy to the built world

At Bedrock, we've assembled one of the most experienced autonomous technology teams in the industry, with deep expertise scaling breakthroughs across transportation, infrastructure, and enterprise software. Our leaders helped put the first self-driving cars on public roads at Waymo, scaled systems for Segment's $3.2B acquisition, and grew Uber Freight to $5B in revenue.

While others debate the future of AI, we're deploying it in the real world. Our systems are already installed on heavy machines across the country, learning on real construction sites and working to reshape the earth with survey-grade precision and exceptional safety. This isn't a simulation—it's autonomous intelligence working on billion-dollar infrastructure projects.

In just over a year, we've raised $80M, put our equipment into the field, and established partnerships with forward-thinking contractors who are integrating our technology into their operations. We're working quickly to close the gap between America's surging demand for housing, data centers, manufacturing hubs, and the construction industry's growing labor shortage.

Here, algorithms meet steel-toed boots. You'll collaborate with both construction veterans and experienced engineers, tackling problems where your work directly impacts how the physical world get built. If you're interested in applying cutting‑edge technology to solve meaningful problems alongside a talented team—we'd love to have you join us.

#J-18808-Ljbffr

[job_alerts.create_a_job]

Machine Learning Engineer • San Francisco, California, United States

[internal_linking.similar_jobs]

Machine Learning Engineer, Monetization Engineering

Pinterest • San Francisco, CA, United States

[job_card.full_time]

Millions of people around the world come to our platform to find creative ideas, dream about new possibilities and plan for memories that will last a lifetime. At Pinterest, we're on a mission to br...[show_more]

[last_updated.last_updated_30] • [promoted]

Machine Learning Engineer

Lumicity • San Francisco, CA, United States

[job_card.full_time]

Hybrid - San Francisco - $170k - $200k plus equity.This early-stage Speech Technology startup is developing the next wave of ML technology in the real-time speech synthesis sector.They have develop...[show_more]

[last_updated.last_updated_variable_hours] • [promoted] • [new]

Research Engineer, Model Evaluations

Menlo Ventures • San Francisco, CA, United States

[job_card.full_time]

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems.We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group ...[show_more]

[last_updated.last_updated_variable_days] • [promoted]

Machine Learning Engineer - Model Evaluations, Public Sector

Scale AI, Inc. • San Francisco, California, United States

[job_card.full_time]

Machine Learning Engineer - Model Evaluations, Public Sector.The Public Sector ML team at Scale deploys advanced AI systems-including LLMs, agentic models, and multimodal pipelines-into mission-cri...[show_more]

[last_updated.last_updated_variable_days] • [promoted]

Machine Learning Engineer

Aquabyte • San Francisco, CA, United States

[job_card.full_time]

You’ll be responsible for software and machine learning model development of our on‑camera and cloud software.Aquabyte is on a mission to revolutionize the sustainability and efficiency of aquacult...[show_more]

[last_updated.last_updated_30] • [promoted]

Machine Learning Engineer - Model Evaluations, Public Sector

Scale AI • San Francisco, CA, United States

[job_card.full_time]

Machine Learning Engineer - Model Evaluations, Public Sector.Louis, MO; New York, NY; Washington, DC.The Public Sector ML team at Scale deploys advanced AI systems—including LLMs, agentic models, a...[show_more]

[last_updated.last_updated_30] • [promoted]

Machine Learning Engineer

Capital One • San Francisco, California, United States

[job_card.part_time]

As a Capital One Machine Learning Engineer (MLE), you'll be part of an Agile team dedicated to productionizing machine learning applications and systems at scale. You'll participate in the detailed ...[show_more]

[last_updated.last_updated_variable_days] • [promoted]

Machine Learning Engineer

University of California, San Francisco • San Francisco, CA, United States

[job_card.full_time]

The Machine Learning and Data engineer role will lead the development, implementation, and maintenance of data pipelines and infrastructure to support the deployment and continuous monitoring of Ma...[show_more]

[last_updated.last_updated_30] • [promoted]

Machine Learning Engineer, 2+ Years Experience

TwelveLabs • San Francisco, CA, United States

[job_card.full_time]

Machine Learning Engineer, 2+ Years Experience.Machine Learning Engineer, 2+ Years Experience.This range is provided by TwelveLabs. Your actual pay will be based on your skills and experience — talk...[show_more]

[last_updated.last_updated_30] • [promoted]

Machine Learning Engineer

Relace • San Francisco, CA, United States

[job_card.full_time]

Relace is building the models and infrastructure that code agents reach for.We power the fastest model on OpenRouter (10,000 tok / s) and deliver optimized small language models designed for retrieva...[show_more]

[last_updated.last_updated_30] • [promoted]

Senior ML Evaluation Engineer, Autonomous Construction

Bedrock Robotics • San Francisco, CA, United States

[job_card.full_time]

A pioneering robotics company in San Francisco seeks a Machine Learning Evaluation Engineer to design evaluation systems and develop metrics that bridge real-world specifications with measurable in...[show_more]

[last_updated.last_updated_variable_days] • [promoted]

Machine Learning Engineer, 6+ Years Experience at Twelve Labs San Francisco, CA

Carlsbad Tech • San Francisco, CA, United States

[job_card.full_time]

Machine Learning Engineer, 6+ Years Experience job at Twelve Labs.At Twelve Labs, we are pioneering the development of frontier multimodal foundation models that can see, hear, and understand the w...[show_more]

[last_updated.last_updated_variable_days] • [promoted]

Senior ML Evaluation Engineer for Large Models

Waymo • San Francisco, CA, United States

[job_card.full_time]

A leading autonomous driving technology company is seeking experienced engineers to develop metrics for evaluating machine learning models within the Waymo Driver. The ideal candidate will have over...[show_more]

[last_updated.last_updated_variable_days] • [promoted]

Machine Learning Engineer

Alexander Chapman • San Francisco, CA, United States

[job_card.full_time]

Get AI-powered advice on this job and more exclusive features.This range is provided by Alexander Chapman.Your actual pay will be based on your skills and experience — talk with your recruiter to l...[show_more]

[last_updated.last_updated_30] • [promoted]

Machine Learning Engineer

Remedy Robotics • San Francisco, CA, United States

[job_card.permanent]

Cardiovascular disease is the #1 cause of morbidity and mortality in the world.Much of this could be prevented with better access to specialist care. Take stroke as an example : any delay in treatmen...[show_more]

[last_updated.last_updated_30] • [promoted]

Lead Research Engineer, Model Evaluations Platform

Anthropic • San Francisco, CA, United States

[job_card.full_time]

A leading AI research organization in San Francisco seeks a Research Engineer to lead the design and implementation of its evaluation platform. You will ensure the safety and effectiveness of AI mod...[show_more]

[last_updated.last_updated_30] • [promoted]

Machine Learning Engineer

Krea.ai, Inc. • San Francisco, CA, United States

[job_card.full_time]

At Krea, we are building next-generation AI creative tools.We are dedicated to making AI intuitive and controllable for creatives. Our mission is to build tools that empower human creativity, not re...[show_more]

[last_updated.last_updated_30] • [promoted]

Machine Learning Software Engineer

Subtle Medical • Menlo Park, CA, United States

[job_card.full_time]

Subtle Medical is a leading provider of AI-powered imaging solutions, optimizing scan efficiency and image quality across radiology. Recognized by TIME as a World’s Top Healthcare Company (2025) and...[show_more]

[last_updated.last_updated_variable_days] • [promoted]