Talent.com

Program evaluation [h1.location_city]

[job_alerts.create_a_job]

Program evaluation • sunnyvale ca

[last_updated.last_updated_variable_hours]
LLM Evaluation Engineer

LLM Evaluation Engineer

The Fountain GroupMountain View, CA
[job_card.full_time]
This isn’t your typical QA role – it’s a unique blend of technical engineering, machine learning evaluation, and data analysis. You’ll work closely with cutting-edge conversational AI technology, de...[show_more][last_updated.last_updated_30]
  • [promoted]
Member of Technical Staff, Evaluation

Member of Technical Staff, Evaluation

Boson AISanta Clara, CA, US
[job_card.full_time]
Boson AI is an early-stage startup building large language tools for everyone to use.Our founders (Alex Smola,Mu Li), and a team of Deep Learning, Optimization, NLP, AutoML and Statistics scientist...[show_more][last_updated.last_updated_30]
  • [promoted]
Program Coordinator (Adult Day Program)

Program Coordinator (Adult Day Program)

Friends of Children with Special NeedsSan Jose, CA, US
[job_card.temporary]
Salary : $28 - $36 / hourly (depending on experience).Friends of Children with Special Needs.FCSN) is a Bay Area non-profit organization founded in 1996 and focused on helping individuals with speci...[show_more][last_updated.last_updated_30]
Senior Software Engineer, ML Systems Evaluation

Senior Software Engineer, ML Systems Evaluation

ASunnyvale, California, United States
[job_card.full_time]
Our Wayfinder team is building scalable, certifiable autonomy systems to power the next generation of commercial aircraft. Our team of experts is driving the maturation of machine learning and other...[show_more][last_updated.last_updated_variable_days]
  • [promoted]
Member of Technical Staff, Model Evaluation

Member of Technical Staff, Model Evaluation

xAIPalo Alto, CA, US
[job_card.full_time]
AI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering exc...[show_more][last_updated.last_updated_30]
  • [promoted]
Program Manager

Program Manager

CBRE GroupMenlo Park, CA, US
[job_card.full_time]
As a CBRE Program Manager, you will manage a team responsible for facilitating small to medium cross-functional projects and programs. This job is part of the Program Management function.They are re...[show_more][last_updated.last_updated_30]
Evaluation Consultant

Evaluation Consultant

Paradise Architectural PANELS & STEELSan Jose, California, USA
[job_card.full_time]
Paradise Architectural PANELS & STEEL is a leading manufacturer of high-quality architectural panels and steel products.We are committed to providing our clients with innovative and sustainable sol...[show_more][last_updated.last_updated_variable_days]
Applied Science Manager, GenAI Evaluation Media (GEM)

Applied Science Manager, GenAI Evaluation Media (GEM)

AmazonSunnyvale, California, USA
[job_card.full_time]
Passionate about creating visual customer experiences that push the boundaries at the forefront of GenAI.The North America Stores GenAI Evaluation Media (GEM) team is seeking an experienced Applied...[show_more][last_updated.last_updated_variable_days]
  • [promoted]
Senior Project Manager, Post-Market Safety Evaluation

Senior Project Manager, Post-Market Safety Evaluation

AbbottSanta Clara, CA, US
[job_card.full_time]
Senior Project Manager, Post-Market Safety Evaluation.Abbott is a global healthcare leader that helps people live more fully at all stages of life. Our portfolio of life-changing technologies spans ...[show_more][last_updated.last_updated_variable_days]
  • [promoted]
Responsible AI ML Engineer – Safety & Evaluation

Responsible AI ML Engineer – Safety & Evaluation

Apple Inc.Cupertino, CA, United States
[job_card.full_time]
A leading technology company in Cupertino seeks a Machine Learning Engineer focused on Responsible AI.You'll work on developing evaluations for safety and fairness in generative AI applications, co...[show_more][last_updated.last_updated_variable_days]
  • [promoted]
Human Evaluation & Content Quality Vendor Operations Manager

Human Evaluation & Content Quality Vendor Operations Manager

US Tech SolutionsMountain View, CA, US
[job_card.temporary]
Human Evaluation & Content Quality Vendor Operations Manager.Location : Mountain View, CA (Hybrid) Duration : 5 months contract. Job Description : As a Human Evaluation & Content Quality Vendor Operati...[show_more][last_updated.last_updated_30]
  • [new]
System Safety Engineer Autonomous Driving - AV Risk Evaluation

System Safety Engineer Autonomous Driving - AV Risk Evaluation

Applied IntuitionSunnyvale, CA, United States
[job_card.full_time]
Applied Intuition is the vehicle intelligence company that accelerates the global adoption of safe, AI-driven machines.Founded in 2017 and now valued at $15 billion following its recent Series F fu...[show_more][last_updated.last_updated_variable_hours]
  • [promoted]
Director, Simulation Evaluation

Director, Simulation Evaluation

WaymoMountain View, CA, United States
[job_card.full_time]
Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver.Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on buildin...[show_more][last_updated.last_updated_30]
  • [new]
Senior Manager - Search, Evaluation Program Management - Apple Maps

Senior Manager - Search, Evaluation Program Management - Apple Maps

AppleCupertino, CA, United States
[job_card.full_time]
Apple Maps is seeking a top-tier leader for our human evaluation team and ensure the quality and relevance of Maps features! We need someone to lead a team who provides data and insights to improve...[show_more][last_updated.last_updated_variable_hours]
  • [promoted]
Product Manager, Evaluation & Data Generation

Product Manager, Evaluation & Data Generation

Hippocratic AIPalo Alto, CA, US
[job_card.full_time]
Hippocratic AI is seeking a PM to lead the development of our model evaluation and data generation platform.In this role, you'll drive the creation of high-quality training and test datasets that i...[show_more][last_updated.last_updated_30]
  • [promoted]
Program Management - Program Manager V

Program Management - Program Manager V

eTeamMenlo Park, CA, US
[job_card.full_time]
Location : Remote (EST & CST Preferred).Duration : 6 months (Potential for extension).Lead end-to-end risk assessments for projects and initiatives involving people data, maintaining accountability t...[show_more][last_updated.last_updated_30]
TLM, Autonomy Evaluation

TLM, Autonomy Evaluation

NuroMountain View, California
[job_card.full_time]
Nuro is a self-driving technology company on a mission to make autonomy accessible to all.Founded in 2016, Nuro is building the world’s most scalable driver, combining cutting-edge AI with automoti...[show_more][last_updated.last_updated_30]
Wireless Technologies Evaluation Engineer

Wireless Technologies Evaluation Engineer

Tata Consultancy ServicesCupertino, CA
[job_card.full_time]
In this role, you will be part of Product RF Definition team and support the Evaluation and Characterization of various technologies from RF perspective. You will work independently under Product RF...[show_more][last_updated.last_updated_30]
LLM Evaluation Engineer

LLM Evaluation Engineer

The Fountain GroupMountain View, CA
[job_card.30_days_ago]
[job_preview.job_type]
  • [job_card.full_time]
[job_card.job_description]

About the Role :

We are seeking a LLM Evaluation Engineer to join a forward-thinking team responsible for developing a sophisticated voice assistant platform. This isn’t your typical QA role – it’s a unique blend of technical engineering, machine learning evaluation, and data analysis. You’ll work closely with cutting-edge conversational AI technology, designing evaluation frameworks, building custom scripts, and creating data visualizations to assess platform performance.

Key Responsibilities :

  • Design and implement evaluation strategies for voice and language models, including automated testing approaches.
  • Analyze unstructured data from log store systems to identify performance gaps and optimize user experiences.
  • Build and maintain custom Python scripts to streamline data processing and generate actionable insights.
  • Develop visual reports to communicate findings and drive continuous improvement.
  • Collaborate with cross-functional teams globally to identify and address pain points in conversational AI performance.
  • Use prompt engineering techniques to refine LLM outputs and articulate system health.

Ideal Candidate :

  • 3+ years of experience in machine learning evaluation, data analysis, or related technical roles.
  • Intermediate to advanced Python scripting, including log parsing and API testing.
  • Familiarity with GenAI and LLMs, including automated workflows and API integrations.
  • Strong analytical mindset, capable of working independently and identifying innovative solutions.
  • Excellent communication skills, able to present complex findings clearly to both technical and non-technical stakeholders.