Talent.com

Program evaluation [h1.location_city]

[job_alerts.create_a_job]

Program evaluation • sunnyvale ca

[last_updated.last_updated_variable_hours]
LLM Evaluation Engineer

LLM Evaluation Engineer

The Fountain GroupMountain View, CA
[job_card.full_time]
This isn’t your typical QA role – it’s a unique blend of technical engineering, machine learning evaluation, and data analysis. You’ll work closely with cutting-edge conversational AI technology, de...[show_more][last_updated.last_updated_30]
Member of Technical Staff, Evaluation

Member of Technical Staff, Evaluation

Boson AISanta Clara, CA, US
[job_card.full_time]
Boson AI is an early-stage startup building large language tools for everyone to use.Our founders (Alex Smola,Mu Li), and a team of Deep Learning, Optimization, NLP, AutoML and Statistics scientist...[show_more][last_updated.last_updated_30]
Senior Software Engineer, Data & Evaluation

Senior Software Engineer, Data & Evaluation

JobrMountain View, CA, United States
[job_card.full_time]
Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver.Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on buildin...[show_more][last_updated.last_updated_variable_days]
Member of Technical Staff, Model Evaluation

Member of Technical Staff, Model Evaluation

xAIPalo Alto, CA, US
[job_card.full_time]
AI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering exc...[show_more][last_updated.last_updated_30]
  • [promoted]
Program Manager (program management, analyze)

Program Manager (program management, analyze)

Argyle InfotechSunnyvale, CA, US
[job_card.full_time]
Coordinates projects and ensures company resources are utilized appropriately.[show_more][last_updated.last_updated_30]
  • [promoted]
Technical Program Manager, Human Evaluation Operations

Technical Program Manager, Human Evaluation Operations

Microsoft CorporationMountain View, CA, United States
[job_card.full_time]
Microsoft AI (MAI) is building the world's most advanced AI systems-and rigorous, scalable human evaluation is foundational to ensuring our models are safe, aligned, and high-quality.The Human Eval...[show_more][last_updated.last_updated_1_day]
  • [promoted]
AIML - Sr Engineering Program Manager, Evaluation

AIML - Sr Engineering Program Manager, Evaluation

AppleCupertino, CA, US
[job_card.full_time]
AIML - Sr Engineering Program Manager, Evaluation.Cupertino, California, United States.Apple is where individual imaginations gather together, committing to the values that lead to great work.Every...[show_more][last_updated.last_updated_30]
  • [promoted]
Senior Autonomy Evaluation Platform Engineer

Senior Autonomy Evaluation Platform Engineer

General MotorsMountain View, CA, United States
[job_card.full_time]
A leading automotive company in California is looking for a Senior Software Engineer to shape the evaluation strategy for autonomous vehicle programs. The role involves designing scalable data pipel...[show_more][last_updated.last_updated_variable_days]
Program Manager (Adult Day Program)

Program Manager (Adult Day Program)

Friends of Children with Special NeedsSan Jose, California, USA
[job_card.full_time]
Program Manager for Adult Day Program.Pay : $65500 - $90000 / year depending on experience.Friends of Children with Special Needs. FCSN) is a Bay Area non-profit organization founded in 1996 and focu...[show_more][last_updated.last_updated_variable_days]
  • [promoted]
Senior Technical Program Manager, Autonomous Driving Performance Evaluation

Senior Technical Program Manager, Autonomous Driving Performance Evaluation

WaymoMountain View, CA, United States
[job_card.full_time]
Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver.Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on buildin...[show_more][last_updated.last_updated_variable_days]
  • [promoted]
Responsible AI ML Engineer – Safety & Evaluation

Responsible AI ML Engineer – Safety & Evaluation

Apple Inc.Cupertino, CA, United States
[job_card.full_time]
A leading technology company in Cupertino seeks a Machine Learning Engineer focused on Responsible AI.You'll work on developing evaluations for safety and fairness in generative AI applications, co...[show_more][last_updated.last_updated_30]
  • [promoted]
Evaluation Consultant

Evaluation Consultant

Paradise Architectural Panels and SteelSan Jose, CA, United States
[job_card.full_time]
About the job Evaluation Consultant.Paradise Architectural PANELS & STEEL is a leading manufacturer of high-quality architectural panels and steel products. We are committed to providing our clients...[show_more][last_updated.last_updated_variable_days]
  • [new]
Senior AI Data and Evaluation Engineer

Senior AI Data and Evaluation Engineer

StrykerMenlo Park, California, USA
[job_card.full_time]
We are looking for an experienced and highly skilled Senior AI Data and Validation Engineer.A successful candidate will be responsible for both dry and wet lab experiments for AI functionality acqu...[show_more][last_updated.last_updated_variable_hours]
  • [promoted]
Lead APP Pre Anesthesia Evaluation

Lead APP Pre Anesthesia Evaluation

Stanford Health CarePalo Alto, CA, US
[job_card.full_time]
Lead Advanced Practice Professional.If you're ready to be part of our legacy of hope and innovation, we encourage you to take the first step and explore our current job openings.Your best is waitin...[show_more][last_updated.last_updated_variable_days]
Evaluation Engineer

Evaluation Engineer

VirtualVocationsSunnyvale, California, United States
[job_card.full_time]
A company is looking for an Evaluation Engineer to own the technical foundation of their auto-evaluation systems.Key Responsibilities Build and optimize a fast, user-friendly core auto-eval platf...[show_more][last_updated.last_updated_variable_days]
TLM, Autonomy Evaluation

TLM, Autonomy Evaluation

NuroMountain View, California
[job_card.full_time]
Nuro is a self-driving technology company on a mission to make autonomy accessible to all.Founded in 2016, Nuro is building the world’s most scalable driver, combining cutting-edge AI with automoti...[show_more][last_updated.last_updated_30]
Wireless Technologies Evaluation Engineer

Wireless Technologies Evaluation Engineer

Tata Consultancy ServicesCupertino, CA
[job_card.full_time]
In this role, you will be part of Product RF Definition team and support the Evaluation and Characterization of various technologies from RF perspective. You will work independently under Product RF...[show_more][last_updated.last_updated_30]
  • [promoted]
Program Manager | (Program management) | Hybrid |

Program Manager | (Program management) | Hybrid |

SamprasoftSunnyvale, CA, US
[job_card.full_time]
Coordinates projects and ensures company resources are utilized appropriately.[show_more][last_updated.last_updated_30]
LLM Evaluation Engineer

LLM Evaluation Engineer

The Fountain GroupMountain View, CA
[job_card.30_days_ago]
[job_preview.job_type]
  • [job_card.full_time]
[job_card.job_description]

About the Role :

We are seeking a LLM Evaluation Engineer to join a forward-thinking team responsible for developing a sophisticated voice assistant platform. This isn’t your typical QA role – it’s a unique blend of technical engineering, machine learning evaluation, and data analysis. You’ll work closely with cutting-edge conversational AI technology, designing evaluation frameworks, building custom scripts, and creating data visualizations to assess platform performance.

Key Responsibilities :

  • Design and implement evaluation strategies for voice and language models, including automated testing approaches.
  • Analyze unstructured data from log store systems to identify performance gaps and optimize user experiences.
  • Build and maintain custom Python scripts to streamline data processing and generate actionable insights.
  • Develop visual reports to communicate findings and drive continuous improvement.
  • Collaborate with cross-functional teams globally to identify and address pain points in conversational AI performance.
  • Use prompt engineering techniques to refine LLM outputs and articulate system health.

Ideal Candidate :

  • 3+ years of experience in machine learning evaluation, data analysis, or related technical roles.
  • Intermediate to advanced Python scripting, including log parsing and API testing.
  • Familiarity with GenAI and LLMs, including automated workflows and API integrations.
  • Strong analytical mindset, capable of working independently and identifying innovative solutions.
  • Excellent communication skills, able to present complex findings clearly to both technical and non-technical stakeholders.