Talent.com

Program evaluation [h1.location_city]

[job_alerts.create_a_job]

Program evaluation • sunnyvale ca

[last_updated.last_updated_variable_hours]
LLM Evaluation Engineer

LLM Evaluation Engineer

The Fountain GroupMountain View, CA
[job_card.full_time]
This isn’t your typical QA role – it’s a unique blend of technical engineering, machine learning evaluation, and data analysis. You’ll work closely with cutting-edge conversational AI technology, de...[show_more][last_updated.last_updated_30]
  • [promoted]
Member of Technical Staff, Evaluation

Member of Technical Staff, Evaluation

Boson AISanta Clara, CA, US
[job_card.full_time]
Boson AI is an early-stage startup building large language tools for everyone to use.Our founders (Alex Smola,Mu Li), and a team of Deep Learning, Optimization, NLP, AutoML and Statistics scientist...[show_more][last_updated.last_updated_30]
  • [promoted]
Program Coordinator (Adult Day Program)

Program Coordinator (Adult Day Program)

Friends of Children with Special NeedsSan Jose, CA, US
[job_card.temporary]
Salary : $28 - $36 / hourly (depending on experience).Friends of Children with Special Needs.FCSN) is a Bay Area non-profit organization founded in 1996 and focused on helping individuals with speci...[show_more][last_updated.last_updated_30]
  • [promoted]
Program Manager

Program Manager

GEICOPalo Alto, CA, US
[job_card.full_time]
GEICO is looking for a Product Program Manager that operates autonomously to deliver key initiatives, which drive strategic outcomes for the GEICO product organization. This is a critical leadership...[show_more][last_updated.last_updated_30]
Senior Software Engineer, ML Systems Evaluation

Senior Software Engineer, ML Systems Evaluation

ASunnyvale, California, United States
[job_card.full_time]
Our Wayfinder team is building scalable, certifiable autonomy systems to power the next generation of commercial aircraft. Our team of experts is driving the maturation of machine learning and other...[show_more][last_updated.last_updated_variable_days]
  • [promoted]
Member of Technical Staff, Model Evaluation

Member of Technical Staff, Model Evaluation

xAIPalo Alto, CA, US
[job_card.full_time]
AI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering exc...[show_more][last_updated.last_updated_30]
  • [promoted]
Program Manager (program management, analyze)

Program Manager (program management, analyze)

Argyle InfotechSunnyvale, CA, US
[job_card.full_time]
Coordinates projects and ensures company resources are utilized appropriately.[show_more][last_updated.last_updated_30]
  • [promoted]
Program Manager

Program Manager

SupermicroSan Jose, CA, United States
[job_card.full_time]
Supermicro is a Top Tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop / Big Data, Hyperscale, HPC and IoT / Embedded customers...[show_more][last_updated.last_updated_30]
  • [new]
Sr Engineering Program Manager, Evaluation - Special Projects

Sr Engineering Program Manager, Evaluation - Special Projects

AppleCupertino, CA, United States
[job_card.full_time]
Apple's Special Projects team is seeking a Senior Engineering Program Manager (EPM) to lead our AI evaluation framework at the forefront of next-generation AI experiences.This is a highly visible r...[show_more][last_updated.last_updated_variable_hours]
Evaluation Consultant

Evaluation Consultant

Paradise Architectural PANELS & STEELSan Jose, California, USA
[job_card.full_time]
Paradise Architectural PANELS & STEEL is a leading manufacturer of high-quality architectural panels and steel products.We are committed to providing our clients with innovative and sustainable sol...[show_more][last_updated.last_updated_variable_days]
Applied Science Manager, GenAI Evaluation Media (GEM)

Applied Science Manager, GenAI Evaluation Media (GEM)

AmazonSunnyvale, California, USA
[job_card.full_time]
Passionate about creating visual customer experiences that push the boundaries at the forefront of GenAI.The North America Stores GenAI Evaluation Media (GEM) team is seeking an experienced Applied...[show_more][last_updated.last_updated_variable_days]
  • [promoted]
Senior Project Manager, Post-Market Safety Evaluation

Senior Project Manager, Post-Market Safety Evaluation

AbbottSanta Clara, CA, US
[job_card.full_time]
Senior Project Manager, Post-Market Safety Evaluation.Abbott is a global healthcare leader that helps people live more fully at all stages of life. Our portfolio of life-changing technologies spans ...[show_more][last_updated.last_updated_variable_days]
  • [promoted]
Responsible AI ML Engineer – Safety & Evaluation

Responsible AI ML Engineer – Safety & Evaluation

Apple Inc.Cupertino, CA, United States
[job_card.full_time]
A leading technology company in Cupertino seeks a Machine Learning Engineer focused on Responsible AI.You'll work on developing evaluations for safety and fairness in generative AI applications, co...[show_more][last_updated.last_updated_variable_days]
  • [promoted]
Director, Simulation Evaluation

Director, Simulation Evaluation

WaymoMountain View, CA, United States
[job_card.full_time]
Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver.Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on buildin...[show_more][last_updated.last_updated_30]
  • [promoted]
Program Manager

Program Manager

CyngnMountain View, CA, US
[job_card.full_time]
Cyngn is a publicly-traded autonomous vehicle company based in Menlo Park, CA.Our self-driving technology can be deployed across a variety of commercial domains, vehicle form-factors.To build this ...[show_more][last_updated.last_updated_30]
TLM, Autonomy Evaluation

TLM, Autonomy Evaluation

NuroMountain View, California
[job_card.full_time]
Nuro is a self-driving technology company on a mission to make autonomy accessible to all.Founded in 2016, Nuro is building the world’s most scalable driver, combining cutting-edge AI with automoti...[show_more][last_updated.last_updated_30]
Wireless Technologies Evaluation Engineer

Wireless Technologies Evaluation Engineer

Tata Consultancy ServicesCupertino, CA
[job_card.full_time]
In this role, you will be part of Product RF Definition team and support the Evaluation and Characterization of various technologies from RF perspective. You will work independently under Product RF...[show_more][last_updated.last_updated_30]
  • [promoted]
Program Manager | (Program management) | Hybrid |

Program Manager | (Program management) | Hybrid |

SamprasoftSunnyvale, CA, US
[job_card.full_time]
Coordinates projects and ensures company resources are utilized appropriately.[show_more][last_updated.last_updated_30]
LLM Evaluation Engineer

LLM Evaluation Engineer

The Fountain GroupMountain View, CA
[job_card.30_days_ago]
[job_preview.job_type]
  • [job_card.full_time]
[job_card.job_description]

About the Role :

We are seeking a LLM Evaluation Engineer to join a forward-thinking team responsible for developing a sophisticated voice assistant platform. This isn’t your typical QA role – it’s a unique blend of technical engineering, machine learning evaluation, and data analysis. You’ll work closely with cutting-edge conversational AI technology, designing evaluation frameworks, building custom scripts, and creating data visualizations to assess platform performance.

Key Responsibilities :

  • Design and implement evaluation strategies for voice and language models, including automated testing approaches.
  • Analyze unstructured data from log store systems to identify performance gaps and optimize user experiences.
  • Build and maintain custom Python scripts to streamline data processing and generate actionable insights.
  • Develop visual reports to communicate findings and drive continuous improvement.
  • Collaborate with cross-functional teams globally to identify and address pain points in conversational AI performance.
  • Use prompt engineering techniques to refine LLM outputs and articulate system health.

Ideal Candidate :

  • 3+ years of experience in machine learning evaluation, data analysis, or related technical roles.
  • Intermediate to advanced Python scripting, including log parsing and API testing.
  • Familiarity with GenAI and LLMs, including automated workflows and API integrations.
  • Strong analytical mindset, capable of working independently and identifying innovative solutions.
  • Excellent communication skills, able to present complex findings clearly to both technical and non-technical stakeholders.