This isn’t your typical QA role – it’s a unique blend of technical engineering, machine learning evaluation, and data analysis.You’ll work closely with cutting-edge conversational AI technology, de...[show_more][last_updated.last_updated_30]
[promoted]
AIML - Sr. Software Development Engineer, Evaluation
AppleCupertino, CA, United States
[job_card.full_time]
At Apple, we create world-class innovative products that seamlessly combine cutting-edge hardware with intelligent software experiences, powered by advanced machine learning technologies.The Evalua...[show_more][last_updated.last_updated_variable_days]
[promoted]
Program Coordinator
InsideHigherEdStanford, California, United States
[job_card.full_time] +1
Dean of Research, Stanford, California, United States.Administration📅Mar 27, 2026 Post Date📅108594 Requisition #.The Ginzton Laboratory houses the research operations of 20 Principal Invest...[show_more][last_updated.last_updated_variable_days]
[promoted]
Human Evaluation & Content Quality Vendor Operations Manager
US Tech SolutionsMountain View, CA, United States
[job_card.temporary]
Location: Mountain View, CA (Hybrid).As a Human Evaluation & Content Quality Vendor Operations Manager, you will play a key role in scaling and optimizing our global human evaluation ecosystem - th...[show_more][last_updated.last_updated_variable_hours]
Program Manager
Foxconn Industrial InternetSan Jose, CA, US
[job_card.full_time] +1
[filters_job_card.quick_apply]
Program Manager San Jose, CA About the job: FULL-TIME/PERMANENT DEPARTMENT: Program Management POSITION: Program Manager JOB FUNCTION: The Program Manager will be mainly responsible for managing th...[show_more][last_updated.last_updated_30]
Program Manager
Advanced Micro Devices, IncSan Jose, California, United States
[job_card.full_time]
WHAT YOU DO AT AMD CHANGES EVERYTHING.At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded syst...[show_more][last_updated.last_updated_30]
[new]
Research Scientist (Model Evaluation)
SanasPalo Alto, California, United States, 94301
[job_card.full_time]
[filters_job_card.quick_apply]
Sanas is pioneering the future of human communication.Founded by a team of Stanford researchers and entrepreneurs with deep industry experience, Sanas has developed the world's first real-time spee...[show_more][last_updated.last_updated_variable_hours]
Program Manager
Codvo.aiSanta Clara, California, United States
[job_card.full_time]
At Codvo, we build scalable, future-ready digital platforms that drive real business impact.We foster a culture of innovation, collaboration, and ownership—where designers work closely with product...[show_more][last_updated.last_updated_30]
Program Associate
StanfordStanford, CA, United States
[job_card.full_time]
The Hoover Institution at Stanford University is seeking qualified candidates for the full-time position of Program Associate.To ensure your application information is captured in our official file...[show_more][last_updated.last_updated_variable_days]
Program Manager
AMDSan Jose, CA, US
[job_card.full_time]
WHAT YOU DO AT AMD CHANGES EVERYTHING.At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded syst...[show_more][last_updated.last_updated_30]
[promoted]
Software Engineer, Metrics, GenAI Model Evaluation
TeslaPalo Alto, CA, United States
[job_card.full_time]
The AI Evaluation team is the main line of defense in ensuring customer safety.Working alongside our AI team, you will design metrics that utilize fleet data and run on large inference clusters to ...[show_more][last_updated.last_updated_variable_days]
[promoted]
Program Manager
ACL DigitalSan Jose, CA, United States
[job_card.permanent]
The Program Manager is responsible for customer product delivery schedules, planning, managing, and tracking.The Program Manager reports to the Chief Executive Officer.Leads all phases of assigned ...[show_more][last_updated.last_updated_variable_days]
[promoted]
Program Analyst
QualcommSanta Clara, CA, United States
[job_card.full_time]
Operations Group, Operations Group >.The Program Analyst will support the Oryon CPU Engineering team developing and delivering key Qualcomm SoCs.This is the team behind Snapdragon X Elite laptop an...[show_more][last_updated.last_updated_variable_days]
[promoted]
Program Administrator*
SanminaSan Jose, CA, United States
[job_card.full_time]
Sanmina Corporation (Nasdaq: SANM) is a leading integrated manufacturing solutions provider serving the fastest-growing segments of the global Electronics Manufacturing Services (EMS) market.Recogn...[show_more][last_updated.last_updated_variable_days]
Director, Simulation and Evaluation - Autonomous Driving
Bosch GroupSunnyvale, California, United States
[job_card.full_time]
As the Director for Simulation and Evaluation, you will sit at the center of the Global AI Backbone, architecting the multi-level simulation ecosystems required to train, evaluate and validate next...[show_more][last_updated.last_updated_variable_days]
[promoted]
Full-stack Engineer, Data Platform - Experimentation & Evaluation
Tik TokSan Jose, CA, United States
[job_card.full_time]
Team Introduction Our mission in experimentation and evaluation team is to build the next-gen A/B testing platform, that empowers the company to make data-driven decision for the products.The suppo...[show_more][last_updated.last_updated_variable_days]
[promoted]
Program Manager
Sanmina-SCISan Jose, CA, United States
[job_card.full_time]
Sanmina Corporation (Nasdaq: SANM) is a leading integrated manufacturing solutions provider serving the fastest-growing segments of the global Electronics Manufacturing Services (EMS) market.Recogn...[show_more][last_updated.last_updated_variable_days]
[promoted]
Program Manager
Hays RecruitmentSanta Clara, CA, United States
[job_card.permanent]
Disclaimer: This is an evergreen job posting designed to connect with top talent for future opportunities.While this role is not actively hiring at the moment, we welcome applications to be conside...[show_more][last_updated.last_updated_variable_days]
[promoted]
Software Engineer, Perception Evaluation
WaymoMountain View, CA, United States
[job_card.full_time]
Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver.Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on buildin...[show_more][last_updated.last_updated_variable_days]
About the Role: We are seeking a LLM Evaluation Engineer to join a forward-thinking team responsible for developing a sophisticated voice assistant platform. This isn’t your typical QA role – it’s a unique blend of technical engineering, machine learning evaluation, and data analysis. You’ll work closely with cutting-edge conversational AI technology, designing evaluation frameworks, building custom scripts, and creating data visualizations to assess platform performance.
Key Responsibilities:
Design and implement evaluation strategies for voice and language models, including automated testing approaches.
Analyze unstructured data from log store systems to identify performance gaps and optimize user experiences.
Build and maintain custom Python scripts to streamline data processing and generate actionable insights.
Develop visual reports to communicate findings and drive continuous improvement.
Collaborate with cross-functional teams globally to identify and address pain points in conversational AI performance.
Use prompt engineering techniques to refine LLM outputs and articulate system health.
Ideal Candidate:
3+ years of experience in machine learning evaluation, data analysis, or related technical roles.
Intermediate to advanced Python scripting, including log parsing and API testing.
Familiarity with GenAI and LLMs, including automated workflows and API integrations.
Strong analytical mindset, capable of working independently and identifying innovative solutions.
Excellent communication skills, able to present complex findings clearly to both technical and non-technical stakeholders.