This isn’t your typical QA role – it’s a unique blend of technical engineering, machine learning evaluation, and data analysis.
You’ll work closely with cutting-edge conversational AI technology, de...[show_more][last_updated.last_updated_30]
[promoted]
Member of Technical Staff, Evaluation
Boson AISanta Clara, CA, US
[job_card.full_time]
Boson AI is an early-stage startup building large language tools for everyone to use.Our founders (Alex Smola,Mu Li), and a team of Deep Learning, Optimization, NLP, AutoML and Statistics scientist...[show_more][last_updated.last_updated_30]
[promoted]
Program Coordinator (Adult Day Program)
Friends of Children with Special NeedsSan Jose, CA, US
[job_card.temporary]
Salary : $28 - $36 / hourly (depending on experience).Friends of Children with Special Needs.FCSN) is a Bay Area non-profit organization founded in 1996 and focused on helping individuals with speci...[show_more][last_updated.last_updated_variable_days]
Program Manager
Hogarth WorldwideSunnyvale, CA, US
[job_card.full_time]
Hogarth is the Global Content Production Company.Part of WPP, Hogarth partners with one in every two of the world's top 100 brands including Coca-Cola, Ford, Rolex, Nestlé, Mondelez and ...[show_more][last_updated.last_updated_30]
Evaluation & Insights Engineer
AppleCupertino, CA, United States
[job_card.full_time]
Weekly Hours : • • 40 • •Role Number : • • 200632687-0836 • •Summary • • Imagine what you could do here.At Apple, great new ideas have a way of becoming extraordinary products, services, and customer ex...[show_more][last_updated.last_updated_variable_days]
Senior Software Engineer, ML Systems Evaluation
ASunnyvale, California, United States
[job_card.full_time]
Our Wayfinder team is building scalable, certifiable autonomy systems to power the next generation of commercial aircraft.
Our team of experts is driving the maturation of machine learning and other...[show_more][last_updated.last_updated_variable_days]
[promoted]
Member of Technical Staff, Model Evaluation
xAIPalo Alto, CA, US
[job_card.full_time]
AI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge.
Our team is small, highly motivated, and focused on engineering exc...[show_more][last_updated.last_updated_30]
Evaluation Consultant
Paradise Architectural PANELS & STEELSan Jose, California, USA
[job_card.full_time]
Paradise Architectural PANELS & STEEL is a leading manufacturer of high-quality architectural panels and steel products.We are committed to providing our clients with innovative and sustainable sol...[show_more][last_updated.last_updated_variable_days]
Applied Science Manager, GenAI Evaluation Media (GEM)
AmazonSunnyvale, California, USA
[job_card.full_time]
Passionate about creating visual customer experiences that push the boundaries at the forefront of GenAI.The North America Stores GenAI Evaluation Media (GEM) team is seeking an experienced Applied...[show_more][last_updated.last_updated_variable_days]
[promoted]
Program Manager
Tech DigitalSan Jose, CA, US
[job_card.full_time]
Managing business operations and focused towards operations budget management managing annual budgets.Managing headcount and managing allocations financials analyses strategic experience in develop...[show_more][last_updated.last_updated_30]
[promoted]
Program Manager
Veeco Instruments IncSan Jose, CA, US
[job_card.full_time]
Veeco's Product Development team in San Jose operates in a fast-paced, innovation-driven environment focused on delivering advanced semiconductor solutions.
The Program Manager will lead cross-funct...[show_more][last_updated.last_updated_variable_days]
[promoted]
Responsible AI ML Engineer – Safety & Evaluation
Apple Inc.Cupertino, CA, United States
[job_card.full_time]
A leading technology company in Cupertino seeks a Machine Learning Engineer focused on Responsible AI.You'll work on developing evaluations for safety and fairness in generative AI applications, co...[show_more][last_updated.last_updated_variable_days]
Senior Project Manager, Post-Market Safety Evaluation.Abbott is a global healthcare leader that helps people live more fully at all stages of life.
Our portfolio of life-changing technologies spans ...[show_more][last_updated.last_updated_variable_days]
Director, Simulation Evaluation
WaymoMountain View, CA, United States
[job_card.full_time]
Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver.Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on buildin...[show_more][last_updated.last_updated_variable_days]
Program Leader
Think TogetherSan Jose, CA, US
[job_card.part_time]
This is a part time, in-person position in districts and school site locations throughout California.Program Leaders act as a positive adult role model, coach, and mentor.Program Leaders must have ...[show_more][last_updated.last_updated_30]
TLM, Autonomy Evaluation
NuroMountain View, California
[job_card.full_time]
Nuro is a self-driving technology company on a mission to make autonomy accessible to all.Founded in 2016, Nuro is building the world’s most scalable driver, combining cutting-edge AI with automoti...[show_more][last_updated.last_updated_30]
Wireless Technologies Evaluation Engineer
Tata Consultancy ServicesCupertino, CA
[job_card.full_time]
In this role, you will be part of Product RF Definition team and support the Evaluation and Characterization of various technologies from RF perspective.
You will work independently under Product RF...[show_more][last_updated.last_updated_30]
[promoted]
Program Manager | (Program management) | Hybrid |
SamprasoftSunnyvale, CA, US
[job_card.full_time]
Coordinates projects and ensures company resources are utilized appropriately.[show_more][last_updated.last_updated_30]
We are seeking a LLM Evaluation Engineer to join a forward-thinking team responsible for developing a sophisticated voice assistant platform. This isn’t your typical QA role – it’s a unique blend of technical engineering, machine learning evaluation, and data analysis. You’ll work closely with cutting-edge conversational AI technology, designing evaluation frameworks, building custom scripts, and creating data visualizations to assess platform performance.
Key Responsibilities :
Design and implement evaluation strategies for voice and language models, including automated testing approaches.
Analyze unstructured data from log store systems to identify performance gaps and optimize user experiences.
Build and maintain custom Python scripts to streamline data processing and generate actionable insights.
Develop visual reports to communicate findings and drive continuous improvement.
Collaborate with cross-functional teams globally to identify and address pain points in conversational AI performance.
Use prompt engineering techniques to refine LLM outputs and articulate system health.
Ideal Candidate :
3+ years of experience in machine learning evaluation, data analysis, or related technical roles.
Intermediate to advanced Python scripting, including log parsing and API testing.
Familiarity with GenAI and LLMs, including automated workflows and API integrations.
Strong analytical mindset, capable of working independently and identifying innovative solutions.
Excellent communication skills, able to present complex findings clearly to both technical and non-technical stakeholders.