Talent.com
AI Agent Evaluation Analyst
AI Agent Evaluation AnalystVirtualVocations • Oakland, California, United States
AI Agent Evaluation Analyst

AI Agent Evaluation Analyst

VirtualVocations • Oakland, California, United States
[job_card.variable_days_ago]
[job_preview.job_type]
  • [job_card.full_time]
[job_card.job_description]

A company is looking for an AI Agent Evaluation Analyst (Freelance).

Key Responsibilities

Review evaluation tasks and scenarios for logic, completeness, and realism

Identify inconsistencies, missing assumptions, or unclear decision points

Help define clear expected behaviors (gold standards) for AI agents

Required Qualifications

Excellent analytical thinking regarding complex systems and logical implications

Familiarity with structured data formats (ability to read JSON / YAML)

Experience with policy evaluation, logic puzzles, or structured scenario design

Background in consulting, academia, or research

Exposure to LLMs, prompt engineering, or AI-generated content

[job_alerts.create_a_job]

Ai Analyst • Oakland, California, United States

[internal_linking.similar_jobs]
AI Research Engineer, Enterprise Evaluations

AI Research Engineer, Enterprise Evaluations

Scale AI • San Francisco, CA, United States
[job_card.full_time]
AI Research Engineer, Enterprise Evaluations.Scale AI is seeking a technically rigorous and driven.This high‑impact role is critical to our mission of delivering the industry's leading.You will be ...[show_more]
[last_updated.last_updated_30] • [promoted]
External Transfer Application Evaluator & Reader (4511U) - CDSS

External Transfer Application Evaluator & Reader (4511U) - CDSS

InsideHigherEd • Berkeley, California, United States
[job_card.full_time] +1
External Transfer Application Evaluator & Reader (4511U) - CDSS.GBL?Page=HRS_APP_JBPST_FL&JobOpeningId=82458&PostingSeq=1&SiteId=21&languageCd=ENG&FOCUS=Applicant. Posted by the FREE value-added rec...[show_more]
[last_updated.last_updated_30] • [promoted]
Research Engineer, Frontier AI Evaluations in Finance

Research Engineer, Frontier AI Evaluations in Finance

OpenAI • San Francisco, CA, United States
[job_card.full_time]
A leading AI research firm in San Francisco seeks a Research Engineer to evaluate model capabilities in finance.The candidate should possess strong analytical skills, be detail-oriented, and thrive...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Science Editor

Science Editor

Bio-Rad Laboratories • Hercules, CA, United States
[job_card.full_time]
Bio-Rad is seeking a Science Editor to join its global Brand and Marketing Communications Department.The Technical Editor will play a key role in ensuring consistency and quality of materials (prim...[show_more]
[last_updated.last_updated_30] • [promoted]
Student Associate, Center for Health Sciences (Lab data collection)

Student Associate, Center for Health Sciences (Lab data collection)

SRI International • Menlo Park, CA, United States
[job_card.part_time] +1
Student Associate, Center for Health Sciences (Lab data collection).SRI's Center for Health Sciences is looking for a Student Associate to join our team! The student will help with data collection...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
AI Engineer, Agents & Evaluation

AI Engineer, Agents & Evaluation

Guild.ai • San Francisco, CA, United States
[job_card.full_time]
AI Engineer, Agents & Evaluation.We’re looking for our first AI Engineer focused on agents and evaluation—a foundational hire who will shape how we build, measure, and scale intelligent systems.Hel...[show_more]
[last_updated.last_updated_variable_hours] • [promoted] • [new]
Applied Research Engineer - AI & LLM Evaluation

Applied Research Engineer - AI & LLM Evaluation

Mercor • San Francisco, CA, United States
[job_card.full_time]
An innovative AI company in San Francisco is seeking a Research Engineer to contribute to the advancement of AI models.The role involves working on post-training and evaluation tasks, designing exp...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
AI Research Engineer, Enterprise Evaluations

AI Research Engineer, Enterprise Evaluations

Scale AI, Inc. • San Francisco, CA, United States
[job_card.full_time]
Scale AI is seeking a technically rigorous and driven.This high-impact role is critical to our mission of delivering the industry's leading. You will be a hands-on contributor to the core systems th...[show_more]
[last_updated.last_updated_30] • [promoted]
AI Automation Analyst — Agentic AI & Workflow Innovator

AI Automation Analyst — Agentic AI & Workflow Innovator

Visa Inc. • Foster City, CA, United States
[job_card.full_time]
A leading global payments technology company is looking for an individual to join their AI Products & Analytics team in California. This role involves designing AI workflows, integrating models into...[show_more]
[last_updated.last_updated_30] • [promoted]
Applied Research Engineer, Agents

Applied Research Engineer, Agents

Labelbox • San Francisco, CA, United States
[job_card.full_time]
At Labelbox, we're building the critical infrastructure that powers breakthrough AI models at leading research labs and enterprises. Since 2018, we've been pioneering data-centric approaches that ar...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Associate Scientist

Associate Scientist

Public Health Institute • Emeryville, CA, United States
[job_card.part_time]
If you are a current and active PHI employee, do not use this site to apply for positions.The Public Health Institute (PHI) is an independent, nonprofit organization dedicated to promoting health, ...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Senior Analytics EngineerSoftware Engineering • Berkeley, CA; Somerville, MA; Weirton, WV • Full time

Senior Analytics EngineerSoftware Engineering • Berkeley, CA; Somerville, MA; Weirton, WV • Full time

Form Energy • Berkeley, CA, United States
[job_card.full_time]
Are you ready to build America's energy future? Form Energy is an American manufacturing and energy technology company.We're revolutionizing energy storage with cost-effective, multi-day technology...[show_more]
[last_updated.last_updated_variable_hours] • [promoted] • [new]
GenAI Evaluations Engineer : Build Trusted AI Platforms

GenAI Evaluations Engineer : Build Trusted AI Platforms

Apple Inc. • San Francisco, CA, United States
[job_card.full_time]
A leading technology company in California is seeking a Software Engineer for the Generative AI Evaluations team.You will design robust evaluation frameworks, analyze AI applications, and collabora...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Young Investigator, FlexOlmo

Young Investigator, FlexOlmo

The Allen Institute for Artificial Intelligence • Berkeley, CA, United States
[job_card.full_time]
Persons in these roles are welcome to work remotely from Berkeley, CA.Ai2 is seeking talented and motivated Postdoctoral Young Investigator to join the. Postdoctoral Young Investigators will be base...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Stock Analysis SME - AI Model Evaluator (Remote)

Stock Analysis SME - AI Model Evaluator (Remote)

Braintrust • San Francisco, CA, United States
[filters.remote]
[job_card.part_time]
A technology consulting company is seeking Stock Analysis Subject Matter Experts to evaluate AI-generated responses and provide structured feedback. This part-time role involves collaborating with A...[show_more]
[last_updated.last_updated_30] • [promoted]
Remote M&A Associate - AI Trainer ($50-$60 / hour)

Remote M&A Associate - AI Trainer ($50-$60 / hour)

Data Annotation • Richmond, California
[filters.remote]
[job_card.full_time] +1
We are looking for a finance professional to join our team to train AI models.You will measure the progress of these AI chatbots, evaluate their logic, and solve problems to improve the quality of ...[show_more]
[last_updated.last_updated_30] • [promoted]
Machine Learning Engineer

Machine Learning Engineer

Meltwater • Redwood City, CA, United States
[job_card.full_time]
Meltwater, a pioneer of media intelligence and now Outside Insight, gives businesses the information advantage they need to stay ahead. More than 30,000 companies have used Meltwater's media intelli...[show_more]
[last_updated.last_updated_variable_hours] • [promoted] • [new]
AI Agent Monitoring & Evaluation Scientist

AI Agent Monitoring & Evaluation Scientist

Datagrid • San Francisco, CA, United States
[job_card.full_time]
A growing AI technology company in San Francisco is seeking a Data Scientist focused on AI Agent Monitoring & Evaluation. You will build evaluation frameworks and monitoring tools for AI agents, ens...[show_more]
[last_updated.last_updated_variable_days] • [promoted]