Talent.com
AI Evaluation Engineer
AI Evaluation EngineerApex Systems • San Francisco, CA
AI Evaluation Engineer

AI Evaluation Engineer

Apex Systems • San Francisco, CA
[job_card.variable_days_ago]
[job_preview.job_type]
  • [job_card.temporary]
  • [filters_job_card.quick_apply]
[job_card.job_description]

Job# : 3018441

Job Description :

Remote – Working PST schedule

Contract : 6 months + extension opportunity

We are looking for engineers to join us on a 6-month contract (with the possibility of extension) our Engineering Team. The primary work is split between engineering work to port external benchmarks to run on internal infrastructure and developing novel model evaluations. You should be comfortable with fast execution speed, high velocity learning, and engineering work with clear documentation and sharp debugging.

Responsibilities

  • Porting new external benchmarks to the teamʼs internal infrastructure so they can be run as part of their evaluation stack for new model releases.
  • Keeping up to date with new evals and benchmarks, pitching the team on porting newly released evals.
  • Performing rigorous quality control for new and existing evals.
  • Implementing novel evaluations to measure dangerous capabilities and safety of frontier models.

Requirements

  • Strong Python coding experience and writing clean code fast.
  • Working in a small team on a large, shared codebase.
  • Experience designing and building model evaluations.
  • Detail-oriented, with tenacity to dig through transcripts to identify and resolve issues.
  • Ability to quickly and independently learn new skills and frameworks.
  • Team player with strong communication skills.
  • In addition, it would be advantageous if you have

  • Demonstrated research experience in the evals space.
  • Experience with agentic evaluations and working with Docker.
  • Apex Benefits Overview : Apex offers a range of supplemental benefits, including medical, dental, vision, life, disability, and other insurance plans that offer an optional layer of financial protection. We offer an ESPP (employee stock purchase program) and a 401K program which allows you to contribute typically within 30 days of starting, with a company match after 12 months of tenure. Apex also offers a HSA (Health Savings Account on the HDHP plan), a SupportLinc Employee Assistance Program (EAP) with up to 8 free counseling sessions, a corporate discount savings program and other discounts. In terms of professional development, Apex hosts an on-demand training program, provides access to certification prep and a library of technical and leadership courses / books / seminars once you have 6+ months of tenure, and certification discounts and other perks to associations that include CompTIA and IIBA. Apex has a dedicated customer service team for our Consultants that can address questions around benefits and other resources, as well as a certified Career Coach. You can access a full list of our benefits, programs, support teams and resources within our ‘Welcome Packet’ as well, which an Apex team member can provide.

    EEO Employer

    Apex Systems is an equal opportunity employer. We do not discriminate or allow discrimination on the basis of race, color, religion, creed, sex (including pregnancy, childbirth, breastfeeding, or related medical conditions), age, sexual orientation, gender identity, national origin, ancestry, citizenship, genetic information, registered domestic partner status, marital status, disability, status as a crime victim, protected veteran status, political affiliation, union membership, or any other characteristic protected by law. Apex will consider qualified applicants with criminal histories in a manner consistent with the requirements of applicable law. If you have visited our website in search of information on employment opportunities or to apply for a position, and you require an accommodation in using our website for a search or application, please contact our Employee Services Department at  or 844-463-6178.

    Apex Systems is a world-class IT services company that serves thousands of clients across the globe. When you join Apex, you become part of a team that values innovation, collaboration, and continuous learning. We offer quality career resources, training, certifications, development opportunities, and a comprehensive benefits package. Our commitment to excellence is reflected in many awards, including ClearlyRated's Best of Staffing® in Talent Satisfaction in the United States and Great Place to Work® in the United Kingdom and Mexico.

    Apex Systems is a world-class IT services company that serves thousands of clients across the globe. When you join Apex, you become part of a team that values innovation, collaboration, and continuous learning. We offer quality career resources, training, certifications, development opportunities, and a comprehensive benefits package. Our commitment to excellence is reflected in many awards, including ClearlyRated's Best of Staffing® in Talent Satisfaction in the United States and Great Place to Work® in the United Kingdom and Mexico. Apex uses a virtual recruiter as part of the application process. Click here for more details.

    Apex Benefits Overview :   Apex offers a range of supplemental benefits, including medical, dental, vision, life, disability, and other insurance plans that offer an optional layer of financial protection. We offer an ESPP (employee stock purchase program) and a 401K program which allows you to contribute typically within 30 days of starting, with a company match after 12 months of tenure. Apex also offers a HSA (Health Savings Account on the HDHP plan), a SupportLinc Employee Assistance Program (EAP) with up to 8 free counseling sessions, a corporate discount savings program and other discounts. In terms of professional development, Apex hosts an on-demand training program, provides access to certification prep and a library of technical and leadership courses / books / seminars once you have 6+ months of tenure, and certification discounts and other perks to associations that include CompTIA and IIBA. Apex has a dedicated customer service team for our Consultants that can address questions around benefits and other resources, as well as a certified Career Coach. You can access a full list of our benefits, programs, support teams and resources within our ‘Welcome Packet’ as well, which an Apex team member can provide.

    [job_alerts.create_a_job]

    AI Evaluation Engineer • San Francisco, CA

    [internal_linking.similar_jobs]
    AI Engineer

    AI Engineer

    The Mortgage Office (Applied Business Software Inc.,) • San Mateo, CA, US
    [job_card.full_time]
    The Mortgage Office (TMO) is the leading B2B fintech platform serving the private lending industry.Our software helps private lenders, fund managers, municipalities, and non-profits originate and s...[show_more]
    [last_updated.last_updated_30] • [promoted]
    AI Evaluation Engineer - Enterprise GenAI Systems

    AI Evaluation Engineer - Enterprise GenAI Systems

    Scale AI • San Francisco, CA, United States
    [job_card.full_time]
    A leading technology company in San Francisco is seeking an AI Research Engineer to join their Enterprise Evaluations team. In this pivotal role, you will contribute to the industry's premier GenAI ...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Senior AI Evaluation Engineer

    Senior AI Evaluation Engineer

    Sentry • San Francisco, CA, United States
    [job_card.full_time]
    A software monitoring firm is seeking a Senior Software Engineer to join its AI / ML team in San Francisco.In this role, you will design evaluation frameworks to measure AI system performance, develo...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    AI Engineer

    AI Engineer

    LangChain • San Francisco, CA, United States
    [job_card.full_time]
    We're looking for an AI Engineer to join our Professional Services team.You'll work directly with enterprise customers to design, build, and optimize production-grade AI agent systems.This role com...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Applied AI Engineer – Generative AI

    Applied AI Engineer – Generative AI

    Kodiak • San Francisco, CA, United States
    [job_card.full_time]
    The company has developed an artificial intelligence (AI) powered technology stack purpose-built for commercial trucking and the public sector. The company delivers freight daily for its customers a...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Research Scientist : AI Evaluation & Alignment

    Research Scientist : AI Evaluation & Alignment

    Patronus AI • San Francisco, CA, United States
    [job_card.full_time]
    A leading AI company is seeking a Research Scientist to develop cutting-edge AI evaluation systems and conduct transformative research in language models. Candidates should have a background in empi...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Remote Rust Engineer for AI Data & Evaluation

    Remote Rust Engineer for AI Data & Evaluation

    Labelbox • San Francisco, CA, United States
    [filters.remote]
    [job_card.full_time]
    A leading technology firm is seeking a Rust Developer to design and optimize high-performance systems supporting AI models. The ideal candidate has over 5 years of experience in production Rust prog...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Applied AI Engineer

    Applied AI Engineer

    Getsafetykit • San Francisco, CA, United States
    [job_card.full_time]
    We’re inventing the future of B2B SaaS with AI agents.We’re betting on language models and we’re betting on scale.You’ll test new models the day they come out and understand their characteristics b...[show_more]
    [last_updated.last_updated_30] • [promoted]
    GenAI Evaluations Engineer — Build Trusted AI Systems

    GenAI Evaluations Engineer — Build Trusted AI Systems

    Apple Inc. • San Francisco, CA, United States
    [job_card.full_time]
    A leading technology company in San Francisco is seeking a driven Software Engineer to join its Generative AI Evaluations team. The role involves designing evaluation frameworks, collaborating close...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    AI Research Engineer, Enterprise Evaluations

    AI Research Engineer, Enterprise Evaluations

    Scale AI, Inc. • San Francisco, CA, United States
    [job_card.full_time]
    Scale AI is seeking a technically rigorous and driven.This high-impact role is critical to our mission of delivering the industry's leading. You will be a hands-on contributor to the core systems th...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Applied Research Engineer - AI & LLM Evaluation

    Applied Research Engineer - AI & LLM Evaluation

    Mercor • San Francisco, CA, United States
    [job_card.full_time]
    An innovative AI company in San Francisco is seeking a Research Engineer to contribute to the advancement of AI models.The role involves working on post-training and evaluation tasks, designing exp...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Applied AI Research Engineer — RAG & Evaluation

    Applied AI Research Engineer — RAG & Evaluation

    Drata • San Francisco, CA, United States
    [job_card.full_time]
    A leading compliance software company in San Francisco is seeking an Applied AI Engineer to innovate compliance automation through applied research and evaluation. This role emphasizes experimentati...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Senior Engineer, AI Evaluation & Reliability (Agentic AI)

    Senior Engineer, AI Evaluation & Reliability (Agentic AI)

    Anomali • Redwood City, CA, US
    [job_card.full_time]
    Anomali is headquartered in Silicon Valley and is the Leading AI-Powered Security Operations Platform that is modernizing security operations. At the center of it is an omnipresent, intelligent, and...[show_more]
    [last_updated.last_updated_30] • [promoted]
    AI Engineer

    AI Engineer

    Langchain • San Francisco, CA, United States
    [job_card.full_time]
    We're looking for an AI Engineer to join our Professional Services team.You'll work directly with enterprise customers to design, build, and optimize production-grade AI agent systems.This role com...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    English Professor - Research & AI Evaluation

    English Professor - Research & AI Evaluation

    Sepal • San Francisco, CA, United States
    [job_card.full_time]
    About the Project Sepal is conducting a qualitative research study to define benchmarks of professional excellence in English language and literature education. We are looking for experienced profes...[show_more]
    [last_updated.last_updated_1_day] • [promoted]
    Generative AI Engineer

    Generative AI Engineer

    Regard • San Francisco, CA, US
    [job_card.full_time]
    As a Generative AI Engineer at Regard, you’ll work across the full lifecycle of developing and deploying AI-driven features, from ideation and design to prototyping, implementation, evaluatio...[show_more]
    [last_updated.last_updated_variable_hours] • [promoted] • [new]
    Research Engineer : AI Systems & LLM Evaluation + Equity

    Research Engineer : AI Systems & LLM Evaluation + Equity

    Mercor, Inc. • San Francisco, CA, United States
    [job_card.full_time]
    A cutting-edge technology company in San Francisco is seeking a Research Engineer.The role involves working on post-training and RLVR, designing experiments, and improving large language models.Ide...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Lead Research Engineer, Model Evaluations Platform

    Lead Research Engineer, Model Evaluations Platform

    Anthropic • San Francisco, CA, United States
    [job_card.full_time]
    A leading AI research organization in San Francisco seeks a Research Engineer to lead the design and implementation of its evaluation platform. You will ensure the safety and effectiveness of AI mod...[show_more]
    [last_updated.last_updated_30] • [promoted]