Talent.com

Figure model serp_jobs.h1.location_city

serp_jobs.job_alerts.create_a_job

Figure model • pomona ca

serp_jobs.last_updated.last_updated_variable_hours
  • serp_jobs.job_card.new
Model Efficiency Engineer

Model Efficiency Engineer

VirtualVocationsOntario, California, United States
serp_jobs.job_card.full_time
A company is looking for a Member of Technical Staff, Model Efficiency.Key Responsibilities Improve core performance metrics of ML systems by analyzing model execution and identifying bottlenecks...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_hours
  • serp_jobs.job_card.promoted
Primary Care Physician Southern California $350K+ Total Compensation Value-Based Model

Primary Care Physician Southern California $350K+ Total Compensation Value-Based Model

Optigy GroupUpland, CA, US
serp_jobs.job_card.full_time
Primary Care Physician Southern California $350K+ Total Compensation Value-Based ModelWere seeking a mission-driven Primary Care Physician (MD / DO) to join a growing, value-based care organization i...serp_jobs.internal_linking.show_moreserp_jobs.last_updated.last_updated_variable_days
Model Efficiency Engineer

Model Efficiency Engineer

VirtualVocationsOntario, California, United States
job_description.job_card.variable_hours_ago
serp_jobs.job_preview.job_type
  • serp_jobs.job_card.full_time
job_description.job_card.job_description

A company is looking for a Member of Technical Staff, Model Efficiency.

Key Responsibilities

Improve core performance metrics of ML systems by analyzing model execution and identifying bottlenecks

Collaborate with modeling and systems teams to experiment, measure, and implement optimizations that enhance inference efficiency

Develop advanced performance techniques, including GPU / CUDA optimizations and model execution strategies for large-scale architectures

Required Qualifications

5+ years of experience in writing high-performance, production-quality code

Strong programming skills in C++ or Python (Rust / Go also welcome)

Experience with large language models and the LLM inference ecosystem

Ability to diagnose and resolve performance bottlenecks across the model execution stack

A strong bias for action with a focus on shipping quickly, measuring impact, and iterating