Talent.com
LLM Inference Deployment Engineer
LLM Inference Deployment EngineerVirtualVocations • Fremont, California, United States
[error_messages.no_longer_accepting]
LLM Inference Deployment Engineer

LLM Inference Deployment Engineer

VirtualVocations • Fremont, California, United States
[job_card.variable_days_ago]
[job_preview.job_type]
  • [job_card.full_time]
[job_card.job_description]

A company is looking for an LLM Inference Deployment Engineer to optimize and deploy large language models for high-performance inference.

Key Responsibilities

Deploy and optimize LLMs post-training from libraries like HuggingFace

Utilize inference runtimes for efficient execution

Develop and maintain high-performance inference pipelines using Docker and Kubernetes

Required Qualifications

Bachelor's or Master's degree in Computer Science, Electrical Engineering, or related field

Experience in LLM inference deployment and model optimization

Expertise in LLM inference frameworks such as PyTorch and ONNX Runtime

In-depth knowledge of Python for model integration and performance tuning

Experience with containerized AI deployments and LLM memory optimization strategies

[job_alerts.create_a_job]

Deployment Engineer • Fremont, California, United States

[internal_linking.similar_jobs]
Sr. Software Engineer - AI / LLM Applications (26456)

Sr. Software Engineer - AI / LLM Applications (26456)

Supermicro • San Jose, CA, United States
[job_card.full_time]
Supermicro is a Top Tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop / Big Data, Hyperscale, HPC and IoT / Embedded customers...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Staff Software Development Engineer (LLM)

Staff Software Development Engineer (LLM)

Fortinet • Sunnyvale, CA, United States
[job_card.full_time]
Architect and implement functions to monitor and filter LLM requests / responses in real time, preventing prompt injection attacks and unauthorized data leakage. Build a highly scalable pipeline capab...[show_more]
[last_updated.last_updated_30] • [promoted]
Sr. Staff ML Platform Engineer (TLM)

Sr. Staff ML Platform Engineer (TLM)

Earnin • Mountain View, California, United States
[job_card.full_time]
As one of the first pioneers of earned wage access, our passion at EarnIn is building products that deliver real-time financial flexibility for those with the unique needs of living paycheck to pay...[show_more]
[last_updated.last_updated_30] • [promoted]
ML Infrastructure Engineer with GCP

ML Infrastructure Engineer with GCP

iSoftTek Solutions Inc • Mountain View, CA, US
[job_card.full_time]
Job Title : ML Infrastructure Engineer with GCP.Location : Mountain View, CA [Needs to be onsite for 1 week once in a quarter on your own expenses]. Note : Only PST and MST candidates are required.Expe...[show_more]
[last_updated.last_updated_30] • [promoted]
Senior DevOps / MLOps Engineer

Senior DevOps / MLOps Engineer

GenBio AI • Palo Alto, CA, US
[job_card.full_time]
Headquartered in Silicon Valley, GenBio AI is a newly established startup where a collective of visionary scientists, engineers, and entrepreneurs are dedicated to transforming the landscape of bio...[show_more]
[last_updated.last_updated_30] • [promoted]
Senior ML Ops Engineer

Senior ML Ops Engineer

Axiado • San Jose, CA, US
[job_card.full_time]
Axiado is an AI-enhanced security processor company redefining the control and management of every digital system.The company was founded in 2017, and currently has 150+ employees.At Axiado, develo...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Senior Staff ML Engineer, (TLM) Driver Understanding and Evaluation

Senior Staff ML Engineer, (TLM) Driver Understanding and Evaluation

Waymo • Mountain View, California, United States
[job_card.full_time]
Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver.Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on buildin...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Distinguished Software Engineer (DLP Platform Architecture, Scale, and Advanced DLP Detection)

Distinguished Software Engineer (DLP Platform Architecture, Scale, and Advanced DLP Detection)

Palo Alto Networks • Santa Clara, CA, US
[job_card.full_time]
At Palo Alto Networks® everything starts and ends with our mission : .Being the cybersecurity partner of choice, protecting our digital way of life. Our vision is a world where each day is safer a...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
MlOps / Data Engineer

MlOps / Data Engineer

TEKsystems • Cupertino, CA, United States
[job_card.full_time]
Expected skills : Python, Golang / Rust (nice to have).Data Engineering tools : pyiceberg, daft to name a few.The candidate should be familiar with data engineering supporting and building systems at P...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
PLM Integration Engineer

PLM Integration Engineer

INTERACTION 24 LLC • Fremont, CA, US
[job_card.full_time]
Teamcenter Developer Must be able to work W2.Hybrid Fremont, CA or Tualatin, OR (must support West Coast hours).Top 3 Skill Qualification Questions. Teamcenter / SAP Integration (T4ST / CN4T).PLMSI ...[show_more]
[last_updated.last_updated_30] • [promoted]
Research Engineer (LLMs and Generative Models)

Research Engineer (LLMs and Generative Models)

GenBio AI • Palo Alto, CA, US
[job_card.full_time]
Headquartered in Silicon Valley, we are a newly established start-up, where a collective of visionary scientists, engineers, and entrepreneurs are dedicated to transforming the landscape of biology...[show_more]
[last_updated.last_updated_30] • [promoted]
ML Engineer

ML Engineer

Catalyst Labs • Palo Alto, CA, US
[job_card.full_time]
Is a rapidly growing Tier 1 VC backed startup based in New York with $60 million in funding revolutionizing how outside sales and service teams work. Their AI technology captures and analyzes real-w...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Lead Engineer, Detection & Response - 100% REMOTE

Lead Engineer, Detection & Response - 100% REMOTE

Jobot • Mountain View, CA, US
[filters.remote]
[job_card.full_time]
This Jobot Job is hosted by : Katherine Krull.Are you a fit? Easy Apply now by clicking the "Apply Now" button and sending us your resume. Salary : $180,000 - $200,000 per year.Come join a growing com...[show_more]
[last_updated.last_updated_30] • [promoted]
Machine Learning Infrastructure Engineer

Machine Learning Infrastructure Engineer

Institute of Foundation Models • Sunnyvale, CA, US
[job_card.full_time]
About the Institute of Foundation Models.We are a dedicated research lab for building, understanding, using, and risk-managing foundation models. Our mandate is to advance research, nurture the next...[show_more]
[last_updated.last_updated_30] • [promoted]
Machine Learning Engineer, NLP and multimodal

Machine Learning Engineer, NLP and multimodal

Newsbreak • Mountain View, California, United States
[job_card.full_time]
NewsBreak is redefining the way users interact with local news and their communities.By bridging local users, local content creators, and local businesses, our mission is to foster safer, more vibr...[show_more]
[last_updated.last_updated_30] • [promoted]
Principal Engineer MLOps (DLP Detection)

Principal Engineer MLOps (DLP Detection)

Palo Alto Networks • Santa Clara, CA, US
[job_card.full_time]
At Palo Alto Networks® everything starts and ends with our mission : .Being the cybersecurity partner of choice, protecting our digital way of life. Our vision is a world where each day is safer a...[show_more]
[last_updated.last_updated_30] • [promoted]
Software Engineer L4, Machine Learning Platform (Metaflow)

Software Engineer L4, Machine Learning Platform (Metaflow)

Netflix • Los Gatos, California, United States
[job_card.full_time]
Netflix is one of the world's leading entertainment services, with 283 million paid memberships in over 190 countries enjoying TV series, films and games across a wide variety of genres and lan...[show_more]
[last_updated.last_updated_30] • [promoted]
Senior Machine Learning Operations (MLOps) Engineer

Senior Machine Learning Operations (MLOps) Engineer

Bonfy-ai • Mountain View, California, United States
[job_card.full_time]
AI is building the trust layer for generative AI.Our Adaptive Content Security platform detects and mitigates subtle risks embedded in large language model (LLM) outputs before they reach users.Fro...[show_more]
[last_updated.last_updated_30] • [promoted]