Talent.com
Staff Machine Learning Engineer / Principal ML Engineer (San Jose)
Staff Machine Learning Engineer / Principal ML Engineer (San Jose)SRS Consulting Inc • San Jose, CA, US
Staff Machine Learning Engineer / Principal ML Engineer (San Jose)

Staff Machine Learning Engineer / Principal ML Engineer (San Jose)

SRS Consulting Inc • San Jose, CA, US
[job_card.variable_hours_ago]
[job_preview.job_type]
  • [job_card.part_time]
[job_card.job_description]

Role : Staff Machine Learning Engineer

Location : San Jose, CA (Onsite) Locals

Duration : Long-term

Mode of Interview : Virtual & Final In-person

Why this role exists

We're building privacypreserving LLM capabilities that help hardware design teams reason over Verilog / SystemVerilog and RTL artifactscode generation, refactoring, lint explanation, constraint translation, and spectoRTL assistance. We're looking for a Stafflevel engineer to technically lead a small, highleverage team that finetunes and productizes LLMs for these workflows in a strict enterprise dataprivacy environment.

You don't need to be a Verilog / RTL expert to start; curiosity, drive, and deep LLM craftsmanship matter most. Any HDL / EDA fluency is a strong plus.

What you'll do (Responsibilities)

Own the technical roadmap for Verilog / RTLfocused LLM capabilitiesfrom model selection and adaptation to evaluation, deployment, and continuous improvement.

Lead a handson team of applied scientists / engineers : set direction, unblock technically, review designs / code, and raise the bar on experimentation velocity and reliability.

Finetune and customize models using stateoftheart techniques (LoRA / QLoRA, PEFT, instruction tuning, preference optimization / RLAIF) with robust HDLspecific evals :

o Compile / lint / simulatebased pass rates, pass@k for code generation, constrained decoding to enforce syntax, and doesitsynthesize checks.

Design privacyfirst ML pipelines on AWS :

o Training / customization and hosting using Amazon Bedrock (including Anthropic models) where appropriate; SageMaker (or EKS + KServe / Triton / DJL) for bespoke training needs.

o Artifacts in S3 with KMS CMKs; isolated VPC subnets & PrivateLink (including Bedrock VPC endpoints), IAM leastprivilege, CloudTrail auditing, and Secrets Manager for credentials.

o Enforce encryption in transit / at rest, data minimization, no public egress for customer / RTL corpora.

Stand up dependable model serving : Bedrock model invocation where it fits, and / or lowlatency selfhosted inference (vLLM / TensorRTLLM), autoscaling, and canary / bluegreen rollouts.

Build an evaluation culture : automatic regression suites that run HDL compilers / simulators, measure behavioral fidelity, and detect hallucinations / constraint violations; model cards and experiment tracking (MLflow / Weights & Biases).

Partner deeply with hardware design, CAD / EDA, Security, and Legal to source / prepare datasets (anonymization, redaction, licensing), define acceptance gates, and meet compliance requirements.

Drive productization : integrate LLMs with internal developer tools (IDEs / plugins, code review bots, CI), retrieval (RAG) over internal HDL repos / specs, and safe tooluse / functioncalling.

Mentor & uplevel : coach ICs on LLM best practices, reproducible training, critical paper reading, and building securebydefault systems.

What you'll bring (Minimum qualifications)

10+ years total engineering experience with 5+ years in ML / AI or largescale distributed systems; 3+ years working directly with transformers / LLMs.

Proven track record shipping LLMpowered features in production and leading ambiguous, crossfunctional initiatives at Staff level.

Deep handson skill with PyTorch, Hugging Face Transformers / PEFT / TRL, distributed training (DeepSpeed / FSDP), quantizationaware finetuning (LoRA / QLoRA), and constrained / grammarguided decoding.

AWS expertise to design and defend secure enterprise deployments, including :

o Amazon Bedrock (model selection, Anthropic model usage, model customization, Guardrails, Knowledge Bases, Bedrock runtime APIs, VPC endpoints)

o SageMaker (Training, Inference, Pipelines), S3, EC2 / EKS / ECR, VPC / Subnets / Security Groups, IAM, KMS, PrivateLink, CloudWatch / CloudTrail, Step Functions, Batch, Secrets Manager.

Strong software engineering fundamentals : testing, CI / CD, observability, performance tuning; Python a must (bonus for Go / Java / C++).

Demonstrated ability to set technical vision and influence across teams; excellent written and verbal communication for execs and engineers.

Nice to have (Preferred qualifications)

Familiarity with Verilog / SystemVerilog / RTL workflows : lint, synthesis, timing closure, simulation, formal, test benches, and EDA tools (Synopsys / Cadence / Mentor).

Experience integrating static analysis / ASTaware tokenization for code models or grammarconstrained decoding.

RAG at scale over code / specs (vector stores, chunking strategies), tooluse / functioncalling for code transformation.

Inference optimization : TensorRTLLM, KVcache optimization, speculative decoding; throughput / latency tradeoffs at batch and token levels.

Model governance / safety in the enterprise : model cards, redteaming, secure eval data handling; exposure to SOC2 / ISO 27001 / NIST frameworks.

Data anonymization, DLP scanning, and code deidentification to protect IP.

What success looks like

90 days

Baseline an HDLaware eval harness that compiles / simulates; establish secure AWS training & serving environments (VPConly, KMSbacked, no public egress).

Ship an initial finetuned / customized model with measurable gains vs. base (e.g., +X% compilepass rate, Y% lint findings per K LOC generated).

180 days

Expand customization / training coverage (Bedrock for managed FMs including Anthropic; SageMaker / EKS for bespoke / open models).

Add constrained decoding + retrieval over internal design specs; productionize inference with SLOs (p95 latency, availability) and audited rollout to pilot hardware teams.

12 months

Demonstrably reduce review / iteration cycles for RTL tasks with clear metrics (defect reduction, timetolintclean, % autofix suggestions accepted), and a stable MLOps path for continuous improvement.

How we work (Security & privacy by design)

Customer and internal design data remain within private AWS VPCs; access via IAM roles and audited by CloudTrail; all artifacts encrypted with KMS.

No public internet calls for sensitive workloads; Bedrock access via VPC interface endpoints / PrivateLink with endpoint policies; SageMaker and / or EKS run in private subnets.

Data pipelines enforce minimization, tagging, retention windows, and reproducibility; DLP scanning and redaction are firstclass steps.

We produce model cards, data lineage, and evaluation artifacts for every release.

Tech you'll touch

Modeling : PyTorch, HF Transformers / PEFT / TRL, DeepSpeed / FSDP, vLLM, TensorRTLLM

AWS & MLOps : Amazon Bedrock (Anthropic and other FMs, Guardrails, Knowledge Bases, Runtime APIs), SageMaker (Training / Inference / Pipelines), MLflow / W&B, ECR, EKS / KServe / Triton, Step Functions

Platform / Security : S3 + KMS, IAM, VPC / PrivateLink (incl. Bedrock), CloudWatch / CloudTrail, Secrets Manager

Tooling (nice to have) : HDL toolchains for compile / simulate / lint, vector stores (pgvector / OpenSearch), GitHub / GitLab CI

[job_alerts.create_a_job]

Staff Machine Learning Engineer • San Jose, CA, US

[internal_linking.related_jobs]
Principal ML Engineer — GenAI & Large-Scale AI Systems

Principal ML Engineer — GenAI & Large-Scale AI Systems

Walmart • Sunnyvale, CA, United States
[job_card.full_time]
A large retail company in California is looking for a Principal Machine Learning Engineer to lead AI and machine learning projects. This role involves developing and deploying scalable solutions, co...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Staff Machine Learning Engineer

Staff Machine Learning Engineer

Adobe Inc. • San Jose, CA, US
[job_card.full_time]
Overview Adobe Experience Intelligence Team is looking for a Staff Machine Learning Engineer who will apply AI and machine learning techniques to big-data problems to help Adobe better understand,...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Principal Machine Learning Engineer - ML Innovation

Principal Machine Learning Engineer - ML Innovation

Apple Inc. • Cupertino, CA, United States
[job_card.full_time]
Cupertino, California, United States Machine Learning and AI.We are seeking an exceptional Principal Machine Learning (ML) Engineer / Researcher to join our premier ML innovation team at Apple.As a k...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Senior Staff Machine Learning Engineer

Senior Staff Machine Learning Engineer

Coupanginternal • Mountain View, California, United States
[job_card.full_time]
Please complete the attached the.Internal Transfer Request Form.Please make sure you are applying with your Coupang e-mail address. We know we’re doing the right thing when we hear our customers say...[show_more]
[last_updated.last_updated_30] • [promoted]
Principal MTS, Machine Learning Engineer

Principal MTS, Machine Learning Engineer

Paypal • San Jose, California, United States
[job_card.full_time]
PayPal has been revolutionizing commerce globally for more than 25 years.Creating innovative experiences that make moving money, selling, and shopping simple, personalized, and secure, PayPal empow...[show_more]
[last_updated.last_updated_30] • [promoted]
Staff Machine Learning Engineer (ML Platform)

Staff Machine Learning Engineer (ML Platform)

EarnIn • Palo Alto, CA, United States
[job_card.full_time]
Get AI-powered advice on this job and more exclusive features.As one of the first pioneers of earned wage access, our passion at EarnIn is building products that deliver real-time financial flexibi...[show_more]
[last_updated.last_updated_30] • [promoted]
Staff Machine Learning Engineer

Staff Machine Learning Engineer

Cisco Systems, Inc. • San Jose, California, United States
[job_card.full_time]
Meet the Team Join the engineering team building the intelligent backbone of Splunk Observability Cloud.We are committed to leveraging the latest advancements in data science and machine learning t...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Staff / Principal Machine Learning Engineer

Staff / Principal Machine Learning Engineer

Inworld Ai • Mountain View, California, United States
[job_card.full_time]
At Inworld, we believe the processes of building, scaling, and evolving applications are monsters that consume value before it can reach users. Our mission is to solve evolution and transform static...[show_more]
[last_updated.last_updated_30] • [promoted]
Staff Machine Learning Engineer

Staff Machine Learning Engineer

Servicenow • Santa Clara, California, United States
[job_card.full_time]
It all started in sunny San Diego, California in 2004 when a visionary engineer, Fred Luddy, saw the potential to transform how we work. Fast forward to today — ServiceNow stands as a global market ...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Staff Machine Learning Engineer

Staff Machine Learning Engineer

Axiado • San Jose, California, USA
[job_card.full_time]
This is a full-stack ML systems role for a seniorindividual contributorand technical architect.You will be responsible for designing the complete ML ecosystem for our edge devices from the cloud-na...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Staff Machine Learning Engineer (Applied ML)

Staff Machine Learning Engineer (Applied ML)

Earnin • Mountain View, California, United States
[job_card.full_time]
As one of the first pioneers of earned wage access, our passion at EarnIn is building products that deliver real-time financial flexibility for those with the unique needs of living paycheck to pay...[show_more]
[last_updated.last_updated_30] • [promoted]
Staff Machine Learning Engineer

Staff Machine Learning Engineer

Coupand • Mountain View, California, United States
[job_card.full_time]
We know we’re doing the right thing when we hear our customers say, "How did we ever live without Coupang?" Born out of an obsession to make shopping, eating, and living easier than ever, we’re col...[show_more]
[last_updated.last_updated_30] • [promoted]
Staff Machine Learning Engineer

Staff Machine Learning Engineer

GEICO • Palo Alto, CA, United States
[job_card.full_time]
Staff Machine Learning Engineer • • • •Overview : • • •single • AI / Machine Learning team, responsible for the tech design and tech health of the team. You will build and architect scalable and reliable AIML...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Staff Machine Learning Engineer

Staff Machine Learning Engineer

Cisco Systems • San Jose, California, United States
[job_card.full_time]
Meet the Team Join the engineering team building the intelligent backbone of Splunk Observability Cloud.We are committed to leveraging the latest advancements in data science and machine learning t...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Staff Machine Learning Engineer / Principal ML Engineer (San Jose)

Staff Machine Learning Engineer / Principal ML Engineer (San Jose)

SRS Consulting Inc • San Jose, CA, United States
[job_card.full_time]
Role : Staff Machine Learning Engineer.Location : San Jose, CA (Onsite) Locals.Mode of Interview : Virtual & Final In-person. We're building privacypreserving LLM capabilities that help hardware design...[show_more]
[last_updated.last_updated_variable_hours] • [promoted] • [new]
Staff / Principal Machine Learning Engineer

Staff / Principal Machine Learning Engineer

Inworld AI • Mountain View, CA, United States
[job_card.full_time]
Staff / Principal Machine Learning Engineer.Staff / Principal Machine Learning Engineer.Get AI-powered advice on this job and more exclusive features. Direct message the job poster from Inworld AI.A...[show_more]
[last_updated.last_updated_30] • [promoted]
Staff Machine Learning R&D Engineer

Staff Machine Learning R&D Engineer

Matterport • Sunnyvale, CA, United States
[job_card.full_time]
Matterport is leading the digital transformation of the built world.Our groundbreaking spatial computing platform turns buildings into data making every space more valuable and accessible.Millions ...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Sr. Staff Machine Learning Engineer, Closeup Relevance

Sr. Staff Machine Learning Engineer, Closeup Relevance

Pinterest • Palo Alto, CA, United States
[job_card.full_time]
Millions of people around the world come to our platform to find creative ideas, dream about new possibilities and plan for memories that will last a lifetime. At Pinterest, we're on a mission to br...[show_more]
[last_updated.last_updated_variable_days] • [promoted]