Role : Staff Machine Learning Engineer
Location : San Jose, CA (Onsite) Locals
Duration : Long-term
Mode of Interview : Virtual & Final In-person
Why this role exists
We're building privacypreserving LLM capabilities that help hardware design teams reason over Verilog / SystemVerilog and RTL artifactscode generation, refactoring, lint explanation, constraint translation, and spectoRTL assistance. We're looking for a Stafflevel engineer to technically lead a small, highleverage team that finetunes and productizes LLMs for these workflows in a strict enterprise dataprivacy environment.
You don't need to be a Verilog / RTL expert to start; curiosity, drive, and deep LLM craftsmanship matter most. Any HDL / EDA fluency is a strong plus.
What you'll do (Responsibilities)
Own the technical roadmap for Verilog / RTLfocused LLM capabilitiesfrom model selection and adaptation to evaluation, deployment, and continuous improvement.
Lead a handson team of applied scientists / engineers : set direction, unblock technically, review designs / code, and raise the bar on experimentation velocity and reliability.
Finetune and customize models using stateoftheart techniques (LoRA / QLoRA, PEFT, instruction tuning, preference optimization / RLAIF) with robust HDLspecific evals :
o Compile / lint / simulatebased pass rates, pass@k for code generation, constrained decoding to enforce syntax, and doesitsynthesize checks.
Design privacyfirst ML pipelines on AWS :
o Training / customization and hosting using Amazon Bedrock (including Anthropic models) where appropriate; SageMaker (or EKS + KServe / Triton / DJL) for bespoke training needs.
o Artifacts in S3 with KMS CMKs; isolated VPC subnets & PrivateLink (including Bedrock VPC endpoints), IAM leastprivilege, CloudTrail auditing, and Secrets Manager for credentials.
o Enforce encryption in transit / at rest, data minimization, no public egress for customer / RTL corpora.
Stand up dependable model serving : Bedrock model invocation where it fits, and / or lowlatency selfhosted inference (vLLM / TensorRTLLM), autoscaling, and canary / bluegreen rollouts.
Build an evaluation culture : automatic regression suites that run HDL compilers / simulators, measure behavioral fidelity, and detect hallucinations / constraint violations; model cards and experiment tracking (MLflow / Weights & Biases).
Partner deeply with hardware design, CAD / EDA, Security, and Legal to source / prepare datasets (anonymization, redaction, licensing), define acceptance gates, and meet compliance requirements.
Drive productization : integrate LLMs with internal developer tools (IDEs / plugins, code review bots, CI), retrieval (RAG) over internal HDL repos / specs, and safe tooluse / functioncalling.
Mentor & uplevel : coach ICs on LLM best practices, reproducible training, critical paper reading, and building securebydefault systems.
What you'll bring (Minimum qualifications)
10+ years total engineering experience with 5+ years in ML / AI or largescale distributed systems; 3+ years working directly with transformers / LLMs.
Proven track record shipping LLMpowered features in production and leading ambiguous, crossfunctional initiatives at Staff level.
Deep handson skill with PyTorch, Hugging Face Transformers / PEFT / TRL, distributed training (DeepSpeed / FSDP), quantizationaware finetuning (LoRA / QLoRA), and constrained / grammarguided decoding.
AWS expertise to design and defend secure enterprise deployments, including :
o Amazon Bedrock (model selection, Anthropic model usage, model customization, Guardrails, Knowledge Bases, Bedrock runtime APIs, VPC endpoints)
o SageMaker (Training, Inference, Pipelines), S3, EC2 / EKS / ECR, VPC / Subnets / Security Groups, IAM, KMS, PrivateLink, CloudWatch / CloudTrail, Step Functions, Batch, Secrets Manager.
Strong software engineering fundamentals : testing, CI / CD, observability, performance tuning; Python a must (bonus for Go / Java / C++).
Demonstrated ability to set technical vision and influence across teams; excellent written and verbal communication for execs and engineers.
Nice to have (Preferred qualifications)
Familiarity with Verilog / SystemVerilog / RTL workflows : lint, synthesis, timing closure, simulation, formal, test benches, and EDA tools (Synopsys / Cadence / Mentor).
Experience integrating static analysis / ASTaware tokenization for code models or grammarconstrained decoding.
RAG at scale over code / specs (vector stores, chunking strategies), tooluse / functioncalling for code transformation.
Inference optimization : TensorRTLLM, KVcache optimization, speculative decoding; throughput / latency tradeoffs at batch and token levels.
Model governance / safety in the enterprise : model cards, redteaming, secure eval data handling; exposure to SOC2 / ISO 27001 / NIST frameworks.
Data anonymization, DLP scanning, and code deidentification to protect IP.
What success looks like
90 days
Baseline an HDLaware eval harness that compiles / simulates; establish secure AWS training & serving environments (VPConly, KMSbacked, no public egress).
Ship an initial finetuned / customized model with measurable gains vs. base (e.g., +X% compilepass rate, Y% lint findings per K LOC generated).
180 days
Expand customization / training coverage (Bedrock for managed FMs including Anthropic; SageMaker / EKS for bespoke / open models).
Add constrained decoding + retrieval over internal design specs; productionize inference with SLOs (p95 latency, availability) and audited rollout to pilot hardware teams.
12 months
Demonstrably reduce review / iteration cycles for RTL tasks with clear metrics (defect reduction, timetolintclean, % autofix suggestions accepted), and a stable MLOps path for continuous improvement.
How we work (Security & privacy by design)
Customer and internal design data remain within private AWS VPCs; access via IAM roles and audited by CloudTrail; all artifacts encrypted with KMS.
No public internet calls for sensitive workloads; Bedrock access via VPC interface endpoints / PrivateLink with endpoint policies; SageMaker and / or EKS run in private subnets.
Data pipelines enforce minimization, tagging, retention windows, and reproducibility; DLP scanning and redaction are firstclass steps.
We produce model cards, data lineage, and evaluation artifacts for every release.
Tech you'll touch
Modeling : PyTorch, HF Transformers / PEFT / TRL, DeepSpeed / FSDP, vLLM, TensorRTLLM
AWS & MLOps : Amazon Bedrock (Anthropic and other FMs, Guardrails, Knowledge Bases, Runtime APIs), SageMaker (Training / Inference / Pipelines), MLflow / W&B, ECR, EKS / KServe / Triton, Step Functions
Platform / Security : S3 + KMS, IAM, VPC / PrivateLink (incl. Bedrock), CloudWatch / CloudTrail, Secrets Manager
Tooling (nice to have) : HDL toolchains for compile / simulate / lint, vector stores (pgvector / OpenSearch), GitHub / GitLab CI
Staff Machine Learning Engineer • San Jose, CA, United States