We’re looking for an entrepreneurial Senior Machine Learning Engineer with experience taking voice-centric AI systems (TTS, STT, LLM-driven dialog) from prototype to large-scale production. You’ll own the full ML lifecycle—research, data pipelines, training, evaluation, deployment, and ongoing optimization—powering sub-second, natural voice conversations at scale.
This role is ideal for someone passionate about pushing the limits of conversational AI : creating highly optimized, domain-specific models that are faster, leaner, and more cost-efficient than general-purpose solutions. You’ll collaborate closely with product, infrastructure, and compliance teams, while setting the technical bar for model excellence and ML best practices.
Required Skills & Experience
- Experience : 7+ years building production ML systems, including 2+ years in speech or conversational AI. Proven track record deploying large-scale voice AI or LLM products.
- Fine-tuning & compression (LoRA, QLoRA, quantization, pruning, distillation).
- Speech (ASR : Whisper, NeMo, Kaldi; TTS : Tacotron, FastSpeech, VITS).
- LLMs & dialogue (GPT-class, RAG, LangGraph, LangChain, MCP).
- Strong in Python; bonus for TypeScript / Node / Java.
- Infra & Ops (Kubernetes, Helm, Terraform, MLflow / SageMaker).
- Data systems (Kafka, Redis, Postgres, Snowflake).
- Streaming protocols (gRPC, WebSockets, HTTP / 2, WebRTC).
- Security & compliance (HIPAA, SOC2, HITRUST).
- Product-oriented, entrepreneurial, strong problem solver, effective communicator, and technical leader.
Desired Skills & Experience
Optimize & Fine-Tune Models : Apply LoRA, QLoRA, RLHF, and other parameter-efficient techniques. Use quantization, pruning, and distillation to shrink models while preserving quality.Build End-to-End Pipelines : Design STT, TTS, and LLM systems achievingScale Inference : Optimize serving on Kubernetes / EKS with dynamic batching, speculative decoding, and streaming protocols.Advance Dialogue Management : Extend LangGraph / LangChain flows and MCP schemas for complex multi-turn conversations.Data & Evaluation : Develop pipelines for conversational logs (Kafka ? Snowflake / S3) and create frameworks to measure ASR accuracy, TTS quality, and task completion.Lead & Mentor : Define ML best practices, champion model CICD and monitoring, and mentor teammates on ML Ops, speech processing, and prompt engineering.Innovate & Research : Run POCs with cutting-edge models (e.g., Whisper-v3, Bark) and stay ahead of the latest in speech + LLM research.Ensure Reliability & Compliance : Implement HIPAA-grade security, PHI safeguards, and robust fallback strategies.The Offer
Bonus eligibleYou will receive the following benefits :
Medical, Dental, and Vision InsuranceVacation TimeStock OptionsApplicants must be currently authorized to work in the US on a full-time basis now and in the future.
#LI-EM1