A company is looking for an AI Model Serving Specialist to operationalize AI workloads by deploying and optimizing model-serving platforms.
Key Responsibilities
Package and deploy ML / LLM models on platforms like Triton, vLLM, or KServe within Kubernetes clusters
Integrate models with Unified Inference API and ensure secure GPU resource allocation
Assist solution architects in onboarding customers and provide troubleshooting support
Required Qualifications
Hands-on experience with NVIDIA Triton, vLLM, or similar serving stacks
Strong knowledge of Kubernetes and GPU scheduling
Familiarity with VMware VCF9 and NSX-T networking
Proficiency in Python and containerization (Docker)
Understanding of observability stacks and FinOps principles
Ai Specialist • Coral Gables, Florida, United States