Job Description
Job Description
Role Overview
We are seeking a Senior DevOps Engineer to design, build, and operate secure, scalable CI / CD and infrastructure platforms supporting production AI and ML workloads. This role enables Machine Learning Engineers, MLOps Engineers, and Data Engineers by ensuring reliable deployment, monitoring, and operation of mission-critical systems.
This is a hands-on infrastructure ownership role for an engineer comfortable operating production platforms, troubleshooting complex systems, and working in security-conscious, compliance-driven environments.
Responsibilities DevOps & Platform Engineering
- Design, implement, and maintain automated CI / CD pipelines supporting application, data, and ML deployments.
- Build, operate, and scale Kubernetes-based platforms for containerized workloads.
- Manage and optimize Linux-based production systems to ensure performance, reliability, and scalability.
- Implement infrastructure as code using tools such as Terraform or equivalent frameworks.
- Support infrastructure for data and ML platforms handling TB- to PB-scale datasets.
Reliability, Monitoring & Security
Monitor system health, performance, and availability using tools such as Prometheus, Grafana, and centralized logging solutions.Ensure infrastructure and deployment pipelines meet availability, reliability, and performance targets, including 99.9% uptime.Partner with security and compliance teams to align platforms with standards such as NIST 800-53 and FedRAMP.Troubleshoot and resolve infrastructure, deployment, and performance issues in production environments.Collaboration & Enablement
Work closely with ML, MLOps, and Data Engineering teams to enable reliable model training, deployment, and scaling.Provide platform tooling, documentation, and operational guidance to development teams.Contribute to operational runbooks, system documentation, and continuous improvement initiatives.Required Qualifications
U.S. Citizen with an active DoD, Intelligence Community, or DHS clearance, or eligibility to obtain and maintain one.Bachelors degree in Computer Science, Information Technology, or a related field, or equivalent professional experience.5+ years of DevOps engineering experience supporting production environments.5+ years of Linux system administration experience, including performance tuning and troubleshooting.Hands-on experience with Kubernetes and Docker in production environments.Experience deploying and managing infrastructure in Azure, AWS, or GCP, with Azure experience strongly preferred.Proficiency with scripting languages such as Bash and / or Python.Preferred Qualifications
Experience supporting AI or ML workloads in production environments.Experience operating in federal, defense, healthcare, or other regulated environments.Familiarity with monitoring and logging stacks such as Prometheus, Grafana, and ELK.Experience with infrastructure-as-code tools such as Terraform.Hands-on experience supporting hybrid or bare-metal infrastructure environments.Relevant certifications, including :Certified Kubernetes Administrator (CKA)
AWS Certified DevOps Engineer – ProfessionalMicrosoft Certified : Azure DevOps Engineer ExpertBenefits & Professional Growth
Competitive salary and comprehensive health benefits.401(k) with company matching.Clearance sponsorship for eligible candidates.Training and certification support for DevOps, Kubernetes, and cloud platforms.Clear growth path into Lead DevOps Engineer or Platform Engineering leadership roles.Equal Employment Opportunity
PTF Consulting, LLC is an Equal Employment Opportunity employer committed to building a diverse and inclusive workforce. We provide equal opportunity to all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, age, disability, or veteran status, in compliance with applicable federal and state employment laws.