Role : Site Reliability Engineering (SRE)
Location : Los Angeles, CA
Remote position
Fulltime position
JD
- Site Reliability Engineer
- Experience in Cloud platforms (AWS, Azure, Google Cloud) and hybrid environments.
- Proficiency in container technologies (Docker, Container, Podman).
- Strong knowledge of Linux administration and networking concepts.
- Experience with Infrastructure as Code (IaC) tools like Terraform, Ansible, Helm, or Pulumi.
- Monitoring and logging expertise using Prometheus, Grafana, ELK, Datadog, or Splunk.
- Hands-on experience with CI / CD pipelines and DevOps tools (Jenkins, GitHub Actions, GitLab CI, ArgoCD).
- Proficiency in scripting / programming (Python, Bash, Go) for automation.
- Strong troubleshooting and incident management skills.
- We are seeking a highly skilled - Site Reliability Engineer (SRE) to manage, optimize, and ensure the reliability of infrastructure.
- The ideal candidate will have deep expertise in ELK, Dynatrace Pagerduty.
- Powershell, container orchestration, cloud infrastructure, and automation, along with a strong focus on reliability, scalability, and performance. Good to have Logic Monitor and Python knowledge
- Reliability & Performance : Implement best practices to ensure high availability, scalability, and performance of containerized applications.
- Monitoring & Incident Response : Set up monitoring (Prometheus, Grafana, ELK, Dynatrace, Pagerduty, Powershell etc.), troubleshoot issues, and lead incident resolution.
- Automation & Infrastructure as Code (IaC) : Develop and maintain Terraform, Helm charts, and Kubernetes manifests for automation.
- CI / CD & DevOps Integration : Work with DevOps teams to optimize CI / CD pipelines for Kubernetes deployments (Jenkins, ArgoCD, FluxCD, etc.).
- Security & Compliance : Implement security best practices for containerized workloads, RBAC, network policies, and vulnerability scanning.
- Capacity Planning & Optimization : Analyze resource usage and optimize infrastructure costs and performance.
- Disaster Recovery & Backup : Implement backup and disaster recovery strategies for Kubernetes workloads.
Thanks, and have a nice day
Manikanth
Sarian Solutions, Inc. |Ph : 732-790-2266 x 201 |Fax : 732-696-4242|manikanth.d@sariansolutions.com
www.sariansolutions.com | Certified Minority Business Enterprise (WMBE)
follow us : @sariansol | Check our current openings at https : / / www.sarianinc.com / work-with-us