Job Description :
Pay Range : $50hr - $55hr
- The Java Site Reliability Engineer (SRE) is responsible for ensuring the availability, performance, scalability, and security of Java-based applications and microservices.
- This role combines strong Java engineering expertise with DevOps and automation capabilities to improve system reliability, optimize deployments, and enhance observability.
- The ideal candidate will collaborate closely with development and infrastructure teams to drive operational excellence and continuous improvement.
Responsibilities :
Reliability And Performance :
Ensure high availability and reliability of Java-based applications and microservices.Monitor system performance, analyze logs, and troubleshoot production issues.Support SLO, SLI, and SLA definitions, tracking, and optimization.Perform capacity planning, load testing, and performance tuning.Automation And DevOps Enablement :
Automate operational processes using scripting and tools such as Python, Bash, Ansible, and Jenkins.Develop and maintain CI / CD pipelines to improve delivery speed and stability.Implement Infrastructure as Code using Terraform, Helm, or CloudFormation.Enhance deployment workflows and reduce manual operational efforts.Java Engineering And Platform Support :
Collaborate with development teams to improve service architecture and reliability.Review and optimize Java services (Spring Boot and Microservices) for performance and resiliency.Participate in on-call rotation and lead incident response and root cause analysis.Monitoring, Logging, And Observability :
Implement and optimize observability tools such as Prometheus, Grafana, ELK, Client, or Datadog.Build dashboards, alerts, and metrics to detect anomalies and prevent outages.Develop automated self-healing and remediation solutions.Security And Compliance :
Enforce secure coding and deployment practices.Ensure compliance with organizational security standards.Patch vulnerabilities and support regular audits.Requirement / Must Have :
Bachelor’s degree in Computer Science, Engineering, or related field (or equivalent experience).8+ years of experience as an SRE, DevOps Engineer, or Java Engineer.Strong experience with Java 8+, Spring Boot, Microservices, and JVM tuning.Hands-on experience with Linux system administration and performance tuning.Experience with CI / CD tools such as Jenkins, GitLab CI, GitHub Actions, or Azure DevOps.Experience with containers and orchestration tools such as Docker and Kubernetes (GKE, EKS, AKS, or OpenShift).Experience with Infrastructure as Code tools like Terraform, Helm, or Ansible.Experience with monitoring and logging tools such as Prometheus, Grafana, ELK, or Client.Strong debugging and incident management skills.Proficiency in scripting languages such as Python, Bash, or Shell.Should Have :
Experience with cloud platforms such as AWS, Azure, or GCP.Knowledge of distributed systems, service mesh, and API gateway technologies.Experience with message queues such as Kafka or RabbitMQ.Familiarity with chaos engineering and resilience testing.Experience with blue-green, canary, or progressive deployment strategies.Soft Skills :
Strong analytical thinking and ability to troubleshoot complex systems.Excellent communication and cross-team collaboration skills.Ownership mindset with a focus on continuous improvement.Ability to work in fast-paced Agile environments.Qualification And Education :
Bachelor’s degree in Computer Science, Engineering, or related field required.