A company is looking for a Senior Site Reliability Engineer.
Key Responsibilities
Improve availability, scalability, and performance of distributed systems across device, edge, and cloud
Participate in incident response, triage, and post-incident reviews while developing automation and self-healing systems
Design and enhance observability systems and dashboards to monitor system health and AI service behavior
Required Qualifications
10+ years of experience in Site Reliability Engineering, Production Engineering, DevOps, or large scale distributed systems operations
Bachelor's Degree in Computer Science, Engineering, or a related technical discipline
Strong experience running production distributed systems at scale
Proficiency in at least one modern programming language (e.g., Python, Go, Java, C++)
Hands-on experience with cloud environments (Azure, AWS, or GCP)
Senior Site Reliability Engineer • Hartford, Connecticut, United States