U.S. Citizens and those authorized to work in the U.S. are encouraged to apply. We are unable to sponsor at this time. No Corp to Corp.
As a Site Reliability Engineer, you will help design, analyze and resolve issues with infrastructure in collaboration with product development teams; you will design, deploy and manage automation tools that increase predictability as well as decrease time to market while reducing cost.
Required Skills: • Expertise with cloud- continuous-deployment- based software development lifecycles (e.g. CI/CD) • Mastery of infrastructure automation technologies (like Terraform, CodeDeploy, Puppet, Ansible, Chef) • Expertise in container/container-fleet-orchestration technologies (like Docker, Vagrant, Mesosphere, etcd, zookeeper) • Cloud and container native Linux administration/build/management skills (e.g. AWS AMIs, Packer, etc.) • Cloud database operations and deployment experience (e.g. RDS MySQL/Postgres/Aurora), Caching operations & deployment experience (e.g. memcache, Redis) • Expertise with Lean/Agile deployment processes (Blue/Green, ZDT, canary, load balancers/DNS strategies) • Familiarity with site and infrastructure monitoring systems (like AWS Cloudwatch, Datadog, New Relic, Sumologic) • Strong problem solving, root cause analysis and systems engineering skills • Excellent presentation and communication skills • Experience with programming in languages like Javascript, Python, PHP, Go, or Ruby; • Strong skills in reading, understanding and writing code in the same • Ability to design and manage escalation response plans from monitoring, react, respond, remediate and retrospect in culturally aligned (proactive, customer focused, collaborative, data-driven) ways • Demonstrated expertise building and managing highly scaled production infrastructure in the cloud (AWS required; GCP, Azure, OpenStack a plus) • Expertise with SDLC branching, SCM, and code deployment systems (e.g. git/gitflow, Jenkins, CircleCI, TravisCI, etc.) • BS Degree in Computer Science (or related technical field and/or equivalent industry experience)
Senior Site Reliability Engineer (SRE) • Needham, MA, US