Site Reliability Engineer (SRE)
San Jose CA 95117
Description
- Experience of maintaining production systems on AWS and / or GCP.
- Experience of Kubernetes clusters maintenance managing and debugging containerized applications (Golang Java Python).
- Understanding of Kafka Spark Storm Cassandra ElasticSearch PostgreSQL Redis (Elasticache) Zookeeper Nginx AWS S3 / GCP GS.
- Understanding of infrastructure as code software (e.g. Terraform AWS and Google Cloud Deployment CloudFormation).
- Experience in continuous integration practices & tools (Jenkins Travis CI CircleCI etc. )
- DocuSign Envelope ID : 7CAEB9E4 D760 461D BF92
- Experience with monitoring solutions such as : CloudWatch Stackdriver Prometheus Thanos
- Graphite Grafana ELK Alert Logic Datadog.
- Experience with logging service solutions.
Key Skills
Kubernetes,FMEA,Continuous Improvement,Elasticsearch,Go,Root cause Analysis,Maximo,CMMS,Maintenance,Mechanical Engineering,Manufacturing,Troubleshooting
Employment Type : Full Time
Experience : years
Vacancy : 1