DevOps Engineer w / DataDog Experience
Location : Richmond VA (NO RELOCATION)
Visa : GC / USC
Interview : Virtual Round
Duration : 6 months
Note :
Prior Capital One Experience is preferred or Strong Financial client experience in recent years as well
Key Responsibilities :
- Implement and manage full-stack observability using Datadog ensuring seamless monitoring across infrastructure applications and services.
- Instrument agents for on-premise cloud and hybrid environments to enable comprehensive monitoring.
- Design and deploy key service monitoring including dashboards monitor creation SLA / SLO definitions and anomaly detection with alert notifications.
- Configure and integrate Datadog with third-party services such as ServiceNow SSO enablement and other ITSM tools.
Core Responsibilities
Design & Implement Solutions : Build and maintain comprehensive observability platforms that provide deep insights into complex systems incorporating logs metrics and traces.System Instrumentation : Instrument applications infrastructure and services to collect telemetry data using frameworks like OpenTelemetry.Data Analysis & Visualization : Develop dashboards reports and alerts using tools like Prometheus Grafana and Splunk to visualize system performance and detect issues.Collaboration : Work with development SRE and DevOps teams to integrate observability best practices and align monitoring with business and operational goals.Automation : Develop scripts and use Infrastructure as Code (IaC) tools like Ansible and Terraform to automate monitoring configurations and telemetry collection.Key Skills & Tools
Observability Tools : Proficiency in monitoring logging and tracing tools including Prometheus Grafana ELK Stack (Elasticsearch Logstash Kibana) Splunk Datadog New Relic and cloud-native solutions like AWS CloudWatch.Programming Languages : Expertise in languages such as Python and Go for scripting and automation.Infrastructure & Cloud Platforms : Experience with cloud platforms (AWS GCP Azure) and container orchestration systems like Kubernetes.Infrastructure as Code (IaC) : Familiarity with Terraform and Ansible for managing infrastructure and configurations.CI / CD & Automation : Experience with CI / CD pipelines and automation tools like Jenkins.System & Software Engineering : A strong background in both system operations and software development.Optimize cloud agent instrumentation with cloud certifications being a plus.Datadog Fundamental APM and Distributed Tracing Fundamentals & Datadog Demo Certification (Mandatory)Strong understanding of Observability concepts (Logs Metrics Tracing)Expertise in security & vulnerability management in observabilityPossesses 2 years of experience in cloud-based observability solutions specializing in monitoring logging and tracing across AWS Azure and GCP environments.Key Skills
Computer Science,user experience,User Interface,SME,CSS,Interaction Design,Windows,Android,Usability Studies,Visual Design,HTML,User Research,JavaScript,Web Services,Wireframes
Employment Type : Full Time
Experience : years
Vacancy : 1