Talent.com
Site Reliability Engineering
Site Reliability EngineeringTechniPros • Atlanta, Georgia, USA
Site Reliability Engineering

Site Reliability Engineering

TechniPros • Atlanta, Georgia, USA
[job_card.variable_days_ago]
[job_preview.job_type]
  • [job_card.full_time]
[job_card.job_description]

Job Title : Site Reliability Engineering (SRE) Architect

Location : Atlanta Georgia (Hybrid)

Long Term Contract

Looking for W2 Candidates. No C2C

Job Description : Role Summary :

As an SRE Architect you will be a pivotal technical leader responsible for designing building and evolving the foundational systems and practices that ensure the reliability scalability performance and efficiency of our critical services. Moving beyond day-to-day operations you will focus on the strategic architectural direction of SRE function defining standards blueprints and frameworks that enable development teams and fellow SRE operations team to build and operate highly resilient systems. Leverage deep expertise in software engineering distributed systems cloud infrastructure and SRE principles to influence technology choices establish best practices and foster a proactive culture of reliability across the organization and much beyond observability pillar.

Key Responsibilities :

Reliability Strategy & Design :

  • Architect and design highly available scalable secure and cost-effective infrastructure and application patterns on AWS
  • Define and evangelize SRE best practices standards and blueprints for service design deployment monitoring and operational readiness across the engineering organization
  • Review current observability implementation to identify gaps and define steps to reach next level maturity of observability setup to provide deep insights into system health and behaviour
  • With overall maturity lead the definition and implementation strategy for Service Level Indicators (SLIs) Service Level Objectives (SLOs) and Error Budgets for critical services

Platform Architecture & Automation :

  • Design solutions to systematically reduce operational toil through automation and improved system design
  • Evaluate current SRE tools and automation frameworks (e.g. CI / CD pipelines Infrastructure as Code modules automated incident remediation chaos engineering platforms) and suggest enhancement that will help overall enhancement of capability
  • Evaluate prototype and recommend new technologies tools and methodologies to enhance system reliability developer productivity and operational efficiency
  • Technical Leadership & Consultation :

  • Act as a senior technical advisor and subject matter expert on reliability scalability and performance for development and platform teams
  • Provide architectural guidance during the design phase of new services and features to ensure reliability principles are embedded early (shift-left)
  • Mentor and coach other SREs and engineers fostering technical excellence and adherence to SRE principles
  • Lead architectural reviews and production readiness assessments for critical systems
  • Resilience :

  • Lead blameless postmortems for significant incidents ensuring root causes are identified and systemic architectural improvements are prioritized and implemented
  • Architect and advocate for resilience patterns (e.g. circuit breaking rate limiting graceful degradation chaos engineering) within applications and infrastructure
  • Required Qualifications :

  • Proven experience in an architectural role designing solutions for reliability scalability and performance
  • Deep understanding and practical application of SRE principles (SLIs / SLOs error budgets toil reduction automation incident management postmortems)
  • Expertise in cloud computing platforms (e.g. AWS) including infrastructure networking and security services
  • Strong experience with containerization and orchestration technologies (Kubernetes Docker serverless computing)
  • Solid experience designing and implementing observability solutions (e.g. Dynatrace Prometheus Grafana ELK / EFK Stack Jaeger OpenTelemetry)
  • Strong programming / scripting skills (e.g. Python Go Bash) for automation and tool development
  • Excellent analytical problem-solving and strategic thinking skills.
  • Strong communication collaboration and leadership skills with the ability to influence technical direction across teams
  • Preferred Qualifications :

  • Experience designing and implementing chaos engineering practices and platforms
  • Best Regards : Jahnavi G

    Phone : 1-

    Email : Key Skills

    Kubernetes,FMEA,Continuous Improvement,Elasticsearch,Go,Root cause Analysis,Maximo,CMMS,Maintenance,Mechanical Engineering,Manufacturing,Troubleshooting

    Employment Type : Full Time

    Experience : years

    Vacancy : 1

    [job_alerts.create_a_job]

    Site Reliability Engineering • Atlanta, Georgia, USA

    [internal_linking.related_jobs]
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    Cloudious LLC • Atlanta, Georgia, USA
    [job_card.full_time]
    Senior Site Reliability Engineer.Manage and optimize data streaming and API components in OpenShift Onpremise and AWS.Proactively review the applications APIs and processes to identify opportunitie...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Field Technician - Entry Level

    Field Technician - Entry Level

    TEKsystems • Fayetteville, GA, United States
    [job_card.full_time]
    Our client has multiple needs open for entry level General Labor Field Techs.These technicians will be learning how to complete fiber drops, run conduit, blow fiber, and test fiber on a grand scale...[show_more]
    [last_updated.last_updated_variable_hours] • [promoted] • [new]
    Senior Validation Engineering Manager

    Senior Validation Engineering Manager

    OSI Engineering • Johns Creek, GA, US
    [job_card.full_time]
    Senior Validation Engineering Manager A leading chip and silicon IP provider is looking to hire a Validation Manager to join its Memory Interface Chip business unit. In this full-time role, you’ll c...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Site Reliability Engineering (SRE) Architect

    Site Reliability Engineering (SRE) Architect

    QTech • Atlanta, Georgia, USA
    [job_card.full_time]
    Job Title : Site Reliability Engineering (SRE) Architect.Location : Atlanta Georgia (Hybrid).As an SRE Architect you will be a pivotal technical leader responsible for designing building and evolving...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Travel Nuclear Medicine Tech - $2,473 per week in Fayetteville, GA

    Travel Nuclear Medicine Tech - $2,473 per week in Fayetteville, GA

    AlliedTravelCareers • Fayetteville, GA, US
    [job_card.full_time]
    AlliedTravelCareers is working with FlexCare to find a qualified Nuclear Medicine Tech in Fayetteville, Georgia, 30214!.FlexCare is a nationwide leader in the staffing of travel nurses and clinicia...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Manager Site Reliability Engineering

    Manager Site Reliability Engineering

    RELX • Alpharetta, GA, US
    [job_card.full_time]
    Are you an experienced site reliability engineering leader ready to shape strategy, inspire teams, and drive innovation at scale? Are you looking to lead a high-impact sre team where your leadershi...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Site Reliability Engineer

    Site Reliability Engineer

    CD Newco LLC d / b / a Curve Dental • Alpharetta, Georgia, United States, 30009
    [job_card.full_time]
    At Flex Dental, we go beyond checking boxes; our integration and automation are unparalleled.Every feature serves a purpose, creating seamless collaboration with Open Dental’s practice management s...[show_more]
    [last_updated.last_updated_30]
    IT|Software Engineering - Group 2 - Lead II - Software Engineering

    IT|Software Engineering - Group 2 - Lead II - Software Engineering

    Axelon Services Corporation • Atlanta, GA, US
    [job_card.full_time]
    Job Description : Lead II - Software Engineering.Act creatively to develop applications by selecting appropriate technical options, optimizing application development, maintenance, and performance b...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Registered Behavior Technician / RBT

    Registered Behavior Technician / RBT

    BrightSpring Health Services • Griffin, GA, United States
    [job_card.part_time]
    Registered Behavior Technician / RBT.SpringHealth Behavioral Health and Integrated Care.SpringHealth Behavioral Health and Integrated Care. Looking for an RBT for a home case in Griffin, Monday-Wedn...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Site Reliability Engineer

    Site Reliability Engineer

    Donato Technologies, Inc • Atlanta, Georgia, USA
    [job_card.full_time]
    Senior Site Reliability Engineer.Manage and optimize data streaming and API components in OpenShift Onpremise and AWS.Proactively review the applications APIs and processes to identify opport...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Software Engineering Senior Manager, CCaaS

    Software Engineering Senior Manager, CCaaS

    Credit Acceptance Corporation • Atlanta, GA, US
    [job_card.full_time]
    Credit Acceptance is proud to be an award-winning company with local and national workplace recognition in multiple categories! Our world-class culture is shaped by dedicated Team Members who share...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Carpenter Helpers for Concrete Construction - Atlanta, GA

    Carpenter Helpers for Concrete Construction - Atlanta, GA

    CECO CONCRETE CONSTRUCTION • Atlanta, GA, US
    [job_card.full_time]
    Are you looking for an opportunity to move your career forward with an established industry leader? Join our team at Ceco Concrete Construction! Founded in 1912, Ceco has more than 100 years of exp...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Site Reliability Engineer

    Site Reliability Engineer

    T-Mobile USA, Inc. • Atlanta, GA, United States
    [job_card.full_time] +1
    At T-Mobile, we invest in YOU! Our Total Rewards Package ensures that employees get the same big love we give our customers. All team members receive a competitive base salary and compensation pack...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Civil Engineer – Land Development

    Civil Engineer – Land Development

    Jobot • Berkeley Lake, GA, US
    [job_card.full_time]
    Growing Engineering Firm | Great Compensation Package | Upwards Career Growth!.This Jobot Job is hosted by : Lauren Lehman. Are you a fit? Easy Apply now by clicking the "Apply Now" button and sendin...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Senior Systems Reliability Engineer - Now Hiring!

    Senior Systems Reliability Engineer - Now Hiring!

    ADP • Alpharetta, GA, United States
    [job_card.full_time]
    Senior Systems Reliability Engineer in our Alpharetta, GA location.Are you empathetic to client needs and inspired by transformation and impacting the lives of millions of people every day?.Are you...[show_more]
    [last_updated.last_updated_variable_days]
    Site Reliability Engineering Manager (Alpharetta)

    Site Reliability Engineering Manager (Alpharetta)

    LexisNexis Risk Solutions • Alpharetta, GA, US
    [job_card.part_time]
    Are you an experienced Site Reliability Engineering leader ready to shape strategy, inspire teams, and drive innovation at scale?. Are you looking to lead a high-impact SRE team where your leadershi...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Cloud Infrastructure Site Reliability Engineer (SRE) (Alpharetta)

    Cloud Infrastructure Site Reliability Engineer (SRE) (Alpharetta)

    Intelliswift - An LTTS Company • Alpharetta, GA, United States
    [job_card.full_time]
    Job Posting Title : Cloud Infrastructure Site Reliability Engineer (SRE).Location : Alpharetta, GA or Berkeley Heights, NJ. As a Cloud Infrastructure Site Reliability Engineer (SRE) with expertise in ...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Milner, GA - Company

    Milner, GA - Company

    Grammer Logistics • Hampton, GA
    [job_card.full_time]
    Sulfuric Acid, Ammonia, Propane / Butane, Natural Gas Condensate, Acetic Acid, Nitric Acid, and several others.These commodities are used across the US for various needs such as fuel, fertilizers, ...[show_more]
    [last_updated.last_updated_1_day] • [promoted]