Principal Kafka Support & Reliability Engineer
Purple DriveCanton, MA, Massachusetts, USACalibri,sans-serif">Role: Principal Kafka Support & Reliability Engineer ...Show more
Controller • brockton ma
Calibri,sans-serif">Role: Principal Kafka Support & Reliability Engineer ...Show more
The client are a fast growing industrial contractor that are growing as a team and looking for a Project Accountant to assist with the day to day operations of the Accounting function.You will be w...Show more
Ready to raise the bar on reliability? .Do you thrive in regulated environments and enjoy solving tough mechanical and electrical puzzles? In Brockton, MA, our Equipment Reliability Lead partners w...Show more
Driven by our passion to get people moving, playing and competing, we bring sports to life for athletes of all ages.Every day, we collaborate to bring innovation and quality craftsmanship across th...Show more
Our client provides services to businesses.They are creating a national network of world-class service providers united by a shared commitment to providing the highest levels of technical knowledge...Show more
Staff Accountant/Accounts Payable Specialist.At Symmons we are dedicated to delivering excellence in every aspect of our business.Our success is driven by our commitment to our core values: being c...Show more
MA - Braintree | First Electronics Corp; MA - Chelsea | First Electronics Corp.The First Electronics Corporation (FEC), a Trexon Company part of Amphenol (a public parent company), is actively seek...Show more
OP Specialty Bracing is growing, and we believe our team is the key to our success.We are hiring a Staff Accountant to join our team.If you are a motivated, caring individual who wants to make a di...Show more
Staff Accountant to join our Finance Team.The Staff Accountant will be a key member involved in the month-end close process, including preparing monthly journal entries, account reconciliations and...Show more
Peak Utility Infrastructure is an integrated engineering and construction company that serves the Electric, Natural Gas, and Telecommunications industries.We provide a full suite of engineering, co...Show more
Works closely with Assistant Branch Managers and Branch Manager to ensure efficient operations in Inventory, Front End, Cash Room and Receiving Departments.Maintains a positive company image by pro...Show more
Enterprise Mobility is currently hiring for an Accountant supporting our Boston Group out of the Administrative office - 405 West Street West Bridgewater, MA 02379.Gain real-world business, account...Show more
Senior HVAC Controls Commissioning & Integration Specialist .Let’s get to the point—are you the person people call when a BAS won’t behave?.Do you turn sequences of operation into stable, energy-ef...Show more
Community Action Agency (CAA) created to help families and individuals stabilize and secure their lives through education, strategic assistance, and building opportunities in their communities.We p...Show more
The average salary range is between $ 79,062 and $ 143,586 year , with the average salary hovering around $ 103,007 year .
Role: Principal Kafka Support & Reliability Engineer
Location: Canton, MA
Role Descriptions: Tier 3 Incident Management Escalation SupportAct as the highest technical escalation point for Kafka production incidents Sev 1 Sev 2.Lead deep troubleshooting across 1. Broker instability| controller elections| ISR shrinkage2. Under replicated partitions and leader imbalance3. Producerconsumer failures| lag spikes| and rebalance stormsDisk| network| JVM| and request handler saturationProvide hands on remediation for complex issues| including Partition reassignment and leader rebalanceBroker configuration tuningThrottlequota strategies for noisy producers or consumersCoordinate with vendor support during service incidents| providing logs| metrics| and forensic details.Guide Tier 2 teams during major incidents and validate restoration actions.2. Kafka Performance Engineering OptimizationAnalyze Kafka workloads for performance and scalability risks Partition skew and hot partitionsInefficient producer batchingcompressionConsumer lag root cause analysisThread pool| IO| and network bottlenecksRecommend and validate Topic design (partition count| replication factor| retention| compaction)Producer and consumer configuration best practicesQuotas| quotas enforcement| and multi tenant controlsSupport onboarding of high throughput or latency sensitive workloads| ensuring Kafka is correctly sized and tuned.3. Platform Stability| Reliability ResilienceDiagnose and resolve systemic Kafka stability issues Repeated broker failures or flappingMetadatacontroller instability (Zookeeper or KRaft)Recovery issues following failovers or maintenance eventsSupport resilience initiatives Multi AZ cluster health validationReplication and DR strategies (MirrorMaker 2| Replicator| or app level DR patterns)Failover testing and validationDefine and improve Kafka SLOs for availability| durability| and latency.4. Change| Upgrade Configuration LeadershipLead medium to high risk Kafka changes| including Broker and cluster configuration changesPartition expansion or large scale reassignmentTopic policy changes impacting durability or performanceSupport and plan Kafka version upgradesMSK Confluent upgrade cyclesClient compatibility and rollout strategiesParticipate in CAB reviews| assess risk| and design rollback and validation plans.5. Root Cause Analysis Continuous ImprovementOwn RCA documentation for major incidents with clear corrective and preventive actions (CAPA).Identify recurring failure patterns and architectural gaps.Recommend platform-level improvements Automation opportunitiesGuardrails and standardsMonitoring and alerting enhancementsContribute to continuous improvement of runbooks| knowledge base articles| and operational playbooks.
Essential Skills: Role OverviewThe Kafka Tier 3 Support Engineer is a senior technical role responsible for expert level support| advanced troubleshooting| performance engineering| and platform stabilization of enterprise Apache Kafka environments. This role functions as the final technical escalation point for Kafka-related production incidents and is accountable for root cause analysis (RCA)| complex remediation| and long term prevention. The engineer works closely with Tier 2 operations| Platform Engineering| SRE teams| application teams| and vendor support (AWS MSK Confluent Cloud providers) to ensure Kafka remains a highly reliable| scalable| and secure streaming backbone.
Desirable Skills:
Keyword:
Skills: Digital : Kafka~Digital: Amazon Connect~Digital : Kubernetes Experience Required: 10 & Above