Job title : Cockroach DBA
Job Location : Austin, TX / Sunnyvale, CA – Onsite
Job Type : Fulltime
Job Description : Key Responsibilities
- Design, deploy, operate, and scale multi-region Cockroach DB clusters in production environments
- Ensure high availability, fault tolerance, and data consistency for globally distributed clusters
- Monitor cluster health, latency, replication status, and resource utilization using observability tools
- Perform capacity planning and proactive scaling for future growth
- Troubleshoot complex database and infrastructure issues including :
- Node failures
- Network partitions
- Leaseholder and range imbalance
- Replication lag
- Hot spotting
- High latency / throughput bottlenecks
- Design disaster recovery strategies (multi-region, backup / restore, failover / fallback)
- Implement and test backup, restore, and point-in-time recovery processes
- Automate provisioning, scaling, patching, and upgrades of CRDB clusters
- Perform rolling upgrades with zero or near-zero downtime
- Optimize SQL query performance and database schema efficiency
- Create operational runbooks, SOPs, and on-call playbooks for CRDB
- Participate in on-call rotations and incident response for production clusters