Distributed Systems Engineer - San Francisco, CA
A company building frontier-scale AI models that automate software engineering and AI research, combining ultra-long context, domain-specific RL, and massive compute infrastructure are looking for a Distributed Systems Engineer to join their team.
What Will I Be Doing :
- Design and build distributed data and coordination systems that enable ultra-long-context model training and inference
- Develop high-performance storage and caching systems to support large-scale GPU workloads
- Work deep in the internals of modern deep learning frameworks in highly distributed environments
- Build automation for fault detection, recovery and high availability across GPU clusters
- Troubleshoot complex, cross-stack issues spanning GPUs, networking, storage, operating systems and cloud infrastructure
What We’re Looking For :
Deep expertise in distributed systems design and public cloud platformsProven experience designing and operating highly available, high-throughput data systemsStrong knowledge of distributed databases, batch or stream processing systems, and / or distributed file systemsExceptional problem-solving ability across the full systems stackA hands-on mindset with the curiosity and grit to learn fast in a frontier technical environmentWhat’s In It for Me :
Salary of $225K–$550K dependent on experience + significant equityGreat benefits inc. 401(k) with 6% company match, comprehensive health, unlimited PTOVisa sponsorship and SF relocation stipend availableWell-funded ($465M+) with backing from top investorsApply now for immediate consideration!