Talent.com
Jane Street
Machine Learning Performance EngineerJane Street • New York, New York, US
Machine Learning Performance Engineer

Machine Learning Performance Engineer

Jane Street • New York, New York, US
30+ days ago
Job type
  • Full-time
Job description

We are looking for an engineer with experience in low-level systems programming and optimization to join our growing ML team.

is a critical pillar of Jane Street's global business. Our ever-evolving trading environment serves as a unique, rapid-feedback platform for ML experimentation, allowing us to incorporate new ideas with relatively little friction.

Your part here is optimizing the performance of our models – both training and inference. We care about efficient large-scale training, low-latency inference in real-time systems, and high-throughput inference in research. Part of this is improving straightforward CUDA, but the interesting part needs a whole-systems approach, including storage systems, networking, and host- and GPU-level considerations. Zooming in, we also want to ensure our platform makes sense even at the lowest level – is all that throughput actually goodput? Does loading that vector from the L2 cache really take that long?

If you’ve never thought about a career in finance, you’re in good company. Many of us were in the same position before working here. If you have a curious mind and a passion for solving interesting problems, we have a feeling you’ll fit right in.

There’s no fixed set of skills, but here are some of the things we’re looking for:

  • An understanding of modern ML techniques and toolsets
  • The experience and systems knowledge required to debug a training run’s performance end to end
  • Low-level GPU knowledge of PTX, SASS, warps, cooperative groups, Tensor Cores, and the memory hierarchy
  • Debugging and optimization experience using tools like CUDA GDB, NSight Systems, NSight Compute
  • Library knowledge of Triton, CUTLASS, CUB, Thrust, cuDNN, and cuBLAS
  • Intuition about the latency and throughput characteristics of CUDA graph launch, tensor core arithmetic, warp-level synchronization, and asynchronous memory loads
  • Background in Infiniband, RoCE, GPUDirect, PXN, rail optimization, and NVLink, and how to use these networking technologies to link up GPU clusters
  • An understanding of the collective algorithms supporting distributed GPU training in NCCL or MPI
  • An inventive approach and the willingness to ask hard questions about whether we're taking the right approaches and using the right tools

If you're a recruiting agency and want to partner with us, please reach out to .

Create a job alert for this search

Machine Learning Performance Engineer • New York, New York, US

Similar jobs

Machine Learning Engineer

Hayward HawkNew York, New York, United States
Full-time

Get AI-powered advice on this job and more exclusive features.Direct message the job poster from Hayward Hawk.IT Specialist | Principal Recruitment Consultant | 02895 902688 |.Hayward Hawk are del... Show more

 • Promoted

Lead Machine Learning Engineer

HarnhamNew York, New York, United States
Full-time

Get AI-powered advice on this job and more exclusive features.This range is provided by Harnham.Your actual pay will be based on your skills and experience — talk with your recruiter to learn more.... Show more

 • Promoted

Machine Learning Software Engineer

The Hagen Ricci GroupNew York, NY, United States
Full-time

We are actively recruiting for Senior Software Engineers for our client's offices in the New York and Chicago These persons have demonstrated experience in developing, managing, and maintaining Mac... Show more

 • Promoted

Principal AI / Machine Learning Engineer

ZT SystemsSecaucus, NJ, United States
Permanent

Principal AI / Machine Learning Engineer.The Principal AI/Machine Learning Engineer will oversee defining and executing ZT’s roadmap for applying artificial intelligence and machine learning in man... Show more

 • Promoted

Senior Staff Machine Learning Engineer - Trusted Identity

UberNew York, NY, United States
Full-time

We are looking for an experienced Senior Staff Machine Learning Engineer to join the Account Integrity team within Trusted Identity engineering org at Uber.The Trusted Identity org plays a crucial ... Show more

 • Promoted

Staff Machine Learning Research Engineer, Agent Post-training - Enterprise GenAI

Scale AINew York, NY, United States
Full-time

Staff Machine Learning Research Engineer, Agent Post-training - Enterprise GenAI.AI is becoming vitally important in every function of our society.At Scale, our mission is to accelerate the develop... Show more

 • Promoted

Senior Principal Data Scientist & ML Engineer — Production AI

Northrop Grumman Corp. (JP)New York, NY, United States
Full-time

JP) in New York is seeking a Data Scientist / Machine Learning Engineer.In this hybrid role, you'll collaborate with engineers and program managers to develop analytics solutions and applications w... Show more

 • Promoted

Applied Machine Learning Engineer

Fireworks AINew York, New York, United States
Full-time

At Fireworks, we’re building the future of generative AI infrastructure.Our platform delivers the highest-quality models with the fastest and most scalable inference in the industry.We’ve been inde... Show more

 • Promoted

Sr. Manager, Machine Learning Engineering

The Walt Disney CompanyNew York, NY, United States
Full-time

You must be in the area or open to relocating.The cross-media measurement and advanced analytics organization is responsible for data strategy & management, cross-platform content measurement, Cont... Show more

 • Promoted

Senior NLP ML Scientist — GenAI & Production Leader

J.P. MorganNew York, NY, United States
Full-time

A leading global financial institution is seeking a Machine Learning Specialist to research and develop advanced models, specifically in Generative AI.In this role, you will collaborate closely wit... Show more

 • Promoted

VP, Data Science & ML Leader - Fixed Income

TWG AINew York, NY, United States
Full-time

A leading technology firm is seeking a Staff Machine Learning Engineer (VP) to join their AI Science team in New York.The role involves architecting and deploying ML systems that influence critical... Show more

 • Promoted

Principal Core Science Machine Learning Scientist

ParamountNew York, NY, United States
Full-time

WeAreParamount on a mission to unleash the power of content.We've got the brands, we've got the stars, we've got thepowerto achieve our mission to entertain the planet - now all we're missing is.YO... Show more

 • Promoted

Machine Learning Research Engineer, GenAI Applied ML

ScaleNew York, New York, United States
Full-time

Machine Learning Research Engineer, GenAI Applied ML.At Scale AI, our mission is to accelerate the development of AI applications.For 8 years, Scale has been the leading AI data foundry, helping fu... Show more

 • Promoted

VP, Data Science / Machine Learning Lead - Capital Markets & Fixed Income

TWG AINew York, NY, United States
Full-time

At TWG Group Holdings, LLC ("TWG Global"), we drive innovation and business transformation across a range of industries, including financial services (particularly capital markets and fixed income)... Show more

 • Promoted

Principal AI / Machine Learning Engineer

ZT Systems groupSecaucus, NJ, United States
Full-time

About The Role**The Principal AI/Machine Learning Engineer will oversee defining and executing ZT’s roadmap for applying artificial intelligence and machine learning in manufacturing.The AI/ML Tran... Show more

 • Promoted

Senior / Principal Machine Learning Scientist, Scientific Reasoning Models, AI for Drug Discovery

GenentechNew York, NY, United States
Full-time

It’s what drives us to innovate.To continuously advance science and ensure everyone has access to the healthcare they need today and for generations to come.Creating a world where we all have more ... Show more

 • Promoted

Senior Full-Stack Computer Vision Engineer – ML in Production

ButterflyMXNew York, NY, United States
Full-time

ButterflyMX is seeking a Senior Full Stack Computer Vision Engineer to enhance access control products using machine learning.This role involves designing, training and deploying models, and buildi... Show more

 • Promoted

Machine Learning Research Engineer, Agents - Enterprise GenAI

Scale AINew York, New York, United States
Full-time

Machine Learning Research Engineer, Agents - Enterprise GenAI.AI is becoming vitally important in every function of our society.At Scale, our mission is to accelerate the development of AI applicat... Show more

 • Promoted

Sr. Machine Learning Engineer

Canoe IntelligenceNew York, NY, United States
Full-time

New York City or London (hybrid) / Fully Remote in the United States or United Kingdom.NYC, will be adjusted for geo).We are looking for a Senior Machine Learning Engineer to design and deploy mode... Show more

 • Promoted

Sr. Manager, Machine Learning Engineering

The Walt Disney Company (Germany) GmbHNew York, NY, United States
Full-time

Manager, Machine Learning Engineering.You must be in the area or open to relocating.The cross-media measurement and advanced analytics organization is responsible for data strategy & management, cr... Show more