Talent.com
Software Engineer - ML/LLM Inference
Software Engineer - ML/LLM InferenceAlldus • San Francisco, CA, United States
Software Engineer - ML / LLM Inference

Software Engineer - ML / LLM Inference

Alldus • San Francisco, CA, United States
[job_card.30_days_ago]
[job_preview.job_type]
  • [job_card.full_time]
[job_card.job_description]

Get AI-powered advice on this job and more exclusive features.

Direct message the job poster from Alldus

Principal Recruitment Consultant | AI & Machine Learning | Co-organizer of the AI in Action Podcast

My client is searching for a talented engineer to work on ML / LLM inference and serving. They specialize in developing next-gen LLM fine-tuning and inference engines.

We are seeking a talented and motivated Software Engineer specializing in Machine Learning (ML) and Large Language Model (LLM) inference to join our dynamic ML Inference team. In this role, you will bridge the gap between AI / ML research and systems programming to build and enhance our next-generation LLM Inference Engine. You will play a crucial role in optimizing the performance, scalability, and efficiency of our LLM serving systems.

Key Responsibilities :

Develop and Enhance Inference Engine :

  • Design, implement, and optimize the next-generation LLM Inference Engine.
  • Integrate the latest LLM inference techniques from research to enhance latency and throughput.

Performance Optimization :

  • Conduct deep performance optimizations across multiple layers of the technology stack, including PyTorch, C++, and CUDA.
  • Analyze and improve system performance to meet the demands of various use cases.
  • Work closely with customers to understand specific performance requirements and optimize solutions accordingly.
  • Provide technical expertise and support to ensure successful deployment and operation of inference systems.
  • Technical Leadership :

  • Define the roadmap and technical vision for the inference stack.
  • Lead initiatives to drive innovation and maintain the competitive edge of our inference technologies.
  • Infrastructure Development :

  • Collaborate with partner teams to build and maintain scalable, multi-replica serving infrastructure.
  • Ensure the reliability and scalability of LLM serving systems to handle increasing workloads.
  • Qualifications : Technical Skills :

  • Proficiency in systems programming languages such as C++.
  • Strong experience with machine learning frameworks, particularly PyTorch.
  • Expertise in GPU programming and CUDA for performance optimization.
  • Solid understanding of AI / ML concepts, especially related to large language models.
  • Experience :

  • Proven experience in developing and optimizing ML / LLM inference systems.
  • Demonstrated ability to integrate research advancements into production systems.
  • Experience with performance tuning and profiling across various technology stacks.
  • Experience with vLLM
  • Seniority level

    Seniority level

    Mid-Senior level

    Employment type

    Employment type

    Full-time

    Job function

    Industries

    Staffing and Recruiting and Software Development

    Referrals increase your chances of interviewing at Alldus by 2x

    Inferred from the description for this job

    San Francisco, CA $130,000.00-$238,000.00 3 days ago

    San Francisco, CA $40,000.00-$70,000.00 2 weeks ago

    San Francisco, CA $145,000.00-$230,000.00 5 days ago

    Full-Stack Software Engineer (Jr / Mid level)

    San Francisco, CA $220,000.00-$350,000.00 4 hours ago

    San Francisco, CA $150,000.00-$230,000.00 2 months ago

    San Francisco, CA $150,000.00-$176,000.00 2 months ago

    San Francisco, CA $99,500.00-$200,000.00 1 day ago

    San Francisco, CA $130,000.00-$140,000.00 2 days ago

    San Francisco, CA $120,000.00-$190,000.00 8 months ago

    San Francisco, CA $125,000.00-$175,000.00 1 month ago

    Software Engineer, Frontend (All Levels)

    San Francisco, CA $150,000.00-$220,000.00 1 hour ago

    San Francisco, CA $56.25-$173,000.00 2 weeks ago

    San Francisco, CA $176,000.00-$250,000.00 2 weeks ago

    Alameda, CA $130,000.00-$160,000.00 4 weeks ago

    San Francisco, CA $150,000.00-$283,000.00 2 weeks ago

    San Francisco, CA $150,000.00-$300,000.00 5 days ago

    San Francisco, CA $165,000.00-$165,000.00 2 years ago

    San Francisco, CA $140,000.00-$280,000.00 7 months ago

    San Francisco, CA $140,000.00-$180,000.00 1 month ago

    San Francisco, CA $130,000.00-$185,000.00 2 months ago

    San Francisco, CA $99,500.00-$200,000.00 1 day ago

    San Francisco, CA $150,500.00-$269,200.00 2 days ago

    San Francisco, CA $100,000.00-$200,000.00 1 year ago

    San Francisco, CA $120,000.00-$200,000.00 2 years ago

    San Francisco, CA $150,000.00-$250,000.00 9 months ago

    We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

    #J-18808-Ljbffr

    [job_alerts.create_a_job]

    Software Engineer • San Francisco, CA, United States

    [internal_linking.related_jobs]
    Staff Machine Learning Software Engineer

    Staff Machine Learning Software Engineer

    Intuitive • San Francisco, California, United States
    [job_card.full_time]
    At Intuitive, we are united behind our mission : we believe that minimally invasive care is life-enhancing care.Through ingenuity and intelligent technology, we expand the potential of physicians to...[show_more]
    [last_updated.last_updated_30] • [promoted]
    LLM Inference Deployment Engineer

    LLM Inference Deployment Engineer

    VirtualVocations • Oakland, California, United States
    [job_card.full_time]
    A company is looking for an LLM Inference Deployment Engineer to optimize and deploy large language models for high-performance inference. Key Responsibilities Deploy and optimize LLMs post-traini...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Senior Software Engineer - Infrastructure, Machine Learning

    Senior Software Engineer - Infrastructure, Machine Learning

    Baton • San Francisco, California, United States
    [job_card.full_time]
    With $10B in freight under management, our technology reaches every part of the U.We design and ship category-defining software that enables Ryder and its 50,000+ customers—including some of the wo...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Senior Software Engineer, Machine Learning

    Senior Software Engineer, Machine Learning

    Planet Labs PBC • San Francisco, CA, United States
    [job_card.full_time]
    We believe in using space to help life on Earth.Planet designs, builds, and operates the largest constellation of imaging satellites in history. This constellation delivers an unprecedented dataset ...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Senior AI / ML Software Engineer (Remote in California)

    Senior AI / ML Software Engineer (Remote in California)

    Rocket Lawyer • San Francisco, California, United States
    [filters.remote]
    [job_card.full_time]
    We believe everyone deserves access to affordable and simple legal services.Founded in 2008, Rocket Lawyer is the largest and most widely used online legal service platform in the world.With office...[show_more]
    [last_updated.last_updated_30] • [promoted]
    ML Engineer

    ML Engineer

    Wispr Flow • San Francisco, California, United States
    [job_card.full_time]
    Wispr Flow is making it as effortless to interact with your devices as talking to a close friend.Voice is the most natural, powerful way to communicate — and we’re building the interfaces to make t...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    ML Engineer

    ML Engineer

    Phizenix • Menlo Park, California, United States
    [job_card.full_time] +1
    Client Opportunity | Through Phizenix.Phizenix, a certified minority and women-led recruiting firm, is hiring on behalf of an innovative generative AI startup that’s developing diffusion-based larg...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Founding Engineer - ML

    Founding Engineer - ML

    Datawizz • San Francisco, California, United States
    [job_card.full_time]
    Datawizz helps companies reduce LLM costs by 85% while improving accuracy by over 20% by combining distillation, model routing, and pruning to route requests to smaller, more efficient models.We st...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Sr. Machine Learning Engineer (Recommendation Systems)

    Sr. Machine Learning Engineer (Recommendation Systems)

    Philo • San Francisco, California, United States
    [job_card.full_time]
    At Philo, we’re a group of technology and product people who set out to build the future of television, marrying the best in modern technology with the most compelling medium ever invented — in sho...[show_more]
    [last_updated.last_updated_30] • [promoted]
    AI Engineer, Applied ML

    AI Engineer, Applied ML

    Perplexity • San Francisco, California, United States
    [job_card.full_time]
    Perplexity is looking for an Applied ML Engineer to design, build, and iterate on cutting-edge AI models powering our core experience. As an expert in machine learning and artificial intelligence, y...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    AI / ML Inference Engineer

    AI / ML Inference Engineer

    Krea • San Francisco, California, United States
    [job_card.full_time]
    At Krea, we're dedicated to making AI intuitive and controllable for creatives.Our mission is to build tools that empower human creativity, not replace it. We believe AI is a new medium that allows ...[show_more]
    [last_updated.last_updated_30] • [promoted]
    AI Engineer, Multimodal LLMs

    AI Engineer, Multimodal LLMs

    Eloquent AI • San Francisco, California, United States
    [job_card.full_time]
    At Eloquent AI, we’re building the next generation of AI Operators—multimodal, autonomous systems that execute complex workflows across fragmented tools with human-level precision.Our technology go...[show_more]
    [last_updated.last_updated_30] • [promoted]
    ML Research Engineer, ML Systems

    ML Research Engineer, ML Systems

    Scale AI, Inc. • San Francisco, CA, United States
    [job_card.full_time]
    Scale's ML platform (RLXF) team builds our internal distributed framework for large language model training and inference. The platform has been powering MLEs, researchers, data scientists and opera...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Software Engineer, Machine Learning Infrastructure

    Software Engineer, Machine Learning Infrastructure

    Datologyai • Redwood City, California, United States
    [job_card.full_time]
    Companies want to train their own large models on their own data.The current industry standard is to train on a random sample of your data, which is inefficient at best and actively harmful to mode...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Senior Software Engineer - Machine Learning

    Senior Software Engineer - Machine Learning

    Celonis • Redwood City, California, United States
    [job_card.full_time]
    We're Celonis, the global leader in Process Intelligence technology and one of the world's fastest-growing SaaS firms.We believe there is a massive opportunity to unlock productivity by placing AI,...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Software Engineer - Machine Learning Platform

    Software Engineer - Machine Learning Platform

    Snowflake • Menlo Park, California, United States
    [job_card.full_time]
    The Snowflake Machine Learning Platform team’s mission is to enable customers to bring their ML / AI workload to Snowflake. Our customers want to leverage ML / AI to extract business values from ever in...[show_more]
    [last_updated.last_updated_30] • [promoted]
    AI Engineer LLM Infra

    AI Engineer LLM Infra

    Yutori • San Francisco, California, United States
    [job_card.full_time]
    Yutori is reimagining how people interact with the web by building AI agents that can reliably do everyday digital tasks. We are building the entire stack to be agent-first, from training our own mo...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Machine Learning Engineer, Training Infrastructure

    Machine Learning Engineer, Training Infrastructure

    Intellipro Group • San Francisco, California, United States
    [job_card.full_time]
    Machine Learning Engineer, Training Infrastructure.We are looking for an ML Engineer with .ML workloads at scale, supporting our 3DVAE and video diffusion models. We encourage you to apply even if y...[show_more]
    [last_updated.last_updated_30] • [promoted]