Talent.com
Machine Learning, Platform Engineer
Machine Learning, Platform EngineerTogether AI • San Francisco, CA, United States
Machine Learning, Platform Engineer

Machine Learning, Platform Engineer

Together AI • San Francisco, CA, United States
[job_card.30_days_ago]
[job_preview.job_type]
  • [job_card.full_time]
[job_card.job_description]

This role focuses on enabling custom models and dedicated inference on Together. We are responsible for optimizing autoscaling, minimizing cold starts, achieving the best end-to-end model performance, and providing a best-in-class developer experience with great tooling.

Required Qualifications

  • 5+ years of demonstrated experience in building large scale, fault tolerant, distributed systems and API microservices
  • Experience running serverless inference platforms, doing model bring-up on short notice, being on call, or general cloud provider is a very big plus
  • Good taste and ability to thoughtfully discuss how what you’ve built has failed over time
  • Experience designing, analyzing and improving efficiency, scalability, and stability of various system resources
  • Excellent understanding of low level operating systems concepts including concurrency, networking and storage, performance and scale
  • Expert-level programmer in one or more of Golang, Rust, Python, C++, or Haskell
  • Proficiency in writing and maintaining Infrastructure as Code (IaC) using tools like Terraform
  • Experience with Kubernetes or other container orchestration systems
  • Bachelor’s or Master’s degree in Computer Science, Computer Engineering, or a related technical field, or equivalent practical experience
  • Writing-heavy roles or companies are a plus

Key Responsibilities

  • New hires may work on multi-cluster orchestration, portfolio optimization, predictive autoscaling, control panes, model bring-up, light model optimization, APIs for managing deployments, inference worker SDKs, and CLI tools.
  • Analyze and improve the robustness and scalability of existing distributed systems, APIs, databases, and infrastructure
  • Partner with product teams to understand functional requirements and deliver solutions that meet business needs
  • Write clear, well-tested, and maintainable software and IaC for both new and existing systems
  • Conduct design and code reviews, create developer documentation, and develop testing strategies for robustness and fault tolerance
  • About Together AI

    Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI infrastructure.

    Compensation

    We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is : $160,000 - $250,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge.

    Equal Opportunity

    Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more.

    Please see our privacy policy at https : / / www.together.ai / privacy

    #J-18808-Ljbffr

    [job_alerts.create_a_job]

    Machine Learning Engineer • San Francisco, CA, United States

    [internal_linking.similar_jobs]
    Machine Learning Engineer, GenAI Platform

    Machine Learning Engineer, GenAI Platform

    Lightfield • San Francisco, California, United States
    [job_card.full_time]
    Lightfield is a new kind of CRM.It's a collaborative system for founders to find, understand, and serve customers faster than anything before it. It captures every customer interaction, generates an...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Machine Learning Engineer, Relevance

    Machine Learning Engineer, Relevance

    Patreon • San Francisco, California, United States
    [job_card.full_time]
    Patreon is a media and community platform where over 300,000 creators give their biggest fans access to exclusive work and experiences. We offer creators a variety of ways to engage with their fans ...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Senior Software Engineer, Machine Learning

    Senior Software Engineer, Machine Learning

    Planet Labs PBC • San Francisco, CA, United States
    [job_card.full_time]
    We believe in using space to help life on Earth.Planet designs, builds, and operates the largest constellation of imaging satellites in history. This constellation delivers an unprecedented dataset ...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Machine Learning Engineer

    Machine Learning Engineer

    VirtualVocations • Oakland, California, United States
    [job_card.full_time]
    A company is looking for a Machine Learning Engineer ll to join their team focused on designing and deploying production models in healthcare. Key Responsibilities Perform in-depth analysis of hea...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Machine Learning, Platform Engineer

    Machine Learning, Platform Engineer

    Together Ai • San Francisco, California, United States
    [job_card.full_time]
    This role focuses on enabling custom models and dedicated inference on Together.We are responsible for optimizing autoscaling, minimizing cold starts, achieving the best end-to-end model performanc...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Software Engineer, Machine Learning Platform

    Software Engineer, Machine Learning Platform

    Coinbase • San Francisco, CA, United States
    [job_card.full_time]
    Ready to be pushed beyond what you think you’re capable of?.At Coinbase, our mission is to increase economic freedom in the world. It’s a massive, ambitious opportunity that demands the best of us, ...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Machine Learning Platform Engineer

    Machine Learning Platform Engineer

    Strava, Inc. • San Francisco, California, United States
    [job_card.full_time]
    About This Role Strava is the app for active people.With over 150 million athletes in more than 185 countries, Strava is where connection, motivation, and personal bests thrive.No matter your activ...[show_more]
    [last_updated.last_updated_1_day] • [promoted]
    Founding Machine Learning Engineer

    Founding Machine Learning Engineer

    Fermàt • San Francisco, California, United States
    [job_card.full_time]
    Commerce brands to transform clicks into conversions with highly-personalized , 1 : 1 dynamic shopping experiences.We've raised $30M+ to date and are backed by Bain Capital Ventures, Greylock, QED, a...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Founding Machine Learning Engineer

    Founding Machine Learning Engineer

    Trove • San Francisco, California, United States
    [job_card.full_time]
    Trove is developing an AI associate for financial firms - think enterprise search & agents for private equity, hedge funds, and banks. Our mission is to deliver associate‑level AGI.We’ve raised near...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Machine Learning Platform Engineer

    Machine Learning Platform Engineer

    Strava • San Francisco, CA, United States
    [job_card.full_time]
    Machine Learning Platform Engineer.Strava is the app for active people.With over 150 million athletes in more than 185 countries, we help users find their crew, crush milestones, and keep moving fo...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Machine Learning Platform Engineer, API

    Machine Learning Platform Engineer, API

    Anthropic • San Francisco, California, United States
    [job_card.full_time]
    Anthropic’s mission is to create reliable, interpretable, and steerable AI systems.We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group ...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Machine Learning Engineer, GenAI Platform

    Machine Learning Engineer, GenAI Platform

    Tome • San Francisco, CA, United States
    [job_card.full_time]
    Lightfield is a new kind of CRM.It's a collaborative system for founders to find, understand, and serve customers faster than anything before it. It captures every customer interaction, generates an...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Founding Machine Learning Engineer (San Francisco)

    Founding Machine Learning Engineer (San Francisco)

    Key Technology • San Francisco, CA, US
    [job_card.part_time]
    Youll design, build, and ship ranking and recommendation systems that make every match feel more personal and improve week after week. Train and fine-tune LLMs / encoders.Collaborate across ML, platfo...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Staff Machine Learning Platform Engineer

    Staff Machine Learning Platform Engineer

    Faire • San Francisco, California, United States
    [job_card.full_time]
    Faire is an online wholesale marketplace built on the belief that the future is local — independent retailers around the globe are doing more revenue than Walmart and Amazon combined, but individua...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Software Engineer - Machine Learning Platform

    Software Engineer - Machine Learning Platform

    Snowflake • Menlo Park, California, United States
    [job_card.full_time]
    The Snowflake Machine Learning Platform team’s mission is to enable customers to bring their ML / AI workload to Snowflake. Our customers want to leverage ML / AI to extract business values from ever in...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Senior ML Platform Engineer - Large-Scale AI Infra

    Senior ML Platform Engineer - Large-Scale AI Infra

    Apple Inc. • San Francisco, CA, United States
    [job_card.full_time]
    A leading technology company in San Francisco seeks a Machine Learning Engineer to contribute to AI and machine learning projects. This role involves managing large data systems, designing algorithm...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Machine Learning Engineer, Identity

    Machine Learning Engineer, Identity

    Adamcad • San Francisco, California, United States
    [job_card.full_time]
    Before Stripe, every growing internet platform had a payments team.Today, every growing internet platform has an Identity team. Identity verification is a core piece of economic infrastructure for o...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Senior Machine Learning Engineer

    Senior Machine Learning Engineer

    Block • San Francisco, California, United States
    [job_card.full_time]
    Block is one company built from many blocks, all united by the same purpose of economic empowerment.The blocks that form our foundational teams — People, Finance, Counsel, Hardware, Information Sec...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]