Talent.com
Software Engineer - Model API's
Software Engineer - Model API'sBaseTen Labs, Inc. • San Francisco, CA, United States
Software Engineer - Model API's

Software Engineer - Model API's

BaseTen Labs, Inc. • San Francisco, CA, United States
[job_card.1_day_ago]
[job_preview.job_type]
  • [job_card.full_time]
[job_card.job_description]
About Baseten

Baseten powers mission-critical inference for the world's most dynamic AI companies, like Cursor, Notion, OpenEvidence, Abridge, Clay, Gamma and Writer. By uniting applied AI research, flexible infrastructure, and seamless developer tooling, we enable companies operating at the frontier of AI to bring cutting?edge models into production. We're growing quickly and recently raised our $300M Series E, backed by investors including BOND, IVP, Spark Capital, Greylock, and Conviction. Join us and help build the platform engineers turn to ship AI products.

The Role

Basetens Model Performance (MP) team is responsible for ensuring the models running on our platform are fast, reliable, and cost?efficient. As part of this team, youll focus on Model APIs the infrastructure powering our hosted API endpoints for the latest open?source models. This work spans distributed systems, model serving, and developer experience. Youll join a small, high?impact team operating at the intersection of product, model performance, and infra, helping to define how developers interact with AI models at scale.

Responsibilities

  • Design, build, and operate the Model APIs surface with focus on advanced inference capabilities: structured outputs (JSON mode, grammar?constrained generation), tool/function calling and multi?modal serving
  • Profile and optimize TensorRT-LLM kernels, analyze CUDA kernel performance, implement custom CUDA operators, tune memory allocation patterns for maximum throughput and optimize communication patterns across multi?GPU setups
  • Productionize performance improvements across runtimes with deep understanding of their internals: speculative decoding implementations, guided generation for structured outputs, custom scheduling and routing algorithms for high?performance serving
  • Build comprehensive benchmarking frameworks that measure real?world performance across different model architectures, batch sizes, sequence lengths, and hardware configurations
  • Productionize performance improvements across runtimes (e.TensorRT, TensorRT?LLM): speculative decoding, quantization, batching, and KV?cache reuse
  • Instrument deep observability (metrics, traces, logs) and build repeatable benchmarks to measure speed, reliability, and quality
  • Implement platform fundamentals: API versioning, validation, usage metering, quotas, and authentication
  • Collaborate closely with other teams to deliver robust, developer?friendly model serving experiences

Requirements
  • 3+ years experience building and operating distributed systems or large?scale APIs
  • Proven track record of owning low?latency, reliable backend services (rate?limiting, auth, quotas, metering, migrations)
  • Infra instincts with performance sensibilities: profiling, tracing, capacity planning, and SLO management
  • Comfortable debugging complex systems, from runtime internals to GPU execution traces
  • Strong written communication; able to produce clear design docs and collaborate across functions

Nice To Have
  • Experience with LLM runtimes (vLLM, SGLang, TensorRT?LLM) or contributions to open?source inference engines (vLLM, TensorRT?LLM, SGLang, TGI)
  • Knowledge of Kubernetes, service meshes, API gateways, or distributed scheduling
  • Background in developer?facing infrastructure or open?source APIs
  • We value infra?leaning generalists who bring strong engineering fundamentals and curiosity. ML experience is a plus, but not required

Benefits
  • Competitive compensation, including meaningful equity
  • 100% coverage of medical, dental, and vision insurance for employee and dependents
  • Generous PTO policy including company wide Winter Break (our offices are closed from Christmas Eve to New Year's Day!)
  • Paid parental leave
  • Company?facilitated 401(k)
  • Exposure to a variety of ML startups, offering unparalleled learning and networking opportunities


Apply now

to embark on a rewarding journey in shaping the future of AI! If you are a motivated individual with a passion for machine learning and a desire to be part of a collaborative and forward?thinking team, we would love to hear from you.

At Baseten, we are committed to fostering a diverse and inclusive workplace. We provide equal employment opportunities to all employees and applicants without regard to race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, or veteran status.

#J-18808-Ljbffr
[job_alerts.create_a_job]

Software Engineer Model APIs • San Francisco, CA, United States

[internal_linking.similar_jobs]
Software Engineer, API Engineer

Software Engineer, API Engineer

OpenAI • San Francisco, CA, United States
[job_card.full_time]
Our team brings OpenAI's most capable technology to the world through our developer platform: the OpenAI API.As the leading AI development platform, our API is used by millions of developers and th...[show_more]
[last_updated.last_updated_30] • [promoted]
Founding Software Engineer

Founding Software Engineer

Goliath Partners LP • San Francisco, CA, United States
[job_card.full_time]
This range is provided by Goliath Partners.Your actual pay will be based on your skills and experience talk with your recruiter to learn more.Direct message the job poster from Goliath Partners.Fas...[show_more]
[last_updated.last_updated_1_day] • [promoted]
Software Engineer

Software Engineer

LandingAI, Inc. • San Francisco, CA, United States
[job_card.full_time]
As a Software Engineer at LandingAI, you will build and maintain core web applications and services that power our AI-driven products.While you will not be responsible for developing machine learni...[show_more]
[last_updated.last_updated_30] • [promoted]
Software Engineer, Mid-Level

Software Engineer, Mid-Level

Jobright.ai • Menlo Park, CA, United States
[job_card.full_time]
Be among the first 25 applicants.Get AI-powered advice on this job and more exclusive features.Jobright is an AI-powered career platform that helps job seekers discover the top opportunities in the...[show_more]
[last_updated.last_updated_30] • [promoted]
Software Engineer (Mid-Level)

Software Engineer (Mid-Level)

Chestnutfi • San Francisco, CA, United States
[job_card.full_time]
Chestnut is building the first AI-native operating system for insurance distribution by transforming how the $1T+ insurance industry allocates its largest spend: sales and distribution.Backed by a1...[show_more]
[last_updated.last_updated_30] • [promoted]
Software Engineer II — Global Expansion & ML

Software Engineer II — Global Expansion & ML

Amazon • San Francisco, CA, United States
[job_card.full_time]
A leading e-commerce company is seeking a Software Engineer to enhance their customer-facing features.The ideal candidate will contribute to software delivery, design solutions, and mentor team mem...[show_more]
[last_updated.last_updated_30] • [promoted]
Software Engineer - Model API's

Software Engineer - Model API's

Baseten • San Francisco, CA, United States
[job_card.full_time]
Baseten powers inference for the world's most dynamic AI companies, like OpenEvidence, Clay, Mirage, Gamma, Sourcegraph, Writer, Abridge, Bland, and Zed.By uniting applied AI research, flexible inf...[show_more]
[last_updated.last_updated_30] • [promoted]
Software / API Engineer

Software / API Engineer

Lawrence Berkeley National Laboratory • Berkeley, California, United States
[job_card.full_time]
The National Energy Research Scientific Computing Center (NERSC) is seeking a versatile Software / API Engineer to join our team building software systems that integrate scientific workflows and su...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Software Engineer, ML Products

Software Engineer, ML Products

Twitch • San Francisco, CA, United States
[job_card.full_time]
Twitch is the world’s biggest live streaming service, with global communities built around gaming, entertainment, music, sports, cooking, and more.It is where thousands of communities come together...[show_more]
[last_updated.last_updated_30] • [promoted]
Software Engineer II

Software Engineer II

Omada Health • South San Francisco, California, United States
[job_card.full_time]
Omada Health is on a mission to inspire and engage people in lifelong health, one step at a time.Omada Health is a digital care provider that empowers people to achieve their health goals through s...[show_more]
[last_updated.last_updated_30] • [promoted]
Software Engineer (Fullstack)

Software Engineer (Fullstack)

Nugen Deeptech Pvt. Ltd. • San Francisco, CA, United States
[job_card.full_time]
We are looking for a Software Engineer (Fullstack) to help build and maintain Nugen's Domain-Aligned AI™ platform that enables enterprises to trust decisions made by AI-assisted workflows.You will ...[show_more]
[last_updated.last_updated_30] • [promoted]
Lead Backend Software Engineer (Product API)

Lead Backend Software Engineer (Product API)

Philo • San Francisco, CA, United States
[job_card.full_time]
San Francisco, CA, Brooklyn, NY, Cambridge, MA or remote.Philo is building the future of television.Our product lets you watch your favorite shows on all the devices you care about, with seamless p...[show_more]
[last_updated.last_updated_30] • [promoted]
Software Engineer

Software Engineer

Neon Health • San Francisco, CA, United States
[job_card.full_time]
You’ll be hands‑on in improving the real‑world behavior of our AI systems — tracing and fixing runtime issues, building agent simulators, designing LLM evals and QA tools, and interfacing with clie...[show_more]
[last_updated.last_updated_30] • [promoted]
Software Engineer, Localization

Software Engineer, Localization

Mach Industries • San Francisco, CA, United States
[job_card.full_time]
Software Engineer, Localization at Mach Industries.At the core of our mission is the commitment to delivering.With a workforce of approximately.We are dedicated to solving the next generation of wa...[show_more]
[last_updated.last_updated_30] • [promoted]
Senior Software Engineer, API

Senior Software Engineer, API

Gridware Technologies Inc. • San Francisco, CA, United States
[job_card.full_time]
Gridware is a San Francisco-based technology company dedicated to protecting and enhancing the electrical grid.We pioneered a groundbreaking new class of grid management called active grid response...[show_more]
[last_updated.last_updated_30] • [promoted]
Senior Software Engineer, Model Serving

Senior Software Engineer, Model Serving

Databricks Inc. • San Francisco, CA, United States
[job_card.full_time]
At Databricks, we are passionate about enabling data teams to solve the world's toughest problems — from making the next mode of transportation a reality to accelerating the development of medical ...[show_more]
[last_updated.last_updated_30] • [promoted]
Software Engineer, API Platform

Software Engineer, API Platform

Convex • San Francisco, CA, United States
[job_card.full_time]
Convex is transforming the way developers build applications.Our mission is to fundamentally change how software is built on the Internet by empowering developers to create fast, reliable, and dyna...[show_more]
[last_updated.last_updated_30] • [promoted]
Software Engineer

Software Engineer

Veracity • San Francisco, CA, United States
[job_card.temporary]
San Francisco Bay Area, CA 6+ Months Contract Scope of Work, Skills and/or Qualifications:.Proficiency in multiple programming languages such as C#, Java, Go, Python, C++, JavaScript, TypeScript (R...[show_more]
[last_updated.last_updated_1_day] • [promoted]