Lead Software Engineer, Model Serving PlatformSciforium • San Francisco, CA, United States

Lead Software Engineer, Model Serving Platform

Sciforium • San Francisco, CA, United States

[job_card.variable_days_ago]

[job_preview.job_type]

[job_card.full_time]

[job_card.job_description]

Lead Software Engineer, Model Serving Platform

Join to apply for the Lead Software Engineer, Model Serving Platform role at Sciforium .

Sciforium is an AI infrastructure company developing next‑generation multimodal AI models and a proprietary, high‑efficiency serving platform. Backed by multi‑million‑dollar funding and direct sponsorship from AMD with hands‑on support from AMD engineers, the team is scaling rapidly to build the full stack powering frontier AI models and real‑time applications.

We offer a fast‑moving, collaborative environment where engineers have meaningful impact, learn quickly, and tackle deep technical challenges across the AI systems stack.

Role Overview

This is a rare chance to help architect and lead the development of Sciforium’s next‑generation model serving platform—the high‑performance engine that will bring a multimodal, highly efficient foundation model to market. As a senior technical leader, you’ll not only build core components yourself but also guide and mentor other engineers, influencing engineering direction, standards, and execution quality.

You will learn and shape the full AI stack : from GPU kernels and quantized execution paths to distributed serving, scheduling, and the APIs that power real‑time AI applications. If you enjoy deep systems work, thrive on ownership, and want to lead engineers in building foundational AI infrastructure, this role puts you at the center of Sciforium’s mission and growth.

Key Responsibilities

Lead the technical direction of the model serving platform, owning architecture decisions and guiding engineering execution.
Build core serving components including execution runtimes, batching, scheduling, and distributed inference systems.
Develop high‑performance C++ and CUDA / HIP modules, including custom GPU kernels and memory‑optimized runtimes.
Collaborate with ML researchers to productionize new multimodal models and ensure low‑latency, scalable inference.
Build Python APIs and services that expose model capabilities to downstream applications.
Mentor and support other engineers through code reviews, design discussions, and hands‑on technical guidance.
Drive performance profiling, benchmarking, and observability across the inference stack.
Ensure high reliability and maintainability through testing, monitoring, and engineering best practices.
Troubleshoot and resolve complex issues across GPU, runtime, and service layers.

Must‑Haves

Bachelor’s degree in Computer Science, Computer Engineering, Electrical Engineering, or equivalent practical experience.

5+ years of experience designing and building scalable, reliable backend systems or distributed infrastructure.

Strong understanding of LLM inference mechanics (prefill vs decode, batching, KV cache).

Experience with Kubernetes / Ray, Containerization.

Strong proficiency in C++, Python.

Strong debugging, profiling, and performance optimization skills at the system level.

Ability to collaborate closely with ML researchers and translate model or runtime requirements into production‑grade systems.

Effective communication skills and the ability to lead technical discussions, mentor engineers, and drive engineering quality.

Comfortable working from the office and contributing to a fast‑moving, high‑ownership team culture.

Nice to Have

Experience with ML systems engineering, distributed GPU scheduling, open source inference engine like vLLM, Sglang, or TRT‑LLM.

Experience in building large‑scale ML / MLOps infrastructure.

Proficiency in CUDA or ROCm and experience with GPU profiling tools.

Experience at an AI / ML startup, research lab, or Big Tech infrastructure / ML team.

Familiarity with multimodal model architectures, raw‑byte models, or efficient inference techniques.

Contributions to open‑source ML or HPC infrastructure.

Why Join Us

Opportunity to build frontier‑scale AI infrastructure powering next‑generation LLMs and multimodal models.

Work with top‑tier engineers and researchers across systems, GPUs, and ML frameworks.

Tackle high‑impact performance and scalability challenges in training and inference.

Access state‑of‑the‑art GPU clusters, datasets, and tooling.

Opportunity to publish, patent, and push the boundaries of modern AI.

Join a culture of innovation, ownership, and fast execution in a rapidly scaling AI organization.

Benefits Include

Medical, dental, and vision insurance.

401k plan.

Daily lunch, snacks, and beverages.

Flexible time off.

Competitive salary and equity.

Equal Opportunity

Sciforium is an equal opportunity employer. All applicants will be considered for employment without attention to race, color, religion, sex, sexual orientation, gender identity, national origin, veteran or disability status.

Seniority level

Mid‑Senior level

Employment type

Full‑time

Job function

Engineering and Information Technology

Industries

Technology, Information and Internet

#J-18808-Ljbffr

[job_alerts.create_a_job]

Software Engineer Platform • San Francisco, CA, United States

[internal_linking.related_jobs]

Staff Software Engineer, ML Platform

Attentive • San Francisco, CA, United States

[job_card.full_time]

Attentive® is the AI-powered mobile marketing platform transforming the way brands personalize consumer engagement.Attentive enables marketers to craft tailored journeys for every subscriber, drivi...[show_more]

[last_updated.last_updated_variable_days] • [promoted]

Model Deployment Engineer

Rime • San Francisco, CA, United States

[job_card.full_time]

Rime builds enterprise‑grade voice models that sound truly human — trusted by global telcos, healthcare systems, and leading brands to power billions of real customer interactions.Our mission is to...[show_more]

[last_updated.last_updated_variable_days] • [promoted]

Software Engineer - Model API's

Baseten • San Francisco, CA, United States

[job_card.full_time]

Baseten powers inference for the world's most dynamic AI companies, like OpenEvidence, Clay, Mirage, Gamma, Sourcegraph, Writer, Abridge, Bland, and Zed. By uniting applied AI research, flexible inf...[show_more]

[last_updated.last_updated_variable_days] • [promoted]

Lead Software Engineer

Commerce • San Francisco, CA, United States

[job_card.full_time]

Lead Software Engineer role at Commerce.We are seeking a highly accountable, impact-driven AI Lead Engineer to play a pivotal role in shaping and executing our AI vision. As a key member of technica...[show_more]

[last_updated.last_updated_30] • [promoted]

Lead Software Engineer

Docusign • San Francisco, CA, United States

[job_card.full_time]

Docusign brings agreements to life.Docusign solutions to accelerate the process of doing business and simplify people’s lives. With intelligent agreement management, Docusign unleashes business-crit...[show_more]

[last_updated.last_updated_30] • [promoted]

Lead Software Engineer

Altana AI • San Francisco, CA, United States

[job_card.full_time]

AI can be a powerful tool for good in the world – at Altana we apply AI to the world’s largest organized body of supply chain data to power a more resilient, more secure, and more sustainable model...[show_more]

[last_updated.last_updated_30] • [promoted]

Senior Software Engineer, AI Model serving - Portland, USA

Clutch Canada • San Francisco, CA, United States

[job_card.full_time]

The mission of Speechify is to make sure that reading is never a barrier to learning.Over 50 million people use Speechify’s text-to-speech products to turn whatever they’re reading – PDFs, books, G...[show_more]

[last_updated.last_updated_variable_days] • [promoted]

Platform Engineer, Model Shaping

Together AI • San Francisco, CA, United States

[job_card.full_time]

The Model Shaping team at Together AI works on products and research for tailoring open foundation models to downstream applications. We build services that allow machine learning developers to choo...[show_more]

[last_updated.last_updated_30] • [promoted]

Software Engineer, Scientific Models (Platform)

Benchling • San Francisco, CA, United States

[job_card.full_time]

Software Engineer, Scientific Models (Platform).Biotechnology is rewriting life as we know it, from the medicines we take, to the crops we grow, the materials we wear, and the household goods that ...[show_more]

[last_updated.last_updated_variable_days] • [promoted]

Staff Software Engineer — Platform Architect & Tech Lead

jobr.pro • San Francisco, CA, United States

[job_card.full_time]

A technology company based in San Francisco is seeking an experienced software developer to help revolutionize the benefits industry by building a multi-sided Global Benefits Marketplace.The ideal ...[show_more]

[last_updated.last_updated_1_day] • [promoted]

Lead Software Engineer

Altana • San Francisco, CA, United States

[job_card.full_time]

[last_updated.last_updated_30] • [promoted]

Lead Software Engineer

Xcede • San Francisco, CA, United States

[job_card.full_time]

A top AI Native Command Center startup is looking for a lead software developer to join their growing technology team.It centralizes internal and external data for companies and matches it with ext...[show_more]

[last_updated.last_updated_30] • [promoted]

Senior Software Engineer, Model Serving

Databricks Inc. • San Francisco, CA, United States

[job_card.full_time]

At Databricks, we are passionate about enabling data teams to solve the world's toughest problems — from making the next mode of transportation a reality to accelerating the development of medical ...[show_more]

[last_updated.last_updated_30] • [promoted]

Lead Software Engineer / Tech Lead

HyperFi, Inc. • San Francisco, CA, United States

[job_card.full_time]

We're building the kind of platform we always wanted to use : fast, flexible, and built for making sense of real-world complexity. Behind the scenes is a robust, event-driven architecture that connec...[show_more]

[last_updated.last_updated_variable_days] • [promoted]

Lead Software Engineer

Troveo AI • San Francisco, CA, United States

[job_card.full_time]

Troveo is building the next‑generation data platform to train AI video models.We offer the world’s largest library of AI video training data—featuring millions of hours of licensed video content.Ou...[show_more]

[last_updated.last_updated_variable_days] • [promoted]

Lead Software Engineer Middleware Reliability Engineering

Visa • Foster City, CA, United States

[job_card.full_time]

Visa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchants, financial institutions, and government entities in more t...[show_more]

[last_updated.last_updated_30] • [promoted]

Lead Software Engineer / Tech Lead

HyperFi • San Francisco, California, United States, 94102

[job_card.full_time]

[last_updated.last_updated_30]

Software Engineer - ML & Platform

Rhizome • San Francisco, CA, United States

[job_card.full_time]

A changing climate demands Resilience by Design.We like solving hard problems with creativity, tenacity, and empathy for our customers. At the same time, we believe that being better stewards in our...[show_more]

[last_updated.last_updated_variable_days] • [promoted]