Talent.com
Inference Engineer, Video AI
Inference Engineer, Video AICantina • San Francisco, CA, United States
[error_messages.no_longer_accepting]
Inference Engineer, Video AI

Inference Engineer, Video AI

Cantina • San Francisco, CA, United States
[job_card.30_days_ago]
[job_preview.job_type]
  • [job_card.full_time]
[job_card.job_description]

A bit about Cantina :

Cantina, founded by Sean Parker, is a new social platform with the most advanced AI character creator. Build, share, and interact with AI bots and your friends directly in the Cantina or across the internet.

Cantina bots are lifelike, social creatures, capable of interacting wherever humans go on the internet. Recreate yourself using powerful AI, imagine someone new, or choose from thousands of existing characters. Bots are a new media type that offer a way for creators to share infinitely scalable and personalized content experiences combined with seamless group chat across voice, video, and text.

If you're excited about the potential AI has to shape human creativity and social interactions, join us in building the future!

A bit about the role : We're looking for an Inference Engineer who specializes in productionizing and hosting video AI models at scale. You'll be responsible for taking cutting-edge neural networks from research to production, building robust inference infrastructure, and optimizing model performance for real-time applications. This role focuses on the deployment and serving of large video models.

As an Inference Engineer, you will :

  • Deploy video AI models to production - Take research models and build production-ready inference endpoints with APIs, ensuring efficient operation across cloud infrastructure.
  • Maintain and optimize inference systems - Debug complex model serving issues, optimize latency performance, monitor system health, and ensure 99.9% uptime for AI-powered features.
  • Implement model optimizations - Work with neural network architectures including diffusion networks, VAEs, and transformers. Apply streaming optimizations and understand video model architectures to implement effective performance improvements.
  • Manage inference infrastructure - Leverage containerization with Docker, cloud storage solutions like S3, and cluster computing to build scalable model serving infrastructure.
  • Collaborate with research teams - Work closely with AI researchers to understand model requirements, architectural constraints, and optimization opportunities for new video generation models.

A bit about you :

  • 2+ years of ML engineering experience with focus on model inference and deployment
  • Strong understanding of neural network architectures , particularly diffusion networks, VAEs, and transformer models
  • Experience with video and image models - Understanding of how video / image generation models work, their architectures, and optimization strategies specific to video processing
  • Multi-GPU inference expertise - Experience running model components across multiple GPUs, implementing parallel processing strategies for large models
  • Production model hosting experience - Track record of deploying and maintaining ML models in production environments, including streaming and real-time inference
  • Experience with containerization (Docker), AWS, and cluster computing environments
  • Familiarity with machine learning frameworks (PyTorch, TensorFlow)
  • Experience with inference platforms and model serving solutions
  • Technical Stack You'll Work With :

  • Cloud : AWS (S3, DynamoDB), Kubernetes clusters
  • ML Infrastructure : Model serving platforms, Docker
  • Languages : Python
  • Frameworks : PyTorch, TensorFlow
  • Models : Video generation models, diffusion networks, VAEs, transformers
  • Optimization : Multi-GPU inference, real-time processing techniques

    Pay Equity :

    In compliance with Pay Transparency Laws, the base salary range for this role is between $175,000-$225,000 for those located in the San Francisco Bay Area, New York City and Seattle, WA. When determining compensation, a number of factors will be considered, including skills, experience, job scope, location, and competitive compensation market data.

    Benefits :

  • Health Care - 99% of premiums for medical, vision, dental are fully paid for by Cantina, plus One Medical membership.
  • Monthly Wellness Stipend - $500 / month to use on whatever you'd like!
  • Rest and Recharge - 15 PTO days per year, 10 sick days, all Federal holidays, and 2 floating holidays.
  • 401(K) - Eligible to participate on day one of employment.
  • Parental Leave & Fertility Support
  • Competitive Salary & Equity
  • Lunch and snacks provided for in-office employees.
  • WFH equipment provided for full-time hybrid / remote employees.
  • [job_alerts.create_a_job]

    Engineer Ai Inference • San Francisco, CA, United States

    [internal_linking.related_jobs]
    AIVideo.com seeks pioneering AI Engineer to reinvent video production

    AIVideo.com seeks pioneering AI Engineer to reinvent video production

    AIVideo • San Francisco, CA, United States
    [job_card.full_time]
    We have collected the best dataset in the world for video editing.Our popular web-based video editor produces thousands of user actions per minute. The goal is to use that user data to power a human...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Senior Full-Stack Engineer - AI-Powered Video Platform

    Senior Full-Stack Engineer - AI-Powered Video Platform

    Genmo • San Francisco, CA, US
    [job_card.full_time]
    A leading AI research lab in San Francisco seeks a Senior Full Stack Engineer to design user experiences powered by AI models. This role involves building responsive UI components using React and Ta...[show_more]
    [last_updated.last_updated_1_day] • [promoted]
    Full-Stack Engineer, AI-Powered Video Studio

    Full-Stack Engineer, AI-Powered Video Studio

    Kapwing • San Francisco, CA, United States
    [job_card.full_time]
    A creative tools company in San Francisco is looking for a Software Engineer to build and enhance features for their cloud-based video editor. The ideal candidate has experience with modern web stac...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Enterprise GTM Lead for AI Media Intelligence

    Enterprise GTM Lead for AI Media Intelligence

    Clipbook • San Francisco, CA, United States
    [job_card.full_time]
    A fast-growing startup is seeking a results-driven Enterprise GTM Lead to expand its AI-powered media intelligence platform. This role involves managing corporate accounts, driving new business acro...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Growth & Experiments Lead, AI Video Creation

    Growth & Experiments Lead, AI Video Creation

    CoffeeSpace • San Francisco, CA, US
    [job_card.full_time]
    A leading partner in tech recruitment in San Francisco is seeking a Head of GTM / Experiments to spearhead marketing strategies in the AI video creation sector. The ideal candidate will have a strat...[show_more]
    [last_updated.last_updated_1_day] • [promoted]
    Senior AI Engineer

    Senior AI Engineer

    Harrison Clarke • Alameda, CA, United States
    [job_card.full_time]
    Senior AI Engineer - Video Search (Applied Research & Product).S-based applied AI company building next-generation real-time video understanding systems deployed at scale across enterprise, governm...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Senior ML Engineer - Agentic Video AI Systems

    Senior ML Engineer - Agentic Video AI Systems

    DeepRec.ai • San Francisco, California, United States
    [job_card.full_time]
    A cutting-edge software company is looking for a Senior Recruitment Consultant to lead the development of agentic video / image pipelines. This role requires hands-on experience in fine-tuning LLMs / VL...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Senior Full-Stack Engineer for AI Video Tools

    Senior Full-Stack Engineer for AI Video Tools

    Hedra, Inc • San Francisco, CA, US
    [job_card.full_time]
    A leading technology company in San Francisco is seeking a Senior Full-Stack Engineer to design and develop transformative video and audio creation tools. The ideal candidate should have over 5 year...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Staff Machine Learning Engineer- Video AI / Computer Vision

    Staff Machine Learning Engineer- Video AI / Computer Vision

    Warner Bros. Discovery • San Francisco, CA, US
    [job_card.full_time]
    Staff Machine Learning Engineer- Video AI / Computer Vision Join to apply for the Staff Machine Learning Engineer- Video AI / Computer Vision role at Warner Bros. Discovery Welcome to Warner Bros.Di...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Senior Backend Engineer - AI Video Platform

    Senior Backend Engineer - AI Video Platform

    GTV • San Francisco, CA, United States
    [job_card.full_time]
    A leading AI video company based in San Francisco is looking for an experienced Backend Software Engineer to help build the core systems of their innovative product. The role involves designing scal...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Product Video Engineer : Motion Graphics for AI SaaS

    Product Video Engineer : Motion Graphics for AI SaaS

    Inkeep, Inc. • San Francisco, CA, United States
    [job_card.full_time]
    A leading AI platform company in San Francisco is seeking a content video creator who is passionate about storytelling and engineering. This role requires creating compelling product-related videos ...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Multimodal Inference Engineer — Scale Real-Time AI

    Multimodal Inference Engineer — Scale Real-Time AI

    OpenAI • San Francisco, CA, United States
    [job_card.full_time]
    A leading AI research company in San Francisco is seeking a Software Engineer specialized in multimodal inference systems. Responsibilities include designing high-performance infrastructure for audi...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    AI Video Editing Engineer (Staff+)

    AI Video Editing Engineer (Staff+)

    Descript, Inc. • San Francisco, CA, United States
    [job_card.full_time]
    A leading AI-powered platform for media content in San Francisco is seeking Software Engineers at mid-career, senior, and staff levels. This role focuses on enhancing video editing experiences and e...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Applied AI Inference Engineer

    Applied AI Inference Engineer

    Baseten • San Francisco, CA, United States
    [job_card.full_time]
    Baseten provides the infrastructure, tooling, and expertise needed to bring great AI products to market - fast.Backed by top investors including IVP, Spark Capital, Greylock, and Conviction, we’re ...[show_more]
    [last_updated.last_updated_30] • [promoted]
    ML Engineer : Shape Scalable Image & Video AI

    ML Engineer : Shape Scalable Image & Video AI

    krea.ai • San Francisco, California, United States
    [job_card.full_time]
    A progressive AI technology firm is looking for a machine learning engineer to work on large-scale image and video models. This role involves training foundation diffusion models and controllability...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Video Intelligence ML Research Engineer : Build End-to-End AI

    Video Intelligence ML Research Engineer : Build End-to-End AI

    Acceler8 Talent • San Francisco, California, United States
    [job_card.full_time]
    A tech-focused recruitment agency is seeking a Machine Learning Research Engineer in San Francisco to develop high-performance video understanding systems. The ideal candidate will have over 3 years...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    AI Inference Engineer

    AI Inference Engineer

    quadric, Inc • Burlingame, CA, US
    [job_card.full_time]
    [filters_job_card.quick_apply]
    Quadric has created an innovative general purpose neural processing unit (GPNPU) architecture.Quadric's co-optimized software and hardware is targeted to run neural network (NN) inference workloads...[show_more]
    [last_updated.last_updated_variable_days]
    Senior Research Engineer Multimodal & Video Foundation Model (Remote)

    Senior Research Engineer Multimodal & Video Foundation Model (Remote)

    Tether Operations Limited • San Francisco, CA, US
    [filters.remote]
    [job_card.full_time]
    Join Tether and Shape the Future of Digital Finance.At Tether, we’re not just building products, we’re pioneering a global financial revolution. Our cutting-edge solutions empower businesses—from ex...[show_more]
    [last_updated.last_updated_30]