Talent.com
AI Inference Engineer
AI Inference Engineerquadric, Inc • Burlingame, CA, US
AI Inference Engineer

AI Inference Engineer

quadric, Inc • Burlingame, CA, US
[job_card.variable_days_ago]
[job_preview.job_type]
  • [job_card.full_time]
  • [filters_job_card.quick_apply]
[job_card.job_description]

Quadric has created an innovative general purpose neural processing unit (GPNPU) architecture. Quadric's co-optimized software and hardware is targeted to run neural network (NN) inference workloads in a wide variety of edge and endpoint devices, ranging from battery operated smart-sensor systems to high-performance automotive or autonomous vehicle systems. Unlike other NPUs or neural network accelerators in the industry today that can only accelerate a portion of a machine learning graph, the Quadric GPNPU executes both NN graph code and conventional C++ DSP and control code.

Role :

The AI Inference Engineer in Quadric is the key bridge between the world of AI / LLM models and Quadric unique platforms. The AI Inference Engineer at Quadric will [1] port AI models to Quadric platform; [2] optimize the model deployment for efficient inference; [3] profile and benchmark the model performance. This senior technical role demands deep knowledge of AI model algorithms, system architecture and AI toolchains / frameworks.

Responsibilities :

  • Quantize, prune and convert models for deployment
  • Port models to Quadric platform using Quadric toolchain
  • Optimize inference deployment for latency, speed
  • Benchmark and profile model performance and accuracy
  • Develop tools to scale and speed up the deployment
  • Make Improvement to SDK and runtime
  • Provide technical support and documents to customers and developer community

Requirements

Requirements :

  • Bachelor’s or Master’s in Computer Science and / or Electric Engineering.
  • 5+ years of experience in AI / LLM model inference and deployment frameworks / tools
  • experience with model quantization (PTQ, QAT) and tools
  • experience with model accuracy measures
  • experience with model inference performance profiling
  • experience with at least one of the following frameworks : onnxruntime, Pytorch, vLLM, huggingface-transformer, neural-compressor, llamacpp
  • Proficiency in C / C++ and Python
  • Demonstrate good capability in problem solving, debug and communication
  • Benefits

  • Health Care Plan (Medical, Dental & Vision)
  • Retirement Plan (401k, IRA)
  • Life Insurance (Basic, Voluntary & AD&D)
  • Paid Time Off (Vacation, Sick & Public Holidays)
  • Family Leave (Maternity, Paternity)
  • Short Term & Long Term Disability
  • Training & Development
  • Work From Home
  • Free Food & Snacks
  • Stock Option Plan
  • [job_alerts.create_a_job]

    Ai Engineer • Burlingame, CA, US

    [internal_linking.related_jobs]
    AI Engineer, L / S Equity

    AI Engineer, L / S Equity

    Point72 Asset Management, L.P • San Francisco, CA, United States
    [job_card.full_time]
    A Career with Point72’s Long / Short Equities Team.Long / Short Equity is Point72’s core strategy and its success is dependent upon our sector-based investing teams. Using fundamental research, our rese...[show_more]
    [last_updated.last_updated_30] • [promoted]
    AI Engineer

    AI Engineer

    Campfire • San Francisco, California, United States
    [job_card.full_time]
    We are not open to remote candidates.Campfire is on a mission to redefine the accounting software landscape by taking on giants like Netsuite to build modern accounting software for startups and mi...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    AI Engineer

    AI Engineer

    Mintlify • San Francisco, California, United States
    [job_card.full_time]
    Mintlify empowers developers and businesses worldwide.We're a company forged in building.Our founders aimed to create a platform that accelerates the ability to solve problems.With X, Anthropic, Cu...[show_more]
    [last_updated.last_updated_30] • [promoted]
    AI Agentic Engineer

    AI Agentic Engineer

    DocuSign, Inc. • San Francisco, CA, United States
    [job_card.full_time]
    Docusign brings agreements to life.Docusign solutions to accelerate the process of doing business and simplify people’s lives. With intelligent agreement management, Docusign unleashes business-crit...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    AI Inference Engineer

    AI Inference Engineer

    Pantera Capital • San Francisco, CA, United States
    [job_card.full_time]
    We are looking for an AI Inference engineer to join our growing team.Our current stack is Python, Rust, C++, PyTorch, Triton, CUDA, Kubernetes. You will have the opportunity to work on large-scale d...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Senior Applied AI Engineer

    Senior Applied AI Engineer

    Pulley • San Francisco, California, United States
    [job_card.full_time]
    At Pulley, we are on a mission to help construction teams break ground faster.Meanwhile, retail and office real estate.Permitting requirements vary vastly by jurisdiction, making it challenging to ...[show_more]
    [last_updated.last_updated_30] • [promoted]
    AI / ML Inference Engineer

    AI / ML Inference Engineer

    Krea • San Francisco, California, United States
    [job_card.full_time]
    At Krea, we're dedicated to making AI intuitive and controllable for creatives.Our mission is to build tools that empower human creativity, not replace it. We believe AI is a new medium that allows ...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Senior Software Engineer, Intelligence — AI Retrieval

    Senior Software Engineer, Intelligence — AI Retrieval

    AngelList • San Francisco, CA, United States
    [job_card.full_time]
    A growing tech company is seeking a Senior Software Engineer to design and implement systems that power data retrieval and search functionalities. The ideal candidate should have extensive experienc...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Generative AI Inference Engineer — High-Performance ML

    Generative AI Inference Engineer — High-Performance ML

    Amazon • San Francisco, CA, United States
    [job_card.full_time]
    A leading technology company is seeking a Software Development Engineer to advance Generative AI capabilities.The ideal candidate will have at least 3 years of professional software development exp...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    AI Engineer

    AI Engineer

    Elicit • Oakland, California, United States
    [job_card.full_time]
    Elicit is an AI research assistant that uses language models to help researchers figure out what’s true and make better decisions, starting with common research tasks like literature review.Elicit ...[show_more]
    [last_updated.last_updated_30] • [promoted]
    AI Engineer

    AI Engineer

    Vibecode • San Francisco, California, United States
    [job_card.full_time]
    We're democratizing software creation.Our platform lets anyone describe an idea and instantly turn it into a working application—no coding required. We're solving one of computing's fundamental chal...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Applied AI Inference Engineer

    Applied AI Inference Engineer

    Baseten • San Francisco, CA, United States
    [job_card.full_time]
    Baseten provides the infrastructure, tooling, and expertise needed to bring great AI products to market - fast.Backed by top investors including IVP, Spark Capital, Greylock, and Conviction, we’re ...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Senior Model Inference Engineer for Production-Scale AI

    Senior Model Inference Engineer for Production-Scale AI

    OpenAI • San Francisco, CA, United States
    [job_card.full_time]
    A leading AI research company in San Francisco seeks an engineer to optimize their powerful AI models for high-volume production environments. The ideal candidate has over 5 years of software engine...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    AI Engineer

    AI Engineer

    Anything • San Francisco, California, United States
    [job_card.full_time]
    Anything is the AI product engineer for the next wave of entrepreneurs.It's an AI agent that turns English into apps.Everything you need make money on the internet built in - mobile, web, design, ...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Staff AI Infra Engineer : Scalable Inference Platform

    Staff AI Infra Engineer : Scalable Inference Platform

    Crusoe • San Francisco, CA, United States
    [job_card.full_time]
    An innovative technology company seeks a Staff Software Engineer for the Managed AI team to design and implement next-generation AI platforms. In this pivotal role, you'll shape system architecture ...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    AI Inference Engineer

    AI Inference Engineer

    Quadric, Inc • Burlingame, California, United States
    [job_card.full_time]
    Quadric has created an innovative general purpose neural processing unit (GPNPU) architecture.Quadric's co-optimized software and hardware is targeted to run neural network (NN) inference workloads...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    AI Engineer

    AI Engineer

    Trial Library • San Francisco, California, United States
    [filters.remote]
    [job_card.full_time]
    Our mission is to improve health equity by expanding access to cancer precision medicine.Trial Library is a mission-driven health technology company dedicated to improving health equity in cancer c...[show_more]
    [last_updated.last_updated_30] • [promoted]
    AI Engineer

    AI Engineer

    Confidential • San Francisco, California, United States
    [filters.remote]
    [job_card.full_time]
    This role is open to US Citizens, Green Card holders, GC-EAD only.Adidev is looking for an adept Machine Learning Engineer to take the helm in deploying advanced machine learning models, with a spe...[show_more]
    [last_updated.last_updated_30] • [promoted]