Talent.com
Senior Solution Architect, HPC and AI - NVIS
Senior Solution Architect, HPC and AI - NVISNVIDIA Corporation • Santa Clara, CA, United States
Senior Solution Architect, HPC and AI - NVIS

Senior Solution Architect, HPC and AI - NVIS

NVIDIA Corporation • Santa Clara, CA, United States
[job_card.variable_days_ago]
[job_preview.job_type]
  • [job_card.full_time]
[job_card.job_description]

NVIDIA is the world leader in computer graphics, artificial intelligence, and accelerated computing. For over 25 years, we have been at the forefront of research and engineering around the greatest advances in technology. Our history of innovation drives us to solve the worlds hardest problems.NVIDIA is looking for Senior HPC / AI Solutions Architect to join its NVIDIA Infrastructure Specialists Team. Academic and commercial groups around the world are using NVIDIA products to revolutionize deep learning and data analytics, and to power data centers. Join the team building many of the largest and fastest AI / HPC systems in the world! We are looking for someone with the ability to work on a dynamic customer focused team that requires excellent interpersonal skills. This role will be interacting with customers, partners and internal teams, to analyze, define and implement large scale AI / HPC projects. The scope of these efforts includes a combination of Networking, System Design and Automation and being the face to the customer!

  • What You’ll Be Doing :
  • Primary responsibilities will include building robust AI / HPC infrastructure for new and existing customers.
  • Support operational and reliability aspects of large-scale AI clusters, focusing on performance at scale, training stability, real-time monitoring, logging, and alerting.
  • Engage in and improve the whole lifecycle of services from inception and design through deployment, operation, and refinement.
  • Your primary focus would be on understanding the AI workload and how it interacts with other parts of the system like networking, storage, deep learning frameworks, data cleaning tools, etc.
  • Help maintain services once they are live by measuring and monitoring progress of AI jobs and helping engineering design solutions for more robust training at scale.
  • Provide feedback to internal teams such as opening bugs, documenting workarounds, and suggesting improvements.
  • What We Need to See :
  • BS / MS / PhD or equivalent experience in Computer Science, Data Science, Electrical / Computer Engineering, Physics, Mathematics, other Engineering fields with at least 8 years work or research experience with Python / C++ / other software development.
  • Track record of medium to large scale AI training and understanding of key libraries used for NLP / LLM / VLA training (NeMo Framework, DeepSpeed etc.)
  • Experience with integration and deployment of software products in production enterprise environments, and microservices software architecture.
  • You are excited to work with multiple levels and teams across organisations (Engineering, Product, Sales and Marketing team) Capable of working in a constantly evolving environment without losing focus. Ability to multitask in a fast-paced environment.
  • Driven with strong analytical and problem-solving skills. Strong time-management and organization skills for coordinating multiple initiatives, priorities and implementations of new technology and products into very sophisticated projects.
  • You are a self-starter with demeanour for growth, passion for continuous learning and sharing findings across the team.
  • Technical leadership and strong understanding of NVIDIA technologies, and success in working with customers.
  • Excellent verbal, written communication, and technical presentation skills in English.
  • Ways to Stand Out from The Crowd :
  • Experience working with large transformer-based architectures for NLP, CV, ASR or other. Experience running large scale distributed DL training.
  • Understanding of HPC systems : data center design, high speed interconnect InfiniBand, Cluster Storage and Scheduling related design and / or management experience.
  • Proven experience with one or more Tier-1 Clouds (AWS, Azure, GCP or OCI) and cloud-native architectures and software.
  • Expertise with parallel filesystems (e.g. Lustre, GPFS, BeeGFS, WekaIO) and high-speed interconnects (InfiniBand, Omni Path, and Gig-E).
  • Strong coding and debugging skills, and demonstrated expertise in one or more of the following areas : Machine Learning, Deep Learning, Slurm, Docker / Kubernetes, Kubernetes, Singularity, MPI, MLOps, LLMOps, Ansible, Terraform, and other high-performance AI cluster solutions.
  • Technical leadership and strong understanding of NVIDIA technologies including GX Cloud, NVIDIA AI Enterprise AI Software, Base Command Manager, NEMO and NVIDIA Inference Microservices. Success in working with customers using NVIDIA technologies.NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking individuals in the world working for us. If you're creative and autonomous, we want to hear from you.The base salary range is 148,000 USD - 276,000 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.You will also be eligible for equity and .
  • NVIDIA accepts applications on an ongoing basis.
  • NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

#J-18808-Ljbffr

[job_alerts.create_a_job]

Senior Solution Architect • Santa Clara, CA, United States

[internal_linking.related_jobs]
Senior Solution Architect – AI / GPU Cloud

Senior Solution Architect – AI / GPU Cloud

GMI Cloud • Mountain View, CA, United States
[job_card.full_time]
Senior Solution Architect – AI / GPU Cloud.We are seeking a Senior Solution Architect to design GPU‑cloud and AI infrastructure solutions, lead PoCs and benchmarks, guide customers through deployme...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Lead System Solutions Architect – AI & HPC Clusters

Lead System Solutions Architect – AI & HPC Clusters

AMD • San Jose, CA, United States
[job_card.full_time]
A leading technology company in San Jose is seeking an experienced System Solutions Architect focused on large clusters for AI workloads. In this role, you will lead customer discovery sessions, tra...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Senior AI System Architect : Enterprise AI Orchestrator

Senior AI System Architect : Enterprise AI Orchestrator

Adobe Inc. • San Jose, CA, United States
[job_card.full_time]
A leading technology company in California is seeking a Senior AI System Architect to bridge business needs with advanced AI system design. The role focuses on translating use cases into scalable so...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
AI Solution Manager

AI Solution Manager

Supermicro • San Jose, CA, United States
[job_card.full_time]
Supermicro is a Top Tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop / Big Data, Hyperscale, HPC and IoT / Embedded customers...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Senior AI Platform Architect, Enterprise

Senior AI Platform Architect, Enterprise

Palo Alto Networks • Santa Clara, CA, United States
[job_card.full_time]
A leading cybersecurity company is seeking a Principal AI Engineer to drive the development and implementation of AI solutions across enterprise functions. This role requires extensive experience in...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Senior Cloud Solution Architect

Senior Cloud Solution Architect

Tencent • Palo Alto, CA, United States
[job_card.full_time]
Senior Cloud Solution Architect.Be among the first 25 applicants.Senior Cloud Solution Architect.Get AI-powered advice on this job and more exclusive features. Cloud & Smart Industries Group (CSIG) ...[show_more]
[last_updated.last_updated_30] • [promoted]
Solutions Architect - HPC / AI / ML

Solutions Architect - HPC / AI / ML

CoreWeave • Sunnyvale, CA, US
[job_card.permanent]
CoreWeave is The Essential Cloud for AI™.Built for pioneers by pioneers, CoreWeave delivers a platform of technology, tools, and teams that enables innovators to build and scale AI with confi...[show_more]
[last_updated.last_updated_30] • [promoted]
Solution Architect, Strategic Technology Partnerships

Solution Architect, Strategic Technology Partnerships

JFrog • Sunnyvale, CA, United States
[job_card.full_time]
Solution Architect, Strategic Technology Partnerships.Solution Architect, Strategic Technology Partnerships.Talent @ JFrog 🐸 Pushing Talents Frogward! At JFrog, we’re reinventing DevOps to help th...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Senior Solution Architect, HPC and AI - NVIS

Senior Solution Architect, HPC and AI - NVIS

NVIDIA • Santa Clara, CA, United States
[job_card.full_time]
Senior Solution Architect, HPC and AI - NVIS.Join to apply for the Senior Solution Architect, HPC and AI - NVIS role at NVIDIA. Do you want to be part of the team that brings Artificial Intelligence...[show_more]
[last_updated.last_updated_30] • [promoted]
Senior Principal Architect - AI Software & HPC

Senior Principal Architect - AI Software & HPC

d-Matrix • Santa Clara, CA, United States
[job_card.full_time]
A technology firm in Santa Clara is seeking a Senior Principal Architect to lead the design of their next-gen AI software stack. This role requires extensive experience in C++ and complex software s...[show_more]
[last_updated.last_updated_variable_hours] • [promoted] • [new]
Avaya Solution Architect

Avaya Solution Architect

Avaya • San Jose, CA, United States
[job_card.full_time]
Avaya is an enterprise software leader that helps the world’s largest organizations and government agencies forge unbreakable connections. The Avaya Infinity™ platform unifies fragmented customer ex...[show_more]
[last_updated.last_updated_variable_hours] • [promoted] • [new]
GenAI Solutions Architect : Scale Startups with AI

GenAI Solutions Architect : Scale Startups with AI

Amazon • Palo Alto, CA, United States
[job_card.full_time]
A leading tech company is looking for a GenAI Solutions Architect to support generative AI startups in California.In this role, you will advise on best practices, implement AWS technology, and help...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Senior Solutions Architect - Enterprise Networking & AI

Senior Solutions Architect - Enterprise Networking & AI

Presidio • Pleasanton, CA, United States
[job_card.full_time]
A leading technology firm in Pleasanton, CA is seeking a skilled Senior Solutions Architect to join their Pre-Sales Engineering Team. The role involves meeting with clients to gather business requir...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
AI Solutions Architect — Integrations & POC Lead

AI Solutions Architect — Integrations & POC Lead

Thelevel • Mountain View, CA, United States
[job_card.full_time]
A cutting-edge AI startup in Mountain View is seeking a Solutions Architect to lead integrations between their platform and customer systems. The role requires a combination of technical expertise, ...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Solution Architect, Strategic Technology Partnerships

Solution Architect, Strategic Technology Partnerships

JFrog Ltd • Sunnyvale, CA, United States
[job_card.full_time]
Solution Architect, Strategic Technology Partnerships.At JFrog, we’re reinventing DevOps to help the world’s greatest companies innovate and we want you along for the ride.This is a special plac...[show_more]
[last_updated.last_updated_variable_days] • [promoted]
Solution Architect

Solution Architect

TradeJobsWorkForce • 95160 San Jose, CA, US
[job_card.full_time]
Solution Architect Job Duties : Responsible for assisting in the establishment of an IT Architectur...[show_more]
[last_updated.last_updated_30] • [promoted]
Senior Solutions Architect - Enterprise, Bay Area

Senior Solutions Architect - Enterprise, Bay Area

Elastic • Mountain View, CA, United States
[job_card.full_time]
Elastic, the Search AI Company, enables everyone to find the answers they need in real time, using all their data, at scale - unleashing the potential of businesses and people.The Elastic Search AI...[show_more]
[last_updated.last_updated_30] • [promoted]
AI & HPC Infra Architect for Large Clusters

AI & HPC Infra Architect for Large Clusters

Advanced Micro Devices, Inc. • San Jose, CA, United States
[job_card.full_time]
A leading technology company in San Jose is seeking an experienced System Solutions Architect to drive AI infrastructure projects. The ideal candidate will have a strong background in Kubernetes-bas...[show_more]
[last_updated.last_updated_variable_days] • [promoted]