Talent.com
AI/ML Computing Cluster Engineer
AI/ML Computing Cluster EngineerSk Hynix America • San Jose, California, United States
AI / ML Computing Cluster Engineer

AI / ML Computing Cluster Engineer

Sk Hynix America • San Jose, California, United States
[job_card.30_days_ago]
[job_preview.job_type]
  • [job_card.full_time]
[job_card.job_description]

Job Title : AI / ML Computing Cluster Engineer

Office Location : San Jose, CA

Work Model : Onsite

About SK hynix America

At SK hynix America, we're at the forefront of semiconductor innovation, developing advanced memory solutions that power everything from smartphones to data centers. As a global leader in DRAM and NAND flash technologies, we drive the evolution of advancing mobile technology, empowering cloud computing, and pioneering future technologies. Our cutting-edge memory technologies are essential in today's most advanced electronic devices and IT infrastructure, enabling enhanced performance and user experiences across the digital landscape.

We're looking for innovative minds to join our mission of shaping the future of technology. At SK hynix America, you'll be part of a team that's pioneering breakthrough memory solutions while maintaining a strong commitment to sustainability. We're not just adapting to technological change – we're driving it, with significant investments in artificial intelligence, machine learning, and eco-friendly solutions and operational practices. As we continue to expand our market presence and push the boundaries of what's possible in semiconductor technology, we invite you to be part of our journey to creating the next generation of memory solutions that will define the future of computing.

Job Overview :

As the AI / ML Computing Cluster engineer, you will work on development and operation of high-performance computing clusters supporting AI / ML workloads. You will be responsible for development, implementation, operation, and optimization of AI data center IT environments to ensure scalability, performance, reliability, and cost-effectiveness. This role requires collaboration with cross-functional teams to align computing infrastructure with the organization's strategic direction.

Responsibilities :

Computing Cluster Infrastructure Development

  • Design and implement distributed computing cluster infrastructure to support large-scale AI / ML model training and inference jobs with a focus on transformer-based AI models.
  • Build and maintain distributed system to ensure scalability, efficient resource allocation, and high throughput.
  • Optimize cluster performance through hardware selection, equipment configuration, network engineering, and performance analysis.
  • Deploy and operate data center networking infrastructure using software system for automation, design validation, deployment, and operational support.
  • Implement tools and processes to maintain high uptime and ensure infrastructure reliability during both model training and inference phases.
  • Identify and resolve performance bottlenecks, improving overall system throughput and response times.

Team Leadership & Collaboration

  • Collaborate with cross-functional teams, including research, security, and benchmark test engineering teams, to integrate infrastructure with AI workflows, ensuring seamless deployment and operation.
  • Engage with technology vendors and partners to evaluate new solutions to drive innovation in AI computing infrastructure.
  • Qualification :

  • Master’s degree or above in Computer Science, Electrical Engineering, or related fields.
  • 2+ years of experience in AI cluster engineering, MLOps, and benchmark testing, including GPU performance analysis, memory usage, and energy / power monitoring tools.
  • Strong familiarity with AI computing architecture, AI / ML infrastructure requirements, memory architecture and usages in AI / ML, AI algorithm trends and best practices.
  • Expertise in optimizing resource utilization, improving system throughput, and reducing latency in both training and inference.
  • Equal Employment Opportunity :

    SKHYA is an Equal Employment Opportunity Employer. We provide equal employment opportunities to all qualified applicants and employees and prohibit discrimination and harassment of any type without regard to race, sex, pregnancy, sexual orientation, religion, age, gender identity, national origin, color, protected veteran or disability status, genetic information or any other status protected under federal, state, or local applicable laws.

    Compensation :

    Our compensation reflects the cost of labor across several U.S. geographic markets, and we pay differently based on those defined markets. Pay within the provided range varies by work location and may also depend on job-related skills and experience. Your Recruiter can share more about the specific salary range for the job location during the hiring process.

    Pay Range

    $100,000 - $150,000 USD

    [job_alerts.create_a_job]

    Aiml Engineer • San Jose, California, United States

    [internal_linking.similar_jobs]
    Senior AI Infra Compute Engineer — Cloud-Scale ML

    Senior AI Infra Compute Engineer — Cloud-Scale ML

    ByteDance • San Jose, CA, United States
    [job_card.full_time]
    A global technology company is seeking a Senior Software Engineer for their AI Infra Compute team in San Jose, California. This position involves designing large-scale cloud infrastructure, implemen...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    AI Solutions Architect : On-Prem & Cloud ML Deployments

    AI Solutions Architect : On-Prem & Cloud ML Deployments

    7wdata • Santa Clara, CA, United States
    [job_card.full_time]
    A technology company is seeking a Machine Learning Engineer / Solution Architect with expertise in deploying deep learning models on-prem and in the cloud. Responsibilities include technical engagemen...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Python ML / AI Engineer

    Python ML / AI Engineer

    Inabia Software & Consulting Inc. • Sunnyvale, California, United States
    [job_card.full_time]
    Go, Python with Cassandra experience with 2-yrs of ML / AI experience.[show_more]
    [last_updated.last_updated_30] • [promoted]
    Sr. ML Engineer, AI Cloud

    Sr. ML Engineer, AI Cloud

    Tenstorrent • Santa Clara, California, United States
    [job_card.full_time] +1
    Tenstorrent is leading the industry on cutting-edge AI technology, revolutionizing performance expectations, ease of use, and cost efficiency. With AI redefining the computing paradigm, solutions mu...[show_more]
    [last_updated.last_updated_30] • [promoted]
    ML Platform Engineer : Architect, Deploy & Scale AI

    ML Platform Engineer : Architect, Deploy & Scale AI

    Walmart • Sunnyvale, CA, United States
    [job_card.full_time]
    A leading retail company in Sunnyvale is seeking a Machine Learning Engineer to lead the design and implementation of scalable ML solutions. This role involves overseeing the entire ML lifecycle, fr...[show_more]
    [last_updated.last_updated_variable_hours] • [promoted] • [new]
    Generative AI - ML System Engineering

    Generative AI - ML System Engineering

    Meshy • Sunnyvale, CA, US
    [job_card.full_time]
    We are looking for Machine Learning Systems Engineers who can help us build the world's largest end-to-end 3D native machine learning systems. You will help us build our end to end ML framework ...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    ML Engineer : Scale Production AI for Expert Knowledge

    ML Engineer : Scale Production AI for Expert Knowledge

    Samaya AI • Mountain View, CA, United States
    [job_card.full_time]
    A pioneering AI firm in Mountain View is seeking an ML Engineer to drive scalable machine learning initiatives.In this role, you will design and productionize ML systems while collaborating across ...[show_more]
    [last_updated.last_updated_variable_hours] • [promoted] • [new]
    Mid-Level AI Engineer

    Mid-Level AI Engineer

    VirtualVocations • Fremont, California, United States
    [job_card.full_time]
    A company is looking for a Mid-Level AI Transformation Engineer to drive artificial intelligence innovation within its healthcare organization. Key Responsibilities Design, develop, and deploy AI-...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    AI / ML Engineer

    AI / ML Engineer

    Powerline • Palo Alto, California, United States
    [job_card.full_time]
    Join Powerline and Shape the Future of the Electricity Grid!.Powerline is a fast-growing, VC-backed climate-tech company based in Silicon Valley, dedicated to transforming the electricity grid with...[show_more]
    [last_updated.last_updated_30] • [promoted]
    AI / ML Engineer - Build Core Intelligence for a New Class of Enterprise AI Products

    AI / ML Engineer - Build Core Intelligence for a New Class of Enterprise AI Products

    Evolution USA • Fremont, CA, United States
    [job_card.full_time]
    Our client is a well-funded, product-focused AI startup building next-generation systems that help organizations capture, organize, and leverage their internal knowledge at scale.Their platform ble...[show_more]
    [last_updated.last_updated_1_day] • [promoted]
    AI / ML Architect

    AI / ML Architect

    KlearNow.ai • Santa Clara, CA, US
    [job_card.full_time]
    AI / ML Architect Job Description.We are currently seeking a highly skilled and visionary AI Architect to join our dynamic team. As an AI Architect, you will be instrumental in shaping the AI strategy...[show_more]
    [last_updated.last_updated_30] • [promoted]
    AI Engineer

    AI Engineer

    Zone It Solutions • San Jose, California, United States
    [job_card.full_time]
    We are on the lookout for an innovative and driven.In this role, you will be responsible for designing, developing, and deploying AI models that will enhance our products and improve our services.B...[show_more]
    [last_updated.last_updated_30] • [promoted]
    Senior Data / AI / ML Software Engineer

    Senior Data / AI / ML Software Engineer

    Crossing Hurdles • Hayward, CA, United States
    [job_card.full_time]
    Crossing Hurdles is a global recruitment firm partnering with, a fast-growing Clinical Data Intelligence platform built on 12+ years of advanced research in Machine Reading and Knowledge Graph tech...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Hybrid ML Engineer, Sensor Fusion & Gen AI Pipelines

    Hybrid ML Engineer, Sensor Fusion & Gen AI Pipelines

    Waymo • Mountain View, CA, United States
    [job_card.full_time]
    A leading autonomous driving technology company is seeking a Software Engineer specializing in machine learning.This role involves applying sensor fusion techniques and developing robust models to ...[show_more]
    [last_updated.last_updated_variable_hours] • [promoted] • [new]
    Senior ML Platform Engineer for Large-Scale AI Infra

    Senior ML Platform Engineer for Large-Scale AI Infra

    Apple Inc. • Santa Clara, CA, United States
    [job_card.full_time]
    A leading technology company in Santa Clara is seeking a Machine Learning Engineer to design and build large-scale distributed services that power their search and foundation model platforms.You wi...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Senior AI / ML Engineer — NLP / LLM on-Device & Multimodal

    Senior AI / ML Engineer — NLP / LLM on-Device & Multimodal

    Google Inc. • Mountain View, CA, United States
    [job_card.full_time]
    A leading technology company in Mountain View, CA, is seeking a Software Engineer to develop cutting-edge technologies that enhance user experiences. The ideal candidate will have extensive software...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    MLOps Architect for Scalable, Production-Ready AI

    MLOps Architect for Scalable, Production-Ready AI

    Microsoft • Mountain View, CA, United States
    [job_card.full_time]
    A leading tech company in Mountain View is seeking a Machine Learning Operations Engineer to develop and architect robust infrastructure for AI products. This high-impact role involves designing sca...[show_more]
    [last_updated.last_updated_variable_days] • [promoted]
    Android AI ML Engineer - On-Device

    Android AI ML Engineer - On-Device

    Focuskpi • Mountain View, California, United States
    [job_card.temporary]
    Android AI ML Engineer - On-Device.The client is looking for a highly capable Android AI / ML Engineer - On-Device to help build intelligent, privacy-first mobile systems that can detect, respond to,...[show_more]
    [last_updated.last_updated_30] • [promoted]