AI/ML Solution Architect

March 6, 2026
Application ends: June 5, 2026
Apply Now

Job Description


Team & Responsibilities

Work alongside senior AI and infrastructure engineers building large-scale GPU platforms. As part of the customer solutions team, you will:

  • Design and validate production-grade distributed training (primary) and large-scale inference architectures on large GPU clusters, typically tens to thousands of GPUs
  • Work hands-on with customers to debug, optimize, and scale ML workloads across multi-node GPU environments
  • Act as a technical authority on GPU performance, networking, and schedulers, making trade-offs at scale and translating customer needs into concrete platform requirements
  • Collaborate closely with engineering, product, and R&D to influence roadmap decisions based on real-world ML workloads
  • This is a hands-on, technical role; you are expected to work directly in customer environments, not only advise at a high level

Required Skills

  • Hands-on experience designing and operating production-grade, multi-node GPU workloads for training or inference
  • Strong background in distributed deep learning (PyTorch Distributed, DeepSpeed) on GPU clusters
  • Deep understanding of GPU architecture and interconnects (H100/A100 class, NVLink, InfiniBand)
  • Experience with Kubernetes or Slurm and performance tuning using GPU profiling and monitoring tools

Are you interested in this position?

Apply by clicking on the “Apply Now” button below!

#GraphicDesignJobsOnline

#WebDesignRemoteJobs #FreelanceGraphicDesigner #WorkFromHomeDesignJobs #OnlineWebDesignWork #RemoteDesignOpportunities #HireGraphicDesigners #DigitalDesignCareers# Dynamicbrand guru