AI/ML Engineer – Infrastructure & Optimization

June 2, 2026
Application ends: September 1, 2026
Apply Now

Job Description

Description :You will be at the forefront of Byteridge’s AI infrastructure capabilities, helping customers unlock the full potential of foundation models through expert-level deployment on GPU infrastructure.
This highly technical role requires deep expertise in machine learning infrastructure, GPU optimization, and production ML systems, combined with the ability to translate complex technical concepts into customer success.

What You’ll Do :

Model Deployment & Optimization :– Lead end-to-end deployments of large language models on AWS infrastructure for strategic customers

– Design and implement training, fine-tuning, and inference pipelines using Amazon SageMaker AI

– Optimize model performance through GPU-level tuning, kernel optimization, and infrastructure configuration

– Deploy models on diverse GPU architectures including NVIDIA and AWS custom silicon (Trainium, Inferentia)

Infrastructure Architecture & Performance :– Architect scalable ML infrastructure using SageMaker AI Inference, HyperPod, and distributed training frameworks

– Implement CUDA-level optimizations and custom kernels for improved model performance

– Design storage and networking architectures optimized for high-throughput ML workloads

– Troubleshoot and resolve complex performance bottlenecks at the GPU driver and kernel level

Customer Engagement & Technical Leadership :– Partner with AWS AI Specialist Solution Architects and customer ML teams to understand model requirements and deployment constraints

– Provide technical guidance on model selection, fine-tuning strategies, and production best practices

– Conduct performance benchmarking and cost optimization analysis for ML workloads

– Share field insights with AWS product teams to influence infrastructure and service roadmaps

What We’re Looking For :

Core Qualifications :– Bachelor’s degree in Computer Science, Engineering, or equivalent practical experience (Master’s or PhD preferred)

– 5+ years of experience in machine learning infrastructure, model deployment, or GPU computing

– Strong programming skills in Python and experience with ML frameworks (PyTorch, TensorFlow, JAX)

– Deep understanding of LLM architectures, training methodologies, and inference optimization

Are you interested in this position?

Apply by clicking on the “Apply Now” button below!

#GraphicDesignJobsOnline

#WebDesignRemoteJobs #FreelanceGraphicDesigner #WorkFromHomeDesignJobs #OnlineWebDesignWork #RemoteDesignOpportunities #HireGraphicDesigners #DigitalDesignCareers# Dynamicbrand guru