Job Description
Roles & Responsibilities :
– ML Pipelines : Design and build CI/CD pipelines specifically for ML workflows (training triggers, model versioning, testing) using tools like Jenkins, Bitbucket, or GitHub Actions.
– Orchestration : Deploy, configure, and optimize Kubernetes clusters to support containerized deep learning applications (managing GPU resources, node scaling).
– Model Serving : Work with Data Scientists to containerize and deploy PyTorch models using Docker and serving frameworks (KServe, Nvidia Triton Inference Server).
– Infrastructure : Manage cloud infrastructure (AWS) for data processing and model storage (S3, ECR, IAM).
– GitOps : Implement GitOps practices to manage the lifecycle of both infrastructure and ML configurations.
– Monitoring : Implement monitoring for both system health (CPU/Memory) and Model Drift/Performance using tools like Prometheus, Grafana, or ELK.
– Automation : Automate repetitive tasks related to dataset management and environment setup using Python.
Qualification :
– 1 – 3 years of relevant experience as MLEngineer, MLOps, or Platform Engineering roles.
– Mandatory : Functional understanding of Machine Learning/Deep Learning concepts and the PyTorch framework.
– Mandatory : Prior experience working with Kubernetes and CI/CD in a production environment.
– Bachelors degree in Computer Science, IT, or a related field; non-IT degrees with relevant experience are also acceptable
Must-have Skills :
– Core MLOps : Practical experience deploying ML/DL models in production systems. You understand the difference between deploying a web app and deploying a deep learning model.
– Kubernetes : Strong hands-on experience with K8s (deployments, services, ingress) and preferably experience scheduling GPU workloads.
– CI/CD & GitOps : Proficiency in building pipelines (Jenkins/Bitbucket) and understanding GitOps workflows (ArgoCD/Flux).
– ML Fundamentals : Working knowledge of PyTorch and Python. You should be able to read model code, understand training/inference loops, optimize pytorch models and debug environment issues (CUDA, dependencies).
– Containerization : Expert-level Docker skills (multi-stage builds, reducing image sizes for large ML dependencies).
– Cloud : Experience with AWS services (EC2, S3, ECR).
– Linux/Scripting : Strong command of Linux internals and shell scripting.
Good-to-have :
– Experience with ML workflow tools like KServe, Triton Inference Server and MLflow.
– Experience profiling and optimizing PyTorch models for production inference on accelerator platforms such as NVIDIA GPUs, TPUs, and AWS Inferentia
– Background in processing Geospatial or Remote Sensing data.
Are you interested in this position?
Apply by clicking on the “Apply Now” button below!
#GraphicDesignJobsOnline
#WebDesignRemoteJobs #FreelanceGraphicDesigner #WorkFromHomeDesignJobs #OnlineWebDesignWork #RemoteDesignOpportunities #HireGraphicDesigners #DigitalDesignCareers# Dynamicbrand guru
Apply Now