Job Description
About the Role
We’re seeking a DevOps Engineer to lead the automation and infrastructure strategy for our cloud-native microservices platform, supporting real-time analytics at scale. You will play a key role in defining CI/CD standards, infrastructure provisioning, and monitoring systems that serve high-throughput applications with stringent availability requirements.
Responsibilities
- Architect and maintain CI/CD pipelines using GitHub Actions and ArgoCD for Kubernetes-based deployments across staging, QA, and production environments.
- Design infrastructure as code (IaC) using Terraform for multi-region AWS environments, including VPCs, EKS clusters, and RDS configurations.
- Implement autoscaling, load balancing, and service mesh configurations for latency-sensitive services.
- Set up and fine-tune Prometheus, Grafana, and Loki for full-stack observability, and manage on-call dashboards and alerts.
- Lead efforts in integrating secrets management (Vault or AWS Secrets Manager) with GitOps workflows.
- Define policies for container image scanning and infrastructure drift detection.
- Collaborate with application teams to optimize resource usage and deployment rollback strategies.
- Conduct root cause analysis and post-mortems for system outages and performance degradations.
Required Qualifications
- 4+ years of hands-on DevOps experience with production Kubernetes environments (EKS preferred).
- Strong proficiency in Terraform, with demonstrated experience in modular design and environment promotion workflows.
- Proven expertise in CI/CD orchestration, ideally with GitHub Actions and ArgoCD; familiarity with Helm and Kustomize a plus.
- Deep understanding of Linux systems, networking fundamentals, and cloud-native tooling.
- Proficiency with observability stacks: Prometheus, Grafana, Loki, Tempo, and Alertmanager.
- Comfortable scripting in Bash and one or more languages like Python or Go for custom tooling.
- Familiarity with security best practices for IAM policies, resource isolation, and container hardening.
- Experience supporting 24/7 production systems with incident management processes.
Are you interested in this position?
Apply by clicking on the “Apply Now” button below!
#GraphicDesignJobsOnline#WebDesignRemoteJobs #FreelanceGraphicDesigner #WorkFromHomeDesignJobs #OnlineWebDesignWork #RemoteDesignOpportunities #HireGraphicDesigners #DigitalDesignCareers#Dynamicbrandguru