Job Description
We’re looking for a DevOps Engineer who thrives in complex, fast-moving environments where infrastructure, deployment, and security challenges evolve daily. You’ll be responsible for re-architecting our CI/CD pipelines, improving infrastructure reliability for our microservices architecture, and embedding observability at every layer of our deployment lifecycle. This role is for someone who gets excited about optimizing Kubernetes clusters at scale, believes IaC should be treated like product code, and can translate a conversation about system load into a concrete Grafana dashboard.
Key Responsibilities
- Refactor and optimize existing CI/CD pipelines (GitHub Actions + ArgoCD) to improve build speed, security checks, and deployment reliability
- Maintain and scale multi-region Kubernetes infrastructure supporting over 100 microservices
- Design and manage robust observability systems using Prometheus, Grafana, Loki, and Alertmanager
- Implement zero-downtime deployment strategies and support feature flag rollout mechanisms
- Drive infrastructure as code adoption using Terraform and Helm with modular, versioned patterns
- Collaborate with backend and SRE teams on performance tuning, cost monitoring, and capacity planning
- Own incident response workflows, on-call rotations, and participate in postmortems and root cause analyses
- Define and enforce guardrails for secrets management, image scanning, and runtime policy enforcement (e.g., OPA/Gatekeeper)
Qualifications
- 4+ years of hands-on DevOps or Site Reliability Engineering experience in production environments
- Strong proficiency with Kubernetes, including workload tuning, ingress configuration, and custom controllers
- Expert-level knowledge in CI/CD tooling (ArgoCD preferred, CircleCI/GitHub Actions also acceptable)
- Deep experience with infrastructure as code (Terraform) and container orchestration with Helm
- Solid background in metrics, logs, and tracing systems (Prometheus, Grafana, Loki, Jaeger, etc.)
- Demonstrated ability to write reusable tooling/scripts in Bash, Python, or Go
- Experience with managing VPC networking, security groups, IAM policies, and cost optimization in AWS
- Comfort with security concepts like least privilege, SBOMs, container hardening, and runtime scanning
Are you interested in this position?
Apply by clicking on the “Apply Now” button below!
#GraphicDesignJobsOnline#WebDesignRemoteJobs #FreelanceGraphicDesigner #WorkFromHomeDesignJobs #OnlineWebDesignWork #RemoteDesignOpportunities #HireGraphicDesigners #DigitalDesignCareers#Dynamicbrandguru