Job Description
We’re looking for a DevOps Engineer who thrives in environments with complex microservices, high CI/CD throughput, and observability challenges. This role isn’t about babysitting Jenkins or clicking around cloud dashboards — you’ll be designing resilient pipelines, managing container-based deployments at scale, and working closely with backend engineers to eliminate friction between code and production.
Key Responsibilities
- Pipeline Engineering: Refactor and maintain our GitHub Actions + ArgoCD pipelines for a monorepo setup serving over 40 microservices.
- IaC Ownership: Own and expand our Terraform modules for provisioning VPCs, EKS clusters, ALBs, and secrets in AWS across multiple environments.
- Container Orchestration: Manage Helm-based deployments on EKS; create reusable Helm charts with dynamic templating logic for service-specific needs.
- Incident Readiness: Implement proactive observability using Prometheus, Loki, Grafana, and custom metrics dashboards. On-call participation (1 week every 6 weeks).
- Secrets and Configuration Management: Enforce GitOps-based secrets handling with SOPS and Sealed Secrets; maintain secure integration patterns across dev, staging, and prod.
- Cross-Team Collaboration: Work with SREs and Backend Engineers to implement blue/green and canary deployments with traffic shaping in Istio.
- Security Integration: Integrate SAST/DAST tools into the CI pipelines and automate policy checks for infrastructure changes via OPA/Gatekeeper.
Requirements
- 3–5 years of DevOps experience, preferably supporting SaaS products at scale
- Hands-on with Kubernetes, including debugging pod/container/network issues
- Proficiency in Terraform, Helm, and managing AWS infrastructure (EC2, EKS, RDS, S3, IAM)
- Familiar with GitHub Actions, ArgoCD, and Docker build optimizations
- Strong understanding of Linux internals and how system-level decisions affect cloud deployments
- Comfortable writing scripts/tools in Python or Go for deployment automation and tooling
- Experience with monitoring stacks: Prometheus, Grafana, Loki, Alertmanager
- Practical knowledge of zero-downtime deployment strategies and rollback automation
Nice-to-Haves
- Prior experience with service mesh tools like Istio or Linkerd
- Experience setting up OPA policies and enforcing governance in Kubernetes
- Familiarity with cost optimization strategies in AWS
- Exposure to machine-level debugging for performance or memory issues
Are you interested in this position?
Apply by clicking on the “Apply Now” button below!
#GraphicDesignJobsOnline#WebDesignRemoteJobs #FreelanceGraphicDesigner #WorkFromHomeDesignJobs #OnlineWebDesignWork #RemoteDesignOpportunities #HireGraphicDesigners #DigitalDesignCareers#Dynamicbrandguru