Job Description
We’re looking for a DevOps Engineer with hands-on experience in automating, scaling, and optimizing cloud infrastructure in a high-availability SaaS environment. You’ll work on a fast-moving platform serving millions of requests per day, driving improvements in our CI/CD pipelines, infrastructure as code (IaC), observability tooling, and deployment strategies.
This role isn’t just about managing Kubernetes clusters—it’s about proactively identifying bottlenecks, ensuring deployment consistency across environments, and owning uptime and performance metrics. You’ll collaborate daily with backend engineers, QA, and SRE to embed resilience, speed, and security into our workflows.
Responsibilities
- Own and manage infrastructure in AWS using Terraform and Helm, ensuring environment parity across dev, staging, and prod.
- Design, implement, and monitor CI/CD pipelines (we use GitHub Actions and ArgoCD) to support microservices and event-driven architectures.
- Lead the containerization strategy for legacy services transitioning to Kubernetes.
- Develop internal tooling for automated rollbacks, canary deploys, and dynamic scaling based on load testing.
- Implement robust logging, tracing, and metrics collection using Prometheus, Loki, Grafana, and OpenTelemetry.
- Monitor and troubleshoot performance issues, outages, and deployments in real-time using PagerDuty, Sentry, and custom alerting systems.
- Conduct postmortems with engineering teams and implement automated remediations.
- Participate in quarterly DR exercises and ensure infrastructure compliance with SOC2 and ISO 27001 standards.
- Regularly audit IAM policies, secrets management (Vault), and service permissions.
Required Qualifications
- 4+ years of experience in a DevOps, SRE, or Platform Engineering role in a cloud-native environment (AWS required).
- Deep understanding of Kubernetes, including CRDs, operators, RBAC, and Helm chart templating.
- Proficiency in infrastructure as code tools (Terraform preferred, Pulumi is a plus).
- Strong scripting experience (Python, Bash, or Go) for automation and tooling.
- Familiarity with GitOps principles and tools such as ArgoCD or FluxCD.
- Experience with distributed tracing, log aggregation, and metric visualization tools.
- Solid knowledge of container security, image scanning, and secret management.
- Comfortable managing DNS, CDN (Cloudflare preferred), and TLS certificate automation.
- Experience supporting engineering teams with custom CI/CD needs and environment setups.
Nice to Have
- Experience with service mesh technologies like Istio or Linkerd.
- Familiarity with Kafka or other event streaming platforms.
- Previous work in a regulated environment (HIPAA, SOC2, etc.).
- Contributions to open source DevOps tooling or public IaC modules.
Are you interested in this position?
Apply by clicking on the “Apply Now” button below!
#GraphicDesignJobsOnline#WebDesignRemoteJobs #FreelanceGraphicDesigner #WorkFromHomeDesignJobs #OnlineWebDesignWork #RemoteDesignOpportunities #HireGraphicDesigners #DigitalDesignCareers#Dynamicbrandguru