Job Description
We are seeking a Cloud Engineer who thrives on designing resilient infrastructure, not just spinning up instances. This role is deeply technical, meant for someone who can diagnose memory leaks at 2 a.m. but also knows when it’s time to deprecate a brittle architecture. You’ll be responsible for building and maintaining secure, scalable, and cost-effective cloud environments (primarily AWS, with some GCP workloads) across production and development environments.
You won’t just be handed Terraform scripts—you’ll be expected to ask why we use them the way we do. This is not a DevOps babysitting role. You’ll collaborate closely with our security, backend, and data teams to shape infrastructure that supports high-volume, low-latency systems used by thousands daily.
Responsibilities
- Architect and manage cloud-native infrastructure in AWS using Infrastructure as Code (primarily Terraform, some Pulumi).
- Own and optimize CI/CD pipelines (we use GitHub Actions, ArgoCD, and Spinnaker).
- Implement observability best practices using tools like Datadog, Prometheus, and OpenTelemetry.
- Design automated disaster recovery solutions and chaos-testing processes.
- Work with the security team to enforce least-privilege IAM, VPC segmentation, and threat detection mechanisms.
- Provide feedback on platform bottlenecks, help shape cost governance strategies, and lead cloud spend reviews.
- Support container orchestration via Kubernetes (EKS) and manage Helm-based deployments.
- Document system design decisions and maintain an evolving runbook for SRE rotation.
Qualifications
Must-Have:
- 4+ years of experience designing and operating cloud infrastructure (AWS is a must; GCP a plus).
- Proficiency with Infrastructure as Code (Terraform preferred; CDK or Pulumi acceptable).
- Deep understanding of VPC design, IAM boundaries, load balancing, and service mesh concepts.
- Experience with Kubernetes in production environments (EKS, GKE, or self-managed).
- Demonstrated ability to troubleshoot distributed systems (e.g., timeouts, packet loss, race conditions).
- Strong scripting skills in Python, Bash, or Go.
- Familiarity with container security, secrets management (e.g., Vault, AWS Secrets Manager), and zero-trust networking.
Nice-to-Have:
- Experience with service discovery, custom Kubernetes operators, or eBPF observability tools.
- Past involvement in SOC2, HIPAA, or ISO-27001 compliance contexts.
- Contributions to open-source infrastructure projects or community cloud modules.
- Exposure to serverless architectures (e.g., Lambda, Cloud Run) and event-driven systems.
Are you interested in this position?
Apply by clicking on the “Apply Now” button below!
#GraphicDesignJobsOnline#WebDesignRemoteJobs #FreelanceGraphicDesigner #WorkFromHomeDesignJobs #OnlineWebDesignWork #RemoteDesignOpportunities #HireGraphicDesigners #DigitalDesignCareers#Dynamicbrandguru