Senior DevOps Engineer

DevOps Engineer

Remote

March 6, 2026

Full Time

Application ends: June 5, 2026

Apply Now

Job Description

Role overview

We’re looking for a Senior DevOps Engineer to take strong ownership of the infrastructure behind our global SaaS messaging platform. This role is for someone who wants to shape how infrastructure is built, operated, and improved. You will be responsible for reliability, scalability, automation, and production stability across all environments.

You will work closely with engineering leadership and developers to improve system architecture, deployment processes, security, and operational standards. This is a high-impact role with real influence on technical decisions and how our infrastructure evolves as we grow.

Key responsibilities

CI/CD automation: Design and own CI/CD pipelines (Gitlab), improving build speed, deployment safety, and rollback processes across environments
Infrastructure ownership: Own and evolve our cloud and bare-metal infrastructure (OVH, Cloudflare, AWS, OpenStack), ensuring high availability, performance, and stability under load
Infrastructure as code: Lead infrastructure as code practices using Terraform and Ansible, enforcing version control, peer review, and consistency standards
Observability and monitoring: Improve system observability using monitoring, logging, tracing, and alerting tools (Grafana, Prometheus, Loki), and drive proactive reliability improvements
Infrastructure security: Strengthen infrastructure security, including DDoS mitigation, traffic filtering, and access control management
Incident management: Lead root cause analysis of production incidents and implement long-term reliability improvements
Automation: Design automation to reduce manual operational work and improve deployment and recovery processes
Database reliability: Ensure high availability and performance of production databases (PostgreSQL, MongoDB), including backup, recovery, and scaling strategies
Environment management: Ensure consistency and reliability across development, staging, and production environments

Expected qualifications

Linux expertise: Strong Linux system administration experience in high-availability production environments
Kubernetes production experience: Hands-on experience running Kubernetes in production, including scaling, upgrades, and troubleshooting
Systems architecture understanding: Solid understanding of containerization, virtualization, and infrastructure design trade-offs
Networking knowledge: Strong understanding of networking concepts (L2, L4, L7), debugging tools (tcpdump, ngrep), and traffic analysis
Production lifecycle experience: Experience operating and troubleshooting applications in high-availability production environments
CI/CD systems design: Experience designing and maintaining CI/CD systems and deployment workflows
Database operations: Strong experience managing PostgreSQL and MongoDB in production, including performance tuning and reliability
Infrastructure as code: Practical experience with Terraform and configuration management tools (Ansible or similar), following best practices
Monitoring and logging: Experience working with monitoring and log aggregation systems (Grafana, Prometheus, Loki, or similar)
Security awareness: Practical understanding of infrastructure security principles and production hardening
Communication skills: Fluent written English and fluent spoken Russian required

Nice to have

Messaging/telecom background: Experience with telecom or messaging systems (SMPP, Asterisk, Kamailio)
PostgreSQL high availability: Experience with PostgreSQL replication/clustering, backups, and failover (PITR, Patroni/repmgr or similar)
Kubernetes operations: Experience operating Kubernetes clusters in production (upgrades, autoscaling, networking, troubleshooting)
Scripting: Scripting skills in Bash, Python, or Go for automation and internal tooling
Security and traffic protection: Experience mitigating malicious traffic and managing DDoS protection (Cloudflare WAF/rate limiting, fail2ban)
Email deliverability basics: Familiarity with SPF, DKIM, DMARC, and how they affect sending reliability
SRE practices: Experience with SLOs/SLIs, alert quality, and incident postmortems

Are you interested in this position?

Apply by clicking on the “Apply Now” button below!

#GraphicDesignJobsOnline #WebDesignRemoteJobs #FreelanceGraphicDesigner #WorkFromHomeDesignJobs #OnlineWebDesignWork #RemoteDesignOpportunities #HireGraphicDesigners #DigitalDesignCareers #Dynamicbrandguru

Senior DevOps Engineer

Job Description