Job description
Join Aurora Cloud Technologies as a Senior DevOps Engineer and help our product teams ship reliable software at scale. This role blends software engineering, SRE practices, and cloud automation to enable rapid, secure, and observable deployments across multiple environments.
What you'll do
- Design, implement, and maintain scalable CI/CD pipelines across development, staging, and production environments using modern tooling (GitHub Actions, Jenkins, CircleCI).
- Architect and operate cloud infrastructure across AWS and GCP with Infrastructure as Code (Terraform, Ansible, CloudFormation) and configuration management tooling.
- Champion reliability, security, and incident response; implement monitoring and alerting (Prometheus, Grafana, Loki) and runbooks to reduce MTTR.
- Drive security and compliance into the software delivery process; manage secrets, access controls, and vulnerability management.
- Automate deployment, testing, and release processes; enable blue/green and canary strategies to minimize risk.
- Build self-service platform capabilities to empower engineering teams and reduce toil through automation and standardized tooling.
- Collaborate with developers and SREs to improve deployment strategies, architecture, and incident postmortems; mentor junior engineers.
Responsibility
- Design, implement, and maintain scalable CI/CD pipelines across development, staging, and production environments using modern tooling (GitHub Actions, Jenkins, CircleCI).
- Architect and operate cloud infrastructure across AWS and GCP with Infrastructure as Code (Terraform, Ansible, CloudFormation) and configuration management tooling.
- Champion reliability, security, and incident response; implement monitoring and alerting (Prometheus, Grafana, Loki) and runbooks to reduce MTTR.
- Drive security and compliance into the software delivery process; manage secrets, access controls, and vulnerability management.
- Automate deployment, testing, and release processes; enable blue/green and canary strategies to minimize risk.
- Build self-service platform capabilities to empower engineering teams and reduce toil through automation and standardized tooling.
- Collaborate with developers and SREs to improve deployment strategies, architecture, and incident postmortems; mentor junior engineers.
Qualification
- Bachelor’s degree in computer science, engineering, or equivalent experience.
- 5+ years in DevOps, SRE, or cloud-native platform engineering.
- Strong experience with Kubernetes and containerized workloads; service mesh experience is a plus.
- Fluent in at least one major cloud provider (AWS, GCP, and/or Azure); knowledge of multi-cloud is a plus.
- Proficiency with Infrastructure as Code tools (Terraform, Ansible, CloudFormation, or Pulumi).
- Experience with monitoring and observability stacks (Prometheus, Grafana, ELK/EFK); strong incident response skills.
- Scripting and automation skills (Python, Bash, or similar); excellent problem-solving and communication skills.
- Passion for reliability, security, and developer experience.