Job description
NovaGrid Technologies is seeking a Lead Infrastructure Development Engineer to architect and implement scalable, secure, and highly available infrastructure across cloud and on-prem environments. You will drive modernization initiatives, automate provisioning, and partner with software engineering, security, and product teams to deliver reliable platforms at scale.
We value hands-on leaders who are collaborative, detail-oriented, and passionate about building resilient systems that empower product teams to move fast with confidence.
Responsibility
- Lead the design and implementation of scalable infrastructure platforms across cloud providers (AWS, Azure, GCP) and on-prem environments where applicable.
- Develop and maintain infrastructure as code (Terraform, CloudFormation, Ansible) and automated CI/CD pipelines to enable rapid, reliable deployments.
- Drive site reliability engineering (SRE) practices, incident response, post-incident reviews, and SLO/SLI-based reliability improvements.
- Collaborate with software engineering teams to translate requirements into scalable, cost-efficient infrastructure solutions.
- Implement security best practices, policy as code, identity management, and compliance controls across all environments.
- Optimize cost, performance, and scalability with continuous monitoring, observability, and capacity planning.
- Mentor engineers, conduct code reviews, and build internal tooling to improve platform efficiency and developer experience.
Qualification
- Bachelor's degree in Computer Science, Engineering, or a related field; or equivalent practical experience.
- 8+ years of infrastructure engineering experience with cloud and on-prem environments.
- Strong expertise with IaC tools (Terraform, Ansible, CloudFormation) and configuration management.
- Deep knowledge of networking, security, identity and access management, and IAM best practices.
- Experience with containerization (Docker, Kubernetes) and cloud-native architectures.
- Proficiency in CI/CD, monitoring, logging, and incident management; experience with observability platforms.
- Excellent communication, collaboration, and mentoring abilities; ability to influence across teams and stakeholders.
- Authorized to work in the United States without sponsorship.