Leave us your email address and we'll send you all the new jobs according to your preferences.
Lead Site Reliability Engineer - Azure - Engineering
Posted 3 hours 44 minutes ago by Mentmore Recruitment
£120,000 Annual
Permanent
Not Specified
Other
London, United Kingdom
Job Description
Lead Site Reliability Engineer - Azure/AWS - Terraform - Engineering - London
My financial services client are looking for a Lead Site Reliability engineer who will be responsible for ensuring the reliability, scalability for their infrastructure and services. This is a senior role requiring technical expertise, leadership, and a commitment to continuous improvement. You must have team lead/mentoring experience and be able to balance technical delivery, team productivity, performance measurement, and collaboration across teams and stakeholders.
Duties & Responsibilities:
- Hands-On Engineering & Technical Leadership
- Design, develop, and maintain cloud infrastructure (Azure/AWS) using Terraform and automation.
- Lead troubleshooting, performance optimisation, and incident resolution to enhance reliability.
- Ensure best practices in CI/CD pipelines, observability, and infrastructure deployment.
- Promote Transparency, Inspection, and Adaptation by making both system and team health data accessible and actionable.
- Work with engineering leads, business stakeholders, and the Head of Platform Operations to define and enforce SLAs, SLOs, and engineering standards that support scalability, reliability, and operational efficiency.
- Design solutions with a systems-thinking approach, ensuring infrastructure, observability, and automation strategies support sustainable growth.
- Improve deployment pipelines, automation, and operational workflows across squads, fostering consistency and best practices.
- Support capacity planning, scalability, and security best practices, proactively identifying risks and opportunities to enhance platform resilience.
- Team Productivity, Performance & Agile Ways of Working
Experience Required:
- Proven leadership experience in technical teams, with a focus on mentoring, professional development, and fostering a culture of innovation, reliability, and engineering excellence.
- Proven experience in Site Reliability Engineering, DevOps, or Systems Engineering, with hands-on experience in both Azure and AWS environments.
- Demonstrable expertise in high-performance, scalable, and highly available systems, with experience in optimising reliability, capacity planning, and system performance.
- Deep expertise in DevOps principles, including automation, infrastructure as code (Terraform, Ansible, or Chef), GitOps workflows, CI/CD best practices (GitHub Actions, GitLab CI/CD, Azure DevOps), and collaborative ways of working.
- Strong background in containerisation (Docker) and orchestration (Kubernetes), with a focus on scalability and resilience.
- Hands-on experience with monitoring, observability, and incident management tools (Prometheus, Grafana, ELK, Azure Monitor, Application Insights, Kusto) and a data-driven approach to improving system reliability.
- Strategic mindset, able to align technical initiatives with business goals, drive scalability and performance improvements, and proactively tackle complex challenges.
- Strong understanding of regulatory and security requirements, such as ISO 27001, PCI DSS, CE+ and SOX, with experience implementing compliance-driven engineering practices.
- Advocate for modern DevOps and SRE best practices, championing collaboration, transparency, automation, continuous learning, and continuous improvement across teams.
- Excellent communication skills, able to engage stakeholders, collaborate cross-functionally, and drive alignment on reliability and operational priorities.
Mentmore Recruitment
Related Jobs
Pedagogisch Medewerker Flex Kinderopvang
- Zuid-Holland, Capelle aan den IJssel, Netherlands, 2901 AA
Pedagogisch Medewerker Kinderopvang Peutergroep
- Gelderland, Arnhem, Netherlands, 6811 AA
Referent Im Corporate Accounting - Konsolidierung (m/w/d)
- Nordrhein-Westfalen, Düsseldorf, Germany, 40210
Ausbildung Zum It-systemelektroniker 2025 (m/w/d)
- Baden-Württemberg, Heilbronn, Germany, 74072
Servicetechniker / Servicemonteur (m/w/d)
- Hamburg, Germany