
Site Reliability Engineer (SRE) Practice Leader
Job Title: Site Reliability Engineer (SRE) Practice Leader
Location: Remote (anywhere US)
About the Role:
We're seeking a dynamic Practice Leader for our Site Reliability Engineer (SRE) team. This role is pivotal in driving sales and revenue growth by leading enterprise customers through transformative projects. The ideal candidate will have a proven track record in IT Strategy, consulting, and hybrid cloud operations, all with a strong emphasis on sales and business development.
This role brings an exciting opportunity to run and grow our SRE Practice. The Practice Leader runs a global operations team of SRE Engineers at many different experience levels. We are looking for someone who not only has worked with offshore operations but is able to speak to technologies in the AWS, Azure and OCI cloud. You will be working with multiple cloud partners such as AWS, Azure, Google Cloud, and Oracle Cloud.
This position is a great career opportunity where you get to run your practice, work with other technology leaders and grow your base business.
Key Responsibilities:
Leadership & Strategy:
- Develop and implement an SRE practice aligned with business and customer engineering goals.
- Lead and mentor a team of SREs, fostering a culture of reliability, automation, and continuous improvement.
- Define and drive best practices for site reliability, observability, and incident management.
- Partner with product and engineering teams to ensure scalable and highly available systems.
- Collaborate with sales to develop offerings that resonate with the market
Customer Reliability & Performance Optimization:
- Develop incident response strategies, including on-call rotations and post-mortem reviews.
- Improve system observability through logging, monitoring, and alerting strategies.
- Lead efforts in performance tuning, capacity planning, and load balancing.
Automation & DevOps Enablement:
- Drive automation efforts for deployments, scaling, and self-healing infrastructure.
- Implement CI/CD pipelines to ensure smooth and reliable software releases.
- Work with DevOps teams to optimize cloud and on-prem infrastructure for reliability.
- Ensure security and compliance standards are met across infrastructure and operations.
Qualifications & Experience:
Education: Bachelor s or Master s in Computer Science, Engineering, or a related field.
Experience: 7+ years in SRE, DevOps, or infrastructure roles, with at least 3 years in a leadership or managerial capacity.
Technical Expertise:
- Strong knowledge of cloud platforms (AWS, Azure, Google Cloud Platform).
- Experience with Kubernetes, Docker, and microservices architecture.
- Proficiency in scripting languages (Python, Bash, Go, etc.).
- Deep understanding of networking, security, and performance optimization.
- Expert knowledge of Terraform, AWS CloudFormation or Pulumi
- Experience with CO/CD pipelines and incident management
Soft Skills: Strong leadership, problem-solving, and cross-functional collaboration abilities.
Why Join Us?
Opportunity to build and shape an SRE practice.
Work with cutting-edge cloud and automation technologies.
Job Title: Site Reliability Engineer (SRE) Practice Leader
Location: Remote (anywhere US)
About the Role:
We're seeking a dynamic Practice Leader for our Site Reliability Engineer (SRE) team. This role is pivotal in driving sales and revenue growth by leading enterprise customers through transformative projects. The ideal candidate will have a proven track record in IT Strategy, consulting, and hybrid cloud operations, all with a strong emphasis on sales and business development.
This role brings an exciting opportunity to run and grow our SRE Practice. The Practice Leader runs a global operations team of SRE Engineers at many different experience levels. We are looking for someone who not only has worked with offshore operations but is able to speak to technologies in the AWS, Azure and OCI cloud. You will be working with multiple cloud partners such as AWS, Azure, Google Cloud, and Oracle Cloud.
This position is a great career opportunity where you get to run your practice, work with other technology leaders and grow your base business.
Key Responsibilities:
Leadership & Strategy:
- Develop and implement an SRE practice aligned with business and customer engineering goals.
- Lead and mentor a team of SREs, fostering a culture of reliability, automation, and continuous improvement.
- Define and drive best practices for site reliability, observability, and incident management.
- Partner with product and engineering teams to ensure scalable and highly available systems.
- Collaborate with sales to develop offerings that resonate with the market
Customer Reliability & Performance Optimization:
- Develop incident response strategies, including on-call rotations and post-mortem reviews.
- Improve system observability through logging, monitoring, and alerting strategies.
- Lead efforts in performance tuning, capacity planning, and load balancing.
Automation & DevOps Enablement:
- Drive automation efforts for deployments, scaling, and self-healing infrastructure.
- Implement CI/CD pipelines to ensure smooth and reliable software releases.
- Work with DevOps teams to optimize cloud and on-prem infrastructure for reliability.
- Ensure security and compliance standards are met across infrastructure and operations.
Qualifications & Experience:
Education: Bachelor s or Master s in Computer Science, Engineering, or a related field.
Experience: 7+ years in SRE, DevOps, or infrastructure roles, with at least 3 years in a leadership or managerial capacity.
Technical Expertise:
- Strong knowledge of cloud platforms (AWS, Azure, Google Cloud Platform).
- Experience with Kubernetes, Docker, and microservices architecture.
- Proficiency in scripting languages (Python, Bash, Go, etc.).
- Deep understanding of networking, security, and performance optimization.
- Expert knowledge of Terraform, AWS CloudFormation or Pulumi
- Experience with CO/CD pipelines and incident management
Soft Skills: Strong leadership, problem-solving, and cross-functional collaboration abilities.
Why Join Us?
Opportunity to build and shape an SRE practice.
Work with cutting-edge cloud and automation technologies.