Epicareer Might not Working Properly
Learn More
B

Site Reliability Engineering Manager (REMOTE)

  • Full Time, remote
  • BD Diagnostics - TriPath
  • Remote On Site, United States of America
Salary undisclosed

Apply on


Original
Simplified
Job Description Summary

A Site Reliability Engineering (SRE) Manager is responsible for ensuring that systems and services run smoothly, reliably, and efficiently at scale. They manage a team of SREs to maintain infrastructure, handle incident response, and improve the system's reliability and performance. Below is a comprehensive job description for an SRE Manager:
The SRE Manager will lead a team of Site Reliability Engineers responsible for the availability, scalability, and performance of critical systems. This role involves working with cross-functional teams to improve system reliability and empower engineers through automation, monitoring, and incident management processes. As an SRE Manager, you will ensure your team delivers high-quality services by focusing on system resilience, performance optimization, and operational excellence.

Job Description

We are the makers of possible

BD is one of the largest global medical technology companies in the world. Advancing the world of health is our Purpose, and it's no small feat. It takes the imagination and passion of all of us-from design and engineering to the manufacturing and marketing of our billions of MedTech products per year-to look at the impossible and find transformative solutions that turn dreams into possibilities.

We believe that the human element, across our global teams, is what allows us to continually evolve. Join us and discover an environment in which you'll be supported to learn, grow and become your best self. Become a maker of possible with us.

Position Summary

A Site Reliability Engineering (SRE) Manager is responsible for ensuring that systems and services run smoothly, reliably, and efficiently at scale. They manage a team of SREs to maintain infrastructure, handle incident response, and improve the system's reliability and performance. Below is a comprehensive job description for an SRE Manager:

The SRE Manager will lead a team of Site Reliability Engineers responsible for the availability, scalability, and performance of critical systems. This role involves working with cross-functional teams to improve system reliability and empower engineers through automation, monitoring, and incident management processes. As an SRE Manager, you will ensure your team delivers high-quality services by focusing on system resilience, performance optimization, and operational excellence.

Key Responsibilities:

Leadership & Strategy:
  • Lead and mentor a team of SREs, fostering a culture of ownership, reliability, and accountability.
  • Collaborate with development, operations, and product teams to define and drive reliability strategies and initiatives.
  • Establish and monitor key performance indicators (KPIs) for system reliability and performance.
  • Provide guidance and prioritize efforts to prevent and mitigate incidents, improve incident response times, and enhance post-incident processes.


Operational Excellence:
  • Ensure 24/7 availability of critical systems by overseeing and improving the incident management process, including on-call rotations.
  • Develop and implement disaster recovery and business continuity strategies.
  • Drive automation efforts to reduce manual work, improve operational efficiency, and enhance system performance.
  • Lead postmortem analysis after major incidents and drive continuous improvements through root cause analysis.


System Reliability & Performance:
  • Define Service Level Objectives (SLOs) and Service Level Indicators (SLIs) to monitor and maintain the system's reliability.
  • Build and maintain automated monitoring, alerting, and self-healing systems.
  • Work closely with software engineering teams to design and implement infrastructure that is scalable and resilient.
  • Ensure systems can handle capacity and performance requirements through load testing and scaling solutions.


Collaboration & Communication:
  • Act as a point of escalation for incidents and outages, ensuring timely resolution and communication.
  • Promote a culture of proactive collaboration with development and operations teams to improve systems before failures occur.
  • Communicate effectively with stakeholders regarding the status of ongoing projects and incidents.
  • Foster a culture of continuous improvement and learning within the team.


Qualifications:

Technical Expertise:
  • 5+ years of experience in Site Reliability Engineering or DevOps, with at least 2 years in a servant leadership role.
  • Ability to manage personnel to reach maximum potential for both service and personal growth
  • Strong understanding of cloud environments (AWS, Azure, Google Cloud) and container orchestration tools (Kubernetes, Docker).
  • Proficient in infrastructure-as-code (Terraform, AWS CDK and CloudFormation) and scripting languages (TypeScript, PowerShell or Go-Lang).
  • Experience with monitoring and observability tools (Prometheus, Grafana, Datadog, etc.).
  • Strong knowledge of CI/CD pipelines, version control systems (Git), and configuration management tools.
  • Experience with Agile methodologies and good understanding of container service and microservices.


Leadership & Management:
  • Proven experience in managing on-call rotations, incident management processes, and operational readiness.
  • Experience in mentoring and growing engineering teams.
  • Strong problem-solving and decision-making skills, with the ability to navigate complex system challenges.
  • Ability to work under pressure during high-stress incidents while maintaining composure.


Soft Skills:
  • Excellent communication and collaboration skills.
  • Ability to balance short-term fixes with long-term improvements.
  • Strong organizational and time-management skills.


Preferred Qualifications:
  • Experience with multi-cloud environments.
  • Familiarity with compliance requirements (e.g., SOC 2, ePHI, ISO 27001).
  • Experience working in a fast-paced, startup environment or large-scale, enterprise systems


Education Qualifications & Previous Experience:
  • Bachelor's degree in a related field or minimum 10 years relevant IT experience


Desired/Additional Skills & Knowledge:
  • Knowledge of Microsoft Azure virtual network appliances
  • Knowledge of network protocols such as: DNS, SMTP, SNMP, SSH, SFTP, etc.
  • Knowledge of Network and TCP/IP routing/subnetting
  • Working knowledge of VPN connectivity
  • Knowledge of backup and disaster recovery processes
  • Knowledge of DevOps, Agile, Infrastructure as code strongly desired
  • Knowledge of Azure SaaS, PaaS, IaaS, offerings, and services in Azure commercial and DOD regions


Certifications
  • Microsoft or other cloud provider certifications preferred
  • Management or other employee management training or certification desired.


Any Additional Information
  • Experience working in a servant leadership environment
  • Able to build strong partnership with business partners and the project teams
  • Strong analytical and decision-making abilities
  • Takes responsibility for delivering superior value and client service
  • Works well with people who have diverse abilities, experiences, and perspectives
  • Influences others without direct authority
  • Approaches opportunities and issues with an optimistic, action-oriented, and solution-based approach.
  • Good writing skills to document plans and process


For certain roles at BD, employment is contingent upon the Company's receipt of sufficient proof that you are fully vaccinated against COVID-19. In some locations, testing for COVID-19 may be available and/or required. Consistent with BD's Workplace Accommodations Policy, requests for accommodation will be considered pursuant to applicable law.

Why Join Us?

A career at BD means being part of a team that values your opinions and contributions and that encourages you to bring your authentic self to work. It's also a place where we help each other be great, we do what's right, we hold each other accountable, and learn and improve every day.

To find purpose in the possibilities, we need people who can see the bigger picture, who understand the human story that underpins everything we do. We welcome people with the imagination and drive to help us reinvent the future of health. At BD, you'll discover a culture in which you can learn, grow, and thrive. And find satisfaction in doing your part to make the world a better place.

To learn more about BD visit

Becton, Dickinson and Company is an Equal Opportunity/Affirmative Action Employer. We do not unlawfully discriminate on the basis of race, color, religion, age, sex, creed, national origin, ancestry, citizenship status, marital or domestic or civil union status, familial status, affectional or sexual orientation, gender identity or expression, genetics, disability, military eligibility or veteran status, or any other protected status.

Primary Work Location

USA CA - San Diego Bldg A&B

Additional Locations

Work Shift

At BD, we are strongly committed to investing in our associates-their well-being and development, and in providing rewards and recognition opportunities that promote a performance-based culture. We demonstrate this commitment by offering a valuable, competitive package of compensation and benefits programs which you can learn more about on our Careers Site under .

Salary or hourly rate ranges have been implemented to reward associates fairly and competitively, as well as to support recognition of associates' progress, ranging from entry level to experts in their field, and talent mobility. There are many factors, such as location, that contribute to the range displayed. The salary or hourly rate offered to a successful candidate is based on experience, education, skills, and any step rate pay system of the actual work location, as applicable to the role or position. Salary or hourly pay ranges may vary for Field-based and Remote roles.

Salary Range Information
$124,100.00 - $204,800.00 USD Annual
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
Report this job