Cloud infrastructure Lead at Rockville, MD - Onsite from day1
Salary undisclosed
Apply on
Original
Simplified
Job Title: Cloud infrastructure Lead
Location: Rockville, MD - Onsite from day1
Duration: 6 months - Contract to hire
Responsibilities
Oversee the management and maintenance of cloud infrastructure, ensuring high availability and reliability. Act as the primary point of contact for all Cloud infrastructure related issues and escalations.
Ensure cloud resources are optimally configured and managed to meet performance and cost objectives.
Implement and maintain monitoring solutions to track the health and performance of cloud infrastructure.
Drive the major incidents and potential incidents end to end with periodic updates to client stake holders for approvals/recommendations.
Ensure due diligence and impact analysis for all the changes that get implemented in the cloud platforms.
Lead and mentor a team of cloud engineers and administrators, fostering a collaborative and high-performing work environment.
Provide guidance and support to team members, facilitating their professional development and growth.
Coordinate and manage the team's daily activities, ensuring alignment with organizational goals and priorities.
Lead the response to cloud-related incidents, ensuring timely resolution and minimal impact on business operations.
Develop and implement incident management processes and procedures.
Perform root cause analysis and implement preventive measures to avoid recurrence of issues.
Identify opportunities to automate repetitive tasks and processes to improve efficiency and reduce operational overhead.
Develop and implement automation scripts and tools, leveraging Infrastructure as Code (IaC) practices.
Continuously evaluate and improve cloud operations processes and procedures.
Ensure cloud infrastructure adheres to security policies, standards, and best practices.
Implement and maintain security controls to protect cloud resources and data.
Ensure compliance with regulatory requirements and industry standards (e.g., GDPR, HIPAA).
Monitor and analyze cloud resource usage, ensuring efficient utilization and avoiding over-provisioning.
Conduct capacity planning to support future growth and demand.
Implement cost management strategies to optimize cloud spending.
Develop and implement disaster recovery and business continuity plans for cloud infrastructure.
Ensure regular testing and validation of disaster recovery procedures.
Ensure cloud infrastructure is resilient and can recover quickly from failures or disruptions.
Work closely with other IT teams, business units, and stakeholders to understand requirements and deliver cloud solutions that meet their needs.
Collaborate with vendors and service providers to evaluate and integrate new cloud technologies and services.
Communicate effectively with stakeholders, providing regular updates on cloud operations and performance.
Maintain comprehensive documentation of cloud infrastructure, configurations, processes, and procedures.
Generate regular reports on cloud performance, incidents, and operational metrics.
Ensure documentation is up-to-date and accessible to relevant stakeholders.
Preferred Certifications and experience:
Cloud certifications such as AWS Certified Solutions Architect Associate or Professional. Microsoft Certified: Azure Architect Certified,
Experience with DevOps practices and tools (CI/CD, Jenkins, Git).
Familiarity with ITIL or other IT service management frameworks.
Excellent communication and collaboration skills, with the ability to effectively interact with technical and non-technical stakeholders at all levels of the organization.
Strong analytical and problem-solving skills, with the ability to identify root causes of issues and implement effective solutions in a timely manner.
Proven ability to work independently as well as part of a team, with a proactive and self-motivated attitude towards achieving project goals.
Location: Rockville, MD - Onsite from day1
Duration: 6 months - Contract to hire
Responsibilities
Oversee the management and maintenance of cloud infrastructure, ensuring high availability and reliability. Act as the primary point of contact for all Cloud infrastructure related issues and escalations.
Ensure cloud resources are optimally configured and managed to meet performance and cost objectives.
Implement and maintain monitoring solutions to track the health and performance of cloud infrastructure.
Drive the major incidents and potential incidents end to end with periodic updates to client stake holders for approvals/recommendations.
Ensure due diligence and impact analysis for all the changes that get implemented in the cloud platforms.
Lead and mentor a team of cloud engineers and administrators, fostering a collaborative and high-performing work environment.
Provide guidance and support to team members, facilitating their professional development and growth.
Coordinate and manage the team's daily activities, ensuring alignment with organizational goals and priorities.
Lead the response to cloud-related incidents, ensuring timely resolution and minimal impact on business operations.
Develop and implement incident management processes and procedures.
Perform root cause analysis and implement preventive measures to avoid recurrence of issues.
Identify opportunities to automate repetitive tasks and processes to improve efficiency and reduce operational overhead.
Develop and implement automation scripts and tools, leveraging Infrastructure as Code (IaC) practices.
Continuously evaluate and improve cloud operations processes and procedures.
Ensure cloud infrastructure adheres to security policies, standards, and best practices.
Implement and maintain security controls to protect cloud resources and data.
Ensure compliance with regulatory requirements and industry standards (e.g., GDPR, HIPAA).
Monitor and analyze cloud resource usage, ensuring efficient utilization and avoiding over-provisioning.
Conduct capacity planning to support future growth and demand.
Implement cost management strategies to optimize cloud spending.
Develop and implement disaster recovery and business continuity plans for cloud infrastructure.
Ensure regular testing and validation of disaster recovery procedures.
Ensure cloud infrastructure is resilient and can recover quickly from failures or disruptions.
Work closely with other IT teams, business units, and stakeholders to understand requirements and deliver cloud solutions that meet their needs.
Collaborate with vendors and service providers to evaluate and integrate new cloud technologies and services.
Communicate effectively with stakeholders, providing regular updates on cloud operations and performance.
Maintain comprehensive documentation of cloud infrastructure, configurations, processes, and procedures.
Generate regular reports on cloud performance, incidents, and operational metrics.
Ensure documentation is up-to-date and accessible to relevant stakeholders.
Preferred Certifications and experience:
Cloud certifications such as AWS Certified Solutions Architect Associate or Professional. Microsoft Certified: Azure Architect Certified,
Experience with DevOps practices and tools (CI/CD, Jenkins, Git).
Familiarity with ITIL or other IT service management frameworks.
Excellent communication and collaboration skills, with the ability to effectively interact with technical and non-technical stakeholders at all levels of the organization.
Strong analytical and problem-solving skills, with the ability to identify root causes of issues and implement effective solutions in a timely manner.
Proven ability to work independently as well as part of a team, with a proactive and self-motivated attitude towards achieving project goals.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
Report this job Similar Jobs