Epicareer Might not Working Properly
Learn More

Azure Site Reliabiilty Engineer (SRE)

  • Full Time, onsite
  • PSR Associates, Inc.
  • Remote Hybrid, United States of America
Salary undisclosed

Apply on


Original
Simplified
PSR Associates is a consulting and talent solutions firm that connects qualified IT professionals with great opportunities. Whether you're looking for a contract or permanent position, we can help you find the right fit for your skills and experience. We have a team of experienced recruiters who know the IT industry inside and out, and we work with you every step of the way to ensure a smooth and successful transition. PSR Connecting Talent, Crafting Success.

Site Reliability Engineer (Azure SRE)
Remote

Long-term, multi-year engagement

Description:
This position is for an Azure Site Reliability Engineer. We are looking for a skilled engineer that has a willingness to learn new technologies quickly and often. We are looking for expertise in setting up, logging, monitoring, alerts. Experience in writing IaC (Infrastructure as Code) and scripts. Must be willing to work as part of an on-call schedule and assist with troubleshooting/debugging VMs.

This role will be part of a Hybrid Cloud Services team providing operational support for the Azure cloud platform. The role will implement hybrid cloud solutions using automated tools.

Important Note:

Candidates MUST either possess an active US Public Trust clearance certification or be able to attain one within 60 days of starting this assignment.

ments:
This Azure SRE position requires 7+ years of experience with the following:


1. **Programming & Scripting:**
- Proficiency in languages like Python, Go, Java, or Ruby.
- Scripting skills with Bash or PowerShell.

2. **Systems Administration:**
- Expertise in Linux/Unix systems.
- Knowledge of Windows Server environments.

3. **Cloud Platforms:**
- Experience with cloud services such as IBM Cloud, AWS, Azure, or Google Cloud Platform.
- Understanding of cloud-native concepts and services (e.g., Kubernetes, Docker).

4. **Infrastructure Management:**
- Skills in configuring and managing servers, databases, and networking components.
- Experience with Infrastructure as Code (IaC) tools like Terraform or Ansible.

5. **Monitoring & Observability:**
- Familiarity with monitoring tools such as Prometheus, Grafana, or IBM Cloud Monitoring.
- Experience with log management and analysis tools like ELK Stack or Splunk.

6. **Performance Tuning:**
- Expertise in optimizing system performance and reliability.
- Knowledge of performance testing and tuning for applications and infrastructure.

7. **Networking:**
- Understanding of networking concepts (e.g., TCP/IP, DNS, HTTP/HTTPS).
- Experience with load balancing, firewalls, and VPNs.

8. **Security:**
- Knowledge of security best practices and compliance standards.
- Experience with vulnerability management and threat detection.

Additional Position Requirements:
  • Bachelor s Degree or higher
  • Minimum 7 years of work experience.
  • Public Trust security clearance or be able to obtain one.
  • Off-shift work that includes evenings, weekends, and on-call support.
  • Proactively monitoring an Incident and Task Ticketing Queue.
  • Ability to Update Tickets with appropriate technical detail and communicate details in an effective manner.

Preferred Skills:
  • Experience solutioning, implementing, and providing support for VMS in an azure hybrid cloud environment.
  • Experience with cloud native components such as Networking, Firewalls, Peering, Security Groups, Availability Zones, Storage, Serverless, Load Balancers, Containerization, System Administration, Backups, patching, etc.
  • An understanding of how automation is integrated with a multi-cloud, native tool environment
  • Experience implementing, managing, and monitoring identity, governance, storage, compute and virtual networks within cloud platforms.
  • Experience configuring Azure, AWS, and Google Cloud native monitoring tools and their integration with client application environments.
  • Experience performing troubleshooting and root cause analysis to expedite incident resolution and acting as an escalation point for Tier 1 and Tier 2 support members
  • Familiarity with the implementation of highly available cloud services through Availability Zones and Auto Scaling
  • Experience with tools such as Azure CLI, PowerShell, AWS CLI, etc.
  • Familiarity with DevOps Tools such as Puppet, Ansible, Terraform and/or familiarity with a programming language such as Python
  • Experience with VMware Virtualization and ServiceNow
    Nice to haves
  • Experience maintaining Gold Images using Azure Image Builder and patching using Azure s native patching solution
  • Experience with Azure resources and services such as networking, Virtual Machines, security groups, peering, availability zones, storage, load balancers, backups, Azure DNS, etc.

Preferred Certifications:
  • CompTIA Linux+
  • RHCSA or RHCE
  • Microsoft Azure Administrator



Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
Report this job
Similar Jobs