Epicareer Might not Working Properly
Learn More

SRE Engineer with OpenTelemetry:: Plano, TX ( Onsite from day 1 )

Salary undisclosed

Apply on


Original
Simplified

Position : SRE Engineer with OpenTelemetry

Location : Plano, TX ( Onsite from day 1 )

Exp Req : 13+ Yrs

Full Time Role

Responsibilities:

1.Infrastructure Management:

  • Design, deploy, and maintain AWS infrastructure to ensure high availability, performance, and security.
  • Implement infrastructure as code using tools such as Terraform or CloudFormation.

2.Monitoring and Observability:

  • Develop and maintain monitoring, logging, and alerting solutions using OpenTelemetry, New Relic, Splunk, and other related tools.
  • Ensure comprehensive observabilities for all services and infrastructure components.

3.Incident Management:

  • Respond to and resolve incidents, ensuring minimal downtime and impact on end-users.
  • Conduct root cause analysis for incidents and implement preventive measures.

4.Performance Optimization:

  • Analyze system performance metrics and optimize resource utilization.
  • Collaborate with development teams to improve application performance and reliability.

5.Automation and Tooling:

  • Automate repetitive tasks and processes to improve efficiency and reduce human error.
  • Develop and maintain CI/CD pipelines to ensure smooth deployment processes.

6.Security and Compliance:

  • Implement security best practices and ensure compliance with industry standards.
  • Conduct regular security assessments and audits.

7.Collaboration and Communication:

  • Work closely with development, QA, and operations teams to ensure seamless integration and deployment of services.
  • Communicate effectively with stakeholders regarding system status, incidents, and project updates.

Skill Set:

Technical Skills:

  • Strong experience with AWS services (EC2, S3, RDS, Lambda, etc.).
  • Proficiency in infrastructure as code tools like Terraform or CloudFormation.
  • Deep understanding of monitoring and observability tools, especially OpenTelemetry, New Relic, and Splunk.
  • Solid scripting skills in languages such as Python, Bash, or similar.
  • Familiarity with containerization and orchestration tools like Docker and Kubernetes.
  • Experience with CI/CD tools such as Jenkins, GitLab CI, or AWS CodePipeline.

Problem-Solving Skills:

  • Ability to troubleshoot and resolve complex system issues.
  • Analytical mindset for performance tuning and optimization.

Communication Skills:

  • Excellent verbal and written communication skills.
  • Ability to collaborate effectively with cross-functional teams.

Security and Compliance Knowledge:

  • Understanding of security best practices and compliance requirements.
  • Experience conducting security assessments and audits.

Soft Skills:

  • Strong organizational and multitasking abilities.
  • Proactive attitude with a focus on continuous improvement.

Preferred Qualifications:

  • Relevant certifications (e.g., AWS Certified Solutions Architect, Certified Kubernetes Administrator).
  • Experience with additional monitoring tools and platforms.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
Report this job