Epicareer Might not Working Properly
Learn More

Monitoring and Observability Engineer

Salary undisclosed

Apply on

Availability Status

This job is expected to be in high demand and may close soon. We’ll remove this job ad once it's closed.


Original
Simplified

Title: Monitoring and Observability Engineer

Location: Charlotte NC Hybrid 3 days a week

Duration: 1 year

Job Description:

  • Experience with monitoring and onboarding cloud applications into Dynatrace
  • Implement monitoring solutions for critical business applications to facilitate observability using Dynatrace open telemetry by collecting, processing, and analyzing data including logs, performance metrics, system events, and distributed traces, logging frameworks
  • Perform necessary maintenance for .Net applications availability using
  • Dynatrace Alerting to improve system scalability, reliability, and performance, optimizing infrastructure, implementing redundancy and failover mechanisms, and conducting load testing to ensure systems can handle expected traffic volumes
  • Expertise in working with AWS components like EC2, RDS, ELB, Route53, Lambda, ECS, WAF, Kafka and onboard Logging and services for these components into Dynatrace
  • Create custom level dashboards that helps in troubleshooting issues related with Networking, Disk IO, and services degradation
  • Work with the application team to facilitate monitoring solutions based on their requirements and created custom alerts, dashboard, rehydration process and reporting
  • Identify key metrics and create Monitors, Custom Alerts using Tags for various applications
  • Troubleshooting issues related to .Net Core AWS in various prod environments and project environments
  • Provide 24x7 support for inter-application groups (Development, QA, System Admin, Support D.B.A.'s)
  • Experience with incident management processes responding to incidents, conduct post-incident reviews (PIRs), and contribute to improving system reliability and resilience
  • Analyze and troubleshoot the application performance issues to pinpoint the root cause for the slowness to avoid downtime.
  • Proficiency in scripting languages such as Python, Bash, or PowerShell for automating routine tasks, troubleshooting issues, and building tool to improve operational efficiency.
  • Work on deployment of .NET applications using CI/CD tools like Jenkins, Puppet and Chef in clustered environments.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
Report this job