Incident Manager
Apply on
We have Contract role for Incident Manager for our client Detroit, MI. Please let me know if you or any of your friends would be interested in this position.
Position Details:
Incident Manager- Detroit, MI
Location : Detroit, MI 48226 (Hybrid)
Project Duration : 12+ Months Contract
This is a potential CONTRACT TO HIRE role.
Must be local to Detroit, MI. Will need to be onsite at DTE 3 days per week.
REQUIRED EDUCATION:
Incident Management/Major Incident Management (Primary responsibility)
Identify an Incident has/is occurring and respond accordingly
Utilize KB articles, incident logs, etc. to quickly determine next steps
Engage the appropriate.
Incident Management/Major Incident Management (Primary responsibility)
- Identify an Incident has/is occurring and respond accordingly
- Utilize KB articles, incident logs, etc. to quickly determine next steps
- Engage the appropriate resources needed to contain the incident
- Establish a communication bridge for resources to collaborate
- Send out communication to appropriate audience at regular intervals during the incident
- Log all activities that occur during the incident
- Moderate the communication bridge to keep the team focused on containment minimizing mean time to restore (MTTR)
- Escalate to management as needed
Event Management (secondary responsibility)
- Use EM tools to improve detection and response times to incidents
- Reduce downtime by proactively detecting performance anomalies before they become a widespread system-down incident
- Recognize the need for additional alarms or modified alarm thresholds based on past incidents
Problem Management (secondary responsibility)
- Detect that a problem exists i.e. repeat incidents of the same type
- Log the problem and assemble a team to work the problem
- Log a known error and any workaround
- Facilitate the technical team as they resolve the problem
- Document the problem resolution and ensure any KB articles are updated
Metrics
- Maintain key performance metrics
- MTTR (mean time to restore)
- MTTK (mean time to know)
- Incidents with defective or non-existent alarms
- Mean Time to Detect/Know
- Unplanned Outage count and duration
- Planned Outage count and duration
- Incident counts by Portfolio