Epicareer Might not Working Properly
Learn More
B

REMOTE - API Production Support specialist

Salary undisclosed

Checking job availability...

Original
Simplified

Location: REMOTE

Position Type: Multiyear Contract

Requirements

MUST be available to work in a 24x7, Level 2 API support and incident response service team

ON CALL Required

Expertise in MuleSoft API troubleshooting and support

Experience using monitoring tools for API management like Azure Monitor, Splunk and Dynatrace

Familiarity with ServiceNow tools for incident tracking and documentation

Ability to use enterprise runbooks and wiki documentation for issue resolution

Ability to collaborate with multiple internal and external stakeholders, including the Tier 3 team and Support Lead

Preferably a Java background to understand stack traces, logs in order to pinpoint root cause

Experience with SOAP/REST APIs with Spring Boot and Java microservices

Experience with MuleSoft AnyPoint Platform including Exchange and monitoring

Use Azure, Splunk and Dynatrace-based dashboards for monitoring and resolution

Conduct root cause analysis, escalate issues to internal Tier 3 team as necessary, and engage multiple vendors for resolution when required

Use enterprise runbooks, wiki documentation, and collaboration with the Tier 3 team or Support Lead

Provide 24x7 on-call support as a primary or secondary contact (rotation basis)

Serve as API support on least one major incident call per day, averaging 2 hours

API-related incidents through ServiceNow and based on Moogsoft tickets

Troubleshoot and resolve issues within L2 incident criteria

Ensure timely response and resolution of API-related incidents per agreed SLAs

Perform initial triage, log analysis, and impact assessment

Ensure monitoring and alerts are accurate, current, and functional

Utilize enterprise runbooks and wiki documentation for troubleshooting and resolution

Participate in Problem and Knowledge Management process as requested

Observability support for incident management to proactively identify, diagnose and resolve issues

Conduct detailed RCA (Root Cause Analysis) for recurring or high-impact incidents

Provide RCA reports with contributing factors, corrective actions, and long-term recommendations

Work with internal teams to implement preventative measures

Collaborate with the Tier 3 team or support lead when necessary to resolve complex issues

Maintain documentation of escalations, including logs, timestamps and resolution progress

After RCA, determine and contact relevant vendors required for issue resolution

Provide necessary logs, issue descriptions, and troubleshooting details to vendors

Track vendor resolution progress, coordinate efforts, and update stakeholders

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
Report this job

Location: REMOTE

Position Type: Multiyear Contract

Requirements

MUST be available to work in a 24x7, Level 2 API support and incident response service team

ON CALL Required

Expertise in MuleSoft API troubleshooting and support

Experience using monitoring tools for API management like Azure Monitor, Splunk and Dynatrace

Familiarity with ServiceNow tools for incident tracking and documentation

Ability to use enterprise runbooks and wiki documentation for issue resolution

Ability to collaborate with multiple internal and external stakeholders, including the Tier 3 team and Support Lead

Preferably a Java background to understand stack traces, logs in order to pinpoint root cause

Experience with SOAP/REST APIs with Spring Boot and Java microservices

Experience with MuleSoft AnyPoint Platform including Exchange and monitoring

Use Azure, Splunk and Dynatrace-based dashboards for monitoring and resolution

Conduct root cause analysis, escalate issues to internal Tier 3 team as necessary, and engage multiple vendors for resolution when required

Use enterprise runbooks, wiki documentation, and collaboration with the Tier 3 team or Support Lead

Provide 24x7 on-call support as a primary or secondary contact (rotation basis)

Serve as API support on least one major incident call per day, averaging 2 hours

API-related incidents through ServiceNow and based on Moogsoft tickets

Troubleshoot and resolve issues within L2 incident criteria

Ensure timely response and resolution of API-related incidents per agreed SLAs

Perform initial triage, log analysis, and impact assessment

Ensure monitoring and alerts are accurate, current, and functional

Utilize enterprise runbooks and wiki documentation for troubleshooting and resolution

Participate in Problem and Knowledge Management process as requested

Observability support for incident management to proactively identify, diagnose and resolve issues

Conduct detailed RCA (Root Cause Analysis) for recurring or high-impact incidents

Provide RCA reports with contributing factors, corrective actions, and long-term recommendations

Work with internal teams to implement preventative measures

Collaborate with the Tier 3 team or support lead when necessary to resolve complex issues

Maintain documentation of escalations, including logs, timestamps and resolution progress

After RCA, determine and contact relevant vendors required for issue resolution

Provide necessary logs, issue descriptions, and troubleshooting details to vendors

Track vendor resolution progress, coordinate efforts, and update stakeholders

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
Report this job