Site Reliability Engineer
Experience:
Working knowledge of Monitoring tools - Splunk, AppDynamics, Thousand Eyes, Extra Hop
Knowledge of networking including DNS, DHCP, firewalls, load balancers and IP routing
Familiarity with one or more databases- Oracle, SQL Server, Mongo DB
Preferred experience with C#, .NET, Java and scripting
Extensive experience in Enterprise level Infrastructure orchestration with SALT, Kubernetes, IAAS
Experience in High Availability and distributed systems, Linux and Windows administration, troubleshooting and support
Experience transitioning platforms to the cloud, good understanding of cloud technologies Google Cloud Platform, Azure, PCF
Experience with Atlassian tools Jira, Confluence, Bamboo, Bitbucket, Harness
Excellent debugging skills across a variety of integrated platforms
Role:
- You are a problem solver and enjoy engineering solutions, often collaborating to achieve the best outcome.
- You are a problem preventer, iteratively improving observability, alerts, metrics, feedback loops, and system design.
- You work well with others. Everyone on our team has the back of every other member. It's that simple.
- You possess excellent communication skills whether documenting, messaging, or speaking/listening.
- You are effective in responding to alerts, escalations, and system recovery events. You are passionate about eliminating noisy alerts.
- You demonstrate courage in protecting our applications and doing what is best for our clients.
- You partner well with developers and architects.
Additional Job Details:
Candidate is for critical role as Site Reliability Engineer (SRE) with strong technical background to efficiently provide engineering and support needs. Experience: Working knowledge of Monitoring tools - Splunk, AppDynamics, Thousand Eyes, Extra Hop Knowledge of networking including DNS, DHCP, firewalls, load balancers and IP routing Familiarity with one or more databases- Oracle, SQL Server, Mongo DB Preferred experience with C#, .NET, Java and scripting Extensive experience in Enterprise level Infrastructure orchestration with SALT, Kubernetes, IAAS Experience in High Availability and distributed systems, Linux and Windows administration, troubleshooting and support Experience transitioning platforms to the cloud, good understanding of cloud technologies Google Cloud Platform, Azure, PCF Experience with Atlassian tools Jira, Confluence, Bamboo, Bitbucket, Harness Excellent debugging skills across a variety of integrated platforms Role: You are a problem solver and enjoy engineering solutions, often collaborating to achieve the best outcome. You are a problem preventer, iteratively improving observability, alerts, metrics, feedback loops, and system design. You work well with others. Everyone on our team has the back of every other member. It's that simple. You possess excellent communication skills whether documenting, messaging, or speaking/listening. You are effective in responding to alerts, escalations, and system recovery events. You are passionate about eliminating noisy alerts. You demonstrate courage in protecting our applications and doing what is best for our clients. You partner well with developers and architects.