Site Reliability Engineer
- Full Time, onsite
- SGS Consulting
- Hybrid2-3 Days onsite/ week (Relocation Required), United States of America
Apply on
Title: Tech Lead/Site Reliability Engineer
Location: Phoenix, AZ, 85027 (Hybrid)
Job ID: 1ACIJP00003207
Job Duration: 06 Months(Contract, Contract-to-Hire)
Job Description:
- Hybrid Onsite: Worker is required to work onsite 3 days per week in Phoenix, AZ as they will be working cross-functionally with 3 different teams.
- Is this a new position added to the team -or- a backfill? Backfill
- Contract Duration: 6-months to start.
- Is there a possibility for extension? Possibly.
- Is there a possibility for this role to be converted to FTE? Yes. This is a Contract-to-Hire position.
- Are you open to accepting H1B and/or Corp-to-Corp Candidates? Yes.
- Is this role associated with a project? Yes - Enterprise Observability
- Does this role require the contractor to interface with PHI or Personal Data? Yes
Main Responsibilities: - Experience in leading Observability initiatives as Lead Engineer.
- Development and implementation of build release pipelines with accountability for managing deployment schedules, issues, risks, and impediments.
- Agile development experience with team member accountability for commitment and delivery each sprint.
- Ensure that all implementations of observability meet the requirements prescribed by IT Services through the effective implementation or use of approved processes, methodologies, and deliverables.
- Provide expertise and design solutions for observability applications as well as system integration with internal systems and external vendors.
- Provide technical leadership in design, development, and testing of solutions.
- Track infrastructure delivery and dependencies to implementation.
- Prepare and present potential technical solutions and advise the teams on approach and tradeoffs.
- Defines the structure of systems, their interfaces, and the principles that guide its organization, software design and implementation.
- Defines and supports reusable application components from a business and technology perspective.
- Able to provide coding and technical direction to less experienced staff or develop highly complex original code.
Qualifications:
Experience with gathering and organizing large volume of data to use for instrumentation into an Enterprise Observability solution.
- Experience with recommending baseline monitoring thresholds, and performance monitoring KPIs and SLAs.
- Experience with installing agents, forwarders, APIs, performance monitoring alerts, dashboards, and data trend analysis.
Experience:
Experience must include at least one of the following languages: Java (required), Desired-Python, Go.
- Experience with Databases Azure SQL, PostgreSQL, MySQL, MongoDB, TSDB or similar databases.
* Experience with building Ingestion patterns and custom exporters to ingest data from application, products and cloud components.
- Experience on one of cloud platforms Microsoft Azure or Google Cloud Platform or AWS cloud is required.
- Experience on Docker, Kubernetes platform and creating Kubernetes manifests is required.
- Experience working with Open-source platforms and Open Telemetry libraries e.g., Grafana is preferred.
- Experience with Agile/Scrum methodologies is required.
Requirements:
- 4-year degree (Computer Science, Information Systems, or relational functional field) and/or equivalent combination of education or work experience.
- 5+ years Tech lead experience required with implementation Observability/SRE solutions using OSS tools at enterprise scale
- 8+ years development experience required. Google Cloud experience is an added advantage.
- Hands-on experience with Tools and Technology is preferred.
What are your top 3-5 skills in an ideal candidate?
1.Grafana OSS Stack for observability (Mimir,Loki,Tempo, Grafana agent)
2.Azure or Google Cloud Platform or AWS hands-on with details around pulling observability data from managed services
3.Java or Golang or Python coding or Open Telemetry SDK implementation in enterprise scale.
What are the top 3-5 responsibilities expected of this worker?
- Lead the Observability Ingestion team.
- Provide technical solutions day to day.
- Responsible for the technical delivery of the team.
- Resolve any technical blockers.
- Work with the Architects to provide solution options and perform POC and learning on any new technologies required for the team.