Data Reliability Engineer
Apply on
Availability Status
This job is expected to be in high demand and may close soon. We’ll remove this job ad once it's closed.
Title: Sr. Data Reliability Engineer
Visa: EAD
Location: San Antonio, TX (Hybrid, 3 days from the office).
We are currently seeking a highly skilled Data Reliability Engineer to join our team. This role requires a blend of expertise in data pipeline technologies and Site Reliability Engineering (SRE) principles to ensure the highest standards of data reliability and system performance.
Responsibilities:
* Design, build, and maintain the infrastructure and data pipelines to support data transformation, data structures, metadata, dependency, and workload management.
* Develop and maintain scalable, reliable, and cost-effective data solutions using AWS technologies and big data tools like Databricks, Airflow, and Dremio.
* Implement robust monitoring and alerting systems using tools such as Datadog to ensure proactive management of the production environments.
* Work closely with data scientists and analytics teams to engineer and optimize data models using DBT (Data Build Tool) and ensure seamless data flow across all segments.
* Enhance data validation and data quality metrics integration within data pipelines to ensure accuracy and reliability of data.
* Automate manual processes, optimize data delivery, and re-design infrastructure for greater scalability.
* Handle the deployment of additional AWS services such as Lambda functions and manage data storage solutions.
* Collaborate with IT and DevOps teams to enhance system performance and reliability through AWS solutions such as SNS, SQS, and Elastic Load Balancer.
* Engage in continuous improvement efforts to enhance performance and provide increased functionality across data platforms.
* Provide support and mentorship to offshore teams, ensuring best practices in coding, testing, and deployment are followed.
* Troubleshoot complex issues across multiple databases and work with various stakeholders to ensure robust architecture and operational s