Big Data Engineer

Full Time, onsite
GCS
Hybrid, United States of America

Salary undisclosed

Checking job availability...

Original

Simplified

Job Title: Sr. Big Data Engineer

Job Description:
We are seeking a Senior Big Data Engineer to provide expert-level guidance on system architecture and infrastructure optimization within our enterprise. This role will focus on the design, development, and optimization of Big Data solutions on AWS, supporting critical initiatives related to the HFC Network and Neo4J implementations. With billions of data points in the graph, we need a seasoned professional to enhance deployment pipelines, ensure data pipeline consistency, and lay the foundation for advanced data science applications.

Responsibilities:

Data Pipeline Development: Design, implement, and maintain scalable, high-performance data pipelines to ingest, process, and transform large datasets.
Pipeline Optimization: Improve data processing speed, reduce latency, and standardize deployment pipelines to ensure seamless, efficient operations.
AWS Cloud Expertise: Utilize AWS services (Amazon EMR, S3, EC2, Lambda, etc.) to build, optimize, and manage cloud-based data solutions.
Cloud Architecture Design: Architect and maintain secure, cost-effective, and high-availability data infrastructure on AWS. Implement best practices for disaster recovery and data security.
Neo4J Integration: Support and expand infrastructure for Neo4J implementations, handling billions of data points efficiently.
Monitoring & Troubleshooting: Develop monitoring and alerting mechanisms to diagnose and resolve data pipeline issues proactively.
Optimization & Automation: Identify areas for improvement and implement automated solutions to enhance data processing efficiency and reduce operational costs.

Required Qualifications:

5+ years of experience in Big Data Engineering, Data Pipeline Development, or Cloud Data Architecture.
Strong expertise in AWS services (Amazon EMR, S3, EC2, Lambda, RDS, etc.).
Hands-on experience with Neo4J or other graph databases, particularly in handling large-scale datasets.
Proficiency in Python, Scala, or Java for data processing and pipeline automation.
Experience with ETL processes, distributed computing, and batch/stream processing frameworks.
Strong understanding of DevOps & CI/CD practices, including infrastructure-as-code (Terraform, CloudFormation).
Experience optimizing and standardizing data pipelines across cloud environments.
Excellent problem-solving and analytical skills with a proactive approach to identifying inefficiencies.

Preferred Qualifications:

Experience working with HFC Network systems or related telecom data infrastructure.
Familiarity with Kubernetes for containerized deployments.
Background in data science applications, machine learning model deployment, or AI-driven analytics.
Knowledge of graph processing algorithms and experience working with billions of data points in large-scale Neo4J implementations.

Interview Process: Initial screening, followed by a technical interview with a hands-on coding assessment.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Report this job