A
Senior Big Data & Databricks Engineer - PySpark, Hadoop & Real-Time Analytics
Salary undisclosed
Apply on
Original
Simplified
Job Description
Job Description
Who We Are
Artmac Soft is a technology consulting and service-oriented IT company dedicated to providing innovative technology solutions and services to customers.
Job Description
Job Title : Senior Big Data & Databricks Engineer - PySpark, Hadoop & Real-Time Analytics
Job Type : C2C
Experience : 10-12 years
Location : Atlanta, New York
Responsibilities :
- Proven experience as a Big Data Engineer with expertise in the Hadoop ecosystem and real-time analytics.
- Strong proficiency in Spark (PySpark/Scala), Hive, and related big data technologies.
- Experience in building and deploying data pipelines and stream processing applications.
- Familiarity with job scheduling and optimization techniques within the Hadoop ecosystem.
- Solid understanding of Unix/Linux environments, including Shell scripting.
- Experience with cloud technologies, specifically Azure, is preferred.
- Excellent problem-solving skills and the ability to work independently and collaboratively in a fast-paced environment.
- Experience with MongoDB or similar NoSQL databases.
- Knowledge of data warehousing concepts and methodologies.
- Familiarity with machine learning frameworks and applications.
- Collaborate with business stakeholders to gather and understand requirements, translating them into technical specifications for data solutions.
- Design, develop, and implement complex data pipelines utilizing technologies such as PySpark, Scala Spark, Hive, Hadoop CLI, MapReduce, Storm, Kafka, and Lambda Architecture.
- Create and submit Spark jobs while ensuring high-performance tuning and scalability of data processes.
- Work with real-time stream processing technologies, including Spark Structured Streaming and Kafka, to deliver timely insights.
- Leverage expertise in Python/Spark and their related libraries and frameworks to optimize data processing tasks.
- Manage job scheduling challenges in Hadoop, ensuring reliability and efficiency in data workflows.
- Develop and execute unit and integration testing for data pipelines, handling large data volumes to derive actionable insights.
- Optimize code for efficiency to meet stipulated SLAs, focusing on performance and resource management.
- Demonstrate strong Unix/Linux expertise, comfortable with the Linux operating system and Shell Scripting.
- Utilize Azure cache to enhance performance and speed of data processing tasks.
Qualification:
- Bachelor's degree in Computer Science, Information Security, or a related field.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
Report this job Similar Jobs