Data Engineer Architect (PySpark)

Full Time, onsite
Darwin Resources LLCs
On Site, United States of America

Salary undisclosed

Apply on

Dice

Original

Simplified

Job Title: Data Engineer Architect (PySpark)
Location: Dallas, TX
Job Type: Full-Time
Department: Data Engineering

We are seeking a talented Data Engineer Architect with expertise in PySpark.
Job Summary:
As a Data Engineer Architect, you will play a pivotal role in designing and implementing scalable data pipelines and architecture that facilitate data ingestion, processing, and analysis. Your expertise in PySpark will be essential in building efficient data solutions that support our analytics and machine learning initiatives.
Key Responsibilities:
- Design and implement robust data architectures using PySpark that support ETL processes, data warehousing, and analytics platforms.
- Build, optimize, and maintain data pipelines for large-scale data processing, ensuring data quality, reliability, and performance.
- Work closely with data scientists, analysts, and other stakeholders to understand data requirements and translate them into technical specifications.
- Identify and implement best practices for data processing and storage, including performance tuning and resource optimization.
- Evaluate and recommend data engineering tools and technologies, keeping abreast of industry trends and advancements.
- Create and maintain comprehensive documentation for data architecture, pipelines, and processes.
Qualifications:
Bachelor's degree in Computer Science, Information Technology, or a related field
5+ years of experience in data engineering or related roles, with a strong focus on big data technologies.
Proven expertise in PySpark and experience with Apache Spark frameworks.
Proficiency in data modeling, ETL processes, and data warehousing concepts.
Experience with cloud platforms (AWS, Azure, Google Cloud Platform) and associated data services.
Strong programming skills in Python, with familiarity in other languages such as Scala or Java being a plus.
Knowledge of SQL and experience with relational and NoSQL databases.
Excellent problem-solving skills and the ability to work in a fast-paced, collaborative environment.
Strong communication skills, with the ability to articulate technical concepts to non-technical stakeholders.
Preferred Skills:
Experience with containerization technologies (Docker, Kubernetes).
Familiarity with machine learning frameworks and libraries.
Understanding of data governance, security, and compliance best practices.

If you are passionate about data engineering and are looking to make a significant impact within a forward-thinking organization, we would love to hear from you!

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Report this job

Similar Jobs

1d ago

Data Engineer

Syra Health