Databricks Engineer
Apply on
Databricks /ADF Position
Start: October/ November, 2024
Type: Full Time Hire
Location: Remote
Description:
The position we are looking for someone that is experienced with Databricks within an Azure environment. We are currently set to onboard various types and forms of data from across FDIC to the Data Orchestration Platform (DOP). As such, this candidate will help drive architecture and integration of Databrick solutions within an Azure environment. This candidate will also be implementing Databrick solutions to onboard these datasets and will need to have hands-on experience. Candidate will be working with technologies such as Azure Data Factory, Change Data Capture, various development languages. The day-to-day tasks include building pipelines using ADF as well as building notebooks and transforming data utilizing Databricks. Since the DOP is a fluid environment, the candidate must be fluid to changes in requests by the customer. Below is a list of requirements:
Architecture and integration of Databricks in an Azure environment
Must have:
- Databricks architecture and design experience in Azure using the Unity Catalog (being implemented currently in PROD environment)
- Integration experience with Databricks and Azure Data Lake (help identify and define additional uses of Databricks in environment)
Desired:
- Integration experience using Databricks with Azure Data Factory (ADF) (currently integrating in Azure environment)
Hands-on experience with Databricks in an Azure environment
Must have:
- Developed Databricks notebooks to provide various functionality (i.e. ETL, zip/unzip, and other data transformations)
- Experience with implementing Databricks views using Data Frames or other methods
- Experience using various programming languages (Python, PySpark, SQL, etc.) to develop solutions in Databricks notebooks and jobs
- Experience with Azure Data Factory (ADF) pipeline creation to move data from various sources (DB, BLOB, file server, website, etc.) to the Azure Data Lake
Desired:
- Familiarity with Change Data Capture (CDC) patterns such as Full, Type 2, IUD, etc. (currently using Databricks to develop CDC pattern solution code between Data Lake Bronze to Silver environments)
Additional Desired:
- Strong willingness to learn and is a self learner.
- Strong initiative
- Works well in a small teams