Geospatial Big Data

Full Time, onsite
CrackaJack Digital Solutions
On Site, United States of America

Salary undisclosed

Checking job availability...

Original

Simplified

Key Responsibilities

Design and implement scalable big data architectures using Hadoop, Spark, or other distributed processing frameworks.
Ingest, store, and manage large volumes of structured and unstructured geospatial data from various sources (e.g., satellite, drone, IoT, APIs).
Optimize geospatial data storage using formats like GeoParquet, GeoJSON, and PostGIS-enabled databases.
Develop and maintain ETL/ELT pipelines with a focus on spatial data enrichment, transformation, and aggregation.
Work with GIS tools and libraries such as GDAL, GeoPandas, Shapely, or QGIS.
Collaborate with data scientists and analysts to support spatial modeling, analytics, and visualization.
Implement data quality checks, lineage tracking, and metadata management for geospatial datasets.
Ensure the performance, scalability, and reliability of geospatial data pipelines in production environments.

Requirements

Technical Skills

Proficiency in distributed computing frameworks (e.g., Apache Spark, Hadoop, Kafka).
Strong programming skills in Python, Scala, or Java.
Experience with spatial databases like PostGIS, GeoMesa, or GeoSpark.
Familiarity with GIS tools and spatial data formats (e.g., GeoTIFF, KML, shapefiles).
Cloud platform experience (AWS, Azure, or Google Cloud Platform), especially services like S3, EMR, or BigQuery with geospatial capabilities.
Understanding of geospatial concepts: coordinate systems, map projections, spatial indexing (R-tree, QuadTree), etc.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Report this job

Key Responsibilities

Design and implement scalable big data architectures using Hadoop, Spark, or other distributed processing frameworks.
Ingest, store, and manage large volumes of structured and unstructured geospatial data from various sources (e.g., satellite, drone, IoT, APIs).
Optimize geospatial data storage using formats like GeoParquet, GeoJSON, and PostGIS-enabled databases.
Develop and maintain ETL/ELT pipelines with a focus on spatial data enrichment, transformation, and aggregation.
Work with GIS tools and libraries such as GDAL, GeoPandas, Shapely, or QGIS.
Collaborate with data scientists and analysts to support spatial modeling, analytics, and visualization.
Implement data quality checks, lineage tracking, and metadata management for geospatial datasets.
Ensure the performance, scalability, and reliability of geospatial data pipelines in production environments.

Requirements

Technical Skills

Proficiency in distributed computing frameworks (e.g., Apache Spark, Hadoop, Kafka).
Strong programming skills in Python, Scala, or Java.
Experience with spatial databases like PostGIS, GeoMesa, or GeoSpark.
Familiarity with GIS tools and spatial data formats (e.g., GeoTIFF, KML, shapefiles).
Cloud platform experience (AWS, Azure, or Google Cloud Platform), especially services like S3, EMR, or BigQuery with geospatial capabilities.
Understanding of geospatial concepts: coordinate systems, map projections, spatial indexing (R-tree, QuadTree), etc.

Report this job