Data Engineer, Amazon AGI, AGI Data Services

Full Time, onsite
Amazon
On Site, United States of America

Salary undisclosed

Apply on

Dice

Original

Simplified

AI is the most transformational technology of our time, capable of tackling some of humanity's most challenging problems. Amazon is investing in generative AI and the responsible development and deployment of large language models (LLMs) across all of our businesses. Come build the future of human-technology interaction with us.

We are looking for those candidates who just don't think out of the box, but make the box they are in 'Bigger'. The future is now, do you want to be a part of it? Then read on!

We're looking for a Data Engineer on Amazon's AGI team to build world-class data platforms and deploy scalable data ingestion tools with a commitment to foster the safe, responsible, and effective development of AI technologies . The ideal candidate is an expert with Petabyte scale data ingestion, processing data, data modeling, ETL/ELT design and business intelligence tools and passionately partners with the business to identify strategic opportunities where improvements in data infrastructure creates outsized business impact. They are a self-starter, comfortable with ambiguity, able to think big (while paying careful attention to detail) and enjoys working in a fast-paced team. The ideal candidate needs to possess exceptional technical expertise with largescale lakehouses, distributed computing at a scale of thousands of hosts on multiple clusters, Spark, BI systems and AWS services.

Core Responsibilities

Design, implement, and support a platform providing ad hoc access to large datasets

Interface with other technology teams to extract, transform, and load data from a wide variety of data sources using Spark or any other state of the art systems

Implement data structures using best practices for lakehouses

Model data and metadata for ad hoc and pre-built reporting, meeting read/write/summary optimized storages

Interface with business customers, gathering requirements and delivering complete reporting solutions

Build robust and scalable data integration (ETL) pipelines using Kotlin, Python, typescript and Spark

Build and deliver high quality datasets to support business analyst and customer reporting needs

Continually improve ongoing automating or simplifying self-service Data ingestion at scale for customers

Participate in strategic & tactical planning discussions

BASIC QUALIFICATIONS

- 3+ years of data engineering experience

- Experience with data modeling, warehousing and building ETL pipelines

- Knowledge of batch and streaming data architectures like Kafka, Kinesis, Flink, Storm, Beam

- Knowledge of distributed systems as it pertains to data storage and computing

- Experience with SQL

PREFERRED QUALIFICATIONS

- Experience with AWS technologies like Redshift, S3, AWS Glue, EMR, Kinesis, FireHose, Lambda, and IAM roles and permissions

- Experience with non-relational databases / data stores (object storage, document or key-value stores, graph databases, column-family databases)

Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status. For individuals with disabilities who would like to request an accommodation, please visit

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Report this job

About Amazon

Size

More than 5000

Industry

Broadline Retail

Location

King County, United States

Founded

5 July 1994

View Company

Similar Jobs