Senior Data Engineer
Apply on
Job Description
About Lextegrity
Lextegrity is a global leader in the digital transformation of corporate risk management. Our integrated platform combines continuous spend monitoring of data from systems like SAP, Oracle, and Concur with workflow automation tools to help global organizations prevent and detect fraud, corruption, conflicts of interest, and other economic crimes.
Lextegrity was founded in 2017 by veteran legal and audit professionals with decades of experience in addressing corporate fraud. Organizations lose 5% of their annual revenues to fraud, yet most companies struggle with embedding compliance policies into real-time business practices. Our software solutions address this key problem.
POSITION SUMMARY
We are seeking a talented Senior Data Engineer to join the data platform group for our Compliance Monitoring product, focused on enabling efficient data to support our risk analytics engine that works to identify fraud, bribery, and corruption. The ideal candidate will have a deep understanding of the python and SQL programming languages and experience with data analytics. The candidate will work closely with the implementation teams, data scientists, and product stakeholders in evolving this unique product.
What you will work on here
- Build scalable batch data processing frameworks for processing data through analytical transformations.
- Create foundational data to support machine learning models & implement models in production.
- Monitor and Improve the performance of data pipelines and database queries.
- Monitor and debug data quality issues.
Qualifications
- 5+ years of experience working heavily (IE daily) with:
- Python
- SQL
- Apache Airflow
- Snowflake
- AWS services (RDS, EMR, S3)
- Git / GitHub
- Spark (PySpark)
- Extensive experience building data processing systems & analytical data products.
- Knowledge of machine learning algorithms and techniques for data analysis.
What you ve done (and skills you will use on a daily basis):
- Strong Python skillset
- Advanced proficiency in Python for data engineering, with over 5 years of experience in developing optimized ETL pipelines for data transformation and analysis. Expert in utilizing Pandas and NumPy for efficient data manipulation, alongside SQLAlchemy for database interactions.
- Demonstrated capability in leveraging Python to design and execute complex data models and ETL processes in Snowflake, ensuring high performance and scalability. Experience with Python-based data processing frameworks to support analytics and machine learning implementations.
- Knowledge of integrating Python applications with cloud services and data orchestration tools, including AWS and Apache Airflow, to enhance data pipeline efficiency.
- Built multiple batch data processing pipelines (or pipeline frameworks) to extract, transform / load, load / transform data.
- Designed/Written data models and ETL using Snowflake as the data warehouse
- Demonstrated ability to design and implement robust data models in Snowflake, optimizing for performance, scalability, and cost-efficiency.
- Advanced understanding of Snowflake's architecture and features, including data sharing and warehousing capabilities to support efficient data analysis.
- Proven track record of deploying and managing Snowflake's security and access control features, ensuring data integrity and compliance with privacy regulations.
- Experience with Snowpark and Snowpipe is a plus, enhancing capabilities in building scalable data applications and automating data loading.
- Worked extensively with AWS services (RDS, EMR, S3)
- Spent time provisioning, using, and terminating them (either via Boto3, CDK, CLI, or Apache Airflow operators).
- Experience with Spark (with PySpark) is a plus
- Git / Github with PyCharm (or your IDE of choice).
- Worked with Apache Airflow building DAG s for ETL / ELT.
- Built single-use case DAG s.
- Built frameworks / reusable components.
- SQL Proficiency
- Can utilize an explain plan to optimize query performance.
- Proficient in SQL from the basics of join optimization / indexing to window functions and beyond.
BENEFITS
- Flexible working conditions and ability to work remotely
- Flexible PTO we trust you
- Work for one of the fastest-growing companies with some of the most talented people in the industry
- A transparent and collaborative team environment
- Medical, Dental & Vision Coverage
- FSA, HSA, Life insurance, Short-Term and Long-term Disability insurance
- Stock options
- 401K Matching Program
Lextegrity is an equal opportunity employer and prohibits discrimination on the basis of race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws. Must be authorized to work in the United States on a full-time basis. No phone calls or recruiters please.
Powered by JazzHR
cwGNniIqSz