logo

View all jobs

Senior Data Engineer

Open, US · Computer/Software

Sr. Data Engineer

Job Description

The Data Engineer is responsible for the maintenance, synchronization, cleaning, and migration of transactional data in a hybrid environment with both on-prem and highly modern cloud based microservices environment. The Data Engineer works with the product teams in order to understand, analyze, document and efficiently implement to deliver streaming as well as batch oriented data for synchronizing legacy and modern data stores ensuring data integrity.

The Data Engineer provides support to the Application database design to aid in eliminating data duplication and enabling selective event based and schedule based data transfer to endpoints within the cloud and legacy environment as required.  The Data Engineer drives towards programmatic pipeline generation and orchestration to enhance repeatability and rapid deployment, using out of the box thinking, AWS native capabilities and CI/CD tools while utilizing established design patterns and methods.

  • The successful candidate will be able to rapidly develop technical solutions working closely with the integrated product teams and developers with minimal direction from senior or lead resources
  • Understand data needs and be able to construct data pipelines for automating event driven bi-directional selective data replication, along with micro-batch and batch based data pipelines
  • Standardization of data processing modules to deliver modularity and enhance reusability
  • Utilize identified tools and services such as AWS Glue, Python, Streamsets, Step functions and Lambda, Kinesis Data Streams and Data Firehose
  • Create and maintain standards and best practices for data and pipeline standards

The candidate must have a successful track record in ETL job design, development and automation activities with minimal supervision.  The candidate will be expected to support a variety of structured, semi-structured and unstructured data in streaming and batch frameworks.  The candidate must possess AWS working knowledge and proven skills on AWS data tools and services such as AWS Glue, Step Functions, Lambda and DMS.  Troubleshoot, monitor and coordinate defect resolution related to data processing & preparation; Responsible for the creation and support of all data pipeline processes across various data assets within the current scope of the system.

 

Required Skills:

  • 3+ years of hands on experience in with ETL tools such as SAP DS and Pentaho PDI
  • 4+ years of hands on experience in with python and various python toolkits and libraries for data processing and pipelines
  • 4+ year of experience and ability to create complex SQL queries and functions
  • 2+ years of hands on experience in AWS Glue, Python, Step Functions, Kinesis Data Streams and Data Firehose
  • 1+ year of experience in java and linux scripting is essential
  • 1+ year experience working with CI/CD tools including Git for etl and scripts repository

 

Desired Skills:

  • Experience of working in AWS environments and any AWS certifications is advantageous
  • Experience working with SQL Server and tools like SSRS is advantageous
  • Experience with big data tools such as EMR/Spark, Databricks/PySpark is an advantage
  • Experience working with database versioning tools such as Flyway is an advantage
  • Experience in Ansible and Jenkins scripting is an advantage

 

 Qualifications:

  • Bachelor's degree in Computer Science or related discipline
  • The ability to successfully obtain and maintain a U.S. Suitability/Public Trust Background Clearance

 

Benefits:
Company pays 100% of employee only Medical, Dental, Life Insurance, Short Term Disability and Long-term disability.
We offer Vision coverage, 401(k) with immediate vesting and a competitive PTO policy.
 
VANTA Partners
www.vantapartners.io
Vanta Partners Glassdoor Page

 

Share This Job

Powered by