Anant Sharma

Data Engineering Consultant | AWS, Azure & Snowflake Certified | Python Developer

Witness My Data Magic at the Snowflake Summit 2024!

Watch my tech talk from San Francisco where I discuss scaling data engineering for 2.5 PB migrations!

Professional Experience

Experience Overview Image


Principal Data Engineer @ ProCogia (Jan 2021 - Present)

  • Invited to present at the Snowflake Data Summit 2024 in San Francisco on a data migration project.
  • Migrated 2.5 PB of data to Snowflake, saving the client ~$1M annually.
  • Built datalake architecture from scratch on AWS + Snowflake ingesting data from Kafka with landing zone and processed datalake
  • Built datalake architecture from scratch on AWS + Snowflake ingesting data from Kafka with landing zone and processed datalake
  • Refractored and deployed Machine learning model on AWS EMR after increasing performance 4000 times better which saved thousands of dollars on cloud cost by rewriting Scikit learn based python code with Sedona & Spark ML based Pyspark code.
  • Implemented anonymization process to structure and standardize datalake using aws EMR + Snowpark scheduled as batch job
  • ETL orchestration using Airflow and ADF

Lead Consultant @ HCL Technologies (Jan 2019 - Dec 2020)

  • Created dimensional data models for Redshift and conducted fraud data modeling.
  • Automated data validation processes using Python, reducing time from days to hours.
  • Data Modeling for Fraud Data & Group Risk Model for one of the leading banks of Malaysia
  • Data Modeling for Group Risk Model to automate the regulatory reporting for Bank Negara

Senior Consultant @ PwC (Apr 2016 - Jan 2019)

  • Implemented data lakes on AWS using S3, Glue, and Athena for ad-hoc reporting.
  • Built data pipelines for real-time data ingestion using Kinesis Firehose and Google Firebase.
  • Implementation of a PwC data model on ERwin Data Modeler for BFSI domain
  • Involved in building Data lake flow (Ingestion → Cleansing → Transforming → Loading) using Data pipeline & Python scripts
  • Implementation of NoSQL data model for MongoDB using Hackolade
  • Allocated on-site and assisted PwC – South Africa in CTR – Automation & Assurance project

Project Engineer @ Wipro (Apr 2014 - Apr 2016)

  • ETL lead for a big Data warehousing project which involved 39 source system having 20+ years of data for of the leading & oldest government bank in India.
  • Extensive use of Erwin Data Modeler, Base SAS Programming, SAS Enterprise Guide, SAS DI Studio, SAS Web Report Studio, Teradata, Teradata FSLDM
  • Provide reusable components to load data from various databases such as Oracle, DB2, SQL Server, etc for which honored by SPOT Award
  • Designed a script which was selected as BVM process across all EDW projects in the organization

Skills & Tools

Skills Overview Image


Data Engineering

Snowflake, Redshift, AWS S3, Athena, Glue

Programming

Python, SQL, JavaScript, PySpark

Cloud

AWS, Azure, Google Cloud

Analytics & BI

Power BI, Tableau, QuickSight

Certifications

Certification Overview Image


AWS Certified

Solutions Architect, Data Analytics

Snowflake Certified

Snowpro Core

Azure Certified

Data Fundamentals

IBM Certified

Python Core & Analytics

Education

Education Overview Image

Bachelors of Technology in Information Technology from Dr. A.P.J. Abdul Kalam Technical University