Skip to content

My portfolio of Data Engineering and Data Quality Projects : AWS, Azure, Databricks, streaming analytics, predictive modeling, and governance.

Notifications You must be signed in to change notification settings

Archana-Lal/data-portfolio

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

15 Commits
Β 
Β 
Β 
Β 

Repository files navigation

Archana Lal – Data Engineering & Data Quality Portfolio

πŸ‘‹ Hello! I'm Archana Lal, a Data Quality Lead and Cloud Data Engineer with over six years of experience architecting, governing, and optimizing data solutions on AWS and Azure. My passion lies in transforming complex business requirements into reliable, high-value data assets.

This portfolio contains five detailed case studies from my work in the global logistics sector, showcasing my journey from hands-on engineering to strategic data leadership.


Portfolio of Case Studies

Each case study below links to a detailed write-up, including the business challenge, my role, the high-level technical solution, and an workflow diagram.

Case Study Focus Area Key Technologies Key Outcome
1. Real-Time Streaming Analytics Streaming Data Pipelines AWS Kinesis, EMR, Elasticsearch βœ… Enabled real-time operational dashboards
2. Predictive ML Data Quality ML Pipeline Enablement AWS, PySpark, Feature Engineering πŸ“ˆ Increased model accuracy from 89% to 96%
3. Cloud Migration & Enablement Multi-Cloud Migration AWS, Azure, Terraform, Redshift πŸ‘₯ Successfully up-skilled a 20-member team
4. Process Stabilization & Leadership Hybrid-Cloud Support & Process Azure Databricks, Synapse, Queues πŸ“‰ Reduced critical escalations by over 40%
5. Strategic Data Governance Data Quality & Observability Databricks, SODA, Data Contracts 🎯 Projected to cover >60% of data gaps

Core Skills

  • Cloud: AWS (S3, Glue, EMR, Redshift, Kinesis, DMS), Microsoft Azure (ADLS, Synapse, ADF, Event Hubs)
  • Data Engineering & ETL: Apache Spark (PySpark, Scala), Databricks, AWS Glue, Azure Data Factory, Talend, SSIS, Hadoop, Hive
  • Databases: Oracle, MySQL, Teradata, Redshift, DynamoDB, Elasticsearch, S3, Azure Blob Storage
  • Data Quality & Governance: Soda, Data Profiling, Data Cleansing, Data Validation, Data Contracts, Alation
  • BI & Visualization: Power BI, ThoughtSpot, MSBI, SAP BODS, SSRS, Kibana, Grafana
  • Languages & Scripting: SQL, Python, Shell Scripting
  • DevOps & Tools: Git (GitHub, Bitbucket), Terraform, Airflow, Jira, Confluence, Alation, Lucid

πŸ† Achievements & Certifications

  • Infosys Platinum Member (Awarded to the Top 1% of employees, FY-2023)
  • Microsoft Certified: Azure Data Engineer Associate (DP-203)
  • AWS Certified Solutions Architect - Associate
  • Microsoft Certified: Power BI Data Analyst Associate (PL-300)

πŸ“« Connect with Me


⚑I am actively seeking new Data Engineering or Data Quality roles in the United Kingdom and require visa sponsorship. If my experience in building and governing enterprise-scale data solutions aligns with your needs, I would be delighted to connect.


⚠️ Disclaimer

The case studies in this repository are inspired by real-world projects I have contributed to during my professional experience. However, they do not include any proprietary code, internal documentation, or confidential information. All client names have been anonymized, and the content has been adapted solely to illustrate the nature of the work and the skills involved. These case studies are presented as representative examples, and may not reflect the exact implementations or internal processes used by the original organizations. This repository is intended purely for demonstrating my experience and capabilities as a data strategist.

About

My portfolio of Data Engineering and Data Quality Projects : AWS, Azure, Databricks, streaming analytics, predictive modeling, and governance.

Topics

Resources

Stars

Watchers

Forks