Data Engineer @ Infosys | BFSI Domain
Azure & GCP | PySpark | SQL | Kafka | Snowflake | Oracle (ORMB) | Data Quality & Automation (SDET) | AI-900 | AZ-900 | 2 x ISTQB
- 3+ years of hands-on experience in Data Engineering across BFSI and enterprise-scale applications.
- Proficient in SQL, Python, and UNIX for ETL data validation, transformation, and backend testing.
- Designed and optimized ETL pipelines for high-volume financial datasets, ensuring accuracy and integrity.
- Worked with Hadoop, Hive, and Spark for large-scale data processing and analytics.
- Developed and maintained data workflows using Apache Airflow, ensuring timely execution of batch processes.
- Implemented data ingestion and streaming pipelines using Confluent Kafka, reducing latency in data delivery.
- Experienced with NoSQL databases (MongoDB, Cassandra) for semi-structured and unstructured data handling.
- Conducted backend-to-frontend data validation, ensuring consistent user experience across applications.
- Collaborated with cross-functional teams (BAs, developers, testers, and product owners) to align data solutions with business requirements.
- Published reusable Python/SQL data validation scripts and automation utilities, improving team efficiency by ~15%.
- Implemented 15+ end-to-end Data Engineering projects including Flight Booking ETL pipelines with Airflow & CI/CD, E-commerce event-driven pipelines on Databricks, Travel Booking SCD2 Data Warehouse, UPI Transactions CDC streaming, Healthcare Medallion architecture, and multiple real-time/batch pipelines across GCP, Snowflake, and Azure ecosystems using PySpark, Airflow, Kafka, and Delta Lake, publishing work on GitHub.

