Skip to content
View AnjulGupta12's full-sized avatar
:octocat:
->🍩[eat]->💤[sleep]->💻[code]->🧘[yoga]->🔁[repeat]
:octocat:
->🍩[eat]->💤[sleep]->💻[code]->🧘[yoga]->🔁[repeat]

Block or report AnjulGupta12

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
AnjulGupta12/README.md

Hi, I'm Anjul Gupta

Linkedin Mail Resume
banner that says Anjul Gupta - data Engineer alongside a cartoon illustration

Data Engineer @ Infosys | BFSI Domain
Azure & GCP | PySpark | SQL | Kafka | Snowflake | Oracle (ORMB) | Data Quality & Automation (SDET) | AI-900 | AZ-900 | 2 x ISTQB

  • 3+ years of hands-on experience in Data Engineering across BFSI and enterprise-scale applications.
  • Proficient in SQL, Python, and UNIX for ETL data validation, transformation, and backend testing.
  • Designed and optimized ETL pipelines for high-volume financial datasets, ensuring accuracy and integrity.
  • Worked with Hadoop, Hive, and Spark for large-scale data processing and analytics.
  • Developed and maintained data workflows using Apache Airflow, ensuring timely execution of batch processes.
  • Implemented data ingestion and streaming pipelines using Confluent Kafka, reducing latency in data delivery.
  • Experienced with NoSQL databases (MongoDB, Cassandra) for semi-structured and unstructured data handling.
  • Conducted backend-to-frontend data validation, ensuring consistent user experience across applications.
  • Collaborated with cross-functional teams (BAs, developers, testers, and product owners) to align data solutions with business requirements.
  • Published reusable Python/SQL data validation scripts and automation utilities, improving team efficiency by ~15%.
  • Implemented 15+ end-to-end Data Engineering projects including Flight Booking ETL pipelines with Airflow & CI/CD, E-commerce event-driven pipelines on Databricks, Travel Booking SCD2 Data Warehouse, UPI Transactions CDC streaming, Healthcare Medallion architecture, and multiple real-time/batch pipelines across GCP, Snowflake, and Azure ecosystems using PySpark, Airflow, Kafka, and Delta Lake, publishing work on GitHub.

Technical Skills :

C CPP JAVA PYTHON
MYSQL ORACLE_SQL_DEVELOPER CASSANDRA MONGODB TRINO AZURE GCP AWS PYSPARK HIVE SNOWFLAKE BIGQUERY HADOOP ICEBERG DATABRICKS KAFKA AIRFLOW HUDI
CONFLUENCE JIRA ZEPHYR GIT GITHUB BITBUCKET
SELENIUM BDDCUCUMBER TESTNG JUNIT MAVEN ECLIPSE VISUAL STUDIO CODE

Pinned Loading

  1. Edureka_Automation Edureka_Automation Public

    POM(Page Object Model) of Edureka using Eclipse, Java, Selenium, JUnit, Apache POI, TestNG. This repository carries all necessary library and jar files to automate Edureka website.

    HTML 1

  2. E-Banking-Automation E-Banking-Automation Public

    POM(Page Object Model) of E-Banking using Eclipse, Java, Selenium, JUnit, Apache POI, TestNG. This repository carries all necessary library and jar files to automate Edureka website.

    HTML

  3. SpiceJet_TestNG_Automation SpiceJet_TestNG_Automation Public

    TestNG automation project in Java Selenium. The implemented automation framework is 100% open source and the components are as follows: Eclipse, Java, Selenium, JUnit, Apache POI, TestNG.

    Java 1

  4. CoinexBodhi_Developer CoinexBodhi_Developer Public

    Web App to track the live data like Price, Volume, Change etc for several Cryptocurrencies.

    HTML

  5. MachineLearning_ClassificationProject MachineLearning_ClassificationProject Public

    Explored and Visualized the Iris Dataset to gain insights. Built Logistic Regression, Decision Tree and Random Forest for classification of species. Used accuracy score as the evaluation metrics an…

    Jupyter Notebook