Skip to content

shasank-periwal/de-interview-prep

Repository files navigation

🚀 Data Engineering Interview Preparation

This repository contains notes from tutorials I watched, documentation I read, and whatever was asked during my interview process. Consider it an all-rounder guide, but feel free to fork it and add whatever you require.


🏁 Getting Started

You can navigate through the folders based on your weak areas or follow the structure for a holistic review.

  1. Clone the repo to your local machine to run code snippets (especially for Python/Spark).
  2. Star the repo to save it for future reference.
  3. Explore the folders linked below to dive into specific technologies.

📂 Repository Structure

🛠 Technical Topics

Core technologies and frameworks essential for Data Engineering.

Topic Description Link
SQL Practice questions, query optimization, and patterns. View SQL
Python Python scripting, DSA for Data Engineers, and interview questions. View Python
Spark Apache Spark architecture, optimization, and Q&A. View Spark
Spark Streaming Real-time data processing concepts and Docker setups. View Streaming
Airflow DAGs, orchestration patterns, and workflow management. View Airflow
Kafka Event streaming architecture and producer/consumer concepts. View Kafka
Hive Data warehousing, HQL, and storage formats. View Hive
Docker Containerization basics for data pipelines. View Docker
AWS Cloud architecture, Kinesis, and data engineering on AWS. View AWS

📐 System Design

Topic Description Link
Data Modelling Star vs Snowflake schemas, normalization, and dimensional modeling. View Modelling
HLD / LLD High-Level and Low-Level Design for Data Platforms. View Design

🧠 Behavioral & Process

Topic Description Link
Hiring Manager Questions to expect (and ask) in HM rounds. View HM Notes
HR Round Standard HR questions and negotiation tips. View HR Notes
Prompts Useful prompts for AI tools to help you prepare. View Prompts
Extras Miscellaneous tips and resources. View Extras
todo Add in your short term goals. View todo

🤝 Contributing

This project is a living document. If you have a new interview question, a better solution, or a new topic to add:

  1. Fork the repository.
  2. Create a new branch (git checkout -b feature/AmazingFeature).
  3. Commit your changes (git commit -m 'Add some AmazingFeature').
  4. Push to the branch (git push origin feature/AmazingFeature).
  5. Open a Pull Request.

📬 Connect with me

If this repository helped you, please consider starring it! It helps others find these resources.


Happy Learning & Good Luck!

About

A comprehensive guide to ace Data Engineering interviews. Covers SQL, Python, Spark, AWS, System Design, Airflow, and behavioral questions.

Topics

Resources

Stars

Watchers

Forks

Contributors