Welcome! This repo contains everything you need for Week 1 of the Data Engineering program: environment setup, Bash/VS Code/Jupyter intros, Git + GitHub workflows, Python practice (basics β intermediate β advanced), and a Pandas primer.
π‘ Tip: Keep commits small (e.g., one per exercise). When a notebookβs tests pass, Restart Kernel β Run All to ensure a clean run.
- Clone the repo
git clone <your-fork-or-classroom-url>.git
cd WEEK1-PYTHON-FOR-DATA-ENGINEERING- Create a project virtual environment (recommended: Python 3.11.3)
-
Linux / macOS
python -m venv .venv source .venv/bin/activate -
Windows (PowerShell)
python -m venv .venv .\.venv\Scripts\Activate.ps1
- (Optional) Jupyter & data stack for local runs
pip install notebook jupyterlab pandas numpy matplotlib seaborn- Launch notebooks
jupyter lab
# or
jupyter notebookπ» In VS Code: Open the folder β Command Palette β βPython: Select Interpreterβ β pick .venv.
- π Tooling: Bash, VS Code, Jupyter, Google Colab
- π± Version control: Git + GitHub (clone, branch, commit, PR)
- π Python: Core syntax & flow, data structures, functions, error handling, iterators/generators, performance tips
- π Pandas: Series/DataFrame basics, transforms, visualization
-
01_welcome/welcome.mdβ course kickoff and expectations. π
-
02_installation_setup/[Time Allocation - 2nd half of Day 1]setup_for_linux/β Linux install guides (Bash, Docker, Git, Python/pyenv, VS Code, PostgreSQL/pgAdmin, Jupyter). π§setup_for_mac/β macOS install guides (Homebrew, Bash, Docker, Git, Python/pyenv, VS Code, PostgreSQL/pgAdmin, Jupyter). πsetup_for_windows/β Windows install guides (Git Bash/WSL notes, Docker Desktop, Git, Python/pyenv-win, VS Code, PostgreSQL/pgAdmin, Jupyter). πͺvscode_venv/β how to use virtual environments with VS Code (Windows/macOS). π₯οΈ Use these if your machine isnβt set up yet.
-
03_bash_jupyter_vscode_colab_intro/bash.mdβ Bash intro + practice game (Bandit). πΉvscode.mdβ VS Code essentials for this course. β¨jupyter.mdβ Jupyter Notebook/Lab walkthrough. πcolab.mdβ Using Google Colab. βοΈ
-
04_git_github/git_github_intro.mdβ class workflow: fork/clone, feature branches, commits, PRs, resolving simple conflicts. π§©
-
05_python_practice/-
python_basics/β notebooks + exercise folder.- Topics:
- Numeric variable types, Strings, If/Elif/Else, Loops π
- Lists, Sets, Mutability, Dictionaries, Comprehensions
- Functions (intro/definitions/calling/challenge)
- β
Each student notebook has TODOs and
asserttests.
-
python_intermediate/β notebooks + exercise folder: β‘- Error handling, Iterators & Generators, Lambda/Map/Filter/Reduce, Performance.
-
python_advanced/β notebooks + exercise folder: π- OOP introduction, Concurrency & Parallelism.
-
-
06_pandas_intro/01_pandas.ipynbβ foundations (Series/DataFrame, indexing, I/O). π02_pandas_practice_1.ipynb,04_pandas_practice_2.ipynb,05_pandas_practice_3.ipynbβ progressively harder practice.03_pandas_visualization.ipynbβ quick plotting. πdata/β sample CSVs/parquet used by the notebooks. Donβt move/rename. ποΈ