❄️ FrozenLake Reinforcement Learning Project

Overview

This project implements and compares two Reinforcement Learning techniques on the classic FrozenLake-v1 environment from Gymnasium:

Q-Learning (Tabular Method)
- Uses a Q-Table to store state-action values.
- Includes training, validation, and visualization.
Deep Q-Learning (DQN, Neural Network)
- Uses a neural network to approximate Q-values.
- Includes experience replay and a target network.
- Optimized for performance and stability.

The goal is to train an agent to navigate the frozen lake safely, avoiding holes and reaching the goal.

Project Structure

FrozenLakeAlec-main/
├── frozen_lake_enhanced.py   # Q-Learning baseline (sourced from johnnycode8/gym_solutions)
├── frozen_lake_q.py          # Q-Learning (inspired but significantly modified)
├── frozen_lake_dql.py        # Deep Q-Learning (inspired but significantly modified)
│
├── Model/
│   ├── frozen_lake8x8.pkl        # Saved Q-Table (Pickle)
│   └── frozen_lake_dql_optimized.pt  # Trained DQN (PyTorch)
│
├── docs/
│   ├── GraphiqueQTable/          # Graphs for Q-Learning
│   │   ├── precision_evolution.png
│   │   ├── exploration_vs_exploitation.png
│   │   ├── cumulative_rewards.png
│   │   └── q_table_final.png
│   ├── GraphiqueDQL/             # Graphs for DQN
│   │   └── frozen_lake_optimized.png
│   └── img/                      # Images (board, sprites, environment)
│       ├── environment.jpeg
│       ├── elf_up.png / elf_down.png / elf_left.png / elf_right.png
│       ├── hole.png / cracked_hole.png
│       ├── ice.png / stool.png
│       └── goal.png
│
├── requirements.txt
├── setup_and_run.sh
├── .gitignore
├── License
└── README.md

Techniques Used

1. Q-Learning (Tabular)

Q-Table initialized with small values.
Rewards are shaped to accelerate learning:
- +100 for reaching the goal.
- -100 for falling into a hole.
- -1 penalty per step, +10 for good intermediate states.
Tracks metrics:
- Accuracy per episode
- Exploration vs exploitation
- Cumulative rewards
Model saved as: Model/frozen_lake8x8.pkl

2. Deep Q-Learning (DQN)

Neural network architecture:
- Input = one-hot encoded state
- 2 hidden layers (128 nodes each, ReLU)
- Output = Q-values per action
Features:
- Experience Replay (ReplayMemory)
- Target Network updated every N steps
- Epsilon-greedy policy with decay
Hyperparameters:
- Learning rate: 0.001
- Discount factor γ: 0.95
- Replay memory: 10000
- Mini-batch size: 32
Model saved as: Model/frozen_lake_dql_optimized.pt

Generated Results

Q-Learning Graphs (`docs/GraphiqueQTable/`)

precision_evolution.png → Accuracy evolution
exploration_vs_exploitation.png → Exploration vs Exploitation
cumulative_rewards.png → Reward progression
q_table_final.png → Final Q-Table (as matrix)

DQN Graphs (`docs/GraphiqueDQL/`)

frozen_lake_optimized.png → Average reward and epsilon decay

Usage

1. Setup environment

bash setup_and_run.sh

This script will:

Create a virtual environment .venv
Install all dependencies (requirements.txt)
Install PyTorch properly (Mac/Linux CPU support included)
Run a validation script

2. Training

Q-Learning (Enhanced baseline):
```
python frozen_lake_enhanced.py
```
Q-Learning (Modified):
```
python frozen_lake_q.py
```
Deep Q-Learning:
```
python frozen_lake_dql.py
```

3. Validation

Validation is built into both frozen_lake_q.py and frozen_lake_dql.py (training + testing phases).

Key Learnings

Direct comparison of tabular Q-Learning vs neural DQN.
Clear visualization of the exploration-exploitation trade-off.
Importance of reward shaping for guiding the agent.
Demonstrates transition from classical RL to Deep RL.

Author

Alec Waumans
Industrial Computer Science Student

License

This project is licensed under the MIT License.

Credits & Inspirations

frozen_lake_enhanced.py sourced from johnnycode8/gym_solutions.
frozen_lake_q.py and frozen_lake_dql.py were inspired by the same project but heavily modified to extend functionality, add reward shaping, improve validation, and generate detailed graphs.
All additional project structure, documentation, and improvements by Alec Waumans.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

❄️ FrozenLake Reinforcement Learning Project

Overview

Project Structure

Techniques Used

1. Q-Learning (Tabular)

2. Deep Q-Learning (DQN)

Generated Results

Q-Learning Graphs (`docs/GraphiqueQTable/`)

DQN Graphs (`docs/GraphiqueDQL/`)

Usage

1. Setup environment

2. Training

3. Validation

Key Learnings

Author

License

Credits & Inspirations

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
Model		Model
docs		docs
.gitignore		.gitignore
License		License
README.md		README.md
frozen_lake_dql.py		frozen_lake_dql.py
frozen_lake_enhanced.py		frozen_lake_enhanced.py
frozen_lake_q.py		frozen_lake_q.py
requirements.txt		requirements.txt
setup_and_run.sh		setup_and_run.sh

License

AlecWaumans/ML-FrozenLake

Folders and files

Latest commit

History

Repository files navigation

❄️ FrozenLake Reinforcement Learning Project

Overview

Project Structure

Techniques Used

1. Q-Learning (Tabular)

2. Deep Q-Learning (DQN)

Generated Results

Q-Learning Graphs (docs/GraphiqueQTable/)

DQN Graphs (docs/GraphiqueDQL/)

Usage

1. Setup environment

2. Training

3. Validation

Key Learnings

Author

License

Credits & Inspirations

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Q-Learning Graphs (`docs/GraphiqueQTable/`)

DQN Graphs (`docs/GraphiqueDQL/`)

Packages