Skip to content

This project explores Reinforcement Learning on FrozenLake using two approaches: Tabular Q-Learning and Deep Q-Learning with neural networks. It compares classical vs deep RL, includes training, validation, and visualizations, and highlights reward shaping, exploration-exploitation trade-offs, and model performance.

License

Notifications You must be signed in to change notification settings

AlecWaumans/ML-FrozenLake

Repository files navigation

❄️ FrozenLake Reinforcement Learning Project

Environment

Python PyTorch License: MIT


Overview

This project implements and compares two Reinforcement Learning techniques on the classic FrozenLake-v1 environment from Gymnasium:

  1. Q-Learning (Tabular Method)

    • Uses a Q-Table to store state-action values.
    • Includes training, validation, and visualization.
  2. Deep Q-Learning (DQN, Neural Network)

    • Uses a neural network to approximate Q-values.
    • Includes experience replay and a target network.
    • Optimized for performance and stability.

The goal is to train an agent to navigate the frozen lake safely, avoiding holes and reaching the goal.


Project Structure

FrozenLakeAlec-main/
├── frozen_lake_enhanced.py   # Q-Learning baseline (sourced from johnnycode8/gym_solutions)
├── frozen_lake_q.py          # Q-Learning (inspired but significantly modified)
├── frozen_lake_dql.py        # Deep Q-Learning (inspired but significantly modified)
│
├── Model/
│   ├── frozen_lake8x8.pkl        # Saved Q-Table (Pickle)
│   └── frozen_lake_dql_optimized.pt  # Trained DQN (PyTorch)
│
├── docs/
│   ├── GraphiqueQTable/          # Graphs for Q-Learning
│   │   ├── precision_evolution.png
│   │   ├── exploration_vs_exploitation.png
│   │   ├── cumulative_rewards.png
│   │   └── q_table_final.png
│   ├── GraphiqueDQL/             # Graphs for DQN
│   │   └── frozen_lake_optimized.png
│   └── img/                      # Images (board, sprites, environment)
│       ├── environment.jpeg
│       ├── elf_up.png / elf_down.png / elf_left.png / elf_right.png
│       ├── hole.png / cracked_hole.png
│       ├── ice.png / stool.png
│       └── goal.png
│
├── requirements.txt
├── setup_and_run.sh
├── .gitignore
├── License
└── README.md

Techniques Used

1. Q-Learning (Tabular)

  • Q-Table initialized with small values.
  • Rewards are shaped to accelerate learning:
    • +100 for reaching the goal.
    • -100 for falling into a hole.
    • -1 penalty per step, +10 for good intermediate states.
  • Tracks metrics:
    • Accuracy per episode
    • Exploration vs exploitation
    • Cumulative rewards
  • Model saved as: Model/frozen_lake8x8.pkl

2. Deep Q-Learning (DQN)

  • Neural network architecture:
    • Input = one-hot encoded state
    • 2 hidden layers (128 nodes each, ReLU)
    • Output = Q-values per action
  • Features:
    • Experience Replay (ReplayMemory)
    • Target Network updated every N steps
    • Epsilon-greedy policy with decay
  • Hyperparameters:
    • Learning rate: 0.001
    • Discount factor γ: 0.95
    • Replay memory: 10000
    • Mini-batch size: 32
  • Model saved as: Model/frozen_lake_dql_optimized.pt

Generated Results

Q-Learning Graphs (docs/GraphiqueQTable/)

  • precision_evolution.png → Accuracy evolution
  • exploration_vs_exploitation.png → Exploration vs Exploitation
  • cumulative_rewards.png → Reward progression
  • q_table_final.png → Final Q-Table (as matrix)

DQN Graphs (docs/GraphiqueDQL/)

  • frozen_lake_optimized.png → Average reward and epsilon decay

Usage

1. Setup environment

bash setup_and_run.sh

This script will:

  • Create a virtual environment .venv
  • Install all dependencies (requirements.txt)
  • Install PyTorch properly (Mac/Linux CPU support included)
  • Run a validation script

2. Training

  • Q-Learning (Enhanced baseline):
    python frozen_lake_enhanced.py
  • Q-Learning (Modified):
    python frozen_lake_q.py
  • Deep Q-Learning:
    python frozen_lake_dql.py

3. Validation

Validation is built into both frozen_lake_q.py and frozen_lake_dql.py (training + testing phases).


Key Learnings

  • Direct comparison of tabular Q-Learning vs neural DQN.
  • Clear visualization of the exploration-exploitation trade-off.
  • Importance of reward shaping for guiding the agent.
  • Demonstrates transition from classical RL to Deep RL.

Author

Alec Waumans
Industrial Computer Science Student


License

This project is licensed under the MIT License.


Credits & Inspirations

  • frozen_lake_enhanced.py sourced from johnnycode8/gym_solutions.
  • frozen_lake_q.py and frozen_lake_dql.py were inspired by the same project but heavily modified to extend functionality, add reward shaping, improve validation, and generate detailed graphs.
  • All additional project structure, documentation, and improvements by Alec Waumans.

About

This project explores Reinforcement Learning on FrozenLake using two approaches: Tabular Q-Learning and Deep Q-Learning with neural networks. It compares classical vs deep RL, includes training, validation, and visualizations, and highlights reward shaping, exploration-exploitation trade-offs, and model performance.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published