🤖 MMHD-project - Multimodality Human Detection

MMHD (Multimodality Human Detection) is a research-oriented project developed for the Robotics and Machine Learning course at the University of Coimbra.
The goal is to explore and compare the performance of YOLOv8 models for human detection using RGB, thermal, and depth image modalities, individually and in a fusion-based setup.

🎓🏆 Approved with the Highest Honors!

🎯 Objectives

✅ Train YOLOv8 models for RGB, thermal, and depth images
✅ Evaluate and compare performances across modalities
✅ Perform inference and visualize predictions
✅ Implement and assess late fusion strategies (e.g., NMS, Voting)
✅ Prepare for possible sensor fusion in real-world applications

🛠️ Model Details

Modality	Model Used	Input Size	Dataset
RGB	YOLOv8s	640×512	MID-3K RGB subset
Thermal	YOLOv8s	640×512	MID-3K Thermal subset
Depth	YOLOv8s	640×512	MID-3K Depth subset

📁 Project Structure

MMHD-project/
|
├── README.md # 📌 This file!
|
├── AAU-VAP-trimodal-dataset/ # 🗂️ Trimodal dataset with aligned RGB, thermal, and depth images
├── MID-3K/ # 🗂️ Base dataset
│ └── dataset/
│   ├── rgb/ # 🌈 RGB images and labels (split & full)
│   ├── thermal/ # 🌡️ Thermal images and labels (split & full)
│   └── depth/ # 🕳️ Depth images and labels (split & full)
|
├── Project/ # 💻 Scripts and project related files
|   ├── RML-2025-report-FelipeTassariAveiro.pdf # 📝 Final report!
|   ├── dataset-separator.py # ✂️ Python script to split dataset
|   ├── RML-project-MMHD.py # 🏋️ Python script to train models
|   ├── label-identifier.py # 🏷️ Python script to organize AAU VAP Trimodal dataset labels
|   ├── Depth.png # 📊 Normalized confusion matrices for depth modality
|   ├── RGB.png # 📊 Normalized confusion matrices for RGB modality
|   ├── Thermal.png # 📊 Normalized confusion matrices for thermal modality
│   └── README-MultimodalISRDataset.md # 📘 README file with description on MID-3K dataset
|
├── RML-project-MMHD/ # 📁 Contains training outputs of each modality
|   ├── rgb/ # 🌈 YOLO training results for RGB modality (metrics, predictions, weights)
|   ├── thermal/ # 🌡️ YOLO training results for thermal modality (metrics, predictions, weights)
|   ├── depth/ # 🕳️ YOLO training results for depth modality (metrics, predictions, weights)
|   ├── test_eval_rgb/ # 🧪 Validation metrics and plots on test set using RGB-trained model
|   ├── test_eval_AAU_rgb/ # 🧪 Validation metrics and plots on test set using RGB-trained model (AAU VAP Trimodal dataset)
|   ├── test_eval_thermal/ # 🧪 Validation metrics and plots on test set using thermal-trained model
|   ├── test_eval_AAU_thermal/ # 🧪 Validation metrics and plots on test set using thermal-trained model (AAU VAP Trimodal dataset)
|   ├── test_eval_depth/ # 🧪 Validation metrics and plots on test set using depth-trained model
|   └── test_eval_AAU_depth/ # 🧪 Validation metrics and plots on test set using depth-trained model (AAU VAP Trimodal dataset)
|
├── predictions # 🧠 Inference results
|  └── late_fusion/ # 🤖 Late fusion inference results
│    ├── comparison/ # 🧩 Predictions comparison between Voting and NMS 
│    ├── eval_summary/ # 📈 Computed metrics 
│    ├── nms/ # 🧮 NMS predictions 
│    └── voting/ # ⚖️ Voting predictions 
|
├── rgb.yaml # ⚙️ Configuration file for training with RGB images
├── thermal.yaml # ⚙️ Configuration file for training with thermal images
├── depth.yaml # ⚙️ Configuration file for training with depth images
|
├── yolo11n.pt # 🤖 Pretrained YOLOv11n model weights
├── yolov8s.pt # 🤖 Pretrained YOLOv8s model weights
|
└── MMHD_project.ipynb # 👨‍💻 Colab notebook

🧪 Setup & Requirements

✅ Google Colab environment
✅ Ultralytics YOLOv8
✅ Dataset preprocessing and splits handled automatically
✅ Custom YAML configurations for each modality
✅ Compatible with fusion-based inference

👨‍💻 Felipe Tassari Aveiro

Master's in Mechanical Engineering, Robotics and Machine Learning (2025)

🎓🏆 Approved with the Highest Honors

University of Coimbra

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🤖 MMHD-project - Multimodality Human Detection

🎯 Objectives

🛠️ Model Details

📁 Project Structure

🧪 Setup & Requirements

👨‍💻 Felipe Tassari Aveiro

📎 References

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
AAU-VAP-trimodal-dataset		AAU-VAP-trimodal-dataset
MID-3K/dataset		MID-3K/dataset
Project		Project
RML-project-MMHD		RML-project-MMHD
predictions		predictions
MMHD_project.ipynb		MMHD_project.ipynb
README.md		README.md
depth.yaml		depth.yaml
rgb.yaml		rgb.yaml
thermal.yaml		thermal.yaml
yolo11n.pt		yolo11n.pt
yolov8s.pt		yolov8s.pt

felipe-aveiro/MMHD-project

Folders and files

Latest commit

History

Repository files navigation

🤖 MMHD-project - Multimodality Human Detection

🎯 Objectives

🛠️ Model Details

📁 Project Structure

🧪 Setup & Requirements

👨‍💻 Felipe Tassari Aveiro

📎 References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages