Computer Vision project made with Python, YOLO and OpenCV to recognize dice, count them, and sum their values in real time using a webcam or video as input.
dice_recognition.mp4
python3 -m venv .venv
source .venv/bin/activate # On Windows use: .venv\Scripts\activateNote
Before installing the dependencies, if you want to use CUDA for better performance, you should install the appropriate CUDA versions of torch and torchvision:
pip3 install torch torchvision --index-url https://download.pytorch.org/whl/cu126The example above uses cu126 (CUDA 12.6). However, you must ensure that:
- Your system has a compatible NVIDIA GPU.
- You have the correct CUDA drivers installed.
If you're using an older GPU or have a lower CUDA version installed (e.g., CUDA 11.8), use the matching packages:
pip3 install torch torchvision --index-url https://download.pytorch.org/whl/cu118For more options and compatibility information, check the official PyTorch installation guide.
pip3 install -r requirements.txtThe pretrained model is already included in this repository at:
runs/detect/train/weights/best.pt
You can run the application directly:
python3 main.pyThe dataset is already included in this repository at:
datasets/dices
To train the model you can use the YOLO CLI:
yolo task=detect mode=train model=yolov8n.pt data=datasets/dices/data.yaml epochs=50 plots=TrueNote
You can customize the training process by for example modifying these opions:
- model: YOLO model to use (e.g.,
yolov8n.pt,yolov8s.pt, etc.). - epochs: Max training cycles.
- patience: Stop early if no improvement after this many epochs.
Once training is complete, the best-trained model will be stored at:
runs/detect/train2/weights/best.pt
You can use this model by modifying main.py:
model = YOLO("runs/detect/train2/weights/best.pt")Note
If you train multiple times, new training folders (e.g., train2, train3, etc.) will be created, so you can choose the best model from any of them by modifying the path in main.py.
Simply run:
python3 main.pyThe script will detect dice, count them, and display the total sum in real-time.
.
βββ datasets/ # Contains the datasets for training
β βββ dices/
β βββ data.yaml # Dataset configuration file
β βββ test/ # Test dataset
β βββ train/ # Training dataset
β βββ valid/ # Validation dataset
βββ main.py # Runs the dice recognition model
βββ requirements.txt # Project dependencies
βββ runs/ # Training outputs
β βββ detect/
β βββ train/ # First training session
β β βββ weights/
β β β βββ best.pt # Best-trained model
βββ yolov8n.pt # Base YOLO model for trainingThis project is licensed under the MIT License.