Towards Smarter Player Scouting: Learning Football Player Embeddings with Variational Autoencoders (VAEs)
This repository contains all the code necessary to replicate the experiments from the paper: Towards Smarter Player Scouting: Learning Football Player Embeddings with Variational Autoencoders (VAEs), by Giulio Fantuzzi,Leonardo Egidi and Nicola Torelli, accepted at IES 2025 – Statistical Methods for Evaluation and Quality, the 12th Scientific Meeting of the Statistics for the Evaluation and Quality of Services Group of the Italian Statistical Society (SVQS - SIS).
The paper is available both in this repository (check here) and in the Book of the Conference, accessible here
To get started, first clone the repository (you may also want to fork it):
git clone https://github.com/giuliofantuzzi/ies2025.gitI recommend creating a virtual environment to avoid conflicts with your system's Python packages:
python -m virtualenv path/to/env
source path/to/env/bin/activate
pip install --upgrade pipOnce activated the environment, install the required dependencies specified in pyproject.toml:
cd ies2025/
pip install .This repository is organized as follows:
data/- contains the datasets used for training and evaluation, along with a data card detailing sources, preprocessing steps, and variable descriptions.models/- contains the python implementation of the VAE model and the VAE loss.scraping/- contains the code used to retrieve the data from web.data_processing.ipynb- a Jupyter notebook detailing all the data preprocessing steps.training.py- a python script to train the VAE model. To train the model with the paper configuration, run:Notice that different configurations can be specified directly from command line. To see all the available options, run:python training.py --DataPath path/to/data --CheckpointsPath path/to/weights.pt
python training.py --help
checkpoints/- contains the weights of the trained model (stored asvae.pthfile).experiments.ipynb: a python notebook to reproduce the experimental results from Section 4 of the paper.