This project, Modeling Football Player Value, aims to statistically estimate the value of football players based on their sports performance. The goal is to develop a reproducible data science pipeline that leverages open data sources, web scraping, or APIs to gather player data and compare the results with reference values from Transfermarkt.
This project was developed as part of the programming course for second-year students at ENSAE Paris.
- Data Collection and Enrichment:
- Use of open datasets, scraping, and APIs to gather detailed player performance metrics.
- Cleaning, transforming, and enriching data for analysis.
- Data Visualization:
- Creation of insightful plots and dashboards to explore trends and relationships.
- Player Value Estimation Models:
- Implementation of various machine learning algorithms to predict player market values.
- Comparison of model outputs with reference values (Transfermarkt, CIES).
- Reproducible Research:
- Complete pipeline, from data acquisition to analysis, documented and shared on GitHub.
- Unit tests to ensure the reliability of core functions.
Here is the structure of this repository:
ModelingFootballValue/
│
├── Api/ # Scripts for API interactions
├── scraping_data/ # Scripts and data related to web scraping
├── using_data/ # Notebooks and scripts for data analysis and modeling
├── .gitignore # Git ignore file
├── README.md # Project documentation
├── rapport.ipynb # Jupyter Notebook containing the project report
└── stadiums_map.html # HTML file visualizing stadium locations
To get started with this project, follow these steps:
- Clone this repository
Clone the repository to your local machine using the following command:
git clone https://github.com/romandb21/ModelingFootballValue.git
-
Install the required packages
Navigate to the project directory and install the necessary Python packages -
Run the project
Open the jupyter notebook and enjoy!
This project was inspired by the intersection of data science and sports analytics. Special thanks to the platforms and datasets used (e.g., FBref, Transfermarkt, Football-Data.org) for providing open data to support this work.