A network science analysis of UFC fighter interactions, rivalries, and performance metrics using graph theory and machine learning.
This project analyzes UFC fight data through the lens of network science, constructing and analyzing fighter networks to understand:
- Fighter rivalries and competition patterns (undirected rivalry graph)
- Win-loss dominance hierarchies (directed dominance graph)
- Network metrics and their relationship to fighter performance
- Community structure within the fighter ecosystem
- Temporal evolution of the fighter network
- Comparison with theoretical null models (ER, BA, WS)
NetworkScienceProject/
├── data
│ ├── data_raw/ # Raw UFC dataset (ufc-master.csv)
│ └── data_processed/ # Processed data, graphs, and metrics
├── src/ufcnet/ # Main Python package
│ ├── config.py # Project paths and configuration
│ ├── utils.py # Logging and utility functions
│ ├── data_loading.py # Data loading and parsing
│ ├── cleaning.py # Data cleaning and preprocessing
│ ├── build_graphs.py # Network construction
│ ├── metrics_static.py # Network metrics and centralities
│ ├── performance_models.py # Fighter performance analysis
│ ├── null_models.py # Temporal and null model analysis
│ └── scripts/ # CLI entry points
│ ├── preprocess_data.py
│ ├── build_networks.py
│ ├── compute_metrics.py
│ ├── analyze_performance.py
│ └── run_null_models.py
├── tests/ # Unit tests
├── figures/ # Generated plots and visualizations
├── reports/ # Analysis reports and model results
├── slides/ # Presentation materials
└── requirements.txt # Python dependencies
- Python 3.8+
- pip or conda for package management
- Clone the repository:
git clone <repository-url>
cd NetworkScienceProject- Create and activate a virtual environment:
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate- Install dependencies:
pip install -r requirements.txt- Download the UFC dataset: The dataset will be downloaded from Kaggle when you run the preprocessing script.
The project is organized into 5 main phases, each with a corresponding script:
Loads raw UFC data, cleans columns, creates fighter IDs, and generates structured datasets.
python src/ufcnet/scripts/preprocess_data.py \
--input data_raw/ufc-master.csv \
--output-dir data_processed/ \
--seed 42Outputs:
data_processed/ufc_clean.csv- Cleaned full datasetdata_processed/fights.csv- Fight-level tabledata_processed/fighters_lookup.csv- Fighter ID mappings
Builds rivalry (undirected) and dominance (directed) graphs from fight data.
python src/ufcnet/scripts/build_networks.py \
--input data_processed/fights.csv \
--output-dir data_processed/ \
--seed 42Outputs:
data_processed/rivalry.gexf/rivalry.gpickle- Rivalry graphdata_processed/dominance.gexf/dominance.gpickle- Dominance graph
Computes global statistics, node centralities, and community structure.
python src/ufcnet/scripts/compute_metrics.py \
--rivalry-graph data_processed/rivalry.gpickle \
--dominance-graph data_processed/dominance.gpickle \
--output-dir data_processed/ \
--seed 42Outputs:
data_processed/global_stats.json- Network-level statisticsdata_processed/centralities.csv- Node centrality measuresdata_processed/communities.csv- Community assignments
Analyzes fighter performance and trains predictive models using network features.
python src/ufcnet/scripts/analyze_performance.py \
--fights data_processed/fights.csv \
--centralities data_processed/centralities.csv \
--communities data_processed/communities.csv \
--output-dir data_processed/ \
--reports-dir reports/ \
--seed 42Outputs:
data_processed/fighters_performance.csv- Fighter performance metricsdata_processed/fighters_merged.csv- Combined dataset for modelingreports/model_results.json- Regression model results
Builds temporal snapshots and compares against theoretical null models.
python src/ufcnet/scripts/run_null_models.py \
--fights data_processed/fights.csv \
--rivalry-graph data_processed/rivalry.gpickle \
--output-dir data_processed/ \
--seed 42Outputs:
data_processed/temporal_stats.csv- Year-over-year network statisticsdata_processed/null_vs_real_stats.csv- Null model comparisons
- Phase 0: Base setup (repo structure, config, script skeletons) ✓
- Phase 1: Data preprocessing (loading, cleaning, fighter IDs)
- Phase 2: Network construction (rivalry & dominance graphs)
- Phase 3: Network metrics & communities
- Phase 4: Performance modeling
- Phase 5: Temporal analysis & null models
This is a network science course project. See individual phase documentation for task assignments.
MIT License (or specify your license)