Markov Chain Aggregation of strategy evolution in Google Research Football RL agents.
This project applies Markov Chain Aggregation (MCA) to formally model how strategies emerge and evolve in reinforcement learning agents trained in the Google Research Football environment.
Inspired by Scott, Fujii & Onishi (2022), who showed that competitive RL agents spontaneously develop football strategies similar to real players, we go beyond their descriptive analysis by modelling strategy evolution as a dynamical system. Using the MCA framework of Banisch (2015), we aim to identify discrete strategic regimes and characterise the transitions between them.
- Extract behavioural observables (pass completion rate, possession, shot frequency, social network metrics) from GRF agents at different training checkpoints
- Discretise the behavioural space into micro-states and estimate empirical transition matrices
- Apply MCA to coarse-grain these micro-states into interpretable macro-states (e.g. chaotic, transitional, structured)
- Characterise strategic transitions: are they sharp or gradual? Are there metastable regimes?
All experiments run inside the Google Research Football environment using Football Academy scenarios as a starting point. A Dockerfile is provided to reproduce the environment.
- Banisch, S. Markov Chain Aggregation for Agent-Based Models. Springer, 2015.
- Kurach, K. et al. Google Research Football: A Novel Reinforcement Learning Environment. arXiv:1907.11180, 2019.
- Scott, A., Fujii, K., and Onishi, M. How does AI play football? An analysis of RL and real-world football strategies. ICAART, 2022.