A collection of chess engines that play like humans, from ELO 1100 to 1900.
In this repo is our 9 final maia models saved as Leela Chess neural networks, and the code to create more and reproduce our results.
Our website has information about the project and team.
You can also play against three of of our models on Lichess:
maia1is targeting ELO 1100maia5is targeting ELO 1500maia9is targeting ELO 1900MaiaMysteryis for testing new versions of Maia
We also have a Lichess team, maia-bots, that we will add more bots to.
The Maias are not a full chess framework chess engines, they are just brains (weights) and require a body to work. So you need to load them with lc0 and follow the instructions here. Then unlike most other engines you want to disable searching, a nodes limit of 1 is what we use. This looks like go nodes 1 in UCI. Note also, the models are also stronger than the rating they are trained on since they make the average move of a player at that rating.
The links to download the models directly are:
| Targeted Rating | lichess name | link |
|---|---|---|
| 1100 | maia1 | maia-1100.pb.gz |
| 1500 | maia5 | maia-1500.pb.gz |
| 1900 | maia9 | maia-1900.pb.gz |
The bots on Lichess use opening books that are still in development, since the models play the same move every time.
| Targeted Rating | link |
|---|---|
| 1200 | maia-1200.pb.gz |
| 1300 | maia-1300.pb.gz |
| 1400 | maia-1400.pb.gz |
| 1600 | maia-1600.pb.gz |
| 1700 | maia-1700.pb.gz |
| 1800 | maia-1800.pb.gz |
We also have all the models in the maia_weights folder of the repo.
When running the models on the command line it should look like this:
:~/maia-chess$ lc0 --weights=model_files/maia-1100.pb.gz
_
| _ | |
|_ |_ |_| v0.26.3 built Dec 18 2020
go nodes 1
Loading weights file from: model_files/maia-1100.pb.gz
Creating backend [cudnn-auto]...
Switching to [cudnn]...
...
info depth 1 seldepth 1 time 831 nodes 1 score cp 6 tbhits 0 pv e2e4
bestmove e2e4
move_prediction/maia_chess_backend also has the LeelaEngine class that uses the config files move_prediction/model_files/*/config.yaml to wrap python-chess and allow the models to be used in Python.
As part of our analysis all the game on Lichess with stockfish analysis were processed into csv files. These can be found here
To create your own maia from a set of chess games in the PGN format:
- Setup your environment
- (optional) Install the
condaenvironment,maia_env.yml - Make sure all the required packages are installed from
requirements.txt
- (optional) Install the
- Convert the PGN into the training format
- Add the
pgn-extracttool to your path - Add the
trainingdata-toolto your path - Run
move_prediction/pgn_to_trainingdata.sh PGN_FILE_PATH OUTPUT_PATH - Wait a bit as the processing is both IO and CPU intense
- The script will create a training and validation set, if you wish to train on the whole set copy the files from
OUTPUT_PATH/validationtoOUTPUT_PATH/training
- Add the
- Edit
move_prediction/maia_config.yml- Add
OUTPUT_PATH/training/*/*toinput_train - Add
OUTPUT_PATH/validation/*/*toinput_test - (optional) If you have multiple GPUS change the
gpufiled to the one you are using - (optional) You can also change all the other training parameters here, like the number of layers
- Add
- Run the training script
move_prediction/train_maia.py PATH_TO_CONFIG - (optional) You can use tensorboard to watch the training progress, the logs are in
runs/CONFIG_BASENAME/ - Once complete the final model will be in
models/CONFIG_BASENAME/directory. It will be the one with the largest number
To train the models we present in the paper you need to download the raw files from Lichess then cut them into the training sets and process them into the training data format. This is a similar format to the general training instructions just with our specified data, so you will need to have trainingdata-tool and pgn-extract on your PATH.
Also note that running the scripts manually line by line might be necessary as they do not have any flow control logic because we were too lazy to implement it. And that move_prediction/replication-move_training_set.py is where the main shuffling and games selection logic is.
Flow control logic is now implemented.
- Setup your environment
- Install the
condaenvironment,maia_env.yml- you will need to work with it the entire time so dont forget to activate - Make sure all the required packages are installed from
requirements.txt
- Install the
- Download the games from Lichess between January 2017 and November 2019 to
data/lichess_raw - (Optional) For simplicity - Give the entire directory running permissions
chmod -R +x directory_name- otherwise give only to scripts which you run (do notice that some scripts run others so you will need to inspect the code and see which one need executing/reading writing permissions). - The downloaded games from
Lichessare .pgz.zst format, code wasn't updated in years and still relies on the old format of .bz2 so you will need to runmove_prediction/convert-zst.sh- make sure you have packagesbzip2andzstd. you will be left with both .pgn.zst and .bz2 files, you can store the .pgn.zst files in another directory for future use but DO NOT keep them indata/lichess_raw - Run
move_prediction/create_trainingFolders.sh. - Run
move_prediction/replication-generate_pgns-all.sh <elo-start> <elo-end> <elo-step>. For example, to run from elo 1100 to 1900 in steps of 100:move_prediction/replication-generate_pgns-all.sh 1100 1900 100. - Run
move_prediction/replication-move_training_set.py.- (optional) - in relation to 6.1, if you are running on different data rather then the one we tested on you will need to modify the file - currently it take september 2019 - november 2019 and from them constructs the test file. If you wish to this manually there is another option in section 8.2.
- (optional) Run
move_prediction/generate-cut_games.sh <elo-start> <elo-end> <elo-step>- this code discards games that have less then 20 moves or didnt end in checkmate, it also cuts games to the last 20 moves if it didnt have 2 castling moves, or if it did, then till the last castling move. - Run
move_prediction/replication-make-leela-files-all.sh <elo-start> <elo-end> <elo-step>. For example, to run from elo 1100 to 1900 in steps of 100:move_prediction/replication-make-leela-files-all.sh 1100 1900 100.- (optional) if you haven't downloaded septeber 2019 - november 2019 and didnt modify the script in 7 you can manually get the test files -
for example, in the end you will have all the train files for specific elo here
data/elo_ranges/{elo}/train/{train directories}/supervised-0/, in order to train maia for a spesific elo you will need both train file and test files, you will need to copy manually some file form the train files which i mentioned, and copy at least 10 of them todata/elo_ranges/{elo}/test/{1-3.pgn}/supervised-0/- noted that this option is not recommended because it creates a mix up between train and test files, but if you just wish to test the maia training process this is the best option.
- (optional) if you haven't downloaded septeber 2019 - november 2019 and didnt modify the script in 7 you can manually get the test files -
for example, in the end you will have all the train files for specific elo here
- Edit
move_prediction/maia_config.ymland add the elo you want to train:- input_test :
../data/elo_ranges/${elo}/test/*/* - output_train :
../data/elo_ranges/${elo}/train/*/* - make sure that you write the full path and not the relative path, because it creates problems depending on from where you run the python script,
so for example:
/home/achiya/repos/maia-chess/data/elo_ranges/1100/train/*/*
- input_test :
- Run the training script
move_prediction/train_maia.py PATH_TO_CONFIG - (optional) You can use tensorboard to watch the training progress, the logs are in
runs/CONFIG_BASENAME/example:
tensorboard --logdir=runsWe also include some other (but not all) config files that we tested. Although, we still recommend using the final config move_prediction/maia_config.yml.
If you wish to generate the testing set we used you can download the December 2019 data and run move_prediction/replication-make_testing_pgns.sh. The data is also avaible for download as a CSV here. The script for running models on the dataset is replication-run_model_on_csv.py and requires the lc0 binary on the path.
After you've ran the code, you can use move_predicition/plot_graph.py to plot a graph based on the csv you got after running the code.
To train the blunder prediction models follow these instructions:
- Setup your environment
- (optional) Install the
condaenvironment,maia_env.yml
- (optional) Install the
- Make sure all the required packages are installed from
requirements.txt - Run
blunder_prediction/make_csvs.sh- You will probably need to update the paths, and may want to change the targets or use a for loop
- Run
blunder_prediction/mmap_csv.pyon all the csv files - Select a config from
blunder_prediction/configsand update the paths - Run `blunder_prediction/train_model.py CONFIG_PATH
@inproceedings{mcilroyyoung2020maia,
title={Aligning Superhuman AI with Human Behavior: Chess as a Model System},
author={McIlroy-Young, Reid and Sen, Siddhartha and Kleinberg, Jon and Anderson, Ashton},
year={2020},
booktitle={Proceedings of the 25th ACM SIGKDD international conference on Knowledge discovery and data mining}
}
The software is available under the GPL License.
Please open an issue or email Reid McIlroy-Young to get in touch
