This repository contains a reference implementation of SAMI (Self-Supervised Alignment with Mutual Information) using the TL;DR dataset.
- Set up a conda environment (we used
python==3.10.0) and install the required dependencies by runningpip install -e .
- Adjust the
experiments/tldr/config/generate.yamlconfig file to match your directories and desired configurations. Example constitutions using principles written bymistral-7bandclaude-opusare provided in constitutions_mistral and constitutions_opus. - Navigate to
cd experiments/tldrand runpython generate.pyto generate your own data. By default, the generated data will be stored inexperiments/tldr/data/base. Note that this directory is already populated with the data used in the paper if you prefer to finetune a model directly.
- Select a model configuration (e.g.,
mistral-7b) from theexperiments/tldr/conf/modeldirectory and update thecache_diraccordingly (e.g.,/scr/YOUR_USERNAME/sami/checkpoints). - Adjust the
experiments/tldr/conf/train_sami.yamlconfig as needed, including optional wandb logging. If you setlog: trueyou should have an account/make sure that you are logged in. - Navigate to
cd experiments/tldrand run training using an interactive job using the command below, or adapt the example slurm script to meet your computing needs and submit it usingsbatch(or modify the script to be a standard bash script and submit from e.g. atmuxwindow).
python train.py \
training.beta=0.0 \
wandb.name="$YOUR_WANDB_NAME" \
training.checkpoint_dir="$YOUR_CHECKPOINT_DIR" \
training.lr=5e-7 \
data_path="data/base" \
data_file="base_mistral_from_mistral_principles.json" \
n_examples=2000- Adjust the
experiments/tldr/config/evaluate.yamlconfiguration, navigate tocd experiments/tldrand runpython evaluate.py. This will write the generated responses intoexperiments/tldr/results/responses. - Compute win rates by adjusting the
experiments/tldr/config/win_rates.yamlconfiguration and runningpython win_rates.pyfrom the same directory. Note that this script currently uses azure, so if you dont have access to GPT-4 via azure, you might have to copy-paste the/scr/models/openai_models/azure.pyand create your ownAsyncOpenAIclass. FYI: We used thegpt-4-0613snapshot for all evaluations.
If you don't have access to GPUs, you can attempt to run training using experiments/tldr/conf/model/mistral_tiny_base, which we tested locally on an Apple M2 Pro (2023 MacBook Pro with 16B memory).
The SAMITrainer and train.py use FSDP (FullyShardedDataParallel). To learn more about FSDP, you may find the FSDP tutorial series and the DDP tutorial series helpful.
If you found this work useful, please cite:
@article{fränken2024selfsupervised,
title={Self-Supervised Alignment with Mutual Information: Learning to Follow Principles without Preference Labels},
author={Jan-Philipp Fränken and Eric Zelikman and Rafael Rafailov and Kanishk Gandhi and Tobias Gerstenberg and Noah D. Goodman},
year={2024},
eprint={2404.14313},
archivePrefix={arXiv},
primaryClass={cs.CL}
}