GitHub - SapanaChaudhary/RA-RLHF

RA-RLHF: Risk Averse Finetuning of Large Language Models

The repository contains the code for our Neurips 2024 paper titled 'Risk Averse fine tuning of LLMs'. This code is based on the huggngface trl github repository trl. Trained model checkpoints will be added shortly.

Please follow the setup instructions as mentioned in the trl repository.

To run the experiments, execute the following commands,

IMDB training :

git checkout  auth1/main

cd examples/IMDB/training

sh ppo_run_single_script.sh #RLHF
sh sr_ppo_run_single_script.sh #RA-RLHF

Jigsaw training :

git checkout  auth2/main

cd examples/Jigsaw/training

sh ppo_run_single_script.sh #RLHF
sh sr_ppo_run_single_script.sh #RA-RLHF

GPT-J 6B IMDB training:

git checkout  auth2/main

cd examples/IMDB/training

python ppo_big_imdb.py #RLHF
python sr_ppo_big_imdb.py #RA-RLHF

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.github/workflows		.github/workflows
benchmark		benchmark
docs/source		docs/source
examples		examples
scripts		scripts
tests		tests
trl		trl
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
2024_risk_averse_RLHF.pdf		2024_risk_averse_RLHF.pdf
CITATION.cff		CITATION.cff
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RA-RLHF: Risk Averse Finetuning of Large Language Models

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RA-RLHF: Risk Averse Finetuning of Large Language Models

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages