Reproduction codes of Proximal Policy Optimization (PPO) with chainer
This repo is a PPO reproduction codes writen with chainer. See this original paper for details
Choose the params and run below command. The default parameters are set for running in atari environment.
Example:
python3 main.py --env-type='atari' For the detail of the parameters check the code or type
python3 main.py --help$ python3 main.py --env-type='atari' --test-run --model-params=trained_results/atari/breakout/small/final_model --atari-model-size='small'| result | score |
|---|---|
![]() |
![]() |
python3 main.py --env-type='atari' --test-run --model-params=trained_results/atari/breakout/large/final_model --atari-model-size='large'| result | score |
|---|---|
![]() |
![]() |
python3 main.py --env-type='atari' --test-run --model-params=trained_results/atari/zaxxon/large/final_model --atari-model-size='large' --env='ZaxxonNoFrameskip-v4'| result | score |
|---|---|
![]() |
![]() |
python3 main.py --env-type='atari' --test-run --model-params=trained_results/atari/space_invaders/large/final_model --atari-model-size='large' --env='SpaceInvadersNoFrameskip-v4'| result | score |
|---|---|
![]() |
![]() |
Sorry in progress...







