Code for Learning Game-Playing Agents with Generative Code Optimization (ICML 2025 PRAL Workshop). We use Trace LLM optimizers (OptoPrime) to optimize Python policies to play Atari games via object-centric representations (OC_Atari). This repo provides the framework where you can let LLM play with Atari games via annotated text interfaces (no image/video required).
Paper that includes 3 initial games (Pong, Breakout, Space Invaders): https://openreview.net/forum?id=ZM65X3NoTd
Paper that includes 8 games: http://arxiv.org/abs/2603.23994
Asterix, Breakout, Enduro, Freeway, Pong, Q*bert, Seaquest, Space Invaders
We compare LLM-optimized policies against deep RL baselines that also use object-centric representations. Our CleanRL fork includes:
We share the training logs on Wandb:
bash install.shThis will:
- Install uv if not already present
- Clone the OC_Atari library into
external/OC_Atari/ - Install all Python dependencies via
uv sync
Follow the LLM API Setup of Trace to use OptoPrime as the supported optimizer.
Each game has a corresponding training script. Run with:
uv run python <game>_training.pyFor example:
uv run python asterix_training.py
uv run python breakout_training.py
uv run python pong_training.py├── *_training.py # Training scripts (one per game)
├── trace_envs/ # Traced environment wrappers (one per game)
├── training_utils.py # Shared training utilities
├── logging_util.py # Logging configuration
├── plotting_game_perf.py # Performance visualization
├── install.sh # Setup script
├── pyproject.toml # Dependencies (managed by uv)
├── external/OC_Atari/ # Object-centric Atari library
├── logs/ # Training logs
└── trace_ckpt/ # Optimizer checkpoints