A framework for Reinforcement Learning research.
│ Overview │ Getting Started │ Documentation │ Citation │
- 💡 Perfect to understand and prototype algorithms:
- One algorithm = One directory -> No backtracking through parent classes
- Algorithms can be easily copied out of RL-X
- ⚒️ Known DL libraries: Implementations in PyTorch and mainly JAX
- ⚡ Maximum speed: Just-In-Time (JIT) compilation and parallel environments
- 🧪 Mix and match and extend: Generic interfaces between algorithms and environments
- ⛰️ Custom environments: Examples for MuJoCo, Isaac Lab, ManiSkill or pure socket communication
- 🚀 GPU environments: MJX, Isaac Lab and ManiSkill can run thousands of parallel environments
- 🤖 Robot learning: Training and deployment for the Unitree Go2 (quadruped) and G1 (humanoid) robots
- 📈 Experiments: Checkpoints, Evaluation, Console log, Tensorboard, Weights & Biases, SLURM, Docker
- Proximal Policy Optimization (PPO) in PyTorch, Flax
- Proximal Policy Optimization + Differentiable Trust Region Layers (PPO+DTRL) in Flax
- Proximal Policy Optimization + Gated Recurrent Unit (PPO+GRU) in Flax
- Proximal Policy Optimization + Transformer (PPO+Transformer) in Flax
- Proximal Policy Optimization + History Window (PPO+HistoryWindow) in Flax
- Proximal Policy Optimization + Memory Actions (PPO+MemoryActions) in Flax
- Early Stopping Policy Optimization (ESPO) in PyTorch, Flax
- Deep Deterministic Policy Gradient (DDPG) in Flax
- Twin Delayed Deep Deterministic Gradient (TD3) in Flax
- Fast Twin Delayed Deep Deterministic Gradient (FastTD3) in PyTorch, Flax
- Soft Actor Critic (SAC) in PyTorch, Flax
- Fast Soft Actor Critic (FastSAC) in PyTorch, Flax
- Randomized Ensembled Double Q-Learning (REDQ) in Flax
- Dropout Q-Functions (DroQ) in Flax
- CrossQ in Flax
- Truncated Quantile Critics (TQC) in Flax
- Aggressive Q-Learning with Ensembles (AQE) in Flax
- Maximum a Posteriori Policy Optimization (MPO) in PyTorch, Flax
- Fast Maximum a Posteriori Policy Optimization (FastMPO) in Flax
- Deep Q-Network (DQN) in Flax
- Deep Q-Network with Histogram Loss using Gaussians (DQN HL-Gauss) in Flax
- Double Deep Q-Network (DDQN) in Flax
- Categorical Deep Q-Network (C51) in Flax
- Parallelized Q-Network (PQN) in Flax
- Gymnasium
- MuJoCo
- Atari
- Classic control
- DeepMind Control Suite
- EnvPool
- MuJoCo
- Atari
- Classic control
- DeepMind Control Suite
- MuJoCo Playground
- Locomotion
- Custom MuJoCo
- Example of a custom MuJoCo environment
- Example of a custom MuJoCo XLA (MJX) environment
- Custom Robot Learning
- Example of custom MuJoCo and MJX environments for quadruped and humanoid locomotion learning and real robot deployment
- Custom Isaac Lab
- Example of a custom Isaac Lab environment
- Custom ManiSkill
- Example of a custom ManiSkill environment
- Custom Interface
- Prototype of a custom environment interface with socket communication
All listed environments are directly embedded in RL-X and can be used out-of-the-box.
For further information on the environments (README) and algorithms (README) and how to add your own, read the respective README files.
Default installation for a Linux system with a NVIDIA GPU:
conda create -n rlx python=3.11.4
conda activate rlx
git clone git@github.com:nico-bohlinger/RL-X.git
cd RL-X
pip install -e .[all] --config-settings editable_mode=compat
pip uninstall $(pip freeze | grep -i '\-cu12' | cut -d '=' -f 1) -y
pip install "torch>=2.7.0" --index-url https://download.pytorch.org/whl/cu118 --upgrade
pip install "jax[cuda12]"
For other configurations, see the detailed installation guide in the documentation. As Isaac Lab needs to be installed separately, instructions can also be found there. Similarly, ManiSkill might need additional steps, like downgrading numpy.
cd experiments
python experiment.py
Detailed instructions for running experiments can be found in the README file in the experiments directory or in the documentation.
If you use RL-X in your research, please cite the following paper:
@incollection{bohlinger2023rlx,
title={RL-X: A Deep Reinforcement Learning Library (not only) for RoboCup},
author={Nico Bohlinger and Klaus Dorer},
booktitle={Robot World Cup},
pages={228--239},
year={2023},
publisher={Springer}
}