Official implementation of paper: "Skeleton2Stage: Reward-Guided Fine-Tuning for Physically Plausible Dance Generation". Prior dance generation methods often operate on sparse skeletons and overlook geometric constraints of the human body, leading to artifacts such as penetration and foot sliding. We propose a reward-guided fine-tuning framework that aligns generated motions with body geometry, improving physical plausibility.
[Febrary 2, 2026] Training code for imitation policy released.
[January 30, 2026] Training and Evaluation code for EDGE released.
-
Supports for training on GPU newer than A100.
-
Installation guidance
-
Release training imitation policy code.
-
Release training code.
-
Release evaluation code.
-
Release rendering code.
-
Release guidance for custom models and rewards.
We identify and address a critical yet often-overlooked gap between skeleton-based motion generation and mesh-level body visualization. Specifically, we leverage a physics-based humanoid controller to evaluate the physical plausibility of generated motions and convert its feedback into a reward that penalizes violations of physical laws, especially constraints arising from human body geometry. Combined with complementary reward signals, this reward design enables us to fine-tune a generative model with reinforcement learning to produce motions that remain physically plausible when visualized on a human body mesh.
All evaluation is done using the mean SMPL body shape.
To create the environment, follow the following instructions:
-
Clone the project:
git clone https://github.com/jjd1123/Skeleton2Stage.git
-
Create new conda environment and install pytroch:
conda create -n isaac python=3.8 pip install -r requirement.txt
-
Download and setup Isaac Gym.
-
Download the MuJoCo version 2.1 for Linux.
-
Install torch-mesh-isect for body penetration rate evaluation.
-
Configure your paths in
environment.sh.
For a cleaner project layout, you can place Isaac Gym, MuJoCo, and torch-mesh-isect under theenvironment/directory. -
This repository additionally depends on the following libraries, which may require special installation procedures:
- jukemirlib
- pytorch3d
- accelerate
- Note: after installation, don't forget to run
accelerate config. We use fp16.
- Note: after installation, don't forget to run
-
Place the smpl files under
body_models/like following,body_models/ ├── README.md # This guide file │ ├── smpl/ │ ├── J_regressor_extra.npy │ ├── kintree_table.pkl │ ├── smplfaces.npy │ ├── SMPL_FEMALE.pkl │ ├── SMPL_MALE.pkl │ └── SMPL_NEUTRAL.pkl │ ├── smplh/ │ ├── female/ │ │ └── model.npz │ ├── male/ │ │ └── model.npz │ └── neutral/ │ └── model.npz │ └── smplx/ ├── female/ │ └── model.npz ├── male/ │ └── model.npz └── neutral/ └── model.npz
Before evaluation, make sure you have:
(1) the correct settings in metric computation scripts, and
(2) the correct model in Line 56 in EDGE.py.
cd code/rl_finetune
bash eval.sh exp_name epoch_num motion_save_root ckpt_root cached_music_featuresComing soon!To fine-tune the generative model, you need the following:
- A base checkpoint of the generative model;
- Conditioning data for training-time sampling;
- A checkpoint of the trained imitation policy.
In this section, we provide data (preprocessed data and the pretrained imitation policy) for a minimal example: finetuning EDGE on AIST++. You can directly run following scripts:
cd code/rl_finetune
# download EDGE checkpoint.
bash download_mode.sh
# download preprocessed data of AIST++ and the pretrained imitation policy.
bash download_data.sh- Note: We use DVC for dataset version control. Please follow the README in the downloaded data directory for a quick setup.
We will also explain how to prepare your own datasets and pre-trained models below.
You can follow the instructions from:
Follow the instructions in Training Imitation Policy.
Coming soon!-
Prepare the Expert Dataset
First, prepare the expert dataset for imitation policy training by following the guide in the Vid2player3d README.
-
Run the Training Script
Once the dataset is ready, start the training by executing the
train.shscript.cd code/pretrain/vid2player3d bash train.sh PATH_TO_VID2PLAYER3D- Customization: You can modify the training strategy by changing the configuration file and the execution order within
train.sh.
- Customization: You can modify the training strategy by changing the configuration file and the execution order within
(1) Change the weight of different rewards in reward.yaml.
(2) Set the correct model for finetuning in Line 56 in EDGE.py.
cd code/rl_finetune
bash run.sh exp_name gpu_parallel_num epoch_num batch_sizeComing soon!Coming soon!Coming soon!
If you find this work useful for your research, please cite our paper:
@misc{jia2026skeleton2stagerewardguidedfinetuningphysically,
title={Skeleton2Stage: Reward-Guided Fine-Tuning for Physically Plausible Dance Generation},
author={Jidong Jia and Youjian Zhang and Huan Fu and Dacheng Tao},
year={2026},
eprint={2602.13778},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2602.13778},
}
This repository is built on top of the following amazing repositories:
- Main code framework is from: EDGE
- Imitation policy is from: vid2player3d
- SMPL models and layer is from: SMPL-X model
- README template is from: PHC
Please follow the lisence of the above repositories for usage.

