Skeleton2Stage: Reward-Guided Fine-Tuning for Physically Plausible Dance Generation

Jidong Jia, Youjian Zhang, Huan Fu, Dacheng Tao

Shanghai Jiao Tong University

Abstract

Official implementation of paper: "Skeleton2Stage: Reward-Guided Fine-Tuning for Physically Plausible Dance Generation". Prior dance generation methods often operate on sparse skeletons and overlook geometric constraints of the human body, leading to artifacts such as penetration and foot sliding. We propose a reward-guided fine-tuning framework that aligns generated motions with body geometry, improving physical plausibility.

We identify and address a critical yet often-overlooked gap between skeleton-based motion generation and mesh-level body visualization. Specifically, we leverage a physics-based humanoid controller to evaluate the physical plausibility of generated motions and convert its feedback into a reward that penalizes violations of physical laws, especially constraints arising from human body geometry. Combined with complementary reward signals, this reward design enables us to fine-tune a generative model with reinforcement learning to produce motions that remain physically plausible when visualized on a human body mesh.

Docs

Current Results on EDGE

All evaluation is done using the mean SMPL body shape.

Installation

To create the environment, follow the following instructions:

Clone the project:

git clone https://github.com/jjd1123/Skeleton2Stage.git

Create new conda environment and install pytroch:

conda create -n isaac python=3.8
pip install -r requirement.txt

Download and setup Isaac Gym.
Download the MuJoCo version 2.1 for Linux.
Install torch-mesh-isect for body penetration rate evaluation.
Configure your paths in environment.sh.
For a cleaner project layout, you can place Isaac Gym, MuJoCo, and torch-mesh-isect under the environment/ directory.
This repository additionally depends on the following libraries, which may require special installation procedures:

jukemirlib
pytorch3d
accelerate
- Note: after installation, don't forget to run accelerate config . We use fp16.

Place the smpl files under body_models/ like following,

body_models/
├── README.md            # This guide file
│
├── smpl/
│   ├── J_regressor_extra.npy
│   ├── kintree_table.pkl
│   ├── smplfaces.npy
│   ├── SMPL_FEMALE.pkl
│   ├── SMPL_MALE.pkl
│   └── SMPL_NEUTRAL.pkl
│
├── smplh/
│   ├── female/
│   │   └── model.npz
│   ├── male/
│   │   └── model.npz
│   └── neutral/
│       └── model.npz
│
└── smplx/
    ├── female/
    │   └── model.npz
    ├── male/
    │   └── model.npz
    └── neutral/
        └── model.npz

Evaluation

Evaluation Pipeline

Before evaluation, make sure you have:

(1) the correct settings in metric computation scripts, and

(2) the correct model in Line 56 in EDGE.py.

cd code/rl_finetune
bash eval.sh exp_name epoch_num motion_save_root ckpt_root cached_music_features

Evaluation on Other Models

Coming soon!

Training

Data Processing

To fine-tune the generative model, you need the following:

A base checkpoint of the generative model;
Conditioning data for training-time sampling;
A checkpoint of the trained imitation policy.

In this section, we provide data (preprocessed data and the pretrained imitation policy) for a minimal example: finetuning EDGE on AIST++. You can directly run following scripts:

cd code/rl_finetune
# download EDGE checkpoint.
bash download_mode.sh
# download preprocessed data of AIST++ and the pretrained imitation policy.
bash download_data.sh

Note: We use DVC for dataset version control. Please follow the README in the downloaded data directory for a quick setup.

We will also explain how to prepare your own datasets and pre-trained models below.

1) Pretrain a base generative model

You can follow the instructions from:

EDGE
POPDG

2) Pretrain an imitation policy

Follow the instructions in Training Imitation Policy.

3) Prepare the conditioning data

Coming soon!

Training Imitation Policy

Prepare the Expert Dataset

First, prepare the expert dataset for imitation policy training by following the guide in the Vid2player3d README.
Run the Training Script

Once the dataset is ready, start the training by executing the train.sh script.
```
cd code/pretrain/vid2player3d
bash train.sh PATH_TO_VID2PLAYER3D
```
- Customization: You can modify the training strategy by changing the configuration file and the execution order within train.sh.

RLFT for EDGE

(1) Change the weight of different rewards in reward.yaml.

(2) Set the correct model for finetuning in Line 56 in EDGE.py.

cd code/rl_finetune
bash run.sh exp_name gpu_parallel_num epoch_num batch_size

RLFT for Other Models

Coming soon!

Rendering

Coming soon!

Trouble Shooting

Supporting GPU types newer than A100

Coming soon!

Citation

If you find this work useful for your research, please cite our paper:

@misc{jia2026skeleton2stagerewardguidedfinetuningphysically,
      title={Skeleton2Stage: Reward-Guided Fine-Tuning for Physically Plausible Dance Generation}, 
      author={Jidong Jia and Youjian Zhang and Huan Fu and Dacheng Tao},
      year={2026},
      eprint={2602.13778},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2602.13778}, 
}

References

This repository is built on top of the following amazing repositories:

Main code framework is from: EDGE
Imitation policy is from: vid2player3d
SMPL models and layer is from: SMPL-X model
README template is from: PHC

Please follow the lisence of the above repositories for usage.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.dvc		.dvc
assets		assets
body_models		body_models
code		code
environment		environment
.dvcignore		.dvcignore
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Skeleton2Stage: Reward-Guided Fine-Tuning for Physically Plausible Dance Generation

Abstract

Table of Contents

News

TODOs

Introduction

Docs

Current Results on EDGE

Installation

Evaluation

Evaluation Pipeline

Evaluation on Other Models

Training

Data Processing

1) Pretrain a base generative model

2) Pretrain an imitation policy

3) Prepare the conditioning data

Training Imitation Policy

RLFT for EDGE

RLFT for Other Models

Rendering

Trouble Shooting

Supporting GPU types newer than A100

Citation

References

About

Uh oh!

Releases

Packages

Languages

jjd1123/Skeleton2Stage

Folders and files

Latest commit

History

Repository files navigation

Skeleton2Stage: Reward-Guided Fine-Tuning for Physically Plausible Dance Generation

Abstract

Table of Contents

News

TODOs

Introduction

Docs

Current Results on EDGE

Installation

Evaluation

Evaluation Pipeline

Evaluation on Other Models

Training

Data Processing

1) Pretrain a base generative model

2) Pretrain an imitation policy

3) Prepare the conditioning data

Training Imitation Policy

RLFT for EDGE

RLFT for Other Models

Rendering

Trouble Shooting

Supporting GPU types newer than A100

Citation

References

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages