Relit-LiVE: Relight Video by Jointly Learning Environment Video

Weiqing Xiao^1,* Hong Li^2,3,* Xiuyu Yang^4,* Houyuan Chen⁵ Wenyi Li⁶ Tianqi Liu⁷ Shaocong Xu² Chongjie Ye⁸ Hao Zhao^4,2,† Beibei Wang^1,†

¹Nanjing University ²BAAI ³Beihang University ⁴Tsinghua University ⁵HKUST ⁶UCAS ⁷HUST ⁸CUHK-Shenzhen
^*Equal contribution. ^†Corresponding authors.

This repo contains the official code of our paper: Relit-LiVE: Relight Video by Jointly Learning Environment Video.

📊 Overview

We present Relit-LiVE, a novel video relighting framework that produces physically consistent and temporally stable results without needing prior knowledge of camera pose. This is achieved by jointly generating relighting videos and environment videos. Additionally, by integrating real-world lighting effects with intrinsic constraints, the relighting videos demonstrate remarkable physical plausibility, showcasing realistic reflections and shadows.

✨ News

May 8, 2026: Release project page and infer pipeline.

📝 Check list

Release the arxiv and project page.
Release inference code and model checkpoints.
Release gradio code and full inference pipeline (inverse-forward).
Release training code and data pipeline.
Release training dataset.

🛠️ Installation

Minimum requirements

Python 3.10
NVIDIA GPU, with at least 24 GB VRAM recommended
CUDA 12.4 or a compatible version
Model weights prepared under checkpoints/ and models/Wan-AI/Wan2.1-T2V-1.3B/

Recommended environment:

Ubuntu 20.04 or newer
Single-GPU CUDA inference setup

Conda environment

conda create -n diffsynth python=3.10
conda activate diffsynth
pip install -e .
pip install -U deepspeed
pip install transformers==4.50.0
pip install gradio==6.14.0

Optional for full inference pipeline

The cosmos-transfer1-diffusion-renderer repository is essential for full pipeline inference. Install the conda environment named cosmos-predict1 following the instructions in its README.md.

cd third_party
git clone https://github.com/nv-tlabs/cosmos-transfer1-diffusion-renderer.git
...

📦 Checkpoints

Download the Relit-LiVE checkpoints from HuggingFace and place them under checkpoints/.

Checkpoint	Resolution	Frames	Download
`model_frame25_480_832.ckpt`	480 × 832	8n+1, n∈{0,1,2,3} → 1/9/17/25	🤗 Download
`model_frame57_480_832.ckpt`	480 × 832	8n+1, n∈{0,…,7} → 1/9/…/57	🤗 Download
`model_frame1_1024_1472.ckpt`	1024 × 1472	1 (image)	🤗 Download

In addition, inference loads the Wan2.1 base model from models/Wan-AI/Wan2.1-T2V-1.3B/. Make sure all weights are in place before running inference.

If you want to reproduce the MIT metrics reported in the paper, you should load the model_frame57_480_832.ckpt and perform single-frame inference directly on the test set.

(Optional for full inference pipeline) Download the cosmos-transfer1-diffusion-renderer checkpoints from HuggingFace and place them under third_party/cosmos-transfer1-diffusion-renderer/checkpoints/ following the instructions in its README.md.

🚀 Inference

By default, generated results are written to inference_output/.

Basic 25-frame relighting

python relit_inference.py \
    --dataset_path datasets/demos \
    --ckpt_path checkpoints/model_frame25_480_832.ckpt \
    --output_dir inference_output \
    --cfg_scale 1.0 \
    --height 480 \
    --width 832 \
    --num_frames 25 \
    --padding_resolution \
    --use_ref_image \
    --env_map_path datasets/envs/Pink_Sunrise \
    --frame_interval 1 \
    --num_inference_steps 50 \
    --quality 10

25-frame rotating-light relighting

python relit_inference.py \
    --dataset_path datasets/demos \
    --ckpt_path checkpoints/model_frame25_480_832.ckpt \
    --output_dir inference_output \
    --cfg_scale 1.0 \
    --height 480 \
    --width 832 \
    --num_frames 25 \
    --padding_resolution \
    --use_ref_image \
    --env_map_path datasets/envs/Pink_Sunrise \
    --frame_interval 1 \
    --num_inference_steps 50 \
    --use_rotate_light \
    --quality 10

Fixed-frame relighting with width-axis light rotation

python relit_inference.py \
    --dataset_path datasets/demos \
    --ckpt_path checkpoints/model_frame25_480_832.ckpt \
    --output_dir inference_output \
    --cfg_scale 1.0 \
    --height 480 \
    --width 832 \
    --num_frames 25 \
    --padding_resolution \
    --use_ref_image \
    --env_map_path datasets/envs/Pink_Sunrise \
    --frame_interval 1 \
    --num_inference_steps 50 \
    --use_fixed_frame_and_w_rotate_light \
    --quality 10

Fixed-frame relighting with height-axis light rotation

python relit_inference.py \
    --dataset_path datasets/demos \
    --ckpt_path checkpoints/model_frame25_480_832.ckpt \
    --output_dir inference_output \
    --cfg_scale 1.0 \
    --height 480 \
    --width 832 \
    --num_frames 25 \
    --padding_resolution \
    --use_ref_image \
    --env_map_path datasets/envs/Pink_Sunrise \
    --frame_interval 1 \
    --num_inference_steps 50 \
    --use_fixed_frame_and_h_rotate_light \
    --quality 10

57-frame video relighting

python relit_inference.py \
    --dataset_path datasets/demos \
    --ckpt_path checkpoints/model_frame57_480_832.ckpt \
    --output_dir inference_output \
    --cfg_scale 1.0 \
    --height 480 \
    --width 832 \
    --num_frames 57 \
    --padding_resolution \
    --use_ref_image \
    --env_map_path datasets/envs/Pink_Sunrise \
    --frame_interval 1 \
    --num_inference_steps 50 \
    --quality 10

Single-frame high-resolution relighting

python relit_inference.py \
    --dataset_path datasets/demos \
    --ckpt_path checkpoints/model_frame1_1024_1472.ckpt \
    --output_dir inference_output \
    --cfg_scale 1.0 \
    --height 1024 \
    --width 1472 \
    --num_frames 1 \
    --padding_resolution \
    --use_ref_image \
    --env_map_path datasets/envs/Pink_Sunrise \
    --frame_interval 1 \
    --num_inference_steps 50 \
    --quality 10

📋 Argument reference

The following arguments are defined in parse_args() inside relit_inference.py.

Argument	Type	Default	Description
`--dataset_path`	str	`./example_test_data`	Input dataset directory. The examples above use `datasets/demos`.
`--env_map_path`	str	`None`	External environment map directory. If not provided, the script reads lighting data from each sample.
`--use_ref_image`	flag	`False`	Enable the reference-image branch.
`--use_muti_ref_image`	flag	`False`	Enable multi-reference-image mode. The argument name follows the current code spelling.
`--ref_image_path_with_idddx`	str	`None`	Template path for external reference images. The script replaces `idddx` with the sample index.
`--full_resolution`	flag	`False`	Use the full-resolution input pipeline.
`--padding_resolution`	flag	`False`	Use a padding-based resize strategy to reduce aggressive cropping.
`--dataset_type`	str	`relit-live`	Dataset format. The default matches the Relit-LiVE directory structure in this repository.
`--drop_mr`	flag	`False`	Ignore metallic and roughness conditioning.
`--use_rotate_light`	flag	`False`	Enable dynamic light rotation mode.
`--use_fixed_frame_and_w_rotate_light`	flag	`False`	Keep the first frame fixed and rotate lighting along the environment-map width axis.
`--use_fixed_frame_and_h_rotate_light`	flag	`False`	Keep the first frame fixed and rotate lighting along the environment-map height axis.
`--h_rotate_light`	int	`0`	Apply vertical environment-map rotation to each frame, in degrees.
`--w_rotate_light`	int	`0`	Apply horizontal environment-map rotation to each frame, in pixels.
`--num_frames`	int	`81`	Number of output frames. When set to `1`, the script saves a png; otherwise it saves an mp4.
`--num_inference_steps`	int	`50`	Number of denoising inference steps.
`--frame_interval`	int	`1`	Sampling interval when reading the input video or image sequence.
`--height`	int	`480`	Output height.
`--width`	int	`832`	Output width.
`--ckpt_path`	str	`None`	Path to the checkpoint to load.
`--output_dir`	str	`./results`	Default output directory.
`--output_path`	str	`None`	Explicit output file path. Only `.mp4` and `.png` are supported.
`--dataloader_num_workers`	int	`1`	Number of DataLoader workers.
`--cfg_scale`	float	`5.0`	Classifier-free guidance scale.
`--wo_ref_weight`	float	`0.0`	Weight for the branch without reference-image conditioning.
`--quality`	int	`5`	Video quality value passed to `imageio` when saving mp4 files.

Notes

Output filenames automatically include parts of the checkpoint name, sequence name, resolution, reference-image mode, environment lighting information, inference steps, frame count, and cfg_scale.
When --num_frames 1 is used, the script writes a png. When --num_frames > 1, it writes an mp4.

🚀 Full inference pipeline (gradio)

Please make sure you have the following items ready:

conda environment named diffsynth.
conda environment named cosmos-predict1.
./checkpoints/*.ckpt.
./third_party/cosmos-transfer1-diffusion-renderer.
./third_party/cosmos-transfer1-diffusion-renderer/checkpoints/Cosmos-Tokenize1-CV8x8x8-720p and ./third_party/cosmos-transfer1-diffusion-renderer/checkpoints/Diffusion_Renderer_Inverse_Cosmos_7B.

Then,

conda activate diffsynth
python run_full_inference_gradio.py

📌 Future plans

This project will be continuously maintained. We welcome users to try it out and share their feedback (15770575681@163.com).

The current plan includes a model version specifically designed for portraits and another that is better suited for handling motion (including camera and scene dynamics).

🤝 Citation

If you find this repository helpful, please consider citing our paper:

@article{xiao2026relit,
  title={Relit-LiVE: Relight Video by Jointly Learning Environment Video},
  author={Xiao, Weiqing and Li, Hong and Yang, Xiuyu and Chen, Houyuan and Li, Wenyi and Liu, Tianqi and Xu, Shaocong and Ye, Chongjie and Zhao, Hao and Wang, Beibei},
  journal={arXiv preprint arXiv:2605.06658},
  year={2026}
}

📝 Acknowledgements

Code is built on DiffSynth-Studio and diffusion-renderer. Thanks all the authors for their excellent contributions!

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
assets		assets
datasets		datasets
diffsynth		diffsynth
third_party		third_party
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
relit_inference.py		relit_inference.py
requirements.txt		requirements.txt
run_full_inference_gradio.py		run_full_inference_gradio.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Relit-LiVE: Relight Video by Jointly Learning Environment Video

📊 Overview

✨ News

📝 Check list

🛠️ Installation

Minimum requirements

Conda environment

Optional for full inference pipeline

📦 Checkpoints

🚀 Inference

Basic 25-frame relighting

25-frame rotating-light relighting

Fixed-frame relighting with width-axis light rotation

Fixed-frame relighting with height-axis light rotation

57-frame video relighting

Single-frame high-resolution relighting

📋 Argument reference

Notes

🚀 Full inference pipeline (gradio)

📌 Future plans

🤝 Citation

📝 Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Relit-LiVE: Relight Video by Jointly Learning Environment Video

📊 Overview

✨ News

📝 Check list

🛠️ Installation

Minimum requirements

Conda environment

Optional for full inference pipeline

📦 Checkpoints

🚀 Inference

Basic 25-frame relighting

25-frame rotating-light relighting

Fixed-frame relighting with width-axis light rotation

Fixed-frame relighting with height-axis light rotation

57-frame video relighting

Single-frame high-resolution relighting

📋 Argument reference

Notes

🚀 Full inference pipeline (gradio)

📌 Future plans

🤝 Citation

📝 Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages