GitHub - GVCLab/PersonaLive: PersonaLive! : Expressive Portrait Image Animation for Live Streaming

Expressive Portrait Image Animation for Live Streaming

Zhiyuan Li^1,2,3 · Chi-Man Pun^1,📪 · Chen Fang² · Jue Wang² · Xiaodong Cun^3,📪

¹ University of Macau ² Dzine.ai ³ GVC Lab, Great Bay University

📋 TODO

If you find PersonaLive useful or interesting, please give us a Star🌟! Your support drives us to keep improving.
Fix bugs (If you encounter any issues, please feel free to open an issue or contact me! 🙏)
Enhance WebUI (Support reference image replacement).
[2025.12.22] 🔥 Supported streaming strategy in offline inference to generate long videos on 12GB VRAM!
[2025.12.17] 🔥 ComfyUI-PersonaLive is now supported! (Thanks to @okdalto)
[2025.12.15] 🔥 Release paper!
[2025.12.12] 🔥 Release inference code, config, and pretrained weights!

⚙️ Framework

We present PersonaLive, a real-time and streamable diffusion framework capable of generating infinite-length portrait animations on a single 12GB GPU.

🚀 Getting Started

🛠 Installation

# clone this repo
git clone https://github.com/GVCLab/PersonaLive
cd PersonaLive

# Create conda environment
conda create -n personalive python=3.10
conda activate personalive

# Install packages with pip
pip install -r requirements_base.txt

⏬ Download weights

Option 1: Download pre-trained weights of base models and other components (sd-image-variations-diffusers and sd-vae-ft-mse). You can run the following command to download weights automatically:

python tools/download_weights.py

Option 2: Download pre-trained weights into the ./pretrained_weights folder from one of the below URLs:

Finally, these weights should be organized as follows:

pretrained_weights
├── onnx
│   ├── unet_opt
│   │   ├── unet_opt.onnx
│   │   └── unet_opt.onnx.data
│   └── unet
├── personalive
│   ├── denoising_unet.pth
│   ├── motion_encoder.pth
│   ├── motion_extractor.pth
│   ├── pose_guider.pth
│   ├── reference_unet.pth
│   └── temporal_module.pth
├── sd-vae-ft-mse
│   ├── diffusion_pytorch_model.bin
│   └── config.json
├── sd-image-variations-diffusers
│   ├── image_encoder
│   │   ├── pytorch_model.bin
│   │   └── config.json
│   ├── unet
│   │   ├── diffusion_pytorch_model.bin
│   │   └── config.json
│   └── model_index.json
└── tensorrt
    └── unet_work.engine

🎞️ Offline Inference

Run offline inference with the default configuration:

python inference_offline.py

-L: Max number of frames to generate. (Default: 100)
--use_xformers: Enable xFormers memory efficient attention. (Default: True)
--stream_gen: Enable streaming generation strategy. (Default: True)
--reference_image: Path to a specific reference image. Overrides settings in config.
--driving_video: Path to a specific driving video. Overrides settings in config.

⚠️ Note for RTX 50-Series (Blackwell) Users: xformers is not yet fully compatible with the new architecture. To avoid crashes, please disable it by running:

python inference_offline.py --use_xformers False

📸 Online Inference

📦 Setup Web UI

# install Node.js 18+
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.1/install.sh | bash
nvm install 18

cd webcam
source start.sh

🏎️ Acceleration (Optional)

Converting the model to TensorRT can significantly speed up inference (~ 2x ⚡️). Building the engine may take about 20 minutes depending on your device. Note that TensorRT optimizations may lead to slight variations or a small drop in output quality.

pip install -r requirement_trt.txt

python torch2trt.py

The provided TensorRT model is from an H100. We recommend ALL users (including H100 users) re-run python torch2trt.py locally to ensure best compatibility.

▶️ Start Streaming

python inference_online.py --acceleration none (for RTX 50-Series) or xformers or tensorrt

Then open http://0.0.0.0:7860 in your browser. (*If http://0.0.0.0:7860 does not work well, try http://localhost:7860)

How to use: Upload Image ➡️ Fuse Reference ➡️ Start Animation ➡️ Enjoy! 🎉

Regarding Latency: Latency varies depending on your device's computing power. You can try the following methods to optimize it:

Lower the "Driving FPS" setting in the WebUI to reduce the computational workload.
You can increase the multiplier (e.g., set to num_frames_needed * 4 or higher) to better match your device's inference speed.

PersonaLive/webcam/util.py

Line 73 in 6953d1a

read_size = min(queue.qsize(), num_frames_needed * 3)

📚 Community Contribution

Special thanks to the community for providing helpful setups! 🥂

Windows + RTX 50-Series Guide: Thanks to @dknos for providing a detailed guide on running this project on Windows with Blackwell GPUs.
TensorRT on Windows: If you are trying to convert TensorRT models on Windows, this discussion might be helpful. Special thanks to @MaraScott and @Jeremy8776 for their insights.
ComfyUI: Thanks to @okdalto for helping implement the ComfyUI-PersonaLive support.

🎬 More Results

👀 Visualization results

demo_1.mp4

demo_2.mp4

demo_3.mp4	demo_4.mp4	demo_5.mp4	demo_6.mp4
demo_7.mp4	demo_8.mp4	demo_9.mp4	demo_0.mp4

🤺 Comparisons

same_id.mp4

cross_id_1.mp4

cross_id_2.mp4

⭐ Citation

If you find PersonaLive useful for your research, welcome to cite our work using the following BibTeX:

@article{li2025personalive,
  title={PersonaLive! Expressive Portrait Image Animation for Live Streaming},
  author={Li, Zhiyuan and Pun, Chi-Man and Fang, Chen and Wang, Jue and Cun, Xiaodong},
  journal={arXiv preprint arXiv:2512.11253},
  year={2025}
}

❤️ Acknowledgement

This code is mainly built upon Moore-AnimateAnyone, X-NeMo, StreamDiffusion, RAIN and LivePortrait, thanks to their invaluable contributions.

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
assets		assets
configs		configs
demo		demo
pretrained_weights		pretrained_weights
src		src
tools		tools
webcam		webcam
LICENSE		LICENSE
README.md		README.md
inference_offline.py		inference_offline.py
inference_online.py		inference_online.py
requirements_base.txt		requirements_base.txt
requirements_trt.txt		requirements_trt.txt
torch2trt.py		torch2trt.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Expressive Portrait Image Animation for Live Streaming

Zhiyuan Li^1,2,3 · Chi-Man Pun^1,📪 · Chen Fang² · Jue Wang² · Xiaodong Cun^3,📪

📋 TODO

⚙️ Framework

🚀 Getting Started

🛠 Installation

⏬ Download weights

🎞️ Offline Inference

📸 Online Inference

📦 Setup Web UI

🏎️ Acceleration (Optional)

▶️ Start Streaming

📚 Community Contribution

🎬 More Results

👀 Visualization results

🤺 Comparisons

⭐ Citation

❤️ Acknowledgement

About

Uh oh!

Releases

Packages

Contributors 4

Uh oh!

Languages

License

GVCLab/PersonaLive

Folders and files

Latest commit

History

Repository files navigation

Expressive Portrait Image Animation for Live Streaming

Zhiyuan Li1,2,3 · Chi-Man Pun1,📪 · Chen Fang2 · Jue Wang2 · Xiaodong Cun3,📪

📋 TODO

⚙️ Framework

🚀 Getting Started

🛠 Installation

⏬ Download weights

🎞️ Offline Inference

📸 Online Inference

📦 Setup Web UI

🏎️ Acceleration (Optional)

▶️ Start Streaming

📚 Community Contribution

🎬 More Results

👀 Visualization results

🤺 Comparisons

⭐ Citation

❤️ Acknowledgement

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Uh oh!

Languages

Zhiyuan Li^1,2,3 · Chi-Man Pun^1,📪 · Chen Fang² · Jue Wang² · Xiaodong Cun^3,📪

Packages