SyncAnimation: A Real-Time End-to-End Framework for Audio-Driven Human Pose and Talking Head Animation
Project Page | Paper (arXiv)
📢 Official repository of SyncAnimation. The paper has been accepted to IJCAI 2025.
Most existing audio-driven talking head synthesis methods focus only on the facial region, pasting other parts like the torso from the original image, which leads to audio inconsistency between facial movements, lips, and body motion. SyncAnimation addresses this issue by ensuring:
- Audio-Body Consistency
- Audio-Face Consistency
- Audio-Lips Consistency
The environment setup of this project follows the installation process of SyncTalk. Below is the recommended installation process on Ubuntu (tested on Ubuntu 20.04 with PyTorch 1.12.1 + CUDA 11.3):
git clone https://github.com/syncanimation/syncanimation.git
cd syncanimation
# It is recommended to use a conda environment
conda create -n syncanimation python==3.8.8
conda activate syncanimation
# Install PyTorch and torchvision (choose versions according to your CUDA version)
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113
sudo apt-get install portaudio19-dev
pip install -r requirements.txt
# Install required modules (freqencoder / gridencoder / shencoder / raymarching)
pip install ./freqencoder
pip install ./shencoder
pip install ./gridencoder
pip install ./raymarching
# Install PyTorch3D (if issues occur, use the fallback script)
pip install --no-index --no-cache-dir pytorch3d -f https://dl.fbaipublicfiles.com/pytorch3d/packaging/wheels/py38_cu113_pyt1121/download.html
# Or:
python ./scripts/install_pytorch3d.py
# Install TensorFlow GPU version
pip install tensorflow-gpu==2.8.1Note:You may encounter compatibility issues when installing PyTorch3D. It is recommended to use the scripts/install_pytorch3d.py script as a fallback.
Please cite the following paper if you use this method, model, or conduct derivative research based on this project:
@inproceedings{ijcai2025p185,
title = {SyncAnimation: A Real-Time End-to-End Framework for Audio-Driven Human Pose and Talking Head Animation},
author = {Liu, Yujian and Xu, Shidang and Guo, Jing and Wang, Dingbin and Wang, Zairan and Tan, Xianfeng and Liu, Xiaoli},
booktitle = {Proceedings of the Thirty-Fourth International Joint Conference on
Artificial Intelligence, {IJCAI-25}},
publisher = {International Joint Conferences on Artificial Intelligence Organization},
editor = {James Kwok},
pages = {1657--1665},
year = {2025},
month = {8},
note = {Main Track},
doi = {10.24963/ijcai.2025/185},
url = {https://doi.org/10.24963/ijcai.2025/185},
}This project is built upon or inspired by the following open-source projects:
- Synctalk
- ER-NeRF
- GeneFace
- AD-NeRF
- Deep3DFaceRecon_pytorch
We sincerely thank the authors of these projects for their contributions to the open-source community.
By using this project, you agree to comply with all applicable laws and regulations. You must not use it to generate or disseminate harmful content. The developers assume no responsibility for any direct, indirect, or consequential damages arising from the use or misuse of this software.

