SyncAnimation: A Real-Time End-to-End Framework for Audio-Driven Human Pose and Talking Head Animation

Project Page | Paper (arXiv)
📢 Official repository of SyncAnimation. The paper has been accepted to IJCAI 2025.

“Generating talking avatar driven by audio remains a significant challenge. Existing methods typically require high computational costs and often lack sufficient facial detail and realism, making them unsuitable for applications that demand high real-time performance and visual quality. Additionally, while some methods can synchronize lip movement, they still face issues with consistency between facial expressions and upper body movement, particularly during silent periods. In this paper, we introduce SyncAnimation, the first NeRF-based method that achieves audio-driven, stable, and real-time generation of speaking avatar by combining generalized audio-to-pose matching and audio-to-expression synchronization. By integrating AudioPose Syncer and AudioEmotion Syncer, SyncAnimation achieves high-precision poses and expression generation, progressively producing audio-synchronized upper body, head, and lip shapes. Furthermore, the High-Synchronization Human Renderer ensures seamless integration of the head and upper body, and achieves audio-sync lip.”

🧠 Introduction

Most existing audio-driven talking head synthesis methods focus only on the facial region, pasting other parts like the torso from the original image, which leads to audio inconsistency between facial movements, lips, and body motion. SyncAnimation addresses this issue by ensuring:

Audio-Body Consistency
Audio-Face Consistency
Audio-Lips Consistency

🛠 Installation & Dependencies

Linux / Ubuntu

The environment setup of this project follows the installation process of SyncTalk. Below is the recommended installation process on Ubuntu (tested on Ubuntu 20.04 with PyTorch 1.12.1 + CUDA 11.3):

git clone https://github.com/syncanimation/syncanimation.git
cd syncanimation

# It is recommended to use a conda environment
conda create -n syncanimation python==3.8.8
conda activate syncanimation

# Install PyTorch and torchvision (choose versions according to your CUDA version)
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113

sudo apt-get install portaudio19-dev
pip install -r requirements.txt

# Install required modules (freqencoder / gridencoder / shencoder / raymarching)
pip install ./freqencoder
pip install ./shencoder
pip install ./gridencoder
pip install ./raymarching

# Install PyTorch3D (if issues occur, use the fallback script)
pip install --no-index --no-cache-dir pytorch3d -f https://dl.fbaipublicfiles.com/pytorch3d/packaging/wheels/py38_cu113_pyt1121/download.html
# Or:
python ./scripts/install_pytorch3d.py

# Install TensorFlow GPU version 
pip install tensorflow-gpu==2.8.1

Note：You may encounter compatibility issues when installing PyTorch3D. It is recommended to use the scripts/install_pytorch3d.py script as a fallback.

📝 Citation

Please cite the following paper if you use this method, model, or conduct derivative research based on this project:

@inproceedings{ijcai2025p185,
  title     = {SyncAnimation: A Real-Time End-to-End Framework for Audio-Driven Human Pose and Talking Head Animation},
  author    = {Liu, Yujian and Xu, Shidang and Guo, Jing and Wang, Dingbin and Wang, Zairan and Tan, Xianfeng and Liu, Xiaoli},
  booktitle = {Proceedings of the Thirty-Fourth International Joint Conference on
               Artificial Intelligence, {IJCAI-25}},
  publisher = {International Joint Conferences on Artificial Intelligence Organization},
  editor    = {James Kwok},
  pages     = {1657--1665},
  year      = {2025},
  month     = {8},
  note      = {Main Track},
  doi       = {10.24963/ijcai.2025/185},
  url       = {https://doi.org/10.24963/ijcai.2025/185},
}

🙏 Acknowledgements

This project is built upon or inspired by the following open-source projects:

Synctalk
ER-NeRF
GeneFace
AD-NeRF
Deep3DFaceRecon_pytorch

We sincerely thank the authors of these projects for their contributions to the open-source community.

⚠️ Disclaimer

By using this project, you agree to comply with all applicable laws and regulations. You must not use it to generate or disseminate harmful content. The developers assume no responsibility for any direct, indirect, or consequential damages arising from the use or misuse of this software.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
assets/image		assets/image
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SyncAnimation: A Real-Time End-to-End Framework for Audio-Driven Human Pose and Talking Head Animation

🧠 Introduction

🛠 Installation & Dependencies

Linux / Ubuntu

📝 Citation

🙏 Acknowledgements

⚠️ Disclaimer

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

SyncAnimation: A Real-Time End-to-End Framework for Audio-Driven Human Pose and Talking Head Animation

🧠 Introduction

🛠 Installation & Dependencies

Linux / Ubuntu

📝 Citation

🙏 Acknowledgements

⚠️ Disclaimer

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages