Skip to content

KCJMP23/syncanimation

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 

Repository files navigation

SyncAnimation: A Real-Time End-to-End Framework for Audio-Driven Human Pose and Talking Head Animation

License
Project Page | Paper (arXiv)
📢 Official repository of SyncAnimation. The paper has been accepted to IJCAI 2025.

SyncAnimation Demo

“Generating talking avatar driven by audio remains a significant challenge. Existing methods typically require high computational costs and often lack sufficient facial detail and realism, making them unsuitable for applications that demand high real-time performance and visual quality. Additionally, while some methods can synchronize lip movement, they still face issues with consistency between facial expressions and upper body movement, particularly during silent periods. In this paper, we introduce SyncAnimation, the first NeRF-based method that achieves audio-driven, stable, and real-time generation of speaking avatar by combining generalized audio-to-pose matching and audio-to-expression synchronization. By integrating AudioPose Syncer and AudioEmotion Syncer, SyncAnimation achieves high-precision poses and expression generation, progressively producing audio-synchronized upper body, head, and lip shapes. Furthermore, the High-Synchronization Human Renderer ensures seamless integration of the head and upper body, and achieves audio-sync lip.”

🧠 Introduction

Most existing audio-driven talking head synthesis methods focus only on the facial region, pasting other parts like the torso from the original image, which leads to audio inconsistency between facial movements, lips, and body motion. SyncAnimation addresses this issue by ensuring:

  • Audio-Body Consistency
  • Audio-Face Consistency
  • Audio-Lips Consistency

SyncAnimation Demo


🛠 Installation & Dependencies

Linux / Ubuntu

The environment setup of this project follows the installation process of SyncTalk. Below is the recommended installation process on Ubuntu (tested on Ubuntu 20.04 with PyTorch 1.12.1 + CUDA 11.3):

git clone https://github.com/syncanimation/syncanimation.git
cd syncanimation

# It is recommended to use a conda environment
conda create -n syncanimation python==3.8.8
conda activate syncanimation

# Install PyTorch and torchvision (choose versions according to your CUDA version)
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113

sudo apt-get install portaudio19-dev
pip install -r requirements.txt

# Install required modules (freqencoder / gridencoder / shencoder / raymarching)
pip install ./freqencoder
pip install ./shencoder
pip install ./gridencoder
pip install ./raymarching

# Install PyTorch3D (if issues occur, use the fallback script)
pip install --no-index --no-cache-dir pytorch3d -f https://dl.fbaipublicfiles.com/pytorch3d/packaging/wheels/py38_cu113_pyt1121/download.html
# Or:
python ./scripts/install_pytorch3d.py

# Install TensorFlow GPU version 
pip install tensorflow-gpu==2.8.1

Note:You may encounter compatibility issues when installing PyTorch3D. It is recommended to use the scripts/install_pytorch3d.py script as a fallback.


📝 Citation

Please cite the following paper if you use this method, model, or conduct derivative research based on this project:

@inproceedings{ijcai2025p185,
  title     = {SyncAnimation: A Real-Time End-to-End Framework for Audio-Driven Human Pose and Talking Head Animation},
  author    = {Liu, Yujian and Xu, Shidang and Guo, Jing and Wang, Dingbin and Wang, Zairan and Tan, Xianfeng and Liu, Xiaoli},
  booktitle = {Proceedings of the Thirty-Fourth International Joint Conference on
               Artificial Intelligence, {IJCAI-25}},
  publisher = {International Joint Conferences on Artificial Intelligence Organization},
  editor    = {James Kwok},
  pages     = {1657--1665},
  year      = {2025},
  month     = {8},
  note      = {Main Track},
  doi       = {10.24963/ijcai.2025/185},
  url       = {https://doi.org/10.24963/ijcai.2025/185},
}

🙏 Acknowledgements

This project is built upon or inspired by the following open-source projects:

  • Synctalk
  • ER-NeRF
  • GeneFace
  • AD-NeRF
  • Deep3DFaceRecon_pytorch

We sincerely thank the authors of these projects for their contributions to the open-source community.


⚠️ Disclaimer

By using this project, you agree to comply with all applicable laws and regulations. You must not use it to generate or disseminate harmful content. The developers assume no responsibility for any direct, indirect, or consequential damages arising from the use or misuse of this software.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors