Identity-as-Presence

Official implementation of "Identity as Presence: Towards Appearance and Voice Personalized Joint Audio-Video Generation"

Identity as Presence: Towards Appearance and Voice Personalized Joint Audio-Video Generation
Yingjie Chen, Shilun Lin, Cai Xing, Qixin Yan, Wenjing Wang, Dingming Liu, Hao Liu, Chen Li, Jing LYU

💡 Abstract

Recent advances have demonstrated compelling capabilities in synthesizing real individuals into generated videos, reflecting the growing demand for identity-aware content creation. Nevertheless, an openly accessible framework enabling fine-grained control over facial appearance and voice timbre across multiple identities remains unavailable. In this work, we present a unified and scalable framework for identity-aware joint audio-video generation, enabling high-fidelity and consistent personalization. Specifically, we introduce a data curation pipeline that automatically extracts identity-bearing information with paired annotations across audio and visual modalities, covering diverse scenarios from single-subject to multi-subject interactions. We further propose a flexible and scalable identity injection mechanism for single- and multi-subject scenarios, in which both facial appearance and vocal timbre act as identity-bearing control signals. Moreover, in light of modality disparity, we design a multi-stage training strategy to accelerate convergence and enforce cross-modal coherence. Experiments demonstrate the superiority of the proposed framework.

🔥 Updates

(2026-03-18) The project page, demo video and technical report are released.

📑 TODO List

Release inference code and model weights for single-subject scenarios
Release inference code and model weights for multi-subject scenarios

Usage

Environment

$ pip install -r requirements.txt

Pretrained Weights

Please download the following pretrained models and place them in the ckpts directory: MMAudio, Wan2.2-TI2V-5B, Identity-as-Presence

After downloading, ensure all model files are placed in the ckpts directory and properly configured.

Inference

$ bash infer.sh

The results will be saved in results directory.

🎥 Demo

Single-subject Personalized Generation

1.mp4	2.mp4	3.mp4	4.mp4
1.mp4	2.mp4	3.mp4	4.mp4

Multi-subject Personalized Generation

1.mp4

1-1.mp4

1-2.mp4

2.mp4

2-1.mp4

2-2.mp4

3.mp4

3-1.mp4

3-2.mp4

4.mp4

4-1.mp4

4-2.mp4

For more details, please refer to our project page.

🔗 Citation

If you find this code useful for your research, please use the following BibTeX entry.

@inproceedings{chen2026identity,
  title={Identity as Presence: Towards Appearance and Voice Personalized Joint Audio-Video Generation},
  author={Chen, Yingjie and Lin, Shilun and Xing, Cai and Binxin, Yang and Long, Zhou and Yan, Qixin and Wang, Wenjing and Liu, Dingming and Liu, Hao and Li, Chen and LYU, Jing},
  journal={arXiv preprint arXiv:2603.17889},
  website={https://chen-yingjie.github.io/projects/Identity-as-Presence/index.html},
  year={2026}}

Acknowledgements

We would like to thank the contributors to various open-source projects for their research and exploration.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
assets		assets
ckpts		ckpts
configs		configs
examples		examples
identity_as_presence		identity_as_presence
.gitignore		.gitignore
README.md		README.md
infer.sh		infer.sh
inference.py		inference.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Identity-as-Presence

💡 Abstract

🔥 Updates

📑 TODO List

Usage

Environment

Pretrained Weights

Inference

🎥 Demo

Single-subject Personalized Generation

Multi-subject Personalized Generation

🔗 Citation

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Identity-as-Presence

💡 Abstract

🔥 Updates

📑 TODO List

Usage

Environment

Pretrained Weights

Inference

🎥 Demo

Single-subject Personalized Generation

Multi-subject Personalized Generation

🔗 Citation

Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages