Skip to content

langzizhixin/IP_LAP_256

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

76 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

IP_LAP_256 is ours LangXin_V2 Commercial code, mainly used for 2024. Because we have better models now, So we open sourced it. Facilitating everyone's learning and research.

一个可以超保真还原本人牙齿和嘴型的商用泛化数字人项目。

推理视频需要开口说话,可以数数1、2、3、4..... 也可以随便说。

This is a project about talking faces. It is a commercial digital human project that faithfully reproduces human faces, mouths, and teeth.We use 256X256 sized facial images for training, Because his face is cut from the forehead, the face size of 256X256 is equivalent to the face size of wav2lip384. So it can generate 720p, 1080p, 2k ,4k Digital Humanhuman videos. Transformer is you need. So this model innovatively uses attention mechanism, which can reference the mouth shape of the face in the previous and subsequent frames to generate new mouth shapes. Thus achieving the restoration of one's own teeth and mouth shape. IP_LAP use a network structure of 128x128 , IP_LAP_256 use a network structure of 256x256. We have done the following work:

  1. Add video cutting codes.
  2. Optimized the network structure and increased the clarity of face segmentation.
  3. Trained 1000 people, 50 hours dataset, and over 50000 pieces of data. landmarks model eval_L1_loss needs to be reduced to around 0.004 ,3080 training for 12-24 hours, renderer model FID needs to be reduced to around 15 ,4090 training for 24-48 hours.
  4. Dear friends, we no released the best landmarks checkpoint, you need load pre training weights for easy subsequent training. but we released the best renderer checkpoint ,you can use it directly.
  5. Of course, you can also use a 1-minute video for fine-tuning training landmarks to achieve better commercial results.
  6. If you want to achieve better reasoning results, then refer to my demo video for shooting.
  7. Requirements, Python==3.7.11 , torch==1.13.1 ,CUDA==11.3 . Of course, you can also choose other versions as long as they correspond well and can run.

🏗️ IP_LAP_256 Project situation

Video | Project Page | Code

checkpoints for LangXin_V2 (IP_LAP_256) https://pan.baidu.com/s/1lzqgqO6vkFxa2-0AiS4a1A?pwd=lzzx

📊 Sample of processed images.

📊 The following pictures are comparison images of the training generator training 200000 steps, The second to last image is the generated digital human image.

📊 The following pictures are comparison images of the training generator training 300000 steps, The second to last image is the generated digital human image.

🎬 Demo

Original video Lip-synced video
555555.mp4
output_555555.mp4
666666.mp4
output_666666.mp4
777777.mp4
output_777777.mp4
888888.mp4
output_888888.mp4

📑 Open-source Plan

For digital human projects , we will continue to train and release higher definition weights in the future. The plan is as follows: Pre training checkpoints for wav2lip_288x288 (LangXin_V0) will be released in January 2025. Pre training checkpoints for wav2lip_384x384 (LangXin_V1) will be released in February 2025. Pre training checkpoints for IP_LAP_256 (LangXin_V2) will be released after June 2025. Pre training checkpoints for (LangXin_V3) will be released after June 2026.

  • landmark_checkpoints
  • renderer_checkpionts
  • Dataset processing pipeline
  • Training method
  • Inference
  • Real time Inference
  • Higher definition commercial checkpoints

🙏 Citing

Thank you to the other three authors, Thank you for their wonderful work. https://github.com/Weizhi-Zhong/IP_LAP

📖 Disclaimers

This repositories made by langzizhixin from Langzizhixin Technology company 2025.7.20 , in Chengdu, China . The above code and weights can only be used for personal/research/non-commercial purposes. Especially for digital human video models in the warehouse, if commercial use is required, please contact the model themselves for authorization. If you need a higher definition model, please contact us by email 277504483@qq.com, ajian.justdoit@gmail.com or add ours WeChat for communication: langzizhixinkeji

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors