Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
16 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
download/
output_logfile.txt
25 changes: 19 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,20 @@


# HDTF
<<<<<<< HEAD
Flow-guided One-shot Talking Face Generation with a High-resolution Audio-visual Dataset
<a href="https://openaccess.thecvf.com/content/CVPR2021/papers/Zhang_Flow-Guided_One-Shot_Talking_Face_Generation_With_a_High-Resolution_Audio-Visual_Dataset_CVPR_2021_paper.pdf" target="_blank">paper</a> <a href="https://github.com/MRzzm/HDTF/blob/main/Supplementary%20Materials.pdf" target="_blank">supplementary</a> [demo video](https://www.youtube.com/watch?v=uJdBgWYBTww)

## Details of HDTF dataset
**./HDTF_dataset** consists of *youtube video url*, *video resolution* (in our method, may not be the best resolution), *time stamps of talking face*, *facial region* (in the our method) and *the zoom scale* of the cropped window.

**xx_video_url.txt:**
=======
Flow-guided One-shot Talking Face Generation with a High-resolution Audio-visual Dataset
<a href="https://openaccess.thecvf.com/content/CVPR2021/papers/Zhang_Flow-Guided_One-Shot_Talking_Face_Generation_With_a_High-Resolution_Audio-Visual_Dataset_CVPR_2021_paper.pdf" target="_blank">paper</a> <a href="https://github.com/MRzzm/HDTF/blob/main/Supplementary%20Materials.pdf" target="_blank">supplementary</a>

## Details of HDTF dataset
**./HDTF_dataset** consists of *youtube video url*, *video resolution* (in our method, may not be the best resolution), *time stamps of talking face*, *facial region* (in the our method) and *the zoom scale* of the cropped window.
**xx_video_url.txt:**
>>>>>>> 8c402f4 (doc: add the downloading instructions)

```
format: video name | video youtube url
Expand All @@ -31,18 +37,17 @@ format: video name+clip index | min_width | width | min_height | height (in
format: video name+clip index | window zoom scale
```


## Processing of HDTF dataset
When using HDTF dataset,
When using HDTF dataset,

- We provide video and url in **xx_video_url.txt**. (the highest definition of videos are 1080P or 720P). Transform video into **.mp4** format and transform interlaced video to progressive video as well.

- We split long original video into talking head clips with time stamps in **xx_annotion_time.txt**. Name the splitted clip as **video name_clip index.mp4**. For example, split the video *Radio11.mp4 00:30-01:00 01:30-02:30* into *Radio11_0.mp4* and *Radio11_1.mp4* .

- Our work does not always download videos with the best resolution, so we provide two cropping methods. Thanks @universome and @Feii Yin for pointing out this problem!
- Our work does not always download videos with the best resolution, so we provide two cropping methods. Thanks @universome and @Feii Yin for pointing out this problem!

1. Download the video with reference resulotion in **xx_resolution.txt** and crop the facial region with fixed window size in **xx_crop_wh.txt**. (This method is as same as ours, but the downloaded video may not be the best resolution).
2. First, download the video with best resulotion. Then, detect the facial landmark in the splitted talking head clips and count the square window of the face, specifically, count the facial region in each frame and merge all regions into one square range. Next, enlarge the window size with **xx_crop_ratio.txt**. Finally, crop the facial region.
2. First, download the video with best resulotion. Then, detect the facial landmark in the splitted talking head clips and count the square window of the face, specifically, count the facial region in each frame and merge all regions into one square range. Next, enlarge the window size with **xx_crop_ratio.txt**. Finally, crop the facial region.

- We resize all cropped videos into **512 x 512** resolution.

Expand All @@ -62,6 +67,14 @@ The code is in **./code_animation2video**, pls visit [here](https://github.com/M
#### code of reproducing other works
coming soon......

## Downloading
For convenience, we added the `download.py` script which downloads, crops and resizes the dataset. You can use it via the following command:
```
python download.py --output_dir /path/to/output/dir --num_workers 8
```

Note: some videos might become unavailable if the authors will remove them or make them private.

## Reference
if you use HDTF, pls reference

Expand Down
Loading