Skip to content

Dataset contains invalid JPG files with a size of 0 bytes #3

@lodeous

Description

@lodeous

First off, this dataset is amazing! Thanks for putting it together.

I was trying to use this dataset to train a GAN network to create anime faces but I ran into a problem with my dataloader. I pulled the data from your google drive link to 'data.tgz' and then extracted the files. As you mention in the readme, there are some bad images. Specifically, my dataloader failed once it encountered some corrupted JPG files with a filesize of 0 bytes.

I used this code to delete the bad files and remove them from the dataset. Could you remove the bad files from the 'data.tgz' to make it easier for other people to use?


import os
#There are some 0 byte JPG files that cause issues
#I extracted 'data.tgz' into a folder called anime_faces
#but you can put whatever folder your data is in instead.
#This will go through all subfolders recursively.
for root, dirs, files in os.walk("anime_faces"):
  for file in files:
    path = os.path.join(root, file)
    if os.stat(path).st_size == 0:
      print("Remove 0B size file:", path)
      os.remove(path)

Here was the output, indicating which files are corrupted and should be removed:


Remove 0B size file: anime_faces/cropped/35611_2011.jpg
Remove 0B size file: anime_faces/cropped/23163_2008.jpg
Remove 0B size file: anime_faces/cropped/62546_2019.jpg
Remove 0B size file: anime_faces/cropped/8331_2004.jpg
Remove 0B size file: anime_faces/cropped/32128_2010.jpg
Remove 0B size file: anime_faces/cropped/38922_2012.jpg
Remove 0B size file: anime_faces/cropped/24884_2009.jpg
Remove 0B size file: anime_faces/cropped/23762_2008.jpg
Remove 0B size file: anime_faces/cropped/13131_2005.jpg
Remove 0B size file: anime_faces/cropped/54405_2016.jpg
Remove 0B size file: anime_faces/cropped/3781_2002.jpg
Remove 0B size file: anime_faces/cropped/61050_2018.jpg
Remove 0B size file: anime_faces/cropped/6955_2003.jpg
Remove 0B size file: anime_faces/cropped/1147_2001.jpg
Remove 0B size file: anime_faces/cropped/32445_2011.jpg
Remove 0B size file: anime_faces/cropped/55266_2016.jpg
Remove 0B size file: anime_faces/cropped/46998_2014.jpg
Remove 0B size file: anime_faces/cropped/27828_2009.jpg
Remove 0B size file: anime_faces/cropped/10877_2005.jpg
Remove 0B size file: anime_faces/cropped/23057_2008.jpg
Remove 0B size file: anime_faces/cropped/46240_2014.jpg
Remove 0B size file: anime_faces/cropped/28321_2010.jpg
Remove 0B size file: anime_faces/cropped/54885_2016.jpg
Remove 0B size file: anime_faces/cropped/8565_2004.jpg
Remove 0B size file: anime_faces/cropped/58070_2017.jpg
Remove 0B size file: anime_faces/cropped/40505_2012.jpg
Remove 0B size file: anime_faces/cropped/43570_2013.jpg
Remove 0B size file: anime_faces/cropped/6339_2003.jpg
Remove 0B size file: anime_faces/cropped/36800_2012.jpg
Remove 0B size file: anime_faces/cropped/5964_2003.jpg
Remove 0B size file: anime_faces/cropped/35412_2011.jpg
Remove 0B size file: anime_faces/cropped/45303_2014.jpg
Remove 0B size file: anime_faces/cropped/21135_2008.jpg
Remove 0B size file: anime_faces/cropped/44478_2013.jpg
Remove 0B size file: anime_faces/cropped/35751_2011.jpg
Remove 0B size file: anime_faces/cropped/62823_2019.jpg
Remove 0B size file: anime_faces/cropped/6188_2003.jpg
Remove 0B size file: anime_faces/cropped/24453_2009.jpg
Remove 0B size file: anime_faces/cropped/6268_2003.jpg
Remove 0B size file: anime_faces/cropped/4221_2002.jpg
Remove 0B size file: anime_faces/cropped/55695_2016.jpg
Remove 0B size file: anime_faces/cropped/26439_2009.jpg
Remove 0B size file: anime_faces/cropped/26648_2009.jpg
Remove 0B size file: anime_faces/cropped/55501_2016.jpg
Remove 0B size file: anime_faces/cropped/25213_2009.jpg
Remove 0B size file: anime_faces/cropped/32898_2011.jpg
Remove 0B size file: anime_faces/cropped/3651_2002.jpg
Remove 0B size file: anime_faces/cropped/3901_2002.jpg
Remove 0B size file: anime_faces/cropped/2125_2001.jpg
Remove 0B size file: anime_faces/cropped/46234_2014.jpg
Remove 0B size file: anime_faces/cropped/7058_2003.jpg
Remove 0B size file: anime_faces/cropped/20399_2007.jpg
Remove 0B size file: anime_faces/cropped/55382_2016.jpg
Remove 0B size file: anime_faces/cropped/154_2000.jpg
Remove 0B size file: anime_faces/cropped/46529_2014.jpg
Remove 0B size file: anime_faces/cropped/20330_2007.jpg
Remove 0B size file: anime_faces/cropped/28555_2010.jpg
Remove 0B size file: anime_faces/cropped/55062_2016.jpg
Remove 0B size file: anime_faces/cropped/23647_2008.jpg
Remove 0B size file: anime_faces/cropped/26406_2009.jpg
Remove 0B size file: anime_faces/cropped/48378_2014.jpg
Remove 0B size file: anime_faces/cropped/20561_2008.jpg
Remove 0B size file: anime_faces/cropped/4118_2002.jpg

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions