I would is it a mistake of FID calculation in all baseline works along with yours. To my undertanding, your generated image is awesome, so the FID should be a small number (take a reference in general computer vision domain where they achieve FID 1.5 - 10 in SOTA on ImageNet-256 dataset).
I suppose there is a mistake when using FrechetInceptionDistance. If we use image as uint8, we should set normalize=False otherwise, if we use image as [0,1] then set normalize=True.
from torchmetrics.image.fid import FrechetInceptionDistance
# Use normalize=False since we'll provide uint8 images in [0, 255] range
fid = FrechetInceptionDistance(normalize=False).to(device)
See fid.py from torchmetrics.
I would is it a mistake of FID calculation in all baseline works along with yours. To my undertanding, your generated image is awesome, so the FID should be a small number (take a reference in general computer vision domain where they achieve FID 1.5 - 10 in SOTA on ImageNet-256 dataset).
I suppose there is a mistake when using
FrechetInceptionDistance. If we use image as uint8, we should setnormalize=Falseotherwise, if we use image as [0,1] then setnormalize=True.See
fid.pyfrom torchmetrics.