I synthesise waveforms with your official ckpt on the test set of the VCTK-Corpus-0.92, which contains the audio clips of the last 8 speakers.
I calculated the LSD and SNR scores between the generated and reference test set, but the calculated metrics are not as good as those in your paper.
Additionally, the lsd calculation in util.util.compute_metrics seems strange, the n_fft should be 2048 while your default setting is 1024.
I synthesise waveforms with your official ckpt on the test set of the VCTK-Corpus-0.92, which contains the audio clips of the last 8 speakers.
I calculated the LSD and SNR scores between the generated and reference test set, but the calculated metrics are not as good as those in your paper.
Additionally, the lsd calculation in
util.util.compute_metricsseems strange, the n_fft should be 2048 while your default setting is 1024.