Running the command provided in the readme to reproduce the results on MuST-C of the paper "Adapting Transformer to End-to-End Spoken Language Translation" results in the following error:
| distributed init (rank 1): tcp://localhost:18735
| distributed init (rank 0): tcp://localhost:18735
| distributed init (rank 2): tcp://localhost:18735
| distributed init (rank 3): tcp://localhost:18735
Namespace(adam_betas='(0.9, 0.999)', adam_eps=1e-08, arch='speechconvtransformer_big', attention_dropout=0.1, attn_2d=True, audio_input=True, bucket_cap_mb=150, clip_norm=20.0, criterion='label_smoothed_cross_entropy', data=['bin/'], ddp_backend='no_c10d', decoder_attention_heads=8, decoder_embed_dim=512, decoder_ffn_embed_dim=1024, decoder_layers=6, decoder_learned_pos=False, decoder_normalize_before=True, decoder_out_embed_dim=512, decoder_output_dim=512, device_id=0, distance_penalty='gauss', distributed_backend='nccl', distributed_init_host='localhost', distributed_init_method='tcp://localhost:18735', distributed_port=18736, distributed_rank=0, distributed_world_size=4, dropout=0.1, encoder_attention_heads=8, encoder_convolutions='[(64, 3, 3)] * 2', encoder_embed_dim=512, encoder_ffn_embed_dim=1024, encoder_layers=6, encoder_learned_pos=False, encoder_normalize_before=True, fix_batches_to_gpus=False, fp16=False, fp16_init_scale=128, fp16_scale_window=None, freeze_encoder=False, init_variance=1.0, keep_interval_updates=-1, label_smoothing=0.1, left_pad_source='True', left_pad_target='False', log_format=None, log_interval=1000, lr=[0.005], lr_scheduler='inverse_sqrt', lr_shrink=0.1, max_epoch=100, max_sentences=8, max_sentences_valid=8, max_source_positions=1400, max_target_positions=300, max_tokens=12000, max_update=0, min_loss_scale=0.0001, min_lr=1e-08, momentum=0.99, no_attn_2d=False, no_cache_source=False, no_epoch_checkpoints=False, no_progress_bar=False, no_save=False, normalization_constant=1.0, optimizer='adam', optimizer_overrides='{}', raw_text=False, relu_dropout=0.1, reset_lr_scheduler=False, reset_optimizer=False, restore_file='checkpoint_last.pt', save_dir='models', save_interval=1, save_interval_updates=0, seed=1, sentence_avg=True, skip_invalid_size_inputs_valid_test=True, source_lang=None, target_lang=None, task='translation', train_subset='train', update_freq=[16], upsample_primary=1, valid_subset='valid', validate_interval=1, warmup_init_lr=0.0003, warmup_updates=4000, weight_decay=0.0)
| [h5] dictionary: 4 types
| [de] dictionary: 192 types
| bin/ train 229703 examples
| bin/ valid 1423 examples
Exception ignored in: <function IndexedDataset.__del__ at 0x7f0de0de5790>
Traceback (most recent call last):
File "/home/amit/amit/pruning/FBK-Fairseq-ST/fairseq/data/indexed_dataset.py", line 85, in __del__
Traceback (most recent call last):
File "../../train.py", line 365, in <module>
Exception ignored in: <function IndexedDataset.__del__ at 0x7f9f0b8f3790>
Traceback (most recent call last):
File "/home/amit/amit/pruning/FBK-Fairseq-ST/fairseq/data/indexed_dataset.py", line 85, in __del__
def __del__(self):
KeyboardInterrupt:
multiprocessing_main(args)
def __del__(self):
File "/home/amit/amit/pruning/FBK-Fairseq-ST/multiprocessing_train.py", line 42, in main
KeyboardInterrupt:
p.join()
File "/home/amit/.pyenv/versions/3.8.2/lib/python3.8/multiprocessing/process.py", line 149, in join
res = self._popen.wait(timeout)
File "/home/amit/.pyenv/versions/3.8.2/lib/python3.8/multiprocessing/popen_fork.py", line 47, in wait
return self.poll(os.WNOHANG if timeout == 0.0 else 0)
File "/home/amit/.pyenv/versions/3.8.2/lib/python3.8/multiprocessing/popen_fork.py", line 27, in poll
pid, sts = os.waitpid(self.pid, flag)
File "/home/amit/amit/pruning/FBK-Fairseq-ST/multiprocessing_train.py", line 84, in signal_handler
raise Exception(msg)
Exception:
-- Tracebacks above this line can probably be ignored --
Traceback (most recent call last):
File "/home/amit/amit/pruning/FBK-Fairseq-ST/multiprocessing_train.py", line 48, in run
single_process_main(args)
File "/home/amit/amit/pruning/FBK-Fairseq-ST/train.py", line 53, in main
dummy_batch = task.dataset('train').get_dummy_batch(args.max_tokens, max_positions)
File "/home/amit/amit/pruning/FBK-Fairseq-ST/fairseq/data/language_pair_dataset.py", line 221, in get_dummy_batch
return self.collater([
File "/home/amit/amit/pruning/FBK-Fairseq-ST/fairseq/data/language_pair_dataset.py", line 224, in <listcomp>
'source': self.src_dict.dummy_sentence(src_len) if self.src_dict is not None else None,
File "/home/amit/amit/pruning/FBK-Fairseq-ST/fairseq/data/dictionary.py", line 302, in dummy_sentence
t = torch.Tensor(length).new_empty((length, self.audio_features)).uniform_(self.nspecial + 1, len(self))
RuntimeError: Expected a_in <= b_in to be true, but got false. (Could this error message be improved? If so, please report an enhancement request to PyTorch.)
I've tried running the command with Python 3.5 and Python 3.8 and I get the same error both times
I believe the error is caused because the parameters being passed to torch::nn::init::uniform_ are incorrect.
Running the command provided in the readme to reproduce the results on MuST-C of the paper "Adapting Transformer to End-to-End Spoken Language Translation" results in the following error:
I've tried running the command with Python 3.5 and Python 3.8 and I get the same error both times
I believe the error is caused because the parameters being passed to torch::nn::init::uniform_ are incorrect.
I tried fixing the error myself by changing
self.nspecial + 1toself.nspecialin the following lineFBK-Fairseq-ST/fairseq/data/dictionary.py
Line 302 in 2d15240
Is this a valid fix?
Thanks in advance,
Chaitanya