Understand if it is possible to use own checkpoint from training as model file

Hello, 

I've been busy with the default fairseq `examples/speech_recognition/infer.py` and also this repo's `recognize.py`, to see if it is possible to run inference using a model we made ourselves by finetuning a base model.  We can get the script `infer.py` to work, but I've noticed that it needs to be able to find the original base model on disk.  Moving the checkpoint model to a different machine is cumbersome, the base model has to be in the same location on the target machine. 

I've tried to study how the model loading works for almost a day now, but I can't wrap my head around it.  I _think_ it only needs some `args` from the original base model, there is a lot of exchange going on between formats and names `cfg`, `w2v_args`, OmegaConf and Namespace. 

The recognize.py and recognize.hydra.py break on loading a checkpoint file (but they work on published finetuned models).   I would be helped if there is a way to produce a model file that works with recognize.py from the original base model and a checkpoint.  I have not been able to find such a tool—I _believe_ it is as simple as adding the correct `.cfg.w2v_args` info to the checkpoint, but I don't understand how. 

I can get recognize.py to work with a checkpoint file with the patch below, but then model loading still refers to the original base model. 

```diff
@@ -139,13 +162,24 @@ class Wav2VecPredictor:
         return feats
 
     def _load_model(self, model_path, target_dict):
-        w2v = torch.load(model_path)
-
+        #w2v = torch.load(model_path)
+        #if w2v['args'] is None:
+        #    w2v['args'] = Namespace()
         # Without create a FairseqTask
-        args = base_architecture(w2v["args"])
-        model = Wav2VecCtc(args, Wav2VecEncoder(args, target_dict))
-        model.load_state_dict(w2v["model"], strict=True)
-        return model
+        #args = base_architecture(w2v["args"])
+        #model = Wav2VecCtc(args, Wav2VecEncoder(args, target_dict))
+        #model.load_state_dict(w2v["model"], strict=True)
+
+        models, saved_cfg, task = load_model_ensemble_and_task(
+            utils.split_paths(model_path),
+            arg_overrides=None, # ast.literal_eval(args.model_overrides),
+            task=None,
+            suffix="",
+            strict=True,
+            num_shards=1,
+            state=None
+        )
+        return models[0]
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Understand if it is possible to use own checkpoint from training as model file #10

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Understand if it is possible to use own checkpoint from training as model file #10

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions