Errors for Runing Experiment Oisst

Hello!

I've recently been testing Dyffusion model using the Oisst dataset. I used the following code to train the interpolation:
`python run.py experiment=oisst_pacific_interpolation work_dir=./myrun/interpolation_oisst trainer.max_epochs=5 datamodule.horizon=7 datamodule.window=4 datamodule.prediction_horizon=7`

Then, I used the following code to train the dyffusion:
`python run.py experiment=oisst_pacific_dyffusion work_dir=./myrun/dyffusion_oisst trainer.max_epochs=5 datamodule.horizon=7 datamodule.window=4 datamodule.prediction_horizon=7 diffusion.interpolator_run_id=on6bffjf`

I successfully trained the interpolation model, but when I trained the dyffusion model, I've got an Error:
```
Error executing job with overrides: ['experiment=oisst_pacific_dyffusion', 'work_dir=./myrun/dyffusion_oisst', 'trainer.max_epochs=5', 'datamodule.horizon=7', 'datamodule.window=4', 'datamodule.prediction_horizon=7', 'diffusion.interpolator_run_id=on6bffjf']
Traceback (most recent call last):
  File "/mpathc/wpeng/codes/python_github_2/dyffusion-main/run.py", line 22, in <module>
    main()
  File "/mpathc/wpeng/codes/0_python_env/diffusers/lib/python3.10/site-packages/hydra/main.py", line 94, in decorated_main
    _run_hydra(
  File "/mpathc/wpeng/codes/0_python_env/diffusers/lib/python3.10/site-packages/hydra/_internal/utils.py", line 394, in _run_hydra
    _run_app(
  File "/mpathc/wpeng/codes/0_python_env/diffusers/lib/python3.10/site-packages/hydra/_internal/utils.py", line 457, in _run_app
    run_and_report(
  File "/mpathc/wpeng/codes/0_python_env/diffusers/lib/python3.10/site-packages/hydra/_internal/utils.py", line 223, in run_and_report
    raise ex
  File "/mpathc/wpeng/codes/0_python_env/diffusers/lib/python3.10/site-packages/hydra/_internal/utils.py", line 220, in run_and_report
    return func()
  File "/mpathc/wpeng/codes/0_python_env/diffusers/lib/python3.10/site-packages/hydra/_internal/utils.py", line 458, in <lambda>
    lambda: hydra.run(
  File "/mpathc/wpeng/codes/0_python_env/diffusers/lib/python3.10/site-packages/hydra/_internal/hydra.py", line 132, in run
    _ = ret.return_value
  File "/mpathc/wpeng/codes/0_python_env/diffusers/lib/python3.10/site-packages/hydra/core/utils.py", line 260, in return_value
    raise self._return_value
  File "/mpathc/wpeng/codes/0_python_env/diffusers/lib/python3.10/site-packages/hydra/core/utils.py", line 186, in run_job
    ret.return_value = task_function(task_cfg)
  File "/mpathc/wpeng/codes/python_github_2/dyffusion-main/run.py", line 12, in main
    return run_model(config)
  File "/mpathc/wpeng/codes/python_github_2/dyffusion-main/src/train.py", line 101, in run_model
    raise e
  File "/mpathc/wpeng/codes/python_github_2/dyffusion-main/src/train.py", line 97, in run_model
    fit(ckpt_filepath=ckpt_path)
  File "/mpathc/wpeng/codes/python_github_2/dyffusion-main/src/train.py", line 93, in fit
    trainer.fit(model, datamodule=datamodule, ckpt_path=ckpt_filepath)
  File "/mpathc/wpeng/codes/0_python_env/diffusers/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 560, in fit
    call._call_and_handle_interrupt(
  File "/mpathc/wpeng/codes/0_python_env/diffusers/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 49, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "/mpathc/wpeng/codes/0_python_env/diffusers/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 598, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "/mpathc/wpeng/codes/0_python_env/diffusers/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1011, in _run
    results = self._run_stage()
  File "/mpathc/wpeng/codes/0_python_env/diffusers/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1055, in _run_stage
    self.fit_loop.run()
  File "/mpathc/wpeng/codes/0_python_env/diffusers/lib/python3.10/site-packages/pytorch_lightning/loops/fit_loop.py", line 216, in run
    self.advance()
  File "/mpathc/wpeng/codes/0_python_env/diffusers/lib/python3.10/site-packages/pytorch_lightning/loops/fit_loop.py", line 458, in advance
    self.epoch_loop.run(self._data_fetcher)
  File "/mpathc/wpeng/codes/0_python_env/diffusers/lib/python3.10/site-packages/pytorch_lightning/loops/training_epoch_loop.py", line 152, in run
    self.advance(data_fetcher)
  File "/mpathc/wpeng/codes/0_python_env/diffusers/lib/python3.10/site-packages/pytorch_lightning/loops/training_epoch_loop.py", line 348, in advance
    batch_output = self.automatic_optimization.run(trainer.optimizers[0], batch_idx, kwargs)
  File "/mpathc/wpeng/codes/0_python_env/diffusers/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/automatic.py", line 185, in run
    closure()
  File "/mpathc/wpeng/codes/0_python_env/diffusers/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/automatic.py", line 146, in __call__
    self._result = self.closure(*args, **kwargs)
  File "/mpathc/wpeng/codes/0_python_env/diffusers/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "/mpathc/wpeng/codes/0_python_env/diffusers/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/automatic.py", line 131, in closure
    step_output = self._step_fn()
  File "/mpathc/wpeng/codes/0_python_env/diffusers/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/automatic.py", line 319, in _training_step
    training_step_output = call._call_strategy_hook(trainer, "training_step", *kwargs.values())
  File "/mpathc/wpeng/codes/0_python_env/diffusers/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 329, in _call_strategy_hook
    output = fn(*args, **kwargs)
  File "/mpathc/wpeng/codes/0_python_env/diffusers/lib/python3.10/site-packages/pytorch_lightning/strategies/strategy.py", line 391, in training_step
    return self.lightning_module.training_step(*args, **kwargs)
  File "/mpathc/wpeng/codes/python_github_2/dyffusion-main/src/experiment_types/_base_experiment.py", line 438, in training_step
    loss_output = self.get_loss(batch)  # either a scalar or a dict with key 'loss'
  File "/mpathc/wpeng/codes/python_github_2/dyffusion-main/src/experiment_types/forecasting_multi_horizon.py", line 419, in get_loss
    loss = self.model.get_loss(inputs=inputs, targets=x_last, **extra_kwargs)
  File "/mpathc/wpeng/codes/python_github_2/dyffusion-main/src/diffusion/_base_diffusion.py", line 116, in get_loss
    results = self(inputs, targets, **kwargs)
  File "/mpathc/wpeng/codes/0_python_env/diffusers/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/mpathc/wpeng/codes/0_python_env/diffusers/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1603, in _call_impl
    result = forward_call(*args, **kwargs)
  File "/mpathc/wpeng/codes/python_github_2/dyffusion-main/src/diffusion/_base_diffusion.py", line 106, in forward
    return self.p_losses(targets, t=t, **kwargs)
  File "/mpathc/wpeng/codes/python_github_2/dyffusion-main/src/diffusion/dyffusion.py", line 526, in p_losses
    x_t[t_nonzero] = x_interpolated.to(x_t.dtype)
RuntimeError: The expanded size of the tensor (14400) must match the existing size (60) at non-singleton dimension 4.  Target sizes: [63, 1, 60, 60, 14400].  Tensor sizes: [63, 1, 60, 60]
```

What I want to test is using 4 historical images to predict 7 future images.

Why did this error occur, and how should I train the model? Thank you!



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Errors for Runing Experiment Oisst #11

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Errors for Runing Experiment Oisst #11

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions