System Info
The environment is consistent with the default, and the GPU is Nvidia A40
Who can help?
No response
Information
Tasks
Reproduction
From v4.47 onwards, when a model cache is to be returned, generate will return a Cache instance instead by default (as opposed to the legacy tuple of tuples format). If you want to keep returning the legacy format, please set return_legacy_cache=True.
Traceback (most recent call last):
File "/home/tank/o1_train/openr/train/mat/scripts/train_math.py", line 107, in
main(sys.argv[1:])
File "/home/tank/o1_train/openr/train/mat/scripts/train_math.py", line 99, in main
runner.run()
File "/home/tank/o1_train/openr/train/mat/scripts/../../mat/runner/shared/math_runner.py", line 76, in run
rewards = self.prm.get_reward(obs, actions)
File "/home/tank/miniconda3/envs/open_reasoner/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/home/tank/o1_train/openr/train/mat/scripts/../../mat/models/ms_prm.py", line 40, in get_reward
last_step_score = step_score[-1]
IndexError: index -1 is out of bounds for dimension 0 with size 0
Expected behavior
How should I fix it, should I replace the A100 GPU or something? I can't simply check the length of the array and continue it
System Info
The environment is consistent with the default, and the GPU is Nvidia A40
Who can help?
No response
Information
Tasks
Reproduction
From v4.47 onwards, when a model cache is to be returned,
generatewill return aCacheinstance instead by default (as opposed to the legacy tuple of tuples format). If you want to keep returning the legacy format, please setreturn_legacy_cache=True.Traceback (most recent call last):
File "/home/tank/o1_train/openr/train/mat/scripts/train_math.py", line 107, in
main(sys.argv[1:])
File "/home/tank/o1_train/openr/train/mat/scripts/train_math.py", line 99, in main
runner.run()
File "/home/tank/o1_train/openr/train/mat/scripts/../../mat/runner/shared/math_runner.py", line 76, in run
rewards = self.prm.get_reward(obs, actions)
File "/home/tank/miniconda3/envs/open_reasoner/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/home/tank/o1_train/openr/train/mat/scripts/../../mat/models/ms_prm.py", line 40, in get_reward
last_step_score = step_score[-1]
IndexError: index -1 is out of bounds for dimension 0 with size 0
Expected behavior
How should I fix it, should I replace the A100 GPU or something? I can't simply check the length of the array and continue it