When I use the trained model to inference, I get the error
Traceback (most recent call last): File "/data/lilumin/Qwen-VL/test.py", line 47, in <module> generated_ids = model.generate(**inputs, max_new_tokens=128) File "/data/lilumin/anaconda3/envs/qwen/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context return func(*args, **kwargs) File "/data/lilumin/anaconda3/envs/qwen/lib/python3.10/site-packages/transformers/generation/utils.py", line 2465, in generate result = self._sample( File "/data/lilumin/anaconda3/envs/qwen/lib/python3.10/site-packages/transformers/generation/utils.py", line 3434, in _sample outputs = model_forward(**model_inputs, return_dict=True) File "/data/lilumin/anaconda3/envs/qwen/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/data/lilumin/anaconda3/envs/qwen/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl return forward_call(*args, **kwargs) File "/data/lilumin/anaconda3/envs/qwen/lib/python3.10/site-packages/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py", line 1834, in forward outputs = self.model( File "/data/lilumin/anaconda3/envs/qwen/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/data/lilumin/anaconda3/envs/qwen/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl return forward_call(*args, **kwargs) File "/data/lilumin/anaconda3/envs/qwen/lib/python3.10/site-packages/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py", line 1180, in forward layer_outputs = decoder_layer( File "/data/lilumin/anaconda3/envs/qwen/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/data/lilumin/anaconda3/envs/qwen/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl return forward_call(*args, **kwargs) File "/data/lilumin/anaconda3/envs/qwen/lib/python3.10/site-packages/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py", line 1042, in forward hidden_states, self_attn_weights, present_key_value = self.self_attn( File "/data/lilumin/anaconda3/envs/qwen/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/data/lilumin/anaconda3/envs/qwen/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl return forward_call(*args, **kwargs) File "/data/lilumin/anaconda3/envs/qwen/lib/python3.10/site-packages/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py", line 963, in forward attn_output = torch.nn.functional.scaled_dot_product_attention( RuntimeError: The expanded size of the tensor (293) must match the existing size (147) at non-singleton dimension 3. Target sizes: [1, 28, 147, 293]. Tensor sizes: [1, 1, 147, 147]
In the file modeling_qwen2_5_vl.py, I comment the code in line 931
if past_key_value is not None: cache_kwargs = {"sin": sin, "cos": cos, "cache_position": cache_position} # Specific to RoPE models key_states, value_states = past_key_value.update(key_states, value_states, self.layer_idx, cache_kwargs)
The model can inference, so what is the function of the update, how to revise the code to update?
When I use the trained model to inference, I get the error
Traceback (most recent call last): File "/data/lilumin/Qwen-VL/test.py", line 47, in <module> generated_ids = model.generate(**inputs, max_new_tokens=128) File "/data/lilumin/anaconda3/envs/qwen/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context return func(*args, **kwargs) File "/data/lilumin/anaconda3/envs/qwen/lib/python3.10/site-packages/transformers/generation/utils.py", line 2465, in generate result = self._sample( File "/data/lilumin/anaconda3/envs/qwen/lib/python3.10/site-packages/transformers/generation/utils.py", line 3434, in _sample outputs = model_forward(**model_inputs, return_dict=True) File "/data/lilumin/anaconda3/envs/qwen/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/data/lilumin/anaconda3/envs/qwen/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl return forward_call(*args, **kwargs) File "/data/lilumin/anaconda3/envs/qwen/lib/python3.10/site-packages/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py", line 1834, in forward outputs = self.model( File "/data/lilumin/anaconda3/envs/qwen/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/data/lilumin/anaconda3/envs/qwen/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl return forward_call(*args, **kwargs) File "/data/lilumin/anaconda3/envs/qwen/lib/python3.10/site-packages/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py", line 1180, in forward layer_outputs = decoder_layer( File "/data/lilumin/anaconda3/envs/qwen/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/data/lilumin/anaconda3/envs/qwen/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl return forward_call(*args, **kwargs) File "/data/lilumin/anaconda3/envs/qwen/lib/python3.10/site-packages/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py", line 1042, in forward hidden_states, self_attn_weights, present_key_value = self.self_attn( File "/data/lilumin/anaconda3/envs/qwen/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/data/lilumin/anaconda3/envs/qwen/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl return forward_call(*args, **kwargs) File "/data/lilumin/anaconda3/envs/qwen/lib/python3.10/site-packages/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py", line 963, in forward attn_output = torch.nn.functional.scaled_dot_product_attention( RuntimeError: The expanded size of the tensor (293) must match the existing size (147) at non-singleton dimension 3. Target sizes: [1, 28, 147, 293]. Tensor sizes: [1, 1, 147, 147]In the file modeling_qwen2_5_vl.py, I comment the code in line 931
if past_key_value is not None: cache_kwargs = {"sin": sin, "cos": cos, "cache_position": cache_position} # Specific to RoPE models key_states, value_states = past_key_value.update(key_states, value_states, self.layer_idx, cache_kwargs)The model can inference, so what is the function of the update, how to revise the code to update?