Skip to content

Conversation

@Mr-Neutr0n
Copy link

Summary

  • Fixed a typo in utils.py load_pretrained() where the dimension mismatch check for absolute_pos_embed compares C1 != C1 (always False) instead of the intended C1 != C2.
  • This means that when loading pretrained weights with a different embedding channel dimension, the guard clause never triggers. Instead of a clear warning message, users get a confusing runtime error downstream.
  • The analogous check for relative_position_bias_table correctly uses nH1 != nH2, confirming this is a typo.

Test plan

  • Verified the one-character fix (C1 -> C2) matches the pattern used elsewhere in the same function
  • No functional change for the common case where dimensions match (the else branch executes either way)
  • For the mismatch case: previously would error out in interpolate; now correctly logs a warning and skips

In `load_pretrained()`, the dimension mismatch guard for
`absolute_pos_embed` compares `C1` with itself (`C1 != C1`), which is
always False. This means a dimension mismatch between pretrained and
current model embedding channels is never detected, leading to a
confusing runtime error instead of a clear warning.

The fix changes the comparison to `C1 != C2` so the check works as
intended, matching the analogous check for
`relative_position_bias_table` (which correctly uses `nH1 != nH2`).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant