Fix SDPA dropout during eval#160
Conversation
|
Thank you for your pull request and welcome to our community. Action RequiredIn order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you. ProcessIn order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA. Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks! |
44a912f to
ac6ccbe
Compare
F.scaled_dot_product_attentionapplies dropout wheneverdropout_p > 0, even when the module is in eval mode.This does not affect the default pretrained V-JEPA 2.1 inference because dropout defaults to
0.0, but if the attention modules are configured with nonzeroproj_drop, eval can produce different outputs.This PR sets the
dropout_pargument to zero whenself.trainingis false, matching the usual behavior.