You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I find VoRA modify the casual attention mask and enable bi-directional modeling for vision token.
So, I want to know could the VoRA be compatible with Flash Attention or SDPA?
Or just back to eager attention when training and inference.
I find VoRA modify the casual attention mask and enable bi-directional modeling for vision token.
So, I want to know could the VoRA be compatible with Flash Attention or SDPA?
Or just back to eager attention when training and inference.