Hi, Why zero initialize `patch_out` in hourglass transformer? It makes output zero in beginning, what's the intuition of it?
Hi, Why zero initialize
patch_outin hourglass transformer? It makes output zero in beginning, what's the intuition of it?