Hi, it's really a great repo and very helpful. Thanks. Could I ask why the gem_out_dim is 2 rho_0 instead of 64 rh_0 mentioned in the paper?
Hi, it's really a great repo and very helpful. Thanks.
Could I ask why the gem_out_dim is 2 rho_0 instead of 64 rh_0 mentioned in the paper?