You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on May 28, 2024. It is now read-only.
Hi!
I'm not sure you still do Q&A support here 😊, but I'm obsessed a certain problem beyond my math skills. I hope you could help me.
The question is related to the loss function of your RSSM which uses variational approach. The reconstruction loss of VAE is p(o_t|s_t) as it is decoder from latent to image. In this case, an observation(=image) has much bigger dimension than the latent. But when it comes to the case in which o_t has much smaller dimension (for example, 4 values like cartpole of OpenAI gym classic_control) than the latent(let's say this is 32~64 here), I think p(o_t|s_t) could not learn any meaningful distribution. Because the conditional s_t was sampled from variational posterior q(s_t|a_1:t, o_1:t) which already has seen the observation of current timestep o_t, I suspect that s_t could just learn to copy the full o_t inside s_t because the dimension of s_t is much bigger.
In this situation (non-image and small dimension of observation), can we still hold this VAE-like approach?
Or is there some other technique more reasonable in this case?
I hope this worry makes sense to you. 😕