Hi,
Thanks for your work!
However, it looks a lit bit confused for me about the number of params/trainable weights.
From the paper, it looks like that each children capsule has its own weight matrix to get the "predict vector" for the parent capsule in next layer. For example, from "primarycaps (ConvCapsuleLayer) " to "conv_cap_2_1 (ConvCapsuleLayer)", there are 2 capsules in the Primarycaps, should the # of params be multiplied by 2? Say 25664*2 ?
The same question also raises in the following layers for me.
Any one could please help me figure out this problem? Thanks !
The following is a part of model params summary for your reference.

Hi,
Thanks for your work!
However, it looks a lit bit confused for me about the number of params/trainable weights.
From the paper, it looks like that each children capsule has its own weight matrix to get the "predict vector" for the parent capsule in next layer. For example, from "primarycaps (ConvCapsuleLayer) " to "conv_cap_2_1 (ConvCapsuleLayer)", there are 2 capsules in the Primarycaps, should the # of params be multiplied by 2? Say 25664*2 ?
The same question also raises in the following layers for me.
Any one could please help me figure out this problem? Thanks !
The following is a part of model params summary for your reference.
