Hi, sorry to bother you again. I find some code in class WeightRegressor like:
out = torch.matmul(out, self.w1) + self.b1
out = out.view(bs, self.in_channels, self.hidden_dim)
out = torch.matmul(out, self.w2) + self.b2
kernel = out.view(bs, self.out_channels, self.in_channels, self.kernel_size, self.kernel_size) # like bz, 512, 512, 3, 3
why not use linear layer, like
w1 = nn.Linear(128, 65536)
w2 = nn.Linear(65536, 2359296)
out = w1(out)
out = w2(out)
kernel = out.view(bs, self.out_channels, self.in_channels, self.kernel_size, self.kernel_size)
Hi, sorry to bother you again. I find some code in class WeightRegressor like:
why not use linear layer, like
w1 = nn.Linear(128, 65536)
w2 = nn.Linear(65536, 2359296)
out = w1(out)
out = w2(out)
kernel = out.view(bs, self.out_channels, self.in_channels, self.kernel_size, self.kernel_size)