Hi,
Thanks for the contribution and updated code on Supervised Contrastive Learning.
My question is related to this part of the loss:
https://github.com/google-research/syn-rep-learn/blob/main/StableRep/models/losses.py#L79
local_batch_size = feats.size(0)
...
# Create label matrix, since in our specific case the
# label matrix in side each batch is the same, so
# we can just create it once and reuse it. For other
# cases, user need to compute it for each batch
if local_batch_size != self.last_local_batch_size:
etc....
My understanding is that, for a given batch in distributed setting, the label tensor (after all_gather) will be identical across all the gpus so no need to compute it multiple times. Just once per batch is enough.
My question is then on the condition local_batch_size != self.last_local_batch_size:: Why the check is done on the batch size and not on the tensors values, isn't the batch size pretty much the same during training ?
Thank you !
Hi,
Thanks for the contribution and updated code on Supervised Contrastive Learning.
My question is related to this part of the loss:
https://github.com/google-research/syn-rep-learn/blob/main/StableRep/models/losses.py#L79
My understanding is that, for a given batch in distributed setting, the label tensor (after all_gather) will be identical across all the gpus so no need to compute it multiple times. Just once per batch is enough.
My question is then on the condition
local_batch_size != self.last_local_batch_size:: Why the check is done on the batch size and not on the tensors values, isn't the batch size pretty much the same during training ?Thank you !