Skip to content

Code Review Questions — MobileStyleGAN.pytorch #63

@bes-dev

Description

@bes-dev

Code Review Questions

Automated code review findings for bes-dev/MobileStyleGAN.pytorch.

  1. train.py:47raise "Unknown export format." raises a string literal rather than an exception instance; in Python 3 this causes a TypeError instead of the intended error.
  2. train.py:22ModelCheckpoint is constructed with the deprecated filepath keyword argument (renamed to dirpath/filename in PyTorch Lightning ≥1.2), which will break on current library versions.
  3. core/distiller.py:76 — In validation_step, self.student is called twice with identical inputs; the first call's result is immediately discarded, doubling unnecessary inference cost.
  4. core/distiller.py:121compute_mean_style accepts a batch_size parameter but always uses the hardcoded literal 4096 when sampling, so passing a different value has no effect.
  5. core/distiller.py:100 — The else branch of make_sample generates a noise tensor of shape (self.wsize, style_dim) reshaped to (1, wsize, style_dim), always producing a batch of exactly 1 regardless of cfg.batch_size, causing a shape mismatch when the generator/discriminator steps expect batch_size samples.
  6. core/distiller.py:84 — A TODO comment acknowledges that validation_epoch_end does not all_gather tensors across GPUs, meaning multi-GPU KID/loss aggregation is silently incorrect in distributed training.
  7. core/models/synthesis_network.py:72 — Inside the layer loop, _style is correctly computed as the per-block W+ style slice, but style (the full un-sliced tensor) is passed to m(hidden, style, _noise) instead of _style, effectively ignoring W+ style mixing for the teacher synthesis network.
  8. core/models/modules/mobile_synthesis_block.py:41 — Both the up (IDWTUpsample) and conv1 layers index the same style slot (style[:, 0, :]), so the upsampling modulation and the first conv share an identical style vector; given wsize() returns 3 this looks like a copy-paste oversight.
  9. core/models/modules/idwt_upsample.py:9 — The class is named IDWTUpsaplme — a typo for IDWTUpsample — which makes the public API confusing and error-prone.
  10. core/models/modules/multichannel_image.py:6 — The class is named MultichannelIamge — a typo for MultichannelImage — propagated throughout imports.
  11. core/models/modules/ops/fused_act.py:30raise NotImplemented raises the built-in singleton (not an exception) instead of raise NotImplementedError(); on non-CUDA, non-CPU devices this silently produces a TypeError rather than the intended error.
  12. core/models/modules/ops/upfirdn2d.py:18 — Same raise NotImplemented bug as in fused_act.py: should be raise NotImplementedError().
  13. core/model_zoo.py:6json.load(open(zoo_path)) opens the file without a with statement, leaving the file descriptor unclosed if an exception is raised.
  14. core/loss/perceptual_loss.py:14 — VGG16 is loaded with the deprecated pretrained=True kwarg; newer torchvision versions require weights=VGG16_Weights.DEFAULT and will emit warnings or break.
  15. core/models/inception_v3.py:118models.inception_v3(pretrained=True) and models.inception_v3(pretrained=False, ...) use the deprecated pretrained keyword, which is removed in recent torchvision versions.
  16. core/loss/non_saturating_gan_loss.py:54real.requires_grad = True mutates the incoming tensor in-place inside reg_d, which can silently corrupt the gradient graph of the caller and cause unexpected side-effects in subsequent training steps.
  17. core/utils.py:11 — In tensor_to_img, if to_numpy=False is passed with the default rgb2bgr=True, the code attempts img[:, :, ::-1] on a PyTorch tensor, which raises a RuntimeError because negative-step slicing is unsupported on tensors.
  18. evaluate_fid.py:102 — A BatchNorm2d layer is trained on the dataset and then used to normalize images before Inception embedding, which is a non-standard normalisation method that will produce FID scores incomparable to any published benchmark.
  19. core/models/modules/noise_injection.py:12 — In trace_model mode, self.register_buffer("noise", noise) is called every forward pass if the buffer does not yet exist, but re-running register_buffer with a new value after the first call silently overwrites the buffer — the guard not hasattr(self, "noise") is insufficient because hasattr will be True after the first call even if the buffer was registered on a different input size.
  20. core/distiller.py:155 — In to_onnx, after tracing the mapping network with var of shape (wsize, style_dim) for W+ mode, the Wrapper is initialised with self.student(style_tmp) inside its __init__, which runs a full forward pass (including noise registration) before the ONNX trace starts, potentially capturing stale buffer state.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions