Code Review Questions

## Code Review Questions

Automated code review findings for `bes-dev/MobileStyleGAN.pytorch`.

1. `train.py:47` — `raise "Unknown export format."` raises a string literal rather than an exception instance; in Python 3 this causes a `TypeError` instead of the intended error.
2. `train.py:22` — `ModelCheckpoint` is constructed with the deprecated `filepath` keyword argument (renamed to `dirpath`/`filename` in PyTorch Lightning ≥1.2), which will break on current library versions.
3. `core/distiller.py:76` — In `validation_step`, `self.student` is called twice with identical inputs; the first call's result is immediately discarded, doubling unnecessary inference cost.
4. `core/distiller.py:121` — `compute_mean_style` accepts a `batch_size` parameter but always uses the hardcoded literal `4096` when sampling, so passing a different value has no effect.
5. `core/distiller.py:100` — The `else` branch of `make_sample` generates a noise tensor of shape `(self.wsize, style_dim)` reshaped to `(1, wsize, style_dim)`, always producing a batch of exactly 1 regardless of `cfg.batch_size`, causing a shape mismatch when the generator/discriminator steps expect `batch_size` samples.
6. `core/distiller.py:84` — A TODO comment acknowledges that `validation_epoch_end` does not `all_gather` tensors across GPUs, meaning multi-GPU KID/loss aggregation is silently incorrect in distributed training.
7. `core/models/synthesis_network.py:72` — Inside the layer loop, `_style` is correctly computed as the per-block W+ style slice, but `style` (the full un-sliced tensor) is passed to `m(hidden, style, _noise)` instead of `_style`, effectively ignoring W+ style mixing for the teacher synthesis network.
8. `core/models/modules/mobile_synthesis_block.py:41` — Both the `up` (IDWTUpsample) and `conv1` layers index the same style slot (`style[:, 0, :]`), so the upsampling modulation and the first conv share an identical style vector; given `wsize()` returns 3 this looks like a copy-paste oversight.
9. `core/models/modules/idwt_upsample.py:9` — The class is named `IDWTUpsaplme` — a typo for `IDWTUpsample` — which makes the public API confusing and error-prone.
10. `core/models/modules/multichannel_image.py:6` — The class is named `MultichannelIamge` — a typo for `MultichannelImage` — propagated throughout imports.
11. `core/models/modules/ops/fused_act.py:30` — `raise NotImplemented` raises the built-in singleton (not an exception) instead of `raise NotImplementedError()`; on non-CUDA, non-CPU devices this silently produces a `TypeError` rather than the intended error.
12. `core/models/modules/ops/upfirdn2d.py:18` — Same `raise NotImplemented` bug as in `fused_act.py`: should be `raise NotImplementedError()`.
13. `core/model_zoo.py:6` — `json.load(open(zoo_path))` opens the file without a `with` statement, leaving the file descriptor unclosed if an exception is raised.
14. `core/loss/perceptual_loss.py:14` — VGG16 is loaded with the deprecated `pretrained=True` kwarg; newer torchvision versions require `weights=VGG16_Weights.DEFAULT` and will emit warnings or break.
15. `core/models/inception_v3.py:118` — `models.inception_v3(pretrained=True)` and `models.inception_v3(pretrained=False, ...)` use the deprecated `pretrained` keyword, which is removed in recent torchvision versions.
16. `core/loss/non_saturating_gan_loss.py:54` — `real.requires_grad = True` mutates the incoming tensor in-place inside `reg_d`, which can silently corrupt the gradient graph of the caller and cause unexpected side-effects in subsequent training steps.
17. `core/utils.py:11` — In `tensor_to_img`, if `to_numpy=False` is passed with the default `rgb2bgr=True`, the code attempts `img[:, :, ::-1]` on a PyTorch tensor, which raises a `RuntimeError` because negative-step slicing is unsupported on tensors.
18. `evaluate_fid.py:102` — A `BatchNorm2d` layer is trained on the dataset and then used to normalize images before Inception embedding, which is a non-standard normalisation method that will produce FID scores incomparable to any published benchmark.
19. `core/models/modules/noise_injection.py:12` — In `trace_model` mode, `self.register_buffer("noise", noise)` is called every forward pass if the buffer does not yet exist, but re-running `register_buffer` with a new value after the first call silently overwrites the buffer — the guard `not hasattr(self, "noise")` is insufficient because `hasattr` will be `True` after the first call even if the buffer was registered on a different input size.
20. `core/distiller.py:155` — In `to_onnx`, after tracing the mapping network with `var` of shape `(wsize, style_dim)` for W+ mode, the `Wrapper` is initialised with `self.student(style_tmp)` inside its `__init__`, which runs a full forward pass (including noise registration) before the ONNX trace starts, potentially capturing stale buffer state.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Code Review Questions — MobileStyleGAN.pytorch #63

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Code Review Questions — MobileStyleGAN.pytorch #63

Description