fix(torch): honor drop_last in all to_dataloader() modes by d-laub · Pull Request #207 · mcvickerlab/GenVarLoader

d-laub · 2026-06-05T09:10:27Z

Summary

Dataset.to_dataloader() did not honor drop_last consistently across modes. This fixes both directions of the defect:

Buffered / double_buffered modes silently dropped the final partial batch even when drop_last=False. Now the trailing partial batch is kept.
Default mode (mode=None) crashed with drop_last=True because drop_last was forwarded to td.DataLoader alongside batch_size=None (PyTorch rejects that combination). The BatchSampler is now the sole authority on dropping the partial batch.

Changes

_chunked.py — ChunkPlanner no longer requires the index count to be a multiple of batch_size; it emits the trailing partial batch as a remainder entry in batch_totals and clamps the final chunk slice to n.
_torch.py — _resolve_buffered_inputs gates partial-batch truncation on drop_last; get_dataloader stops forwarding drop_last to td.DataLoader in default mode; a directly-passed BatchSampler's batch_size is adopted for buffered re-batching (with a warning when it conflicts with an explicit batch_size).
_buffered_loader.py — __len__ floor → ceil so it matches what iteration yields.

Test Plan

pytest tests/unit/test_chunk_planner.py tests/unit/test_torch.py tests/unit/test_buffered_loader.py tests/unit/test_double_buffered_loader.py → 39 passed, 1 skipped (1kg-gated)
New tests cover the full mode × drop_last matrix, partial-batch instance count, default-mode drop_last=True regression, and a DDP-shaped custom-BatchSampler case
End-to-end repro on the dummy dataset confirms ceil(N/bs) batches with drop_last=False and N//bs with drop_last=True in both default and buffered modes

🤖 Generated with Claude Code

Two defects: buffered modes ignore drop_last=False (unconditional n_keep truncation + ChunkPlanner divisibility requirement), and default mode crashes on drop_last=True (drop_last forwarded to DataLoader alongside batch_size=None). Design teaches ChunkPlanner about a trailing partial batch and stops forwarding drop_last in the default path. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

d-laub and others added 8 commits June 5, 2026 01:07

docs(plan): to_dataloader drop_last fix implementation plan

b033a41

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

fix(chunked): keep trailing partial batch in ChunkPlanner

e78c1f9

fix(torch): do not forward drop_last to DataLoader in default mode

b1d1e1d

fix(torch): buffered modes honor drop_last=False

f5eead9

fix(torch): warn when BatchSampler overrides explicit batch_size

33b0009

style: apply ruff format to test_torch.py

96f45f0

style(torch): use f-string for BatchSampler warning

42061a3

d-laub merged commit 5e49833 into main Jun 5, 2026
7 checks passed

d-laub deleted the fix/dataloader-drop-last branch June 5, 2026 09:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(torch): honor drop_last in all to_dataloader() modes#207

fix(torch): honor drop_last in all to_dataloader() modes#207
d-laub merged 8 commits into
mainfrom
fix/dataloader-drop-last

d-laub commented Jun 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

d-laub commented Jun 5, 2026

Summary

Changes

Test Plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant