[Torchvision API] Input metadata#6364
Conversation
Signed-off-by: Marek Dabek <mdabek@nvidia.com>
Signed-off-by: Marek Dabek <mdabek@nvidia.com>
Signed-off-by: Marek Dabek <mdabek@nvidia.com>
Signed-off-by: Marek Dabek <mdabek@nvidia.com>
Signed-off-by: Marek Dabek <mdabek@nvidia.com>
Signed-off-by: Marek Dabek <mdabek@nvidia.com>
Signed-off-by: Marek Dabek <mdabek@nvidia.com>
Signed-off-by: Marek Dabek <mdabek@nvidia.com>
|
@greptileai review |
|
| Filename | Overview |
|---|---|
| dali/python/nvidia/dali/experimental/torchvision/v2/functional/image_metadata.py | New pure Python/PIL/torch compatibility layer implementing get_image_size and get_dimensions; no DALI-specific logic, straightforward mirroring of torchvision behavior. |
| dali/python/nvidia/dali/experimental/torchvision/v2/functional/crop.py | New functional crop operator delegating to ndd.slice with out_of_bounds_policy=pad; validates PIL vs tensor layout, rounds PIL box coordinates, delegates size validation to RandomCrop.verify_args. |
| dali/python/nvidia/dali/experimental/torchvision/v2/randomcrop.py | New RandomCrop operator; uses fn.slice with negative anchor + out_of_bounds_policy for padding; _randint clamps negative max_value to 0; needs_padding should be _needs_padding per internal-state convention. |
| dali/test/python/torchvision/test_tv_image_metadata.py | New tests cross-validating against torchvision for both PIL and tensor inputs including GPU; error-case tests are missing glob= patterns on assert_raises calls. |
| dali/test/python/torchvision/test_tv_randomcrop.py | Comprehensive RandomCrop tests covering identity crops, padding modes, pad_if_needed, asymmetric padding, randomness sampling, and invalid-arg rejection; two dict-fill tests correctly skipped with @unittest.skip. |
| dali/test/python/torchvision/test_tv_crop.py | New crop functional tests; covers tensor, PIL (L/RGB/RGBA), batched tensor, dtype preservation, float-param rejection, and out-of-bounds; validates against torchvision reference. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[User input: PIL Image or torch.Tensor] --> B{Input type}
B -->|torch.Tensor| C[adjust_input decorator converts to DALI tensor]
B -->|PIL Image| D[adjust_input decorator converts to DALI tensor HWC]
C --> E[crop or RandomCrop._kernel]
D --> E
E --> F{needs padding?}
F -->|Yes| G[fn.slice negative anchor + out_of_bounds_policy=pad]
F -->|No| H[fn.slice / ndd.slice no padding policy]
G --> I[DALI output tensor / batch]
H --> I
Reviews (4): Last reviewed commit: "Review fixes" | Re-trigger Greptile
15c1775 to
c024bac
Compare
Signed-off-by: Marek Dabek <mdabek@nvidia.com>
|
@greptileai re-review |
|
!build |
|
CI MESSAGE: [52613366]: BUILD STARTED |
|
CI MESSAGE: [52613366]: BUILD PASSED |
* Different index support for tensor and PILImages Signed-off-by: Marek Dabek <mdabek@nvidia.com>
Signed-off-by: Marek Dabek <mdabek@nvidia.com>
Signed-off-by: Marek Dabek <mdabek@nvidia.com>
c024bac to
b88990d
Compare
Category:
New feature
Description:
Torchvision functional API operator to get metadata of input:
Additional information:
Affected modules and functionalities:
Key points relevant for the review:
Tests:
Checklist
Documentation
DALI team only
Requirements
REQ IDs: N/A
JIRA TASK: N/A