Skip to content

Limit CPU video decoder codec support#6352

Merged
JanuszL merged 3 commits into
NVIDIA:mainfrom
JanuszL:remove_cpu_codecs
May 26, 2026
Merged

Limit CPU video decoder codec support#6352
JanuszL merged 3 commits into
NVIDIA:mainfrom
JanuszL:remove_cpu_codecs

Conversation

@JanuszL
Copy link
Copy Markdown
Contributor

@JanuszL JanuszL commented May 18, 2026

Limit CPU video decoder codec support

Restrict the CPU frames decoder to codecs supported by the currently
compiled libavcodec configuration. H264 and HEVC are no longer
advertised for the CPU variant while VP8, VP9, and MJPEG remain
enabled.

Make ReadRegularFrame mark end-of-stream by setting next_frame_idx_
to -1 when the index reaches NumFrames(), mirroring the existing
guard in ReadFlushFrame. Without this, codecs with no decoder latency
(VP9 on the new test inputs) deliver the final frame via the regular
path, leaving next_frame_idx_ at NumFrames() and causing
VideoInput depletion to be reported one batch late.

Reset the decoder when an indexed next frame falls outside the valid
range, avoiding reuse of an invalid decoder position.

Update video decoder tests to expect CPU failures for unsupported
codecs instead of skipping only MPEG4. Use VP9 CFR/VFR test inputs
and device-less CPU pipelines where appropriate. Point the CFR/VFR
reference frame folders at `frames_{1,2}_vp9/` so CPU decode of the
new VP9 fixtures matches at the existing eps=10 tolerance. Drop the
CPU HEVC frames-decoder tests (`ConstantFrameRateHevc`,
`VariableFrameRateHevc`, `VariableFrameRateHevcNoIndex`) — HEVC is
no longer in the CPU codec allow-list.

Tolerate up to 16 isolated subpixel deviations exceeding eps in
TestVideo::CompareFrame (out of ~2.7M subpixels per frame). The CPU
VP9 decode path occasionally produces a single byte that differs by
~32 — a SIMD glitch inside libavcodec/sws_scale that Valgrind cannot
instrument. The budget is orders of magnitude below what any genuine
regression would produce, so test sensitivity is preserved.

In dali/test/python/input/test_video.py, filter out h264 from the
round-robin fixture (the unsuffixed test_{1,2}.mp4 in cfr//vfr/
are h264) and restrict test_video_input_audio_stream to the mixed
backend — the only DALI_extra video with an audio stream is h264.

Category:

Bug fix (non-breaking change which fixes an issue)

Description:

Restricts the CPU video frames decoder to the codecs supported by the
currently compiled libavcodec configuration. H264 and HEVC are no longer
advertised for the CPU variant, while VP8, VP9, and MJPEG remain enabled.

ReadRegularFrame now mirrors ReadFlushFrame and signals end-of-stream
by setting next_frame_idx_ to -1 once NumFrames() is reached, so
codecs with no decoder latency report depletion immediately instead of
one batch late.

Resets the decoder when an indexed next frame falls outside the valid
range, avoiding reuse of an invalid decoder position.

The video decoder tests now expect CPU failures for unsupported codecs
instead of skipping only MPEG4. The affected CFR/VFR test inputs are
switched to VP9 variants, and CPU pipelines use device_id=None where
appropriate. The CFR/VFR reference frame folders are repointed at the
new VP9-derived frames_{1,2}_vp9/ so CPU decode matches at the
existing eps=10 tolerance. CPU HEVC frames-decoder tests are removed.

TestVideo::CompareFrame now tolerates up to 16 isolated subpixel
deviations exceeding eps per frame (out of ~2.7M). The CPU VP9 decode
path occasionally produces a single byte that differs by ~32 — a SIMD
glitch inside libavcodec/sws_scale that Valgrind cannot instrument.
The budget is orders of magnitude below any genuine regression, so
test sensitivity is preserved.

dali/test/python/input/test_video.py filters out h264 from the
round-robin fixture and restricts test_video_input_audio_stream to
the mixed backend — the only DALI_extra video with an audio stream is
h264, which CPU can no longer decode.

Additional information:

Affected modules and functionalities:

  • CPU video frames decoder: codec allow-list and ReadRegularFrame
    end-of-stream signalling.
  • Video frames decoder seek path (reset when seek target is out of
    range).
  • Video decoder gtests: reference frame paths, dropped CPU HEVC cases,
    relaxed CompareFrame tolerance.
  • dali/test/python/input/test_video.py: h264 fixture filter and
    audio-stream test backend restriction.
  • DALI deps and DALI_extra version pins.

Key points relevant for the review:

  • DALI_DEPS_VERSION and DALI_EXTRA_VERSION are temporary ToDo
    placeholders until the corresponding dali_deps and dali_extra
    repository changes merge.
  • CPU unsupported-codec behavior is now tested as an expected failure
    instead of being skipped for a subset of codecs.
  • The ReadRegularFrame EOS guard is required for VideoInput
    depletion to fire on the right batch with VP9 inputs (h264 hid the
    off-by-one through its decoder latency, which routed the tail
    through ReadFlushFrame).
  • The 16-subpixel tolerance in CompareFrame is a flake mitigation,
    not a tolerance loosening: a real codec/colorspace bug would touch
    thousands of subpixels.

Tests:

  • Existing tests apply
  • New tests added
    • Python tests
    • GTests
    • Benchmark
    • Other
  • N/A

Not run locally.

Checklist

Documentation

  • Existing documentation applies
  • Documentation updated
    • Docstring
    • Doxygen
    • RST
    • Jupyter
    • Other
  • N/A

DALI team only

Requirements

  • Implements new requirements
  • Affects existing requirements
  • N/A

REQ IDs: N/A

JIRA TASK: DALI-4712

@JanuszL
Copy link
Copy Markdown
Contributor Author

JanuszL commented May 18, 2026

NVIDIA/DALI_extra#135 & NVIDIA/DALI_deps#162 are related to this change.

@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [51666287]: BUILD STARTED

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 18, 2026

Greptile Summary

This PR restricts the CPU video frames decoder to codecs supported by the compiled libavcodec (VP8, VP9, MJPEG only — H264 and HEVC removed), fixes an off-by-one end-of-stream signal in ReadRegularFrame, and resets the decoder when a seek target falls outside the valid index range.

  • Core decoder changes: SelectVideoStream now uses a correctly-sized std::array<AVCodecID, 3>, ReadRegularFrame sets next_frame_idx_ = -1 on exhaustion (mirroring the existing ReadFlushFrame guard), and SeekFrame gains a reset path for out-of-bounds indexed positions.
  • Test infrastructure updates: CPU HEVC gtest cases are dropped; unsupported-CPU-codec paths now assert a RuntimeError instead of being skipped; cfr/vfr reference fixtures are repointed to new VP9 variants; CompareFrame tolerates up to 16 isolated subpixel deviations to suppress an sws_scale SIMD flake.
  • CI setup: TL0_videoreader_test/test.sh is updated to copy the VP9 multi-resolution directory and use the VP9 Sintel container.

Confidence Score: 4/5

Safe to merge once any outstanding questions around NumFrames() re-entrancy after Reset() are confirmed; the codec restriction and EOS-fix logic are sound.

The codec allow-list change, EOS guard, and seek-reset path are all well-reasoned and internally consistent. The early NumFrames() call in ReadNextFrame is the right approach to pre-populate the index, but its interaction with Reset() — which zeroes next_frame_idx_ and reopens the stream — means ParseNumFrames() could be re-invoked after every seek-triggered reset. Whether that is safe depends on ParseNumFrames() being idempotent after the file is reopened.

frames_decoder_cpu.cc — specifically the NumFrames() / ParseNumFrames() interaction after Reset() in a seek-heavy workload.

Important Files Changed

Filename Overview
dali/operators/video/frames_decoder_cpu.cc Restricts supported codecs to VP8/VP9/MJPEG (array size correctly fixed to 3), adds EOS guard in ReadRegularFrame, pre-populates index via NumFrames() on first frame, and tolerates AVERROR_EOF from avcodec_send_packet(null). Logic looks sound; one open question around NumFrames() re-entrancy after Reset().
dali/operators/video/frames_decoder_base.cc SeekFrame gains a guard to reset the decoder when next_frame_idx_ is beyond NumFrames() (guarded by HasIndex()), mirroring the existing negative-index reset. Reset() sets next_frame_idx_=0, so the assert after the new condition holds.
dali/operators/video/video_test.cc CompareFrame switched from early-exit on first mismatch to counting all bad subpixels; tolerates up to 16 isolated deviations to suppress VP9/sws_scale SIMD flake. Reference frame paths updated to _vp9 variants. Copyright header is stale (2021-2022).
dali/test/python/decoder/test_video.py Adds assert_unsupported_cpu_codec helper and exercises all previously-skipped CPU unsupported codecs as expected-failure tests; fixes device_id=None for CPU pipelines; updates cfr/vfr paths to VP9 fixtures; removes pipe.build() in several places relying on auto-build.
dali/test/python/input/test_video.py Filters out H264 (test_1/test_2.mp4) from the round-robin fixture; switches audio-stream test to sintel_trailer_vp9.mp4. The test_video_input_audio_stream docstring may no longer accurately describe the scenario if the VP9 file has no audio track.
qa/TL0_videoreader_test/test.sh Switches resolution test fixture from the full video_resolution tree to the VP9 subdirectory; switches sintel container to the VP9 variant. The previous find -name 'vp9' bug was fixed.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[ReadNextFrame called] --> B{next_frame_idx_ == 0?}
    B -->|yes| C[Call NumFrames to populate index]
    C --> D{next_frame_idx_ == -1?}
    B -->|no| D
    D -->|yes| E[return false EOS]
    D -->|no| F{flush_state_?}
    F -->|no| G[ReadRegularFrame]
    F -->|yes| H[ReadFlushFrame]
    G --> I{frame received?}
    I -->|yes| J[CopyToOutput + increment idx]
    J --> K{idx >= NumFrames?}
    K -->|yes| L[idx = -1]
    K -->|no| M[return true]
    L --> M
    I -->|no| N[send null packet, flush_state_=true, return false]
    H --> O{frame received?}
    O -->|no| P[flush_state_=false, idx=-1, return false]
    O -->|yes| Q[CopyToOutput + increment idx]
    Q --> R{idx >= NumFrames?}
    R -->|yes| S[idx = -1]
    R -->|no| T[return true]
    S --> T
Loading

Fix All in Claude Code

Reviews (21): Last reviewed commit: "Typo fix" | Re-trigger Greptile

Comment thread dali/operators/video/frames_decoder_cpu.cc Outdated
Comment thread DALI_DEPS_VERSION Outdated
@@ -1 +1 @@
b270f29e9d7655512e7e8eaf055cca4d19b55f55
ToDo
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 [Bug] Version pin files set to literal ToDo placeholder.

Both DALI_DEPS_VERSION and DALI_EXTRA_VERSION now contain the string ToDo instead of a commit SHA. Any CI job that reads these files to fetch the matching dali_deps / dali_extra artefacts will either fail outright or pick up an incorrect/stale revision, breaking reproducibility for the entire build. These files should be updated to the real commit SHAs before this PR merges (or the dependent PRs should land first).

Fix in Claude Code

Comment thread dali/test/python/decoder/test_video.py Outdated
Comment thread dali/test/python/decoder/test_video.py
@JanuszL JanuszL force-pushed the remove_cpu_codecs branch 2 times, most recently from 13fd2c1 to 8375cb2 Compare May 18, 2026 22:02
@JanuszL
Copy link
Copy Markdown
Contributor Author

JanuszL commented May 18, 2026

@greptile review

@JanuszL JanuszL force-pushed the remove_cpu_codecs branch from 8375cb2 to 9389ee9 Compare May 18, 2026 22:18
@JanuszL
Copy link
Copy Markdown
Contributor Author

JanuszL commented May 18, 2026

@greptile review

@JanuszL JanuszL force-pushed the remove_cpu_codecs branch from 9389ee9 to d364075 Compare May 18, 2026 22:27
@JanuszL
Copy link
Copy Markdown
Contributor Author

JanuszL commented May 18, 2026

@greptile review

Comment on lines 160 to 165
++next_frame_idx_;
if (next_frame_idx_ >= NumFrames()) {
next_frame_idx_ = -1;
LOG_LINE << "Next frame index out of bounds (regular), setting to -1" << std::endl;
}
return true;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 The new EOS guard calls NumFrames() unconditionally on every decoded frame, which can invoke ParseNumFrames() when no index is built and nb_frames is zero in the container. ParseNumFrames() reads all remaining demuxer packets to completion, so the very first frame's increment will exhaust the packet stream and cause all subsequent av_read_frame calls to return EOF — silently dropping every frame after the first. The existing guard in ReadFlushFrame has the same limitation (documented with a TODO) but that function runs only after the demuxer is already exhausted. The fix is to guard the check with HasIndex(), mirroring the SeekFrame condition added in this same PR.

Suggested change
++next_frame_idx_;
if (next_frame_idx_ >= NumFrames()) {
next_frame_idx_ = -1;
LOG_LINE << "Next frame index out of bounds (regular), setting to -1" << std::endl;
}
return true;
++next_frame_idx_;
// TODO(awolant): Figure out how to handle this during index building
// Or when NumFrames is unavailable
if (HasIndex() && next_frame_idx_ >= NumFrames()) {
next_frame_idx_ = -1;
LOG_LINE << "Next frame index out of bounds (regular), setting to -1" << std::endl;
}
return true;

Fix in Claude Code

Comment thread dali/test/python/decoder/test_video.py Outdated
Comment on lines 711 to 712
batch_size = 3
pipe = test_pipeline(batch_size=batch_size, num_threads=3, device_id=0)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 test_multichannel_fill_value hard-codes device_id=0 even though the test body uses fn.experimental.decoders.video which is a CPU/mixed operator; on a device-less CI machine this will fail at pipeline construction. Other tests in this PR were correctly updated to derive device_id from device, so this one was apparently missed.

Suggested change
batch_size = 3
pipe = test_pipeline(batch_size=batch_size, num_threads=3, device_id=0)
batch_size = 3
device_id = None if device == "cpu" else 0
pipe = test_pipeline(batch_size=batch_size, num_threads=3, device_id=device_id)

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [51735694]: BUILD STARTED

@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [51735694]: BUILD FAILED

@JanuszL JanuszL force-pushed the remove_cpu_codecs branch 2 times, most recently from a8f64cc to fba22e3 Compare May 19, 2026 05:11
@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [51770028]: BUILD STARTED

Comment on lines 311 to 322
# test overflow of frame_buffer_
filenames.append(f"{get_dali_extra_path()}/db/video/cfr_test.mp4")
filenames = filter(lambda filename: "mpeg4" not in filename, filenames)
filenames = filter(lambda filename: "hevc" not in filename, filenames)
filenames = filter(lambda filename: "av1" not in filename, filenames)
if device == "cpu":
# some formats are not yet supported in the CPU operator itself
filenames = filter(lambda filename: "mpeg4" not in filename, filenames)
filenames = filter(
lambda filename: "test_1.mp4" not in filename and "test_2.mp4" not in filename,
filenames,
)
filenames = cycle(filenames)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 cfr_test.mp4 is H264 and is not filtered for CPU.

The DALI_extra README.rst shows cfr_test.mp4 is generated with -c:v libx264, so it's H264. This file is appended to filenames before the CPU-conditional filters run. The CPU block filters mpeg4, test_1.mp4, and test_2.mp4, but cfr_test.mp4 slips through. When the CPU decoder encounters it, SelectVideoStream issues a DALI_WARN and returns false, causing a RuntimeError that fails the test.

The fix is to guard the filenames.append call (or add "cfr_test" not in filename to the CPU filter) so that the H264-specific overflow fixture is only used when the device can decode H264.

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@JanuszL JanuszL force-pushed the remove_cpu_codecs branch from fba22e3 to c87a527 Compare May 19, 2026 05:51
@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [51774937]: BUILD STARTED

Comment thread dali/test/python/decoder/test_video.py Outdated
Comment on lines +312 to +313
if device == "gpu":
filenames.append(f"{get_dali_extra_path()}/db/video/cfr_test.mp4")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 [Bug] Dead condition — cfr_test.mp4 is never appended for the "mixed" device.

test_multi_gpu_video is decorated with @params("cpu", "mixed"), so device is never "gpu". The frame-buffer overflow fixture (cfr_test.mp4) is silently skipped for the "mixed" case, which was the very path the file was added to stress. The condition should be device == "mixed" (or device != "cpu").

Suggested change
if device == "gpu":
filenames.append(f"{get_dali_extra_path()}/db/video/cfr_test.mp4")
if device == "mixed":
filenames.append(f"{get_dali_extra_path()}/db/video/cfr_test.mp4")

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@JanuszL JanuszL force-pushed the remove_cpu_codecs branch from c87a527 to 4e1a23c Compare May 19, 2026 06:14
@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [51777473]: BUILD STARTED

@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [51777473]: BUILD FAILED

@JanuszL JanuszL force-pushed the remove_cpu_codecs branch from 4e1a23c to fd7b970 Compare May 20, 2026 07:42
@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [51925234]: BUILD FAILED

@JanuszL JanuszL force-pushed the remove_cpu_codecs branch from fd7b970 to dfa6ff4 Compare May 20, 2026 07:49
@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [51925957]: BUILD STARTED

@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [52057797]: BUILD FAILED

@JanuszL JanuszL force-pushed the remove_cpu_codecs branch 2 times, most recently from d53209a to ce13c1e Compare May 22, 2026 04:56
@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [52186630]: BUILD STARTED

@JanuszL JanuszL force-pushed the remove_cpu_codecs branch from ce13c1e to 22563c3 Compare May 22, 2026 05:08
@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [52188114]: BUILD STARTED

Comment thread dali/test/python/decoder/test_video.py Outdated
}

unsupported_cpu_codec_error = r"is not supported by the CPU variant of this operator\."
unsupported_cpu_codecs = {"h264", "hevc", "mpeg4"}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Maintainability] This Python-side set duplicates the implicit allow-list in frames_decoder_cpu.cc:218-227 (everything not in {VP8, VP9, MJPEG}). When the C++ array changes, this set has to track it manually with nothing to enforce the link. Consider either:

  • A short comment here pointing to the C++ definition (# Must match the inverse of frames_decoder_cpu.cc SelectVideoStream's codec table), or
  • Defining the supported codecs once and deriving unsupported_cpu_codecs (the latter is harder across C++/Python).

Lowest-cost option is the cross-reference comment.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is the point. It list the codecs which are not supported as in C++ definition. I don't think we can track this easily besides adding a comment that this comes from the CPU decoder implementation.

Comment thread dali/test/python/decoder/test_video.py Outdated
# some formats are not yet supported in the CPU operator itself
filenames = filter(lambda filename: "mpeg4" not in filename, filenames)
filenames = filter(
lambda filename: "test_1.mp4" not in filename and "test_2.mp4" not in filename,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Robustness] The substring filter "test_1.mp4" not in filename and "test_2.mp4" not in filename is fragile — any future file containing test_1.mp4 anywhere in its path (e.g. prefix_test_1.mp4, or a directory named test_1.mp4_backup) would be unintentionally dropped. Prefer an exact basename check:

excluded = {"test_1.mp4", "test_2.mp4"}
filenames = filter(lambda f: os.path.basename(f) not in excluded, filenames)

Same applies to the equivalent block at lines 356-358.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed


ret = avcodec_send_packet(codec_ctx_, nullptr);
DALI_ENFORCE(ret >= 0,
DALI_ENFORCE(ret >= 0 || ret == AVERROR_EOF,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Nit/Wording] Now that AVERROR_EOF is accepted as a non-error, the message "Failed to send packet to decoder" is misleading for ret == AVERROR_EOF — except, of course, that branch no longer triggers the ENFORCE. Still, if the ENFORCE does fire, the message reads better as "avcodec_send_packet (flush) failed: ..." so it's clear which call site it is.

More importantly: dropping into flush_state_ = true on AVERROR_EOF is the right thing, but worth a 1-line comment here explaining why EOF is acceptable (the decoder has already drained — no more packets to send) so this doesn't look like accidental error-swallowing.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comment

// Tolerate a tiny number of isolated subpixel deviations exceeding eps. The CPU VP9
// decode path occasionally produces a single byte that differs by ~32 (suspected SIMD
// glitch in libavcodec/sws_scale that Valgrind cannot instrument). A genuine regression
// produces orders of magnitude more bad subpixels, so this budget still catches real
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Question/Design] Tolerating 16 deviating subpixels per frame to mask a suspected libavcodec/sws_scale SIMD glitch is pragmatic, but it does mean a regression that introduces ≤16 wrong subpixels per frame will be silently absorbed. Given the suspected upstream cause, is there any plan to file an upstream bug and/or pin a known-good FFmpeg version, or to scope this tolerance only to the VP9 paths it actually affects (e.g. an extra arg to CompareFrame rather than a global cap)?

Not blocking — the rationale in the comment is clear and the 16/2.7M ratio is tiny — just want to make sure we don't lose the thread on the underlying glitch.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. The threshold is set empirically based on the failures in tests observed. I don't know if there is a good FFmpeg we can pin. I still believe that 16 off pixels is still something we can tolerate, if something is seriously off the number if bigger.

@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [52188114]: BUILD FAILED

Comment thread dali/operators/video/frames_decoder_cpu.cc
Comment thread dali/operators/video/frames_decoder_base.cc


@params("cpu", "gpu")
@params("gpu")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Coverage] Dropping CPU from @params here is a real regression in test surface, not just a fixture-availability artifact. The test uses /db/video/full_dynamic_range/video.mp4 (H.264), which the new CPU codec policy rejects — so the only fix is a VP9 version of that fixture in DALI_extra. Worth filing a follow-up so this CPU path doesn't quietly stay uncovered, or extending the PR #135 set with video_vp9.mp4 here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code path that detect full range videos applies only to h264/h265, and for VP9 we would need to implement if separately. I will add this to our ToDo list.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up #6367

Comment thread dali/test/python/input/test_video.py Outdated
@params(*device_values)
# The only available test video with an audio stream (sintel) is h264, which is not supported by
# the CPU variant of the operator. Run this test only on the mixed (GPU) backend.
@params("mixed")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Coverage / Question] Same shape as the test_full_range_video_in_memory coverage loss — CPU dropped because the only audio-stream fixture is H.264. However: NVIDIA/DALI_extra#135 adds sintel_trailer_vp9.mp4. Does that file retain the audio stream from the original sintel? If yes, this test can stay parameterized over both devices by switching the input to the VP9 variant. If no, worth re-encoding to preserve audio so we don't lose this CPU-side audio-stream regression check permanently.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Comment thread dali/test/python/decoder/test_video.py Outdated
}

unsupported_cpu_codec_error = r"is not supported by the CPU variant of this operator\."
unsupported_cpu_codecs = {"h264", "hevc", "mpeg4"}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Drift risk — separate from the maintainability comment on the original review] This set claims to enumerate "unsupported on CPU" codecs but is missing AV1, which is also unsupported on the CPU path (no decoder in libavcodec build + commented out in frames_decoder_cpu.cc:223). It works today only because both test_decoder_operator_codec_support and test_reader_operator_codec_support early-skip AV1 with raise SkipTest("...Ampere+ GPUs..."). If/when AV1 becomes generally available and that skip is removed, the codec-rejection branch will be silently bypassed and the test will attempt a real decode on CPU. Add "av1" here defensively, or guard the skip with if codec == "av1" and not has_ampere(): raise SkipTest(...).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

@JanuszL JanuszL force-pushed the remove_cpu_codecs branch 2 times, most recently from c200f32 to 63ae75a Compare May 22, 2026 12:09
@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [52225500]: BUILD STARTED

Comment on lines 261 to 263
@params(*device_values)
def test_video_input_audio_stream(device):
"""
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Comment contradicts decorator — and audio-stream coverage may be lost.

The block comment says "Run this test only on the mixed (GPU) backend", but the decorator @params(*device_values) expands to @params("cpu", "mixed"), so the CPU case still runs. The file was switched from sintel_trailer-720p.mp4 (h264 + audio) to sintel_trailer_vp9.mp4: if the VP9 re-encode preserved the audio track, CPU can handle it and the comment is simply stale. If the audio track was dropped during re-encoding, neither device is actually testing the "video decoding when audio stream is present" scenario described in the docstring — the test would pass vacuously. Please confirm whether sintel_trailer_vp9.mp4 retains an audio stream, and update the comment (or restrict the decorator to @params("mixed")) accordingly.

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@JanuszL JanuszL force-pushed the remove_cpu_codecs branch from 63ae75a to 8428bfe Compare May 22, 2026 12:19
@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [52226023]: BUILD STARTED

Restrict the CPU frames decoder to codecs supported by the currently
compiled libavcodec configuration. H264 and HEVC are no longer
advertised for the CPU variant while VP8, VP9, and MJPEG remain
enabled.

Make `ReadRegularFrame` mark end-of-stream by setting `next_frame_idx_`
to -1 when the index reaches `NumFrames()`, mirroring the existing
guard in `ReadFlushFrame`. Without this, codecs with no decoder latency
(VP9 on the new test inputs) deliver the final frame via the regular
path, leaving `next_frame_idx_` at `NumFrames()` and causing
`VideoInput` depletion to be reported one batch late.

Reset the decoder when an indexed next frame falls outside the valid
range, avoiding reuse of an invalid decoder position.

Update video decoder tests to expect CPU failures for unsupported
codecs instead of skipping only MPEG4. Use VP9 CFR/VFR test inputs
and device-less CPU pipelines where appropriate. Point the CFR/VFR
reference frame folders at `frames_{1,2}_vp9/` so CPU decode of the
new VP9 fixtures matches at the existing eps=10 tolerance. Drop the
CPU HEVC frames-decoder tests (`ConstantFrameRateHevc`,
`VariableFrameRateHevc`, `VariableFrameRateHevcNoIndex`) — HEVC is no
longer in the CPU codec allow-list.

Tolerate up to 16 isolated subpixel deviations exceeding eps in
`TestVideo::CompareFrame` (out of ~2.7M subpixels per frame). The CPU
VP9 decode path occasionally produces a single byte that differs by
~32 — a SIMD glitch inside libavcodec/sws_scale that Valgrind cannot
instrument. The budget is orders of magnitude below what any genuine
regression would produce, so test sensitivity is preserved.

In `dali/test/python/input/test_video.py`, filter out h264 from the
round-robin fixture (the unsuffixed `test_{1,2}.mp4` in `cfr/`/`vfr/`
are h264) and restrict `test_video_input_audio_stream` to the mixed
backend — the only DALI_extra video with an audio stream is h264.

Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
@JanuszL JanuszL force-pushed the remove_cpu_codecs branch from 8428bfe to 5c1c613 Compare May 22, 2026 13:20
@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [52231011]: BUILD STARTED

@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [52231011]: BUILD PASSED

bool FramesDecoderCpu::ReadNextFrame(uint8_t *data) {
LOG_LINE << "FramesDecoderCpu::ReadNextFrame: next_frame_idx_=" << next_frame_idx_ << std::endl;
if (next_frame_idx_ == 0) {
// call NumFrames() to populate the index before nay frames are read
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe there's a typo here? Should it be new frames are read?

Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [52531032]: BUILD STARTED

Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [52552124]: BUILD STARTED

@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [52552124]: BUILD FAILED

@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [52552124]: BUILD PASSED

@JanuszL JanuszL merged commit a8d5ec5 into NVIDIA:main May 26, 2026
6 checks passed
@JanuszL JanuszL deleted the remove_cpu_codecs branch May 26, 2026 05:30
@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [52531032]: BUILD PASSED

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants