-
Notifications
You must be signed in to change notification settings - Fork 8
[feat] Add PyNvBatchAsyncStreamReader for 2D batched async stream dec… #20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,217 @@ | ||
| --- | ||
| name: docs-compliance | ||
| description: ACCV-Lab documentation conventions and pre-PR compliance check. INVOKE when creating or editing any .md or .rst file under docs/ or packages/*/docs/, when modifying a Python module/class/function/method docstring, or before opening a PR that touches documentation. Provides hard rules for Sphinx role usage, docstring formatting, public-API export requirements, admonition syntax, and a pre-PR checklist with verification commands. | ||
| --- | ||
|
|
||
| # ACCV-Lab Documentation Compliance | ||
|
|
||
| ## When this skill applies | ||
|
|
||
| - Creating or editing any `.md` / `.rst` file under `docs/` or `packages/*/docs/` | ||
| - Modifying a Python module, class, function, or method docstring (anything autodoc renders) | ||
| - Preparing a PR that touches documentation, samples, or public-API docstrings | ||
|
|
||
| ## Authoritative project references | ||
|
|
||
| Read these for ground truth before deviating from any rule below: | ||
|
|
||
| - `docs/conf.py` — Sphinx configuration (extensions, autodoc options, custom handlers) | ||
| - `docs/guides/DOCUMENTATION_SETUP_GUIDE.md` — build pipeline & directory structure | ||
| - `docs/guides/FORMATTING_GUIDE.md` — Python/C++ formatting (also affects docstring rendering) | ||
| - `docs/spelling_wordlist.txt` — accepted technical-term whitelist | ||
| - `docs/_ext/` — local Sphinx extensions (`note_literalinclude`, `module_docstring`, `markdown_note_admonitions`) | ||
|
|
||
| ## Hard rules | ||
|
|
||
| ### Rule 1 — API references: use Sphinx roles, never bare backticks | ||
|
|
||
| For any `accvlab.*` symbol mentioned in narrative text, use the appropriate role so it cross-links in the rendered HTML. | ||
|
|
||
| | Symbol kind | MyST role (`.md`) | RST role (`.rst`) | | ||
| |---|---|---| | ||
| | Class | `` {py:class}`~accvlab.<pkg>.<Class>` `` | `` :class:`~accvlab.<pkg>.<Class>` `` | | ||
| | Method | `` {py:meth}`~accvlab.<pkg>.<Class>.<method>` `` | `` :meth:`~accvlab.<pkg>.<Class>.<method>` `` | | ||
| | Module function | `` {py:func}`~accvlab.<pkg>.<func>` `` | `` :func:`~accvlab.<pkg>.<func>` `` | | ||
| | Attribute | `` {py:attr}`~accvlab.<pkg>.<Class>.<attr>` `` | `` :attr:`~accvlab.<pkg>.<Class>.<attr>` `` | | ||
|
|
||
| ``` | ||
| Bad: | ||
| See `lookup()` and `put()` for details. | ||
|
|
||
| Good: | ||
| See {py:meth}`~accvlab.on_demand_video_decoder.SharedGopStore.lookup` | ||
| and {py:meth}`~accvlab.on_demand_video_decoder.SharedGopStore.put` | ||
| for details. | ||
| ``` | ||
|
|
||
| **Exclusions** — keep bare backticks for: | ||
| - Stdlib types (`RuntimeWarning`, `NamedTuple`, `multiprocessing.Lock`) — this project does not cross-ref stdlib in user docs | ||
| - Parameter names, field names, prose terms (`access_tick`, `flock`, `spawn`) | ||
| - API names appearing inside fenced code blocks (` ```python ` … ` ``` `) — only narrative prose gets roles | ||
|
|
||
| ### Rule 2 — `Returns:` block formatting gotcha | ||
|
|
||
| In Google/NumPy-style docstrings, the **first line** after `Returns:` is silently parsed as a return-type annotation if it ends with `:`, even when a real type annotation is on the signature. This produces malformed return docs that look fine in the source but break in the rendered API table. | ||
|
|
||
| ``` | ||
| Bad: | ||
| Returns: | ||
| Tuple of three things: | ||
| - first | ||
| - second | ||
| - third | ||
|
|
||
| Good: | ||
| Returns: | ||
| Tuple containing | ||
|
|
||
| - first | ||
| - second | ||
| - third | ||
| ``` | ||
|
|
||
| Lead with prose that does **not** end in `:`, then a blank line, then the bullets. | ||
|
|
||
| ### Rule 3 — Public API must be exported | ||
|
|
||
| A new public class or function will not appear in the auto-generated `api.rst` unless **both** of these hold: | ||
|
|
||
| 1. It is imported in `packages/<pkg>/accvlab/<pkg>/__init__.py` | ||
| 2. It is listed in that file's `__all__` | ||
|
|
||
| Internal helpers belong under `_internal/` and are not exported. | ||
|
|
||
| ### Rule 4 — Type annotations on public APIs | ||
|
|
||
| Every public function parameter and return value must have a type annotation. `sphinx_autodoc_typehints` renders them into the docs; missing annotations produce gaps in the rendered API table. | ||
|
|
||
| ```python | ||
| Good: | ||
| def get_batch(self, refs: List[GopRef]) -> List[np.ndarray]: | ||
| ... | ||
| ``` | ||
|
|
||
| ### Rule 5 — Annotation must match docstring | ||
|
|
||
| When changing a function's signature (parameter types, return type, parameter names), update the corresponding `Args:` / `Returns:` lines in the docstring **in the same edit**. Stale docstrings vs. live signatures are caught in review. | ||
|
|
||
| ### Rule 6 — No implementation details in user-facing docs | ||
|
|
||
| User-facing docs (`docs/`, `packages/*/docs/`, public-class docstrings) describe **what the user does**, not **how the framework is implemented**. | ||
|
|
||
| ``` | ||
| Bad (jargon / impl detail leaked to user): | ||
| - put() acquires an flock for atomicity (double-check after acquiring the lock) | ||
| - Returns the original decoder | ||
| - Uses C++ GetGOP under the hood | ||
|
|
||
| Good: | ||
| - put() acquires an flock for atomicity | ||
| - Returns the underlying PyNvGopDecoder | ||
| - Returns cached data without re-demuxing | ||
| ``` | ||
|
|
||
| If a phrase would prompt the question *"is there something the user should do?"*, rewrite it. Implementation notes belong in source-level comments or developer-facing docstrings under `_internal/`, not user docs. | ||
|
|
||
| ### Rule 7 — Doc build must be warning-free | ||
|
|
||
| `./scripts/build_docs.sh` warnings and errors are **blocking**. Before requesting review: | ||
|
|
||
| ```bash | ||
| ./scripts/build_docs.sh 2>&1 | tee /tmp/docs_build.log | ||
| grep -iE 'warning|error' /tmp/docs_build.log | ||
| ``` | ||
|
|
||
| Resolve every new warning. Common sources: bad role syntax, missing `__all__` exports, malformed `Returns:` blocks, broken cross-refs, unknown spelling. | ||
|
|
||
| ### Rule 8 — Admonitions: blockquote form for dual-readable files | ||
|
|
||
| Files that must render correctly in **both** GitHub/IDE preview **and** Sphinx HTML use the blockquote admonition pattern: | ||
|
|
||
| ```md | ||
| > **ℹ️ Note**: Short tip for the reader. | ||
|
|
||
| > **⚠️ Important**: Crucial warning users must not miss. | ||
| ``` | ||
|
|
||
| The local `markdown_note_admonitions` extension converts these to Sphinx admonitions at build time. Multi-line notes are supported as long as every line starts with `>`. | ||
|
|
||
| Use fenced admonitions ```` ```{note} ```` / ```` ```{important} ```` **only** in files that are exclusively part of the built docs and never opened in GitHub/IDE. | ||
|
|
||
| ### Rule 9 — Edit source, not mirror | ||
|
|
||
| Source-of-truth lives at `packages/<pkg>/docs/`. Files under `docs/contained_package_docs_mirror/<pkg>/docs/` are symlinks regenerated by `mirror_referenced_dirs.py` at build time. Editing the mirror is at best a no-op and at worst destructive (overwritten on next build). | ||
|
|
||
| ### Rule 10 — Relative paths in include/image/literalinclude | ||
|
|
||
| Paths inside `.md` / `.rst` directives (`include`, `image`, `literalinclude`, etc.) must be **relative to the current document**. This keeps docs portable: links resolve correctly both in the original package directory and after mirroring into `docs/contained_package_docs_mirror/`. | ||
|
|
||
| ``` | ||
| Bad: | ||
| .. literalinclude:: /home/user/project/packages/foo/examples/demo.py | ||
|
|
||
| Good: | ||
| .. literalinclude:: ../examples/demo.py | ||
| ``` | ||
|
|
||
| Inside Python docstrings, paths are relative to the file that includes the docstring (the autodoc directive's location), so absolute paths are acceptable there. | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Absolute paths (w.r.t. the local file system) would still break. |
||
|
|
||
| ### Rule 11 — Sample docs: explain real-use-case provenance and cross-link | ||
|
|
||
| When a sample uses hard-coded values that would normally come from runtime sources (parser output, demuxer results, model outputs, etc.), explicitly document where those values come from in production. Cross-link to a related sample that demonstrates the real flow. | ||
|
|
||
| ```python | ||
| Good: | ||
| # Each task tuple: (video_path, target_frame_id, gop_first_frame, gop_len). | ||
| # | ||
| # In a real pipeline, gop_first_frame and gop_len would come from a | ||
| # demuxer (e.g. GetGOPList returning first_frame_ids / gop_lens). | ||
| # See samples/SampleSeparationAccessGOPListAPI.py for an end-to-end | ||
| # example. Hard-coded values here keep the demo dependency-free. | ||
| tasks = [...] | ||
| ``` | ||
|
|
||
| ## Pre-PR compliance checklist | ||
|
|
||
| Run through these before requesting review on any PR that touches docs or docstrings: | ||
|
|
||
| - [ ] All `accvlab.*` API references in narrative use `{py:meth}` / `{py:func}` / `{py:class}` roles | ||
| - [ ] No bare backticks for `accvlab.*` names except in code blocks | ||
| - [ ] Admonitions in dual-readable files use the blockquote pattern | ||
| - [ ] Edits land in `packages/<pkg>/docs/`, not in the mirror | ||
| - [ ] Paths in directives are relative | ||
| - [ ] New technical terms added to `docs/spelling_wordlist.txt` | ||
| - [ ] `./scripts/build_docs.sh` runs with no new warnings or errors | ||
| - [ ] `./scripts/build_docs.sh --spelling` reviewed; report at `docs/_build/spelling/output.txt` | ||
| - [ ] Sample docs reference real-use-case origin and cross-link to related samples | ||
|
|
||
| ## Verification commands | ||
|
|
||
| Quick scans to surface common violations before review: | ||
|
|
||
| ```bash | ||
| # 1. Bare backtick API references that should be sphinx roles. | ||
| # Customise the regex with the symbols touched by your PR. | ||
| grep -rnE '`(SharedGopStore|GopRef|CachedGopDecoder|PyNvGopDecoder|CreateGopDecoder)[A-Za-z_]*\(?\)?`' \ | ||
| docs/ packages/*/docs/ 2>/dev/null | grep -v '```' | ||
|
|
||
| # 2. Returns: block immediately followed by a line that ends in ':' (Rule 2 violation). | ||
| grep -rEn -A1 'Returns:$' packages/*/accvlab/ | grep -E ':\s*$' | ||
|
|
||
| # 3. Public symbols not exported in root __init__.py (manual diff). | ||
| # After adding `class Foo` or `def bar`, confirm: | ||
| # - `from .<sub> import Foo` (or `bar`) appears in __init__.py | ||
| # - `'Foo'` (or `'bar'`) appears in __all__ | ||
| grep -E '^(class|def) [A-Z]' packages/<pkg>/accvlab/<pkg>/<file>.py | ||
| grep -E "(<symbol>|__all__)" packages/<pkg>/accvlab/<pkg>/__init__.py | ||
|
|
||
| # 4. Accidental edits inside the mirror directory (Rule 9 violation). | ||
| git diff --name-only | grep contained_package_docs_mirror | ||
|
|
||
| # 5. Full doc build with warning surface. | ||
| ./scripts/build_docs.sh 2>&1 | grep -iE 'warning|error' | grep -v -i 'INFO' | ||
|
|
||
| # 6. Spelling check. | ||
| ./scripts/build_docs.sh --spelling | ||
| cat docs/_build/spelling/output.txt 2>/dev/null | ||
| ``` | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -209,3 +209,4 @@ unlinking | |
| atomicity | ||
| picklable | ||
| ABI | ||
| aggregator | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -25,6 +25,7 @@ section helps you quickly locate the sample code that matches your requirements. | |
| | [SampleDecodeFromGopFilesToListAPI.py](../samples/SampleDecodeFromGopFilesToListAPI.py) | Selective GOP loading | {py:meth}`~accvlab.on_demand_video_decoder.PyNvGopDecoder.LoadGopsToList`, {py:meth}`~accvlab.on_demand_video_decoder.PyNvGopDecoder.DecodeFromGOPListRGB` | | ||
| | [SampleDecodeFromGopList.py](../samples/SampleDecodeFromGopList.py) | Batch decode from multiple demux results (N demux → 1 decode) | {py:meth}`~accvlab.on_demand_video_decoder.PyNvGopDecoder.DecodeFromGOPListRGB` | | ||
| | [SampleStreamAsyncAccess.py](../samples/SampleStreamAsyncAccess.py) | Async stream decoding with prefetching | {py:func}`~accvlab.on_demand_video_decoder.CreateSampleReader`, {py:meth}`~accvlab.on_demand_video_decoder.PyNvSampleReader.DecodeN12ToRGBAsync`, {py:meth}`~accvlab.on_demand_video_decoder.PyNvSampleReader.DecodeN12ToRGBAsyncGetBuffer` | | ||
| | [SampleBatchAsyncStreamAccess.py](../samples/SampleBatchAsyncStreamAccess.py) | 2D async stream decoding — multiple frames per video per call, with prefetching | {py:func}`~accvlab.on_demand_video_decoder.CreateBatchAsyncStreamReader`, {py:meth}`~accvlab.on_demand_video_decoder.PyNvBatchAsyncStreamReader.Decode`, {py:meth}`~accvlab.on_demand_video_decoder.PyNvBatchAsyncStreamReader.GetBuffer` | | ||
| | [SampleSharedGopStore.py](../samples/SampleSharedGopStore.py) | Cross-process shared GOP cache for DataLoader | {py:class}`~accvlab.on_demand_video_decoder.SharedGopStore`, {py:class}`~accvlab.on_demand_video_decoder.GopRef` | | ||
|
|
||
| For details on the **Key APIs**, please refer to the API documentation of the corresponding functions and classes. | ||
|
|
@@ -43,7 +44,9 @@ If you need random frame access: | |
| → Use SampleRandomAccess | ||
|
|
||
| If you need sequential frame decoding: | ||
| If you need async decoding with prefetching for lower latency: | ||
| If you need multiple frames per video per call (2D batch): | ||
| → Use SampleBatchAsyncStreamAccess | ||
| Else if you need async decoding with prefetching for lower latency: | ||
| → Use SampleStreamAsyncAccess | ||
| Otherwise: | ||
| → Use SampleStreamAccess | ||
|
|
@@ -554,6 +557,141 @@ cd packages/on_demand_video_decoder/samples | |
| python SampleStreamAsyncAccess.py | ||
| ``` | ||
|
|
||
| #### 3.2.4 Sample: Batch Async Stream Access (2D) | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It may be good to mention in the description that for each |
||
|
|
||
| **File:** `packages/on_demand_video_decoder/samples/SampleBatchAsyncStreamAccess.py` | ||
|
|
||
| **When to Use** | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does this improve performance or is it "just" a convenience feature? It may be good to add a short note on that as especially the third point in the list below seems to indicate that this is a convenience feature. |
||
|
|
||
| The 2D batch async API is preferred over basic async stream access when: | ||
| - Each iteration consumes **multiple frames per video** (e.g. multi-sweep | ||
| StreamPETR-like training where one batch needs F sweeps × V cameras) | ||
| - You want a single in-flight submission to cover V × F frames instead of | ||
| V frames | ||
| - You want the output as a 2D structure ``out[v][f]`` rather than re-batching | ||
| V results F times in Python | ||
|
|
||
| The 1D async API ({py:meth}`~accvlab.on_demand_video_decoder.PyNvSampleReader.DecodeN12ToRGBAsync`) | ||
| remains the right choice when you only need one frame per video per | ||
| iteration. | ||
|
|
||
| **Key Differences from 1D Async Stream Access** | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It may be good to mention that the file list remains the same (i.e. a flat list with one file name per camera) for both methods. Although this is already implied by the description above, I think it may help with understanding. |
||
|
|
||
| | Feature | 1D Async ({py:class}`~accvlab.on_demand_video_decoder.PyNvSampleReader`) | 2D Batch Async ({py:class}`~accvlab.on_demand_video_decoder.PyNvBatchAsyncStreamReader`) | | ||
| |---------|---------|---------| | ||
| | Frame ids shape | ``List[int]`` (len V) | ``List[List[int]]`` (V × F) | | ||
| | Returned structure | ``List[RGBFrame]`` (len V) | ``List[List[RGBFrame]]`` (V × F) | | ||
| | Frames decoded per call | V | V × F | | ||
| | Result buffer | 1 result, V frames | 1 result, V × F frames | | ||
| | Pool sized at construction by | (n/a — per-reader) | ``max_frames_per_decode_call`` | | ||
|
|
||
| **Core APIs** | ||
|
|
||
| - {py:func}`~accvlab.on_demand_video_decoder.CreateBatchAsyncStreamReader`: Construct a 2D batch async reader | ||
| - {py:meth}`~accvlab.on_demand_video_decoder.PyNvBatchAsyncStreamReader.Decode`: Submit an async 2D decode (returns immediately) | ||
| - {py:meth}`~accvlab.on_demand_video_decoder.PyNvBatchAsyncStreamReader.GetBuffer`: Block until decode is done and return decoded frames | ||
|
|
||
| **Code Walkthrough** | ||
|
|
||
| Construct the reader. ``max_frames_per_decode_call`` is the F upper bound | ||
| (per ``Decode()`` call, not per video file): | ||
|
|
||
| ```python | ||
| import accvlab.on_demand_video_decoder as nvc | ||
|
|
||
| reader = nvc.CreateBatchAsyncStreamReader( | ||
| num_of_set=1, | ||
| num_of_file=6, # V upper bound | ||
| max_frames_per_decode_call=4, # F upper bound (per Decode() call) | ||
| iGpu=0, | ||
| ) | ||
| ``` | ||
|
|
||
| Build a 2D frame_ids and submit: | ||
|
|
||
| ```python | ||
| V = len(file_path_list) | ||
| F = 4 | ||
| # frame_ids[v][f] = f-th frame requested for video v. | ||
| # All inner lists must be the same length (jagged inner lengths are rejected). | ||
| frame_ids = [[0, 7, 14, 21]] * V | ||
|
|
||
| reader.Decode(file_path_list, frame_ids, as_bgr=False) | ||
| # Returns immediately; decoding happens on a background worker thread. | ||
| ``` | ||
|
|
||
| Retrieve the result: | ||
|
|
||
| ```python | ||
| out = reader.GetBuffer(file_path_list, frame_ids, as_bgr=False) | ||
| # out is List[List[RGBFrame]] indexed [v][f]. | ||
| # out[v][f].shape == (H, W, 3), dtype uint8, GPU memory. | ||
| ``` | ||
|
|
||
| **Two Contracts to Remember** | ||
|
|
||
| > **ℹ️ Note**: When | ||
| > {py:meth}`~accvlab.on_demand_video_decoder.PyNvBatchAsyncStreamReader.GetBuffer` | ||
| > returns, all GPU work (decode + internal copies) is already complete. You | ||
| > can read the returned frames on any CUDA stream — including PyTorch's | ||
| > default stream — without additional synchronization. | ||
|
|
||
| > **⚠️ Important**: The returned | ||
| > {py:class}`~accvlab.on_demand_video_decoder.RGBFrame` objects are zero-copy | ||
| > views into the reader's internal aggregator pool. Submitting the next | ||
| > {py:meth}`~accvlab.on_demand_video_decoder.PyNvBatchAsyncStreamReader.Decode` | ||
| > reuses that memory. You **must** clone every frame you want to keep | ||
| > **before** the next ``Decode()`` call. Skipping the clone is silent data | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nitpick: I would say "Skipping the clone [leads to]" instead of "[is]" |
||
| > corruption. | ||
|
|
||
| **Canonical Prefetch Pattern** | ||
|
|
||
| ```python | ||
| # Iteration 0: prime the pipeline | ||
| reader.Decode(files, frame_ids_0, as_bgr=False) | ||
| out = reader.GetBuffer(files, frame_ids_0, as_bgr=False) | ||
|
|
||
| # Clone before submitting the next batch | ||
| tensors_0 = [ | ||
| [torch.as_tensor(out[v][f], device="cuda").clone() for f in range(F)] | ||
| for v in range(V) | ||
| ] | ||
|
|
||
| # Prefetch iteration 1 in parallel with processing iteration 0 | ||
| reader.Decode(files, frame_ids_1, as_bgr=False) | ||
| # ... process tensors_0 here (model forward, etc.) ... | ||
|
|
||
| # Iteration 1: GetBuffer is usually already-ready because of the prefetch | ||
| out = reader.GetBuffer(files, frame_ids_1, as_bgr=False) | ||
| tensors_1 = [ | ||
| [torch.as_tensor(out[v][f], device="cuda").clone() for f in range(F)] | ||
| for v in range(V) | ||
| ] | ||
| reader.Decode(files, frame_ids_2, as_bgr=False) | ||
| # ... process tensors_1 ... | ||
| ``` | ||
|
|
||
| **Resolution Handling** | ||
|
|
||
| Videos in a single ``Decode()`` call may have **different resolutions** — each | ||
| video gets its own per-slot aggregator pool, sized lazily to that video's | ||
| ``F * H_v * W_v * 3`` on the first ``Decode()`` that hits the slot. If a later | ||
| ``Decode()`` swaps in a video at the same slot with a different resolution, | ||
| the pool is reallocated automatically (grows if larger; reuses the existing | ||
| allocation if same or smaller). | ||
|
|
||
| Per-frame shape consequence: ``out[v][f].shape == (H_v, W_v, 3)`` may vary | ||
| across ``v``. The frames are not stack-able into a single | ||
| ``[V, F, H, W, 3]`` tensor without resize/pad — that is a physical fact of | ||
| mixed-resolution input, not an API limitation. | ||
|
|
||
| **Running the Sample** | ||
|
|
||
| ```bash | ||
| cd packages/on_demand_video_decoder/samples | ||
| python SampleBatchAsyncStreamAccess.py | ||
| ``` | ||
|
|
||
| ### 3.3 Separation Access Decoding | ||
|
|
||
| Separation Access mode decouples demuxing and decoding into two separate stages. This provides fine-grained control over the video processing pipeline and enables advanced optimization strategies. | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: In the best case, the edit would actually be done to the original file (as symlinks are used). Maybe say something like: