Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
217 changes: 217 additions & 0 deletions .claude/skills/docs-compliance/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,217 @@
---
name: docs-compliance
description: ACCV-Lab documentation conventions and pre-PR compliance check. INVOKE when creating or editing any .md or .rst file under docs/ or packages/*/docs/, when modifying a Python module/class/function/method docstring, or before opening a PR that touches documentation. Provides hard rules for Sphinx role usage, docstring formatting, public-API export requirements, admonition syntax, and a pre-PR checklist with verification commands.
---

# ACCV-Lab Documentation Compliance

## When this skill applies

- Creating or editing any `.md` / `.rst` file under `docs/` or `packages/*/docs/`
- Modifying a Python module, class, function, or method docstring (anything autodoc renders)
- Preparing a PR that touches documentation, samples, or public-API docstrings

## Authoritative project references

Read these for ground truth before deviating from any rule below:

- `docs/conf.py` — Sphinx configuration (extensions, autodoc options, custom handlers)
- `docs/guides/DOCUMENTATION_SETUP_GUIDE.md` — build pipeline & directory structure
- `docs/guides/FORMATTING_GUIDE.md` — Python/C++ formatting (also affects docstring rendering)
- `docs/spelling_wordlist.txt` — accepted technical-term whitelist
- `docs/_ext/` — local Sphinx extensions (`note_literalinclude`, `module_docstring`, `markdown_note_admonitions`)

## Hard rules

### Rule 1 — API references: use Sphinx roles, never bare backticks

For any `accvlab.*` symbol mentioned in narrative text, use the appropriate role so it cross-links in the rendered HTML.

| Symbol kind | MyST role (`.md`) | RST role (`.rst`) |
|---|---|---|
| Class | `` {py:class}`~accvlab.<pkg>.<Class>` `` | `` :class:`~accvlab.<pkg>.<Class>` `` |
| Method | `` {py:meth}`~accvlab.<pkg>.<Class>.<method>` `` | `` :meth:`~accvlab.<pkg>.<Class>.<method>` `` |
| Module function | `` {py:func}`~accvlab.<pkg>.<func>` `` | `` :func:`~accvlab.<pkg>.<func>` `` |
| Attribute | `` {py:attr}`~accvlab.<pkg>.<Class>.<attr>` `` | `` :attr:`~accvlab.<pkg>.<Class>.<attr>` `` |

```
Bad:
See `lookup()` and `put()` for details.

Good:
See {py:meth}`~accvlab.on_demand_video_decoder.SharedGopStore.lookup`
and {py:meth}`~accvlab.on_demand_video_decoder.SharedGopStore.put`
for details.
```

**Exclusions** — keep bare backticks for:
- Stdlib types (`RuntimeWarning`, `NamedTuple`, `multiprocessing.Lock`) — this project does not cross-ref stdlib in user docs
- Parameter names, field names, prose terms (`access_tick`, `flock`, `spawn`)
- API names appearing inside fenced code blocks (` ```python ` … ` ``` `) — only narrative prose gets roles

### Rule 2 — `Returns:` block formatting gotcha

In Google/NumPy-style docstrings, the **first line** after `Returns:` is silently parsed as a return-type annotation if it ends with `:`, even when a real type annotation is on the signature. This produces malformed return docs that look fine in the source but break in the rendered API table.

```
Bad:
Returns:
Tuple of three things:
- first
- second
- third

Good:
Returns:
Tuple containing

- first
- second
- third
```

Lead with prose that does **not** end in `:`, then a blank line, then the bullets.

### Rule 3 — Public API must be exported

A new public class or function will not appear in the auto-generated `api.rst` unless **both** of these hold:

1. It is imported in `packages/<pkg>/accvlab/<pkg>/__init__.py`
2. It is listed in that file's `__all__`

Internal helpers belong under `_internal/` and are not exported.

### Rule 4 — Type annotations on public APIs

Every public function parameter and return value must have a type annotation. `sphinx_autodoc_typehints` renders them into the docs; missing annotations produce gaps in the rendered API table.

```python
Good:
def get_batch(self, refs: List[GopRef]) -> List[np.ndarray]:
...
```

### Rule 5 — Annotation must match docstring

When changing a function's signature (parameter types, return type, parameter names), update the corresponding `Args:` / `Returns:` lines in the docstring **in the same edit**. Stale docstrings vs. live signatures are caught in review.

### Rule 6 — No implementation details in user-facing docs

User-facing docs (`docs/`, `packages/*/docs/`, public-class docstrings) describe **what the user does**, not **how the framework is implemented**.

```
Bad (jargon / impl detail leaked to user):
- put() acquires an flock for atomicity (double-check after acquiring the lock)
- Returns the original decoder
- Uses C++ GetGOP under the hood

Good:
- put() acquires an flock for atomicity
- Returns the underlying PyNvGopDecoder
- Returns cached data without re-demuxing
```

If a phrase would prompt the question *"is there something the user should do?"*, rewrite it. Implementation notes belong in source-level comments or developer-facing docstrings under `_internal/`, not user docs.

### Rule 7 — Doc build must be warning-free

`./scripts/build_docs.sh` warnings and errors are **blocking**. Before requesting review:

```bash
./scripts/build_docs.sh 2>&1 | tee /tmp/docs_build.log
grep -iE 'warning|error' /tmp/docs_build.log
```

Resolve every new warning. Common sources: bad role syntax, missing `__all__` exports, malformed `Returns:` blocks, broken cross-refs, unknown spelling.

### Rule 8 — Admonitions: blockquote form for dual-readable files

Files that must render correctly in **both** GitHub/IDE preview **and** Sphinx HTML use the blockquote admonition pattern:

```md
> **ℹ️ Note**: Short tip for the reader.

> **⚠️ Important**: Crucial warning users must not miss.
```

The local `markdown_note_admonitions` extension converts these to Sphinx admonitions at build time. Multi-line notes are supported as long as every line starts with `>`.

Use fenced admonitions ```` ```{note} ```` / ```` ```{important} ```` **only** in files that are exclusively part of the built docs and never opened in GitHub/IDE.

### Rule 9 — Edit source, not mirror

Source-of-truth lives at `packages/<pkg>/docs/`. Files under `docs/contained_package_docs_mirror/<pkg>/docs/` are symlinks regenerated by `mirror_referenced_dirs.py` at build time. Editing the mirror is at best a no-op and at worst destructive (overwritten on next build).
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: In the best case, the edit would actually be done to the original file (as symlinks are used). Maybe say something like:

In the best case, this would work accidentally (due to symlink behavior), but in the worst case, ...


### Rule 10 — Relative paths in include/image/literalinclude

Paths inside `.md` / `.rst` directives (`include`, `image`, `literalinclude`, etc.) must be **relative to the current document**. This keeps docs portable: links resolve correctly both in the original package directory and after mirroring into `docs/contained_package_docs_mirror/`.

```
Bad:
.. literalinclude:: /home/user/project/packages/foo/examples/demo.py

Good:
.. literalinclude:: ../examples/demo.py
```

Inside Python docstrings, paths are relative to the file that includes the docstring (the autodoc directive's location), so absolute paths are acceptable there.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolute paths (w.r.t. the local file system) would still break.
Maybe something like:

For directives inside Python docstrings rendered by autodoc:
1. Find the docs file that renders the docstring, usually `packages/<pkg>/docs/api*.rst`.
2. Resolve directive paths from that docs file, not from the Python source file.
3. Prefer files under `packages/<pkg>/docs/`, e.g. `.. image:: images/foo.png`.
4. For package-root sibling dirs listed in `docu_referenced_dirs.txt`, use paths from `docs/`, e.g. `.. literalinclude:: ../example/foo.py`.
5. Never use machine-local absolute filesystem paths such as `/home/user/...`.


### Rule 11 — Sample docs: explain real-use-case provenance and cross-link

When a sample uses hard-coded values that would normally come from runtime sources (parser output, demuxer results, model outputs, etc.), explicitly document where those values come from in production. Cross-link to a related sample that demonstrates the real flow.

```python
Good:
# Each task tuple: (video_path, target_frame_id, gop_first_frame, gop_len).
#
# In a real pipeline, gop_first_frame and gop_len would come from a
# demuxer (e.g. GetGOPList returning first_frame_ids / gop_lens).
# See samples/SampleSeparationAccessGOPListAPI.py for an end-to-end
# example. Hard-coded values here keep the demo dependency-free.
tasks = [...]
```

## Pre-PR compliance checklist

Run through these before requesting review on any PR that touches docs or docstrings:

- [ ] All `accvlab.*` API references in narrative use `{py:meth}` / `{py:func}` / `{py:class}` roles
- [ ] No bare backticks for `accvlab.*` names except in code blocks
- [ ] Admonitions in dual-readable files use the blockquote pattern
- [ ] Edits land in `packages/<pkg>/docs/`, not in the mirror
- [ ] Paths in directives are relative
- [ ] New technical terms added to `docs/spelling_wordlist.txt`
- [ ] `./scripts/build_docs.sh` runs with no new warnings or errors
- [ ] `./scripts/build_docs.sh --spelling` reviewed; report at `docs/_build/spelling/output.txt`
- [ ] Sample docs reference real-use-case origin and cross-link to related samples

## Verification commands

Quick scans to surface common violations before review:

```bash
# 1. Bare backtick API references that should be sphinx roles.
# Customise the regex with the symbols touched by your PR.
grep -rnE '`(SharedGopStore|GopRef|CachedGopDecoder|PyNvGopDecoder|CreateGopDecoder)[A-Za-z_]*\(?\)?`' \
docs/ packages/*/docs/ 2>/dev/null | grep -v '```'

# 2. Returns: block immediately followed by a line that ends in ':' (Rule 2 violation).
grep -rEn -A1 'Returns:$' packages/*/accvlab/ | grep -E ':\s*$'

# 3. Public symbols not exported in root __init__.py (manual diff).
# After adding `class Foo` or `def bar`, confirm:
# - `from .<sub> import Foo` (or `bar`) appears in __init__.py
# - `'Foo'` (or `'bar'`) appears in __all__
grep -E '^(class|def) [A-Z]' packages/<pkg>/accvlab/<pkg>/<file>.py
grep -E "(<symbol>|__all__)" packages/<pkg>/accvlab/<pkg>/__init__.py

# 4. Accidental edits inside the mirror directory (Rule 9 violation).
git diff --name-only | grep contained_package_docs_mirror

# 5. Full doc build with warning surface.
./scripts/build_docs.sh 2>&1 | grep -iE 'warning|error' | grep -v -i 'INFO'

# 6. Spelling check.
./scripts/build_docs.sh --spelling
cat docs/_build/spelling/output.txt 2>/dev/null
```
1 change: 1 addition & 0 deletions docs/spelling_wordlist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -209,3 +209,4 @@ unlinking
atomicity
picklable
ABI
aggregator
Original file line number Diff line number Diff line change
Expand Up @@ -83,10 +83,12 @@ def _preload_local_ffmpeg() -> None:
# C++ core interfaces
'PyNvGopDecoder',
'PyNvSampleReader',
'PyNvBatchAsyncStreamReader',
'FastStreamInfo',
'DecodedFrameExt',
'RGBFrame',
'CreateSampleReader',
'CreateBatchAsyncStreamReader',
'GetFastInitInfo',
'SavePacketsToFile',
# Python decoder with caching
Expand Down
140 changes: 139 additions & 1 deletion packages/on_demand_video_decoder/docs/sample.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ section helps you quickly locate the sample code that matches your requirements.
| [SampleDecodeFromGopFilesToListAPI.py](../samples/SampleDecodeFromGopFilesToListAPI.py) | Selective GOP loading | {py:meth}`~accvlab.on_demand_video_decoder.PyNvGopDecoder.LoadGopsToList`, {py:meth}`~accvlab.on_demand_video_decoder.PyNvGopDecoder.DecodeFromGOPListRGB` |
| [SampleDecodeFromGopList.py](../samples/SampleDecodeFromGopList.py) | Batch decode from multiple demux results (N demux → 1 decode) | {py:meth}`~accvlab.on_demand_video_decoder.PyNvGopDecoder.DecodeFromGOPListRGB` |
| [SampleStreamAsyncAccess.py](../samples/SampleStreamAsyncAccess.py) | Async stream decoding with prefetching | {py:func}`~accvlab.on_demand_video_decoder.CreateSampleReader`, {py:meth}`~accvlab.on_demand_video_decoder.PyNvSampleReader.DecodeN12ToRGBAsync`, {py:meth}`~accvlab.on_demand_video_decoder.PyNvSampleReader.DecodeN12ToRGBAsyncGetBuffer` |
| [SampleBatchAsyncStreamAccess.py](../samples/SampleBatchAsyncStreamAccess.py) | 2D async stream decoding — multiple frames per video per call, with prefetching | {py:func}`~accvlab.on_demand_video_decoder.CreateBatchAsyncStreamReader`, {py:meth}`~accvlab.on_demand_video_decoder.PyNvBatchAsyncStreamReader.Decode`, {py:meth}`~accvlab.on_demand_video_decoder.PyNvBatchAsyncStreamReader.GetBuffer` |
| [SampleSharedGopStore.py](../samples/SampleSharedGopStore.py) | Cross-process shared GOP cache for DataLoader | {py:class}`~accvlab.on_demand_video_decoder.SharedGopStore`, {py:class}`~accvlab.on_demand_video_decoder.GopRef` |

For details on the **Key APIs**, please refer to the API documentation of the corresponding functions and classes.
Expand All @@ -43,7 +44,9 @@ If you need random frame access:
→ Use SampleRandomAccess

If you need sequential frame decoding:
If you need async decoding with prefetching for lower latency:
If you need multiple frames per video per call (2D batch):
→ Use SampleBatchAsyncStreamAccess
Else if you need async decoding with prefetching for lower latency:
→ Use SampleStreamAsyncAccess
Otherwise:
→ Use SampleStreamAccess
Expand Down Expand Up @@ -554,6 +557,141 @@ cd packages/on_demand_video_decoder/samples
python SampleStreamAsyncAccess.py
```

#### 3.2.4 Sample: Batch Async Stream Access (2D)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be good to mention in the description that for each V, the same number of frames must be used.


**File:** `packages/on_demand_video_decoder/samples/SampleBatchAsyncStreamAccess.py`

**When to Use**
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this improve performance or is it "just" a convenience feature? It may be good to add a short note on that as especially the third point in the list below seems to indicate that this is a convenience feature.


The 2D batch async API is preferred over basic async stream access when:
- Each iteration consumes **multiple frames per video** (e.g. multi-sweep
StreamPETR-like training where one batch needs F sweeps × V cameras)
- You want a single in-flight submission to cover V × F frames instead of
V frames
- You want the output as a 2D structure ``out[v][f]`` rather than re-batching
V results F times in Python

The 1D async API ({py:meth}`~accvlab.on_demand_video_decoder.PyNvSampleReader.DecodeN12ToRGBAsync`)
remains the right choice when you only need one frame per video per
iteration.

**Key Differences from 1D Async Stream Access**
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be good to mention that the file list remains the same (i.e. a flat list with one file name per camera) for both methods. Although this is already implied by the description above, I think it may help with understanding.


| Feature | 1D Async ({py:class}`~accvlab.on_demand_video_decoder.PyNvSampleReader`) | 2D Batch Async ({py:class}`~accvlab.on_demand_video_decoder.PyNvBatchAsyncStreamReader`) |
|---------|---------|---------|
| Frame ids shape | ``List[int]`` (len V) | ``List[List[int]]`` (V × F) |
| Returned structure | ``List[RGBFrame]`` (len V) | ``List[List[RGBFrame]]`` (V × F) |
| Frames decoded per call | V | V × F |
| Result buffer | 1 result, V frames | 1 result, V × F frames |
| Pool sized at construction by | (n/a — per-reader) | ``max_frames_per_decode_call`` |

**Core APIs**

- {py:func}`~accvlab.on_demand_video_decoder.CreateBatchAsyncStreamReader`: Construct a 2D batch async reader
- {py:meth}`~accvlab.on_demand_video_decoder.PyNvBatchAsyncStreamReader.Decode`: Submit an async 2D decode (returns immediately)
- {py:meth}`~accvlab.on_demand_video_decoder.PyNvBatchAsyncStreamReader.GetBuffer`: Block until decode is done and return decoded frames

**Code Walkthrough**

Construct the reader. ``max_frames_per_decode_call`` is the F upper bound
(per ``Decode()`` call, not per video file):

```python
import accvlab.on_demand_video_decoder as nvc

reader = nvc.CreateBatchAsyncStreamReader(
num_of_set=1,
num_of_file=6, # V upper bound
max_frames_per_decode_call=4, # F upper bound (per Decode() call)
iGpu=0,
)
```

Build a 2D frame_ids and submit:

```python
V = len(file_path_list)
F = 4
# frame_ids[v][f] = f-th frame requested for video v.
# All inner lists must be the same length (jagged inner lengths are rejected).
frame_ids = [[0, 7, 14, 21]] * V

reader.Decode(file_path_list, frame_ids, as_bgr=False)
# Returns immediately; decoding happens on a background worker thread.
```

Retrieve the result:

```python
out = reader.GetBuffer(file_path_list, frame_ids, as_bgr=False)
# out is List[List[RGBFrame]] indexed [v][f].
# out[v][f].shape == (H, W, 3), dtype uint8, GPU memory.
```

**Two Contracts to Remember**

> **ℹ️ Note**: When
> {py:meth}`~accvlab.on_demand_video_decoder.PyNvBatchAsyncStreamReader.GetBuffer`
> returns, all GPU work (decode + internal copies) is already complete. You
> can read the returned frames on any CUDA stream — including PyTorch's
> default stream — without additional synchronization.

> **⚠️ Important**: The returned
> {py:class}`~accvlab.on_demand_video_decoder.RGBFrame` objects are zero-copy
> views into the reader's internal aggregator pool. Submitting the next
> {py:meth}`~accvlab.on_demand_video_decoder.PyNvBatchAsyncStreamReader.Decode`
> reuses that memory. You **must** clone every frame you want to keep
> **before** the next ``Decode()`` call. Skipping the clone is silent data
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: I would say "Skipping the clone [leads to]" instead of "[is]"

> corruption.

**Canonical Prefetch Pattern**

```python
# Iteration 0: prime the pipeline
reader.Decode(files, frame_ids_0, as_bgr=False)
out = reader.GetBuffer(files, frame_ids_0, as_bgr=False)

# Clone before submitting the next batch
tensors_0 = [
[torch.as_tensor(out[v][f], device="cuda").clone() for f in range(F)]
for v in range(V)
]

# Prefetch iteration 1 in parallel with processing iteration 0
reader.Decode(files, frame_ids_1, as_bgr=False)
# ... process tensors_0 here (model forward, etc.) ...

# Iteration 1: GetBuffer is usually already-ready because of the prefetch
out = reader.GetBuffer(files, frame_ids_1, as_bgr=False)
tensors_1 = [
[torch.as_tensor(out[v][f], device="cuda").clone() for f in range(F)]
for v in range(V)
]
reader.Decode(files, frame_ids_2, as_bgr=False)
# ... process tensors_1 ...
```

**Resolution Handling**

Videos in a single ``Decode()`` call may have **different resolutions** — each
video gets its own per-slot aggregator pool, sized lazily to that video's
``F * H_v * W_v * 3`` on the first ``Decode()`` that hits the slot. If a later
``Decode()`` swaps in a video at the same slot with a different resolution,
the pool is reallocated automatically (grows if larger; reuses the existing
allocation if same or smaller).

Per-frame shape consequence: ``out[v][f].shape == (H_v, W_v, 3)`` may vary
across ``v``. The frames are not stack-able into a single
``[V, F, H, W, 3]`` tensor without resize/pad — that is a physical fact of
mixed-resolution input, not an API limitation.

**Running the Sample**

```bash
cd packages/on_demand_video_decoder/samples
python SampleBatchAsyncStreamAccess.py
```

### 3.3 Separation Access Decoding

Separation Access mode decouples demuxing and decoding into two separate stages. This provides fine-grained control over the video processing pipeline and enables advanced optimization strategies.
Expand Down
Loading