Skip to content

Add more Python-related leak suppressions.#6370

Open
JanuszL wants to merge 1 commit into
NVIDIA:mainfrom
JanuszL:more_lsan_supres
Open

Add more Python-related leak suppressions.#6370
JanuszL wants to merge 1 commit into
NVIDIA:mainfrom
JanuszL:more_lsan_supres

Conversation

@JanuszL
Copy link
Copy Markdown
Contributor

@JanuszL JanuszL commented May 26, 2026

  • After moving to Python 3.12 for the sanitized build,
    new false positives have started appearing. This requires
    adding additional suppression entries and modifies sanitizer
    options.

Category:

Other (e.g. Documentation, Tests, Configuration)

Description:

  • After moving to Python 3.12 for the sanitized build,
    new false positives have started appearing. This requires
    adding additional suppression entries.

Additional information:

Affected modules and functionalities:

  • sanitizer suppression list
  • sanitizer options

Key points relevant for the review:

  • NA

Tests:

  • Existing tests apply
    • all tests with sanitizers on
  • New tests added
    • Python tests
    • GTests
    • Benchmark
    • Other
  • N/A

Checklist

Documentation

  • Existing documentation applies
  • Documentation updated
    • Docstring
    • Doxygen
    • RST
    • Jupyter
    • Other
  • N/A

DALI team only

Requirements

  • Implements new requirements
  • Affects existing requirements
  • N/A

REQ IDs: N/A

JIRA TASK: N/A

@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [52651772]: BUILD STARTED

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 26, 2026

Greptile Summary

This PR addresses false-positive leak reports that appeared after upgrading the sanitized build to Python 3.12. The core change is switching ASAN from start_deactivated=true to start_deactivated=false, which causes ASAN to monitor allocations from the very start of the process (including Python interpreter initialization), exposing new CPython-internal allocations that were previously invisible. The suppression list and test scripts are updated accordingly.

  • qa/leak.sup: Removes the now-redundant specific _PyObject_GC_New/_PyObject_GC_NewVar entries (covered by the new _PyObject_GC_* wildcard) and adds 10 additional Python 3.12 suppressions plus leak:/usr/bin/sort.
  • qa/test_template_impl.sh: Reorders enable_sanitizer() to set up LD_LIBRARY_PATH (now including nvjpeg2k and nvtiff dependency paths) before exporting ASAN_OPTIONS; adds malloc_context_size=64 for richer stack traces.
  • qa/TL0_cpu_only/test_nofw.sh & test_pytorch.sh: Mirror the LD_LIBRARY_PATH ordering fix, prepending nvjpeg2k/nvtiff dependency directories before nvimgcodec.

Confidence Score: 5/5

Safe to merge — all changes are confined to CI/sanitizer configuration files with no impact on production DALI code.

The PR touches only QA scripts and a leak suppression list. The switch from start_deactivated=true to false is intentional and well-motivated by the Python 3.12 migration. The new suppression entries are scoped to known CPython internals and an external binary, and the wildcard PyObject_GC* cleanly replaces two redundant specific entries. No correctness or data-loss risk.

No files require special attention.

Important Files Changed

Filename Overview
qa/leak.sup Removes two specific GC suppression entries (_PyObject_GC_New, _PyObject_GC_NewVar) now covered by the new wildcard PyObject_GC*, and adds 10 new Python 3.12 false-positive suppressions plus leak:/usr/bin/sort; file still lacks a trailing newline (pre-existing).
qa/test_template_impl.sh Rewrites enable_sanitizer(): adds nvjpeg2k/nvtiff dep paths before nvimgcodec, moves ASAN_OPTIONS after path setup, switches start_deactivated from true to false, and adds malloc_context_size=64. When either nvjpeg2k or nvtiff is absent the
qa/TL0_cpu_only/test_nofw.sh Mirrors test_template_impl.sh change: prepends nvjpeg2k and nvtiff lib paths before nvimgcodec; same trailing-colon risk when those packages are absent.
qa/TL0_cpu_only/test_pytorch.sh Identical LD_LIBRARY_PATH expansion pattern as test_nofw.sh; same trailing-colon risk on missing packages.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[DALI_ENABLE_SANITIZERS set?] -->|yes| B[enable_sanitizer]
    A -->|no| Z[run test_body normally]
    B --> C[export PYTHONMALLOC=malloc]
    C --> D[compile & preload libfakeclose.so]
    D --> E[save OLD_LD_LIBRARY_PATH2]
    E --> F[append nvjpeg2k /lib path]
    F --> G[append nvtiff /lib path]
    G --> H[append nvimgcodec dir]
    H --> I["export ASAN_OPTIONS\nstart_deactivated=false\nmalloc_context_size=64\nleaks+symbolizer enabled"]
    I --> J[export ASAN_SYMBOLIZER_PATH]
    J --> K[run test_body]
    K --> L[disable_sanitizer]
    L --> M[process_sanitizers_logs]
    M --> N{grep ERROR in sanitizer.log?}
    N -->|yes| O[exit 1]
    N -->|no| P[continue]
Loading

Fix All in Claude Code

Reviews (4): Last reviewed commit: "Add more Python-related leak suppression..." | Re-trigger Greptile

Comment thread qa/test_template_impl.sh Outdated
Comment on lines +79 to +81
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:"$(python -c 'import nvidia.nvjpeg2k as n, os; print(os.path.dirname(n.__file__) + "/lib")' 2>/dev/null)"
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:"$(python -c 'import nvidia.nvtiff as n, os; print(os.path.dirname(n.__file__) + "/lib")' 2>/dev/null)"
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:"$(python -c 'import nvidia.nvimgcodec as n, os; print(os.path.dirname(n.__file__))' 2>/dev/null)"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 [Nit] The two new lines in enable_sanitizer() use 2>/dev/null without || echo '', while the equivalent lines in test_nofw.sh and test_pytorch.sh use 2>/dev/null || echo ''. Both produce an empty string on failure, so the behaviour is identical, but the inconsistency may surprise a future reader. Consider aligning the style.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@JanuszL JanuszL force-pushed the more_lsan_supres branch from 44bf62a to 5bd9383 Compare May 26, 2026 15:17
@rostan-t rostan-t self-assigned this May 26, 2026
@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [52651772]: BUILD FAILED

@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [52687092]: BUILD STARTED

- After moving to Python 3.12 for the sanitized build,
  new false positives have started appearing. This requires
  adding additional suppression entries and modifies sanitizer
  options.

Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
@JanuszL JanuszL force-pushed the more_lsan_supres branch from 5bd9383 to 853a10c Compare May 26, 2026 22:28
@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [52701305]: BUILD FAILED

@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [52702362]: BUILD STARTED

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants