Skip to content

fix(noise): suppress recurring ffprobe demuxing-error lines in non-verbose activity log (closes #847)#953

Closed
Salem874 wants to merge 1 commit into
mainfrom
fix/847-ffprobe-demuxing-noise-filter
Closed

fix(noise): suppress recurring ffprobe demuxing-error lines in non-verbose activity log (closes #847)#953
Salem874 wants to merge 1 commit into
mainfrom
fix/847-ffprobe-demuxing-noise-filter

Conversation

@Salem874

Copy link
Copy Markdown
Contributor

Summary

Closes #847 — the most-aggravating recurring log noise in the user-facing activity log. ffprobe's "Invalid data found when processing input" warning fires from stderr on virtually every track during enrichment as ffmpeg encounters the partial-moov-atom case at byte 0. Downloads complete; ffprobe falls through to a valid result on retry. But ~20 noise lines per album turned the activity log into static.

Implementation modelled exactly on the shipped #660 pattern (`is_python_traceback_noise` + non-verbose suppression in `download_queue.rs::emit_subprocess_line` gate). Verbose mode still surfaces everything; on-disk activity-log writer still records every line so support requests stay debuggable.

What changed

Design choices

Why a tight structural match rather than a regex on the literal demuxer list? Future ffmpeg may auto-select a different comma-separated demuxer list (the canonical `mov,mp4,m4a,3gp,3g2,mj2` set could shift to include / exclude formats). The match keys on the bracket shape `[in#N/...` + the `@ 0x]` pointer suffix, which has been stable across ffmpeg 4.x / 5.x / 6.x / 7.x. If upstream ever switches the pointer format the gate fails OPEN (noise resumes) rather than over-suppressing genuine errors. Safe-default direction.

Why combine with the existing #660 gate rather than a separate emit path? Single `is_known_noise` boolean keeps the call site readable + avoids the "every new noise filter doubles the branching" pattern. Future noise gates land in the same OR chain.

Tests

6 new unit tests pin the contract — all green:

Test What it pins
canonical Apple Music shape verbatim from #847 issue report
leading-whitespace + wider hex pointer tolerance ffmpeg builds + 64-bit platform variations
alternate demuxer lists (aac, wav,mp3) future-proof against ffmpeg's auto-selection changes
genuine ffmpeg errors pass through muxer errors, parameter errors, generic "Invalid data" without the demuxing prefix all still visible
empty / unrelated / traceback lines pass through no false positives
wrapped GAMDL `[INFO HH:MM:SS]` prefix doesn't trigger structlog-prefixed lines aren't confused with ffmpeg's prefix
requires `0x` pointer form upstream format change → gate fails open (safe)

Verification

  • ✅ `cargo test --lib is_ffprobe_demux_noise` — 6/6 pass
  • ✅ Full test suite green (no regression)
  • ✅ Verbose mode unchanged — every line still emits both on-disk + UI
  • ✅ Disk writer unchanged — forensic record stays complete

Test plan

  • Download an album in non-verbose mode → ffprobe demuxing-error lines no longer appear in the activity log (was ~4 lines per track).
  • Toggle verbose mode → lines reappear immediately.
  • Check `{app_data_dir}/logs/activity-YYYY-MM-DD.log` — every suppressed line is still recorded on disk.
  • Provoke a genuine non-demuxing ffmpeg error (e.g. paste an empty file at the codec-detection step) → it should still surface in the activity log (the gate is tight enough not to swallow real errors).

🤖 Generated with Claude Code

…rbose activity log (#847)

`Error during demuxing: Invalid data found when processing input` fires from ffprobe stderr on virtually every track during enrichment — modern ffmpeg's partial-moov-atom warning at byte 0. Downloads complete; ffprobe falls through to a valid result on retry. But the noise produces ~20 entries per album in the activity log, reported in #847 as the most-aggravating recurring noise.

Implementation modelled exactly on the shipped #660 pattern (`is_python_traceback_noise` + non-verbose suppression in `download_queue.rs::emit_subprocess_line` gate):

- New `process::is_ffprobe_demux_noise(line: &str) -> bool` in `utils/process.rs`. Match keys on the structural `[in#<digit>/<demuxer-list> @ 0x<hex>]` prefix plus both required tail substrings (`Error during demuxing:` AND `Invalid data found when processing input`). Tight enough that genuine ffmpeg demuxer errors that share words pass through; loose enough to survive future demuxer-list shifts (the comma-separated demuxer names vary by ffmpeg version).
- Wired into both stdout (line 10958) and stderr (line 11155) reader gates in `download_queue.rs` — `is_known_noise` now combines both `is_python_traceback_noise` and `is_ffprobe_demux_noise`. The 5th arg to `emit_subprocess_line` (`show_in_ui`) stays `verbose || !is_known_noise` so the on-disk activity-log writer still records every line — forensic record intact for support requests.

6 new unit tests pin the contract:
- Canonical Apple Music shape (verbatim from the #847 issue report)
- Leading-whitespace + wider-hex-pointer tolerance
- Alternate demuxer lists (aac, wav,mp3) — future-proof against ffmpeg auto-selection changes
- Genuine ffmpeg errors pass through (muxer errors, parameter errors, generic "Invalid data" without the demuxing prefix)
- Empty / unrelated / traceback lines pass through
- Wrapped GAMDL `[INFO     HH:MM:SS]` prefix doesn't false-positive
- Requires `0x<hex>` pointer form — gate fails open (noise resumes) if upstream switches format rather than over-suppressing

Closes #847
@github-actions

Copy link
Copy Markdown
Contributor

PR Security Checks

⚠️ 1 category(ies) of findings — these are heuristics, please review each:

Hardcoded absolute filesystem paths (derive from app_data_dir / std::env, not literals)

src-tauri/src/services/download_queue.rs:15180:        assert!(super::super::config_service::validate_path_safe("/home/user/Music").is_ok());
src-tauri/src/services/download_queue.rs:15181:        assert!(super::super::config_service::validate_path_safe("C:\\Users\\Music").is_ok());
src-tauri/src/services/download_queue.rs:15188:        assert!(super::super::config_service::validate_path_safe("/home/../root").is_err());
src-tauri/src/utils/process.rs:2030:                r#"File "/Users/user/Library/Application Support/io.github.meedyadl/python/lib/python3.12/site-packages/httpx/_transports/default.py", line 118, in map_httpcore_exceptions"#

Generated by .github/workflows/pr-security.yml. Non-blocking — the merge gate is ci.yml. Cross-source checks live in tools/audit-checks/ and run locally. False positives are expected.

@Salem874

Copy link
Copy Markdown
Contributor Author

Superseded by PR #954 (combined alpha-targeting housekeeping).

Entire content (the 2d7701e0 ffprobe noise filter commit + 6 unit tests) landed cleanly in PR #954 via cherry-pick onto an alpha base. Closing per user instruction to consolidate.

@Salem874 Salem874 closed this Jun 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

investigate: repeated 'Error during demuxing: Invalid data found when processing input' on every track

1 participant