Skip to content

[BOUNTY] Add diagnostic metadata diff tool (#5)#30

Open
leo202000 wants to merge 2 commits into
thanhle74:mainfrom
leo202000:feat/diagnostic-metadata-diff
Open

[BOUNTY] Add diagnostic metadata diff tool (#5)#30
leo202000 wants to merge 2 commits into
thanhle74:mainfrom
leo202000:feat/diagnostic-metadata-diff

Conversation

@leo202000

@leo202000 leo202000 commented Jun 22, 2026

Copy link
Copy Markdown

Summary

Adds a diagnostic metadata diff tool (tools/diagnostic_diff.py) that compares two diagnostic build JSON reports and reports metadata and per-module differences, addressing bounty #5.

Changes

  • tools/diagnostic_diff.py:
    • diff_reports(old, new) compares top-level metadata fields (commit, passed, failed, generated_at, etc.) and per-module status/elapsed_seconds/artifact, plus added/removed modules.
    • format_diff() renders a human-readable diff with sections for metadata changes, added/removed modules, and module status transitions.
    • --json outputs the diff as JSON; --output writes to a file.
    • load_report() reads a diagnostic build JSON file.
  • tests/test_diagnostic_diff.py: 10 unit tests covering metadata changes, module status transitions, added/removed modules, elapsed_seconds deltas, identical reports, formatting, report loading, and CLI parsing.
  • diagnostic/build-d8740957.logd + .json: required diagnostic bundle.

Testing

  • python3 tests/test_diagnostic_diff.py -v -> 10 tests pass.
  • python3 build.py -> diagnostic bundle generated and committed (diagnostic/build-d8740957.logd, 15182 bytes, DIAG magic).
  • Smoke tested diffing two sample diagnostic reports (status transitions, added modules, metadata changes all detected).

Checklist

  • Relevant modules affected by these changes build locally
  • Tests pass locally
  • Diagnostic build log is committed in this PR
  • Documentation has been updated, if applicable
  • Configuration or schema changes are documented, if applicable
  • No generated build artifacts are committed, except the required diagnostic build log
  • Changes are scoped to the PR purpose and avoid unrelated cleanup
  • Security, privacy, and error-handling implications have been considered

  • I would like to request that my diagnostic build log is removed before merging

Addresses bounty issue #5. Please let me know the process for claiming the $25 bounty once merged.

Summary by CodeRabbit

  • New Features

    • Added a command-line tool to compare diagnostic build reports and identify differences in metadata and module-level changes, with support for both human-readable and JSON output formats.
  • Tests

    • Added comprehensive unit tests for the diagnostic report comparison tool, validating metadata diff detection, module status changes, and output formatting.

Add tools/diagnostic_diff.py to compare two diagnostic build JSON
reports and report metadata field changes, per-module status transitions,
and added/removed modules. Supports human-readable and JSON output via
--json, plus --output to write to a file. Includes unit tests covering
metadata changes, module status transitions, added/removed modules,
elapsed_seconds deltas, identical reports, formatting, and CLI parsing.

Addresses bounty mannowell#5.
@coderabbitai

coderabbitai Bot commented Jun 22, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

Adds tools/diagnostic_diff.py, a CLI utility that loads two diagnostic build JSON reports, compares top-level metadata fields and per-module status/elapsed/artifact fields, and outputs either a formatted text diff or JSON. A unittest suite is added in tests/test_diagnostic_diff.py, and a sample diagnostic report diagnostic/build-d8740957.json for commit d8740957 is included.

Changes

Diagnostic Build Report Diff Tool

Layer / File(s) Summary
Diff engine: data loading and comparison
tools/diagnostic_diff.py
Defines META_FIELDS and MODULE_FIELDS constants, load_report() to parse a JSON report from disk, _module_map() to index modules by name, and diff_reports() to compute metadata changes, added/removed modules, and per-module field changes.
Output formatting and CLI entrypoint
tools/diagnostic_diff.py
Implements format_diff() to render a human-readable text report with conditional sections, parse_args() for positional old/new paths and --json/--output flags, and main() to orchestrate loading, diffing, and output.
Unit tests and sample diagnostic artifact
tests/test_diagnostic_diff.py, diagnostic/build-d8740957.json
test_diagnostic_diff.py covers diff_reports, format_diff, load_report, and parse_args for all diff scenarios. build-d8740957.json provides a real-world report with 10 modules (2 passed, 8 failed), decryption metadata, and a pr_note field.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related issues

Poem

🐇 Hoppity-hop through the JSON diff maze,
Two reports in hand, I compare all their ways.
Modules added or dropped, metadata changed—
Format or JSON, the output arranged!
With tests to confirm every field and each key,
This rabbit has diffed them with accuracy and glee! 🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 22.22% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly identifies the main change: adding a diagnostic metadata diff tool to address bounty #5. It is specific, concise, and directly reflects the primary purpose of the changeset.
Description check ✅ Passed The description covers all required sections (Summary, Changes, Testing) with specific implementation details and test results. Most checklist items are completed; however, documentation and schema changes sections are marked as not applicable rather than confirmed unnecessary.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tools/diagnostic_diff.py`:
- Around line 157-173: The main() function calls load_report() for both args.old
and args.new without any error handling, but these calls can raise
FileNotFoundError or json.JSONDecodeError. Wrap the load_report() calls in a
try/except block to catch these exceptions and provide user-friendly error
messages instead of letting the exceptions propagate uncaught. When a
FileNotFoundError or json.JSONDecodeError occurs, print a clear error message
describing the issue and exit gracefully with an appropriate error code using
sys.exit(), ensuring users see helpful feedback instead of a traceback.
- Around line 178-179: The main() function returns an exit code, but the code in
the if __name__ == "__main__": block simply calls main() without using
sys.exit() to pass the return code to the shell. Modify the code to call
sys.exit(main()) instead of just main() to ensure the exit code is properly
communicated to the process, allowing CI and shell scripts to detect failures
correctly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 291aa415-2496-4d31-8cad-06b4ffedff02

📥 Commits

Reviewing files that changed from the base of the PR and between 94e0fb0 and 4ef164b.

📒 Files selected for processing (4)
  • diagnostic/build-d8740957.json
  • diagnostic/build-d8740957.logd
  • tests/test_diagnostic_diff.py
  • tools/diagnostic_diff.py

Comment thread tools/diagnostic_diff.py
Comment on lines +157 to +173
def main() -> int:
args = parse_args()
old = load_report(args.old)
new = load_report(args.new)
diff = diff_reports(old, new)

if args.json:
output = json.dumps(diff, indent=2, default=str)
else:
output = format_diff(diff)

if args.output:
with open(args.output, "w", encoding="utf-8") as f:
f.write(output)
print(f"Diff written to {args.output}")
else:
print(output)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add error handling for missing or invalid diagnostic reports.

Currently, load_report() can raise FileNotFoundError or json.JSONDecodeError (lines 159–160), but main() has no try/except block. Exceptions will propagate uncaught, producing a traceback instead of a user-friendly error message.

💡 Suggested error handling
 def main() -> int:
     args = parse_args()
+    try:
+        old = load_report(args.old)
+        new = load_report(args.new)
+    except FileNotFoundError as e:
+        print(f"Error: File not found: {e.filename}", file=sys.stderr)
+        return 1
+    except json.JSONDecodeError as e:
+        print(f"Error: Invalid JSON: {e.msg} at line {e.lineno}", file=sys.stderr)
+        return 1
     old = load_report(args.old)
     new = load_report(args.new)
     diff = diff_reports(old, new)
🧰 Tools
🪛 ast-grep (0.44.0)

[info] 163-163: use jsonify instead of json.dumps for JSON output
Context: json.dumps(diff, indent=2, default=str)
Note: Security best practice.

(use-jsonify)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/diagnostic_diff.py` around lines 157 - 173, The main() function calls
load_report() for both args.old and args.new without any error handling, but
these calls can raise FileNotFoundError or json.JSONDecodeError. Wrap the
load_report() calls in a try/except block to catch these exceptions and provide
user-friendly error messages instead of letting the exceptions propagate
uncaught. When a FileNotFoundError or json.JSONDecodeError occurs, print a clear
error message describing the issue and exit gracefully with an appropriate error
code using sys.exit(), ensuring users see helpful feedback instead of a
traceback.

Comment thread tools/diagnostic_diff.py
Comment on lines +178 to +179
if __name__ == "__main__":
main()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use sys.exit() to communicate return code to the shell.

The main() function returns an exit code (0), but sys.exit() is not called to pass it along. This means the process will always exit with code 0 regardless of errors, preventing CI/shell scripts from detecting failures.

🔧 Proposed fix
 if __name__ == "__main__":
-    main()
+    sys.exit(main())
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if __name__ == "__main__":
main()
if __name__ == "__main__":
sys.exit(main())
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/diagnostic_diff.py` around lines 178 - 179, The main() function returns
an exit code, but the code in the if __name__ == "__main__": block simply calls
main() without using sys.exit() to pass the return code to the shell. Modify the
code to call sys.exit(main()) instead of just main() to ensure the exit code is
properly communicated to the process, allowing CI and shell scripts to detect
failures correctly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant