feat: LLM-friendly flags — entry IDs, --overview, asset stripping, compact JSON#1
Conversation
…, compact JSON PR 1 of a series making hargrep more token-efficient for LLM agents. Based on eval data showing hargrep needlessly verbose vs. naive jq on small HARs while winning clearly on correctness and volume. Changes: - Entry IDs. Every output entry now includes an `id` field (original 0-indexed position in the HAR). IDs are stable across filter changes, enabling the pointer-then-fetch pattern: an agent lists matches with `--fields id,url`, then drills into a specific one with `--entry N` (returns a single JSON object instead of an array). Adds a new `id` value for `--fields`. - `--overview`. One-shot HAR dashboard: entry count, status/method/MIME histograms, top 10 domains by request count, total body size, total time. Respects filters — call it after `--status-range 4xx` to see a filtered picture. Replaces a cascade of exploratory queries with one call. - Asset-body auto-strip. Static-asset response bodies (images, fonts, CSS, JS, WASM, video, audio) are stripped by default. They dominate real-HAR size but rarely help debug API behaviour. `--include-all-bodies` restores the old behaviour; `--no-body` still strips everything. - TTY-aware compact JSON. `--output json` is pretty in a TTY, compact when piped. Saves ~30% on the common subprocess case without hurting human readability. All new flags have conflict rules (e.g. `--overview` and `--count` are mutually exclusive). 99 tests pass (49 unit + 50 integration), clippy clean. Breaking change note: default response bodies for static assets are now stripped. Pass `--include-all-bodies` to recover the prior behaviour.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 87219f81fa
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if let Some(id) = cli.entry { | ||
| let total = har.log.entries.len(); | ||
| let entry = har.log.entries.into_iter().nth(id).ok_or_else(|| { | ||
| anyhow::anyhow!("entry id {id} out of range (HAR has {total} entries)") | ||
| })?; | ||
| let output = output::format_single_entry(id, &entry, body_mode)?; | ||
| print!("{output}"); | ||
| return Ok(0); |
There was a problem hiding this comment.
Reject filter flags when using --entry
The --entry path short-circuits before any filter logic runs, so filter flags like --status, --method, --url, etc. are accepted but silently ignored. In practice, hargrep --entry 0 --status 500 ... will still emit entry 0 even if it is not a 500 response, which can mislead automation that expects all supplied predicates to be enforced. Either make --entry conflict with filter flags or apply filtering before resolving the entry ID.
Useful? React with 👍 / 👎.
| serde_json::to_string(&doc)? | ||
| }; | ||
| println!("{serialized}"); | ||
| return Ok(0); |
There was a problem hiding this comment.
Return grep-style no-match code for --overview
This branch always returns exit code 0, even when filters produce zero matches. That diverges from the tool’s documented/established contract (1 for no matches) used by other output modes, so scripts that rely on exit status to detect empty result sets will incorrectly treat an empty overview as success. The overview path should return the precomputed exit_code instead of hardcoding success.
Useful? React with 👍 / 👎.
…s exit code Two P2 issues flagged by Codex on PR #1: 1. --entry silently ignored filter flags (e.g. `--entry 0 --status 500` returned entry 0 regardless of its status). --entry is a direct lookup, not a filter operation — adding conflicts_with_all for method, status, status-range, url, url-regex, header, mime, min-time so the combination errors at parse time with exit code 2 instead of misleading automation. 2. --overview always returned exit 0, breaking the grep-like "1 on no matches" contract documented for every other output mode. Now returns the precomputed exit_code so scripts can distinguish empty-result from matched-result runs. Regression tests added for both (102 tests pass total).
|
Addressed both Codex review points in d614e16:
Added regression tests covering both (102 tests total). CI green. |
- clippy 1.95 on CI flagged unnecessary_sort_by in largest_bodies; switched to sort_by_key with Reverse. Local clippy on older Rust didn't trigger it. - Pin fixture-exact sizes in test_fields_includes_content_size so a sort-key swap would surface immediately (per pr-test-analyzer review). - Pin the #1 winner (id 3, PNG) in --largest-bodies tests instead of the tautological sorted-desc check. Added limit=1 test. - Widen test_largest_bodies_conflicts_with_other_views to cover all four view flags (was testing only --overview). - Add 4 inline unit tests to src/aggregates.rs for largest_bodies covering sort, limit truncation, limit=0, and -1 (unknown) sinking with stable tie-breaking. - Doc comment on largest_bodies explicitly notes -1 semantics and stable sort. - Updated aggregate_exit_code docstring to list --largest-bodies. - README notes the -1-sinks-to-bottom behavior. 140 tests pass (59 unit + 81 integration). Clippy + fmt clean.
…e eval regression) (#5) * feat: add content-size field + --largest-bodies view (fixes size-aggregate eval regression) Closes the size-aggregate gap surfaced in the post-PR4 eval rerun, where hargrep regressed 64% on "which URL has the largest body?" because the agent chased --size-by-type (MIME-level aggregate) when it needed URL-level body sizes. - content-size: new valid --fields value. Emits as contentSize (HAR camelCase convention). Source is response.content.size (i64; -1 when the HAR logger didn't know, surfaced raw so callers can filter). - --largest-bodies[=N]: new aggregate view. Emits [{id, url, mime_type, content_size}] sorted by content_size desc, limited to top N. Default N=10. Uses --largest-bodies=N (equals) syntax because plain space delimiters would be ambiguous with the FILE positional arg — clap's require_equals keeps the grammar unambiguous. Respects filters and honors grep-like exit semantics (1 on empty). Conflicts with every other output/view flag. Updated --help-llm cheatsheet and README to document both additions. 135 tests pass (55 unit + 80 integration). Clippy + fmt clean. * fix: use sort_by_key; address PR review feedback - clippy 1.95 on CI flagged unnecessary_sort_by in largest_bodies; switched to sort_by_key with Reverse. Local clippy on older Rust didn't trigger it. - Pin fixture-exact sizes in test_fields_includes_content_size so a sort-key swap would surface immediately (per pr-test-analyzer review). - Pin the #1 winner (id 3, PNG) in --largest-bodies tests instead of the tautological sorted-desc check. Added limit=1 test. - Widen test_largest_bodies_conflicts_with_other_views to cover all four view flags (was testing only --overview). - Add 4 inline unit tests to src/aggregates.rs for largest_bodies covering sort, limit truncation, limit=0, and -1 (unknown) sinking with stable tie-breaking. - Doc comment on largest_bodies explicitly notes -1 semantics and stable sort. - Updated aggregate_exit_code docstring to list --largest-bodies. - README notes the -1-sinks-to-bottom behavior. 140 tests pass (59 unit + 81 integration). Clippy + fmt clean.
Summary
Makes hargrep materially more token-efficient for LLM coding agents without sacrificing human usability. First of a planned series; scope intentionally small.
Based on measured eval data (3 arms × 10 tasks × 3 trials on a 201-entry HAR): hargrep clearly beats naive `Read`+`Grep` (-90% tokens, +20% correctness), but loses to generic `jq`+`bash` on small HARs due to verbose default output and prompt overhead. This PR closes that gap on the default-output and structural axes; aggregate flags come in PR 2.
What changed
Breaking change
Default response bodies for static-asset MIME types are now stripped. Scripts that need the prior behaviour should pass `--include-all-bodies`. JSON/HTML/XML/text bodies are unchanged.
Test plan
Follow-up (not in this PR)
PR 2 will add aggregate views that collapse multi-turn queries: `--domains`, `--size-by-type`, `--redirects`, `--body-grep`. PR 3 will trim the LLM-facing prompt material and add a compact `--help-llm` output. The eval harness that produced the baseline numbers lives in a separate branch and is deliberately excluded from this PR.
🤖 Generated with Claude Code