Skip to content

feat: LLM-friendly flags — entry IDs, --overview, asset stripping, compact JSON#1

Merged
brunojm merged 2 commits into
mainfrom
feat/llm-improvements-pr1
Apr 17, 2026
Merged

feat: LLM-friendly flags — entry IDs, --overview, asset stripping, compact JSON#1
brunojm merged 2 commits into
mainfrom
feat/llm-improvements-pr1

Conversation

@brunojm

@brunojm brunojm commented Apr 17, 2026

Copy link
Copy Markdown
Owner

Summary

Makes hargrep materially more token-efficient for LLM coding agents without sacrificing human usability. First of a planned series; scope intentionally small.

Based on measured eval data (3 arms × 10 tasks × 3 trials on a 201-entry HAR): hargrep clearly beats naive `Read`+`Grep` (-90% tokens, +20% correctness), but loses to generic `jq`+`bash` on small HARs due to verbose default output and prompt overhead. This PR closes that gap on the default-output and structural axes; aggregate flags come in PR 2.

What changed

  • Entry IDs. Every output entry includes a stable `id` field (its original 0-indexed position in the HAR). Enables the pointer-then-fetch pattern — an agent lists matches with `--fields id,url,status`, then drills into one with `--entry N` (returns a single JSON object, not an array). `id` is also valid in `--fields`.
  • `--overview`. One-shot JSON dashboard: entry count, status/method/MIME histograms, top 10 domains, total body size, total time. Respects filters — use it after `--status-range 4xx` for a scoped view. Replaces a cascade of exploratory queries with a single call.
  • Static-asset body stripping by default. Images, fonts, CSS, JS, WASM, video, and audio response bodies are dropped by default. They dominate real-HAR size but rarely help debug API behaviour. `--include-all-bodies` restores prior behaviour; `--no-body` still strips everything.
  • TTY-aware compact JSON. `--output json` is pretty when stdout is a terminal, compact when piped. ~30% savings on the typical subprocess case, no change for humans.

Breaking change

Default response bodies for static-asset MIME types are now stripped. Scripts that need the prior behaviour should pass `--include-all-bodies`. JSON/HTML/XML/text bodies are unchanged.

Test plan

  • `cargo test` — 99 tests pass (49 unit + 50 integration). Includes new tests for: entry-id emission across all output formats, `--entry` happy + out-of-range + conflicts, asset-body stripping semantics, `--include-all-bodies` override, `--no-body` still strips everything, `--overview` shape + status/method histograms + filter composition + conflict rules, compact JSON when piped.
  • `cargo clippy --all-targets -- -D warnings` — clean.
  • `cargo fmt --check` — clean.
  • Manual smoke on `samples/igvita.har` for `--overview` and `--entry`.

Follow-up (not in this PR)

PR 2 will add aggregate views that collapse multi-turn queries: `--domains`, `--size-by-type`, `--redirects`, `--body-grep`. PR 3 will trim the LLM-facing prompt material and add a compact `--help-llm` output. The eval harness that produced the baseline numbers lives in a separate branch and is deliberately excluded from this PR.

🤖 Generated with Claude Code

…, compact JSON

PR 1 of a series making hargrep more token-efficient for LLM agents. Based on
eval data showing hargrep needlessly verbose vs. naive jq on small HARs while
winning clearly on correctness and volume.

Changes:

- Entry IDs. Every output entry now includes an `id` field (original 0-indexed
  position in the HAR). IDs are stable across filter changes, enabling the
  pointer-then-fetch pattern: an agent lists matches with `--fields id,url`,
  then drills into a specific one with `--entry N` (returns a single JSON
  object instead of an array). Adds a new `id` value for `--fields`.

- `--overview`. One-shot HAR dashboard: entry count, status/method/MIME
  histograms, top 10 domains by request count, total body size, total time.
  Respects filters — call it after `--status-range 4xx` to see a filtered
  picture. Replaces a cascade of exploratory queries with one call.

- Asset-body auto-strip. Static-asset response bodies (images, fonts, CSS, JS,
  WASM, video, audio) are stripped by default. They dominate real-HAR size but
  rarely help debug API behaviour. `--include-all-bodies` restores the old
  behaviour; `--no-body` still strips everything.

- TTY-aware compact JSON. `--output json` is pretty in a TTY, compact when
  piped. Saves ~30% on the common subprocess case without hurting human
  readability.

All new flags have conflict rules (e.g. `--overview` and `--count` are
mutually exclusive). 99 tests pass (49 unit + 50 integration), clippy clean.

Breaking change note: default response bodies for static assets are now
stripped. Pass `--include-all-bodies` to recover the prior behaviour.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 87219f81fa

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/main.rs
Comment on lines +133 to +140
if let Some(id) = cli.entry {
let total = har.log.entries.len();
let entry = har.log.entries.into_iter().nth(id).ok_or_else(|| {
anyhow::anyhow!("entry id {id} out of range (HAR has {total} entries)")
})?;
let output = output::format_single_entry(id, &entry, body_mode)?;
print!("{output}");
return Ok(0);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Reject filter flags when using --entry

The --entry path short-circuits before any filter logic runs, so filter flags like --status, --method, --url, etc. are accepted but silently ignored. In practice, hargrep --entry 0 --status 500 ... will still emit entry 0 even if it is not a 500 response, which can mislead automation that expects all supplied predicates to be enforced. Either make --entry conflict with filter flags or apply filtering before resolving the entry ID.

Useful? React with 👍 / 👎.

Comment thread src/main.rs Outdated
serde_json::to_string(&doc)?
};
println!("{serialized}");
return Ok(0);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Return grep-style no-match code for --overview

This branch always returns exit code 0, even when filters produce zero matches. That diverges from the tool’s documented/established contract (1 for no matches) used by other output modes, so scripts that rely on exit status to detect empty result sets will incorrectly treat an empty overview as success. The overview path should return the precomputed exit_code instead of hardcoding success.

Useful? React with 👍 / 👎.

…s exit code

Two P2 issues flagged by Codex on PR #1:

1. --entry silently ignored filter flags (e.g. `--entry 0 --status 500`
   returned entry 0 regardless of its status). --entry is a direct lookup,
   not a filter operation — adding conflicts_with_all for method, status,
   status-range, url, url-regex, header, mime, min-time so the combination
   errors at parse time with exit code 2 instead of misleading automation.

2. --overview always returned exit 0, breaking the grep-like "1 on no
   matches" contract documented for every other output mode. Now returns
   the precomputed exit_code so scripts can distinguish empty-result from
   matched-result runs.

Regression tests added for both (102 tests pass total).
@brunojm

brunojm commented Apr 17, 2026

Copy link
Copy Markdown
Owner Author

Addressed both Codex review points in d614e16:

  • --entry now conflicts with all filter flags (--method, --status, --status-range, --url, --url-regex, --header, --mime, --min-time). Clap rejects the combination at parse time with exit code 2, so hargrep --entry 0 --status 500 ... now errors instead of silently ignoring the --status predicate.
  • --overview respects the grep-like exit contract: returns 1 when the filter set is empty, 0 otherwise. The empty document is still printed so downstream tooling sees well-formed output.

Added regression tests covering both (102 tests total). CI green.

@brunojm brunojm merged commit e351c87 into main Apr 17, 2026
4 checks passed
@brunojm brunojm deleted the feat/llm-improvements-pr1 branch April 17, 2026 01:34
brunojm added a commit that referenced this pull request Apr 17, 2026
- clippy 1.95 on CI flagged unnecessary_sort_by in largest_bodies; switched
  to sort_by_key with Reverse. Local clippy on older Rust didn't trigger it.
- Pin fixture-exact sizes in test_fields_includes_content_size so a sort-key
  swap would surface immediately (per pr-test-analyzer review).
- Pin the #1 winner (id 3, PNG) in --largest-bodies tests instead of the
  tautological sorted-desc check. Added limit=1 test.
- Widen test_largest_bodies_conflicts_with_other_views to cover all four
  view flags (was testing only --overview).
- Add 4 inline unit tests to src/aggregates.rs for largest_bodies covering
  sort, limit truncation, limit=0, and -1 (unknown) sinking with stable
  tie-breaking.
- Doc comment on largest_bodies explicitly notes -1 semantics and stable
  sort.
- Updated aggregate_exit_code docstring to list --largest-bodies.
- README notes the -1-sinks-to-bottom behavior.

140 tests pass (59 unit + 81 integration). Clippy + fmt clean.
brunojm added a commit that referenced this pull request Apr 17, 2026
…e eval regression) (#5)

* feat: add content-size field + --largest-bodies view (fixes size-aggregate eval regression)

Closes the size-aggregate gap surfaced in the post-PR4 eval rerun, where
hargrep regressed 64% on "which URL has the largest body?" because the
agent chased --size-by-type (MIME-level aggregate) when it needed URL-level
body sizes.

- content-size: new valid --fields value. Emits as contentSize (HAR
  camelCase convention). Source is response.content.size (i64; -1 when the
  HAR logger didn't know, surfaced raw so callers can filter).

- --largest-bodies[=N]: new aggregate view. Emits [{id, url, mime_type,
  content_size}] sorted by content_size desc, limited to top N. Default
  N=10. Uses --largest-bodies=N (equals) syntax because plain space
  delimiters would be ambiguous with the FILE positional arg — clap's
  require_equals keeps the grammar unambiguous.

Respects filters and honors grep-like exit semantics (1 on empty).
Conflicts with every other output/view flag.

Updated --help-llm cheatsheet and README to document both additions.

135 tests pass (55 unit + 80 integration). Clippy + fmt clean.

* fix: use sort_by_key; address PR review feedback

- clippy 1.95 on CI flagged unnecessary_sort_by in largest_bodies; switched
  to sort_by_key with Reverse. Local clippy on older Rust didn't trigger it.
- Pin fixture-exact sizes in test_fields_includes_content_size so a sort-key
  swap would surface immediately (per pr-test-analyzer review).
- Pin the #1 winner (id 3, PNG) in --largest-bodies tests instead of the
  tautological sorted-desc check. Added limit=1 test.
- Widen test_largest_bodies_conflicts_with_other_views to cover all four
  view flags (was testing only --overview).
- Add 4 inline unit tests to src/aggregates.rs for largest_bodies covering
  sort, limit truncation, limit=0, and -1 (unknown) sinking with stable
  tie-breaking.
- Doc comment on largest_bodies explicitly notes -1 semantics and stable
  sort.
- Updated aggregate_exit_code docstring to list --largest-bodies.
- README notes the -1-sinks-to-bottom behavior.

140 tests pass (59 unit + 81 integration). Clippy + fmt clean.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant