feat: aggregate views (--domains, --size-by-type, --redirects) + --body-grep filter#3
Conversation
…and --body-grep filter
PR 2 of the LLM-friendly series. Stacks on feat/llm-improvements-pr1.
Each new aggregate view answers a question that previously forced an agent to
chain several hargrep calls and post-process the output. --body-grep replaces
falling back to rg/grep on the raw HAR (noisy and unaware of JSON escaping).
Flags:
- --domains: [{domain, count}] sorted by count desc. Respects filters, so
e.g. --status-range 4xx --domains shows which hosts are erroring.
- --size-by-type: [{mime_type, total_bytes, count}] sorted by total_bytes
desc. Makes "where's my bandwidth going?" a one-liner.
- --redirects: [{id, url, status, location}] for every 3xx entry. Raw pairs
rather than stitched chains — stitching is one step in the agent and keeps
the format simple.
- --body-grep SUBSTRING: new filter that matches against request postData.text
or response content.text. Composes with the existing filter pipeline.
All four are mutually exclusive with each other and with --overview,
--count, --fields, --entry, --no-body, --include-all-bodies, --output where
combining would be nonsensical.
116 tests pass (55 unit + 61 integration). Clippy clean, fmt clean.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 956adc2789
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| /// Filter by substring match against request or response body text. | ||
| /// Matches when either contains the pattern. Case-sensitive. | ||
| #[arg(long)] | ||
| body_grep: Option<String>, |
There was a problem hiding this comment.
Disallow combining --entry with --body-grep
The new --body-grep flag is a filter, but it is not included in --entry's conflict set. In run, the --entry branch returns before any filtering is applied, so hargrep --entry N --body-grep ... succeeds and silently ignores --body-grep, which can mislead automation expecting filter semantics. --entry already blocks other filter flags for this exact reason, so body_grep should be treated the same way.
Useful? React with 👍 / 👎.
Codex flagged on PR #3 that --body-grep wasn't in --entry's conflict set, so `hargrep --entry N --body-grep foo` silently ignored the filter. Same class of bug as the earlier --entry fix; body-grep was just added later and missed the sweep. Adding body_grep + the newly-introduced body_regex to the list. Extended the existing conflict test to cover both.
* feat: add --body-regex filter and --help-llm compact cheatsheet Third PR in the LLM-friendly series. Two small additions that complete the filter/help surface area for agents. - --body-regex REGEX: regex variant of --body-grep, mirroring how --url pairs with --url-regex. Compiled at CLI parse time so bad patterns error with exit code 2 before any file is read. Supports (?i) for case-insensitive. Composes with --body-grep and all other filters as AND. - --help-llm: prints a compact flag reference and exits. 1566 bytes vs 3511 for clap's default --help (-55%). Tuned for LLM consumption: one line per flag group, no examples, exit codes documented. Lets an agent discover flags on-demand for ~400 tokens instead of carrying a 1k+ token cheatsheet in every system prompt. 129 tests pass (55 unit + 74 integration). Clippy clean, fmt clean. * fix: --entry also conflicts with --body-grep and --body-regex Codex flagged on PR #3 that --body-grep wasn't in --entry's conflict set, so `hargrep --entry N --body-grep foo` silently ignored the filter. Same class of bug as the earlier --entry fix; body-grep was just added later and missed the sweep. Adding body_grep + the newly-introduced body_regex to the list. Extended the existing conflict test to cover both. * docs: note --entry conflicts with filters; add --body-regex + --help-llm examples
Summary
Aggregate views that collapse multi-turn agent interactions into single calls, plus a first-class `--body-grep` filter. Second PR in the LLM-friendly series; follows up on #1 (now merged).
(This is a re-opened version of #2, which was auto-closed when its stacked base branch was deleted during the #1 squash-merge. Same code.)
What changed
All aggregate views respect filters, so `--status-range 4xx --domains` scopes to erroring hosts. All views honor the grep-like exit contract: exit 1 when the emitted document is empty, 0 otherwise — derived from the aggregate rows themselves, not just the pre-aggregate filter set, so `--redirects` on a HAR with no 3xx entries correctly exits 1 even if there are plenty of 2xx matches.
Codex review on #2 (already addressed)
Fixed in the commit. Exit-code logic now derives from `aggregate_exit_code(doc)` which inspects the emitted array's length or the overview's `entries` count.
Test plan
Follow-up (PR 3)
🤖 Generated with Claude Code