Skip to content

feat(context): per-tool output caps + stale tool-result pruning#38

Merged
arniesaha merged 1 commit intomainfrom
feat/tool-caps-stale-pruning
Apr 23, 2026
Merged

feat(context): per-tool output caps + stale tool-result pruning#38
arniesaha merged 1 commit intomainfrom
feat/tool-caps-stale-pruning

Conversation

@arniesaha
Copy link
Copy Markdown
Owner

Summary

Two independent levers for in-session token hygiene:

1. Tool-level output caps

  • read_file gains offset / limit params (line-based), defaults to 500 lines capped at ~20K chars, includes pagination hint in the truncated response so the model knows how to page
  • run_shell and claude-subagent switch to head+tail truncation (70/30) via a shared helper src/tools/truncate.ts#headAndTail — so neither the initial plan nor the final error/answer is silently lost

2. pruneStaleToolResults in transformContext

  • Runs every turn (before the compaction threshold check)
  • Preserves the last FRESH_TURNS user turns (default 4, override via MAX_FRESH_TURNS)
  • Older toolResult bodies replaced with a short stub — tool name, length, head prefix
  • Structure-preserving: toolCallId / role / isError untouched so tool_usetool_result pairing remains valid for Anthropic's API
  • Idempotent (second pass returns the same reference)

Why

Before: a single heavy gh pr diff / browser-scrape early in a session re-bills itself at full size on every subsequent agent iteration. After: stale bodies collapse to a ~200-char breadcrumb after 4 turns.

This is the biggest in-session lever other than caching: it operates on the active messages array on every turn, not just at compaction time.

Test plan

  • 7 new unit tests in tests/context.test.ts (no-op, prune-stale, keep-fresh, skip-small, idempotence, structural preservation)
  • Existing claude-subagent truncation test updated to verify head+tail
  • Full npm test — 78/78 pass
  • npm run build clean
  • Manual: heavy session, check context_info shows token drop after turn 4

🤖 Generated with Claude Code

Two independent knobs for in-session token hygiene:

1. Tool-level output caps
   - read_file gains offset/limit (line-based), defaults to 500 lines
     capped at ~20K chars; includes pagination hint in the truncated
     response so the model knows how to page.
   - run_shell and claude-subagent switch from head-only / tail-only to
     head+tail truncation (70/30) via a shared helper
     src/tools/truncate.ts#headAndTail — so neither the initial plan
     nor the final error/answer is silently lost.

2. pruneStaleToolResults in transformContext
   - Runs every turn before compaction. Preserves the last FRESH_TURNS
     user turns (default 4, env MAX_FRESH_TURNS) intact; older
     toolResult bodies are replaced with a short stub that keeps the
     tool name, length, and a head prefix for continuity.
   - Structure-preserving: toolCallId/role/isError untouched so
     tool_use ↔ tool_result pairing stays valid for Anthropic's API.
   - Idempotent (second pass returns same reference).

Before: a single heavy gh-diff / browser-scrape early in a session
re-billed itself at full size on every subsequent iteration. After:
stale bodies collapse to a ~200-char breadcrumb after a few turns.

Tests: 7 new in tests/context.test.ts covering no-op, prune-stale,
keep-fresh, skip-small, idempotence, and structural preservation.
Updated existing claude-subagent truncation test to the head+tail
behaviour.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@arniesaha arniesaha force-pushed the feat/tool-caps-stale-pruning branch from ff85ed4 to 361f76a Compare April 23, 2026 07:42
@arniesaha arniesaha merged commit c398bdb into main Apr 23, 2026
1 check passed
@arniesaha arniesaha deleted the feat/tool-caps-stale-pruning branch April 23, 2026 07:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant