Add Anthropic prompt caching breakpoints by smolpaws · Pull Request #1017 · enyst/OpenHands-Tab

smolpaws · 2026-05-11T15:23:27Z

Summary

split cacheable and dynamic system prompt content in the TypeScript agent-sdk runtime
apply Anthropic prompt caching breakpoints for native Anthropic and LiteLLM Claude request payloads
add focused tests for prompt splitting, provider support, and cache marker placement
add a reusable live smoke runner for validating Anthropic prompt caching against a local profile

Verification

npm test -w @smolpaws/agent-sdk
npm run build -w @smolpaws/agent-sdk
npm run typecheck
npm run lint
npm test
npm run typecheck
npm run lint
node scripts/anthropic-cache-smoke.mjs

Review

Agent Mail review was requested from PinkStone; no additional blocking findings came back before merge.
CodeRabbit and Gemini both focused on cache marker placement. The native Anthropic path was corrected to put cache_control on the tool_result content block, while the LiteLLM/OpenAI-compatible tool-message envelope behavior was intentionally kept for Python parity and the LiteLLM tool-result quirk.
Devin raised prompt-ordering and allowlist coverage questions. We kept the prompt split ordering intentionally and expanded the Claude 3 alias allowlist where it was worth doing.
Final live smoke validation used the local opus-46 profile and showed cache markers on outgoing requests plus cache-read tokens on all five turns of a single conversation, including turns with terminal and file_editor tool use.

Summary by CodeRabbit

New Features
- Prompt caching enabled for Anthropic and OpenAI-compatible LLMs to reduce latency and token usage.
- System prompts are split into cacheable and dynamic parts so stable context is preserved while dynamic context can change.
Tests
- Added tests covering prompt-caching behavior, cache-marker placement, and system-prompt composition and stability across providers.

Split static and dynamic system prompt content so Anthropic-compatible providers can cache only the stable prefix, matching the Python agent-sdk behavior. Mark the last user/tool turn for cache extension and cover the native Anthropic and LiteLLM Claude request shapes with focused tests.

coderabbitai · 2026-05-11T15:23:41Z

📝 Walkthrough

Walkthrough

Splits system prompts into cacheable and dynamic segments, detects Anthropic models that support prompt caching, emits ephemeral cache_control in Anthropic and OpenAI-compatible request payloads, updates Agent/condensation plumbing to carry split prompts, and adds tests validating marker placement and prompt stability.

Changes

Prompt Caching Implementation

Layer / File(s)	Summary
Type Contract for Cacheable Prompts `packages/agent-sdk/src/sdk/llm/types.ts`	`ChatCompletionRequest` gains optional `cacheableSystemPrompt` and `dynamicSystemPrompt` fields to carry split prompt segments through the LLM pipeline.
Provider Cache Support Detection `packages/agent-sdk/src/sdk/llm/providerQuirks.ts`	Adds `PROMPT_CACHE_MODELS` list and exports `supportsPromptCaching(config)` which returns true for Anthropic cache-capable models.
Anthropic Provider Cache Integration `packages/agent-sdk/src/sdk/llm/anthropic.ts`	Introduces `EPHEMERAL_CACHE_CONTROL`, extends content/message types with optional `cache_control`, updates `toAnthropicMessages` to mark the last eligible user/tool content block when requested, and adjusts `requestBody` to emit a cached system segment plus optional dynamic segment.
OpenAI-Compatible Provider Cache Integration `packages/agent-sdk/src/sdk/llm/openai-compatible.ts`	Adds `EPHEMERAL_CACHE_CONTROL` and `cache_control` to content/message shapes, extends `toOpenAIMessage` with per-message `cachePrompt` to mark final user/tool blocks (ensuring text blocks for images), and rewrites request-body generation to include cached system content and per-message cache flags.
Agent System Prompt Refactoring `packages/agent-sdk/src/sdk/runtime/Agent.ts`	Splits system-prompt generation into `buildCacheableSystemPrompt()` (stable base with tools) and `buildDynamicSystemPrompt()` (context-specific suffix), recomposes for non-condensation uses, and passes both into condensation requests.
Condensation Function Integration `packages/agent-sdk/src/sdk/runtime/condensation.ts`	`buildChatRequestWithCondensation` accepts `dynamicSystemPrompt`, composes the dynamic portion with the conversation summary block, and returns `cacheableSystemPrompt` and optional `dynamicSystemPrompt` alongside the final `systemPrompt`.
Provider Support Detection Tests `packages/agent-sdk/src/sdk/llm/__tests__/providerQuirks.test.ts`	Adds `supportsPromptCaching` test cases for allowed Anthropic models, LiteLLM Anthropic routing, excluded Anthropic variants, and non-Anthropic models.
End-to-End Provider Cache Tests `packages/agent-sdk/src/sdk/llm/__tests__/promptCaching.test.ts`	New Vitest suite mocking fetch/streaming that asserts `cache_control: { type: 'ephemeral' }` placements: static system blocks and final user/tool messages for Anthropic, and cached system content plus final tool/user message marking for OpenAI-compatible requests.
Tests Adjustment `packages/agent-sdk/src/sdk/llm/__tests__/thinkingBlocks.test.ts`	Loosens an Anthropic tool-result assertion from `toEqual` to `toMatchObject`.
Agent and Condensation Tests `packages/agent-sdk/src/sdk/runtime/__tests__/Agent.system-prompt.test.ts`, `packages/agent-sdk/src/sdk/runtime/__tests__/condensation.test.ts`	Agent test verifies `cacheableSystemPrompt` stability across runs while `dynamicSystemPrompt` changes; condensation test verifies separated cacheable/dynamic segments and exact newline/summary formatting.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

enyst/OpenHands-Tab#706: Modifies condensation helpers and Agent usage related to splitting cacheable vs dynamic system prompts.
enyst/OpenHands-Tab#986: Also touches Agent system-prompt construction and LLM-context extraction, overlapping system prompt responsibilities.
enyst/OpenHands-Tab#584: Modifies LLM provider message builders (toAnthropicMessages / toOpenAIMessage) and related message formats.

Suggested labels

codex

"I hopped through prompts both old and new,
Static stays cozy, dynamic hops through.
Ephemeral crumbs tucked safe in a trail,
Anthropic and OpenAI follow the tale.
🐇✨ Cache kept neat, while context prevails."

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main change: adding prompt caching breakpoints for Anthropic LLM provider support.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch codex/anthropic-prompt-caching

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

smolpaws · 2026-05-11T15:24:15Z

@coderabbitai review

coderabbitai · 2026-05-11T15:24:21Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

smolpaws · 2026-05-11T15:25:44Z

/gemini review

Co-authored-by: devin-ai-integration[bot] <158243242+devin-ai-integration[bot]@users.noreply.github.com>

coderabbitai

🧹 Nitpick comments (3)

packages/agent-sdk/src/sdk/runtime/Agent.ts (3)

1021-1049: 💤 Low value

Optional: Consider clarifying JSDoc terminology.

The JSDoc mentions "registered tool summaries" but the code actually uses tool definitions (name + description from getToolDefinitions()). Consider rewording to "tool definitions" for clarity, though this is a minor nitpick.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/agent-sdk/src/sdk/runtime/Agent.ts` around lines 1021 - 1049, Update
the JSDoc on buildCacheableSystemPrompt to replace the phrase "registered tool
summaries" with clearer wording such as "tool definitions (name + description)";
locate the comment above the private buildCacheableSystemPrompt() method and
change the description to reference getToolDefinitions() semantics so the doc
matches the implementation that extracts tool.function.name and
tool.function.description.

1051-1066: ⚡ Quick win

Consider adding JSDoc for consistency.

Since buildCacheableSystemPrompt() has comprehensive JSDoc explaining the caching strategy, adding similar documentation to buildDynamicSystemPrompt() would improve consistency and help developers understand the split. For example, explain that this returns runtime-mutated context (editor state, secrets, etc.) that should not be cached.

📝 Example JSDoc

+  /**
+   * Builds the dynamic system-prompt suffix that changes at runtime.
+   *
+   * This suffix contains runtime-mutated context such as the current editor state,
+   * available secrets, and LLM provider details. It should NOT be cached by
+   * Anthropic prompt caching, as it varies across runs.
+   *
+   * `@returns` The dynamic suffix, or null if no agentContext is configured.
+   */
   private buildDynamicSystemPrompt(): string | null {

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/agent-sdk/src/sdk/runtime/Agent.ts` around lines 1051 - 1066, Add a
JSDoc block above the buildDynamicSystemPrompt method (matching the style used
for buildCacheableSystemPrompt) that explains this function returns the
runtime-mutated system prompt pieces (editor state, secrets, runtime context)
and therefore must not be cached; mention the inputs used (agentContext,
secrets.getRegisteredNames(), resolved llmModel/llmProvider/llmBaseUrl via
resolveSystemPromptLlmContext) and the intended usage so readers understand why
caching is split between buildCacheableSystemPrompt and
buildDynamicSystemPrompt.

1068-1073: ⚡ Quick win

Consider adding JSDoc for clarity.

This method recombines the cacheable and dynamic prompt parts for use in system prompt events. Adding JSDoc would clarify its role in the caching strategy and improve code documentation consistency.

📝 Example JSDoc

+  /**
+   * Recombines cacheable and dynamic system prompt parts into a single string.
+   *
+   * Used by ensureSystemPrompt() to emit the full system prompt event.
+   * The split parts are sent separately to LLM clients that support prompt caching.
+   */
   private buildSystemPrompt(): string {

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/agent-sdk/src/sdk/runtime/Agent.ts` around lines 1068 - 1073, The
buildSystemPrompt method lacks documentation explaining its purpose and relation
to caching; add a concise JSDoc comment above the buildSystemPrompt method
describing that it recombines cacheable and dynamic prompt parts (via
buildCacheableSystemPrompt and buildDynamicSystemPrompt), explains the returned
string format (joined with double newlines), and notes its role in system prompt
events and caching strategy so callers understand when caching applies versus
when dynamic content is included.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@packages/agent-sdk/src/sdk/runtime/Agent.ts`:
- Around line 1021-1049: Update the JSDoc on buildCacheableSystemPrompt to
replace the phrase "registered tool summaries" with clearer wording such as
"tool definitions (name + description)"; locate the comment above the private
buildCacheableSystemPrompt() method and change the description to reference
getToolDefinitions() semantics so the doc matches the implementation that
extracts tool.function.name and tool.function.description.
- Around line 1051-1066: Add a JSDoc block above the buildDynamicSystemPrompt
method (matching the style used for buildCacheableSystemPrompt) that explains
this function returns the runtime-mutated system prompt pieces (editor state,
secrets, runtime context) and therefore must not be cached; mention the inputs
used (agentContext, secrets.getRegisteredNames(), resolved
llmModel/llmProvider/llmBaseUrl via resolveSystemPromptLlmContext) and the
intended usage so readers understand why caching is split between
buildCacheableSystemPrompt and buildDynamicSystemPrompt.
- Around line 1068-1073: The buildSystemPrompt method lacks documentation
explaining its purpose and relation to caching; add a concise JSDoc comment
above the buildSystemPrompt method describing that it recombines cacheable and
dynamic prompt parts (via buildCacheableSystemPrompt and
buildDynamicSystemPrompt), explains the returned string format (joined with
double newlines), and notes its role in system prompt events and caching
strategy so callers understand when caching applies versus when dynamic content
is included.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c11df5ac-36e6-4ca5-8762-6ee769f1abd7

📥 Commits

Reviewing files that changed from the base of the PR and between a841c0f and 139fcf6.

📒 Files selected for processing (7)

packages/agent-sdk/src/sdk/llm/__tests__/promptCaching.test.ts
packages/agent-sdk/src/sdk/llm/__tests__/thinkingBlocks.test.ts
packages/agent-sdk/src/sdk/llm/anthropic.ts
packages/agent-sdk/src/sdk/llm/openai-compatible.ts
packages/agent-sdk/src/sdk/llm/providerQuirks.ts
packages/agent-sdk/src/sdk/runtime/Agent.ts
packages/agent-sdk/src/sdk/runtime/__tests__/Agent.system-prompt.test.ts

🚧 Files skipped from review as they are similar to previous changes (5)

packages/agent-sdk/src/sdk/runtime/tests/Agent.system-prompt.test.ts
packages/agent-sdk/src/sdk/llm/providerQuirks.ts
packages/agent-sdk/src/sdk/llm/openai-compatible.ts
packages/agent-sdk/src/sdk/llm/anthropic.ts
packages/agent-sdk/src/sdk/llm/tests/promptCaching.test.ts

github-actions · 2026-05-11T17:33:32Z

🔧 VSCode Extension Built Successfully

• File: openhands-tab-0.9.3.vsix (548 KB)
• Download: https://github.com/enyst/OpenHands-Tab/actions/runs/25686354917

To install:

Download the artifact from the run page above
VS Code → Command Palette → "Extensions: Install from VSIX..."
Select the downloaded .vsix

Built with Node 22. Commit 63b0c15.

This comment was marked as resolved.

Sign in to view

Support claude-opus-4-7 prompt caching

a841c0f

This comment was marked as resolved.

Sign in to view

Fix Anthropic tool-result cache placement

2250e22

This comment was marked as resolved.

Sign in to view

Update packages/agent-sdk/src/sdk/llm/providerQuirks.ts

139fcf6

Co-authored-by: devin-ai-integration[bot] <158243242+devin-ai-integration[bot]@users.noreply.github.com>

coderabbitai Bot reviewed May 11, 2026

View reviewed changes

Add Anthropic cache smoke runner

f5bff0e

smolpaws merged commit 41d4f04 into develop May 11, 2026
9 checks passed

smolpaws deleted the codex/anthropic-prompt-caching branch May 11, 2026 17:37

Conversation

smolpaws commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Verification

Review

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Uh oh!

smolpaws commented May 11, 2026

Uh oh!

coderabbitai Bot commented May 11, 2026

Uh oh!

smolpaws commented May 11, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 11, 2026

🔧 VSCode Extension Built Successfully

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

smolpaws commented May 11, 2026 •

edited

Loading

coderabbitai Bot commented May 11, 2026 •

edited

Loading