From d9e9976a23849977b2d5021871f859d3a8abe8a2 Mon Sep 17 00:00:00 2001 From: kanywst Date: Sun, 24 May 2026 20:24:57 +0900 Subject: [PATCH 1/8] feat(ai-hud): scaffold status bar AI cost + cache HUD Reserves 'src/vs/workbench/contrib/aiHud/' for the status bar item that surfaces AI cost and prompt-cache state. Design at 'docs/zeus-prompt-cache-hud.md'. This is the counter-positioning vs Cursor's credit model: show the raw numbers, never enforce a hard cap. Users decide whether to stop. --- docs/zeus-prompt-cache-hud.md | 47 ++++++++++++++++++++++++ src/vs/workbench/contrib/aiHud/README.md | 5 +++ 2 files changed, 52 insertions(+) create mode 100644 docs/zeus-prompt-cache-hud.md create mode 100644 src/vs/workbench/contrib/aiHud/README.md diff --git a/docs/zeus-prompt-cache-hud.md b/docs/zeus-prompt-cache-hud.md new file mode 100644 index 00000000..974d4770 --- /dev/null +++ b/docs/zeus-prompt-cache-hud.md @@ -0,0 +1,47 @@ +# Prompt cache HUD + +A status bar item that shows AI cost and prompt-cache state in real time. The point is transparency: developers should be able to feel each AI call's cost so they can change behavior, not be surprised at the end of the month. + +This is positioned directly against Cursor's credit model, which most users describe as opaque. We show the raw numbers; users can hide it if they don't care. + +## Status bar layout + +```text +⚡ 2 agents · 92% cache · $0.003 / $0.41 today +``` + +- **`⚡ N agents`**: number of currently-running subagents. Clickable → opens parallel-agents view +- **`X% cache`**: rolling cache hit ratio over the last 100 requests +- **`$X.XXX`**: cost of the most recent AI call +- **`$X.XX today`**: cumulative cost for the local day (resets midnight) + +Hovering over each segment shows a tooltip with the breakdown (input tokens / output tokens / cached tokens / cost per million). + +## Source of data + +- Live agent count: `IAgentRuntime` event stream (`feat/agent-sdk`) +- Per-call cost and cache state: Anthropic SDK `usage` field, mapped to current model pricing +- Today's cumulative: persisted to `IStorageService` workspace scope, key `zeus.ai.cost.today.` + +## Configuration + +- `zeus.ai.hud.enabled` (default: `true`) — show the HUD at all +- `zeus.ai.hud.detail` (`"compact" | "verbose"`) — controls the format +- `zeus.ai.hud.todayLimit` (number | null) — soft cap; turns the cost segment red when exceeded, no enforcement + +## Why no enforcement + +Hard credit caps are what makes Cursor frustrating. Zeus shows the number; the user decides whether to stop. If you want enforcement, the local LLM path or a custom MCP proxy can give you that. + +## Acceptance criteria (real impl) + +- [ ] Status bar item appears when an Anthropic AI feature is configured +- [ ] Real-time updates within ~200ms of each request completing +- [ ] Hover tooltip shows token / cost breakdown +- [ ] `today` value persists across editor restarts in the same local day +- [ ] Setting `zeus.ai.hud.enabled = false` hides the item entirely +- [ ] Pricing table lives in code, not over the network (no live pricing fetch) + +## Status + +Slot reserved at `src/vs/workbench/contrib/aiHud/`. Depends on `IAgentRuntime` (`feat/agent-sdk`). diff --git a/src/vs/workbench/contrib/aiHud/README.md b/src/vs/workbench/contrib/aiHud/README.md new file mode 100644 index 00000000..86d3bcc2 --- /dev/null +++ b/src/vs/workbench/contrib/aiHud/README.md @@ -0,0 +1,5 @@ +# `aiHud` contribution + +Slot for the status bar HUD that shows AI cost and prompt-cache state. Design at [`docs/zeus-prompt-cache-hud.md`](../../../../../../docs/zeus-prompt-cache-hud.md). + +Reads from `IAgentRuntime` (`feat/agent-sdk`). No enforcement, just transparency. From 78e9f5a179b97547cadb8928e6345d7a65f0f8fc Mon Sep 17 00:00:00 2001 From: kanywst Date: Sun, 24 May 2026 21:08:14 +0900 Subject: [PATCH 2/8] docs(ai-hud): fix readme path, single-key storage, multi-item status note - src/vs/workbench/contrib/aiHud/README.md: relative path is 5 levels to docs/, not 6 - docs/zeus-prompt-cache-hud.md: replace per-day storage key ('zeus.ai.cost.today.') with a single key holding {date, total} that resets at local midnight, so the store doesn't grow over time - docs/zeus-prompt-cache-hud.md: note that the HUD is implemented as multiple adjacent StatusBarItems because VS Code's API doesn't support per-segment coloring in one item --- docs/zeus-prompt-cache-hud.md | 4 +++- src/vs/workbench/contrib/aiHud/README.md | 2 +- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/docs/zeus-prompt-cache-hud.md b/docs/zeus-prompt-cache-hud.md index 974d4770..70a6a375 100644 --- a/docs/zeus-prompt-cache-hud.md +++ b/docs/zeus-prompt-cache-hud.md @@ -21,7 +21,7 @@ Hovering over each segment shows a tooltip with the breakdown (input tokens / ou - Live agent count: `IAgentRuntime` event stream (`feat/agent-sdk`) - Per-call cost and cache state: Anthropic SDK `usage` field, mapped to current model pricing -- Today's cumulative: persisted to `IStorageService` workspace scope, key `zeus.ai.cost.today.` +- Today's cumulative: persisted to `IStorageService` workspace scope under a single key `zeus.ai.cost` holding `{ date: "YYYY-MM-DD", total: number }`. When local midnight passes, the record is reset before the next write, so storage doesn't grow per day. ## Configuration @@ -29,6 +29,8 @@ Hovering over each segment shows a tooltip with the breakdown (input tokens / ou - `zeus.ai.hud.detail` (`"compact" | "verbose"`) — controls the format - `zeus.ai.hud.todayLimit` (number | null) — soft cap; turns the cost segment red when exceeded, no enforcement +The HUD is implemented as **multiple adjacent `StatusBarItem`s** (agents, cache, cost, today). VS Code's `StatusBarItem` API does not support per-segment coloring inside a single item, so the colored "over limit" treatment lives on its own item. + ## Why no enforcement Hard credit caps are what makes Cursor frustrating. Zeus shows the number; the user decides whether to stop. If you want enforcement, the local LLM path or a custom MCP proxy can give you that. diff --git a/src/vs/workbench/contrib/aiHud/README.md b/src/vs/workbench/contrib/aiHud/README.md index 86d3bcc2..d41813bf 100644 --- a/src/vs/workbench/contrib/aiHud/README.md +++ b/src/vs/workbench/contrib/aiHud/README.md @@ -1,5 +1,5 @@ # `aiHud` contribution -Slot for the status bar HUD that shows AI cost and prompt-cache state. Design at [`docs/zeus-prompt-cache-hud.md`](../../../../../../docs/zeus-prompt-cache-hud.md). +Slot for the status bar HUD that shows AI cost and prompt-cache state. Design at [`docs/zeus-prompt-cache-hud.md`](../../../../../docs/zeus-prompt-cache-hud.md). Reads from `IAgentRuntime` (`feat/agent-sdk`). No enforcement, just transparency. From 3e9b58cba9dfb8d88551cd02b3561dd692563d5e Mon Sep 17 00:00:00 2001 From: kanywst Date: Sun, 24 May 2026 22:29:11 +0900 Subject: [PATCH 3/8] ci: empty commit to retrigger run after transient @parcel/watcher node-gyp failure From 5d9ecc5152c67efb1cd6354a3ba425022b43b28d Mon Sep 17 00:00:00 2001 From: kanywst Date: Sun, 24 May 2026 23:39:53 +0900 Subject: [PATCH 4/8] docs(ai-hud): APPLICATION-scoped daily cost + leaner CI flake guard - IStorageService scope: WORKSPACE -> APPLICATION (per-user). The user's daily spend shouldn't reset when they switch workspaces, since the goal is total-cost transparency. Future setting 'zeus.ai.hud.scope' can flip it per-project for users who want that. - Flaky test (mcpStdioStateHandler 'sigterm after grace'): drop the redundant GITHUB_ACTIONS check (CI is set by every major CI provider, so the second condition was always true when the first was). Add a 'FLAKY-ON-CI(zeus#28)' marker so the skip is traceable and removable later. --- docs/zeus-prompt-cache-hud.md | 2 +- .../contrib/mcp/test/node/mcpStdioStateHandler.test.ts | 9 +++++---- 2 files changed, 6 insertions(+), 5 deletions(-) diff --git a/docs/zeus-prompt-cache-hud.md b/docs/zeus-prompt-cache-hud.md index 70a6a375..f04bc397 100644 --- a/docs/zeus-prompt-cache-hud.md +++ b/docs/zeus-prompt-cache-hud.md @@ -21,7 +21,7 @@ Hovering over each segment shows a tooltip with the breakdown (input tokens / ou - Live agent count: `IAgentRuntime` event stream (`feat/agent-sdk`) - Per-call cost and cache state: Anthropic SDK `usage` field, mapped to current model pricing -- Today's cumulative: persisted to `IStorageService` workspace scope under a single key `zeus.ai.cost` holding `{ date: "YYYY-MM-DD", total: number }`. When local midnight passes, the record is reset before the next write, so storage doesn't grow per day. +- Today's cumulative: persisted to `IStorageService` **`APPLICATION`** (per-user, cross-workspace) scope under a single key `zeus.ai.cost` holding `{ date: "YYYY-MM-DD", total: number }`. Application scope (not workspace) because the user's per-day spend should not reset when switching between workspaces — the goal is to surface real cost, not per-project cost. When local midnight passes, the record is reset before the next write, so storage doesn't grow per day. A future `zeus.ai.hud.scope` setting can flip it to `WORKSPACE` if a user wants per-project tracking. ## Configuration diff --git a/src/vs/workbench/contrib/mcp/test/node/mcpStdioStateHandler.test.ts b/src/vs/workbench/contrib/mcp/test/node/mcpStdioStateHandler.test.ts index 57283241..9847af9a 100644 --- a/src/vs/workbench/contrib/mcp/test/node/mcpStdioStateHandler.test.ts +++ b/src/vs/workbench/contrib/mcp/test/node/mcpStdioStateHandler.test.ts @@ -53,11 +53,12 @@ suite('McpStdioStateHandler', () => { assert.strictEqual(result.trim(), 'Data received: Hello MCP!'); }); - // Flaky on shared CI: the child process can exit before its + // FLAKY-ON-CI(zeus#28): child process can exit before its // post-SIGTERM stdout flush lands, so the test sees 'stdin ended' - // only and not 'stdin ended\nSIGTERM received'. Skip on CI until - // the upstream subprocess flush race is properly fixed. - if (!isWindows && !process.env.CI && !process.env.GITHUB_ACTIONS) { + // only and not 'stdin ended\nSIGTERM received'. Skipped on CI + // until the upstream subprocess flush race is fixed; tracked so + // this doesn't silently rot. + if (!isWindows && !process.env.CI) { test('sigterm after grace', async () => { const { handler, output } = run(` setInterval(() => {}, 1000); From 79b38acfc48c92187eac41e233e4f64b6709d9f8 Mon Sep 17 00:00:00 2001 From: kanywst Date: Mon, 25 May 2026 00:11:28 +0900 Subject: [PATCH 5/8] =?UTF-8?q?fix(ai-hud):=20cache=20window=20persistence?= =?UTF-8?q?,=20USD=20unit,=20pricing=20JSON;=20un-skip=20flake=20by=20bump?= =?UTF-8?q?ing=20grace=20100=E2=86=92250ms?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- docs/zeus-prompt-cache-hud.md | 6 +++--- .../mcp/test/node/mcpStdioStateHandler.test.ts | 12 +++++------- 2 files changed, 8 insertions(+), 10 deletions(-) diff --git a/docs/zeus-prompt-cache-hud.md b/docs/zeus-prompt-cache-hud.md index f04bc397..ab3a6093 100644 --- a/docs/zeus-prompt-cache-hud.md +++ b/docs/zeus-prompt-cache-hud.md @@ -11,7 +11,7 @@ This is positioned directly against Cursor's credit model, which most users desc ``` - **`⚡ N agents`**: number of currently-running subagents. Clickable → opens parallel-agents view -- **`X% cache`**: rolling cache hit ratio over the last 100 requests +- **`X% cache`**: rolling cache hit ratio over the last 100 requests. The 100-entry window is persisted alongside the day's cost record in `IStorageService.APPLICATION` (key `zeus.ai.cache.window`) so the ratio survives editor restarts; otherwise it would read `0%` for the first call after every relaunch, which is more misleading than helpful - **`$X.XXX`**: cost of the most recent AI call - **`$X.XX today`**: cumulative cost for the local day (resets midnight) @@ -27,7 +27,7 @@ Hovering over each segment shows a tooltip with the breakdown (input tokens / ou - `zeus.ai.hud.enabled` (default: `true`) — show the HUD at all - `zeus.ai.hud.detail` (`"compact" | "verbose"`) — controls the format -- `zeus.ai.hud.todayLimit` (number | null) — soft cap; turns the cost segment red when exceeded, no enforcement +- `zeus.ai.hud.todayLimit` (number | null) — soft cap in USD (matches the units shown in the status bar); turns the cost segment red when exceeded, no enforcement. `null` disables the colouring. The HUD is implemented as **multiple adjacent `StatusBarItem`s** (agents, cache, cost, today). VS Code's `StatusBarItem` API does not support per-segment coloring inside a single item, so the colored "over limit" treatment lives on its own item. @@ -42,7 +42,7 @@ Hard credit caps are what makes Cursor frustrating. Zeus shows the number; the u - [ ] Hover tooltip shows token / cost breakdown - [ ] `today` value persists across editor restarts in the same local day - [ ] Setting `zeus.ai.hud.enabled = false` hides the item entirely -- [ ] Pricing table lives in code, not over the network (no live pricing fetch) +- [ ] Pricing table lives in a bundled JSON file (`src/vs/workbench/contrib/aiHud/common/anthropicPricing.json`) shipped with the build. Updated by a dependabot-style PR when the upstream price page changes — see `script/refresh-pricing.mjs`. The HUD never makes a live network call for pricing on a hot path (latency + offline). At process start, if the cached file is older than 30 days, the HUD logs a warning to the developer console suggesting a Zeus update; the user-visible numbers continue to use the bundled table. ## Status diff --git a/src/vs/workbench/contrib/mcp/test/node/mcpStdioStateHandler.test.ts b/src/vs/workbench/contrib/mcp/test/node/mcpStdioStateHandler.test.ts index 9847af9a..d2b574ea 100644 --- a/src/vs/workbench/contrib/mcp/test/node/mcpStdioStateHandler.test.ts +++ b/src/vs/workbench/contrib/mcp/test/node/mcpStdioStateHandler.test.ts @@ -9,7 +9,10 @@ import * as assert from 'assert'; import { McpStdioStateHandler } from '../../node/mcpStdioStateHandler.js'; import { isWindows } from '../../../../../base/common/platform.js'; -const GRACE_TIME = 100; +// 250ms gives the child enough time on slow CI runners to handle SIGTERM +// and flush stdout before the parent escalates to SIGKILL. 100ms was +// racy on Linux containers under load. +const GRACE_TIME = 250; suite('McpStdioStateHandler', () => { const store = ensureNoDisposablesAreLeakedInTestSuite(); @@ -53,12 +56,7 @@ suite('McpStdioStateHandler', () => { assert.strictEqual(result.trim(), 'Data received: Hello MCP!'); }); - // FLAKY-ON-CI(zeus#28): child process can exit before its - // post-SIGTERM stdout flush lands, so the test sees 'stdin ended' - // only and not 'stdin ended\nSIGTERM received'. Skipped on CI - // until the upstream subprocess flush race is fixed; tracked so - // this doesn't silently rot. - if (!isWindows && !process.env.CI) { + if (!isWindows) { test('sigterm after grace', async () => { const { handler, output } = run(` setInterval(() => {}, 1000); From b29324861a37da542443586f8fc52b29f853fc8c Mon Sep 17 00:00:00 2001 From: kanywst Date: Mon, 25 May 2026 00:29:25 +0900 Subject: [PATCH 6/8] feat(ai-hud): main-process singleton service to avoid cross-window race, nls.localize plan, UI stale-pricing warning, bump GRACE_TIME to 1000ms --- docs/zeus-prompt-cache-hud.md | 8 +++++--- .../contrib/mcp/test/node/mcpStdioStateHandler.test.ts | 9 +++++---- 2 files changed, 10 insertions(+), 7 deletions(-) diff --git a/docs/zeus-prompt-cache-hud.md b/docs/zeus-prompt-cache-hud.md index ab3a6093..104c8e42 100644 --- a/docs/zeus-prompt-cache-hud.md +++ b/docs/zeus-prompt-cache-hud.md @@ -11,17 +11,19 @@ This is positioned directly against Cursor's credit model, which most users desc ``` - **`⚡ N agents`**: number of currently-running subagents. Clickable → opens parallel-agents view -- **`X% cache`**: rolling cache hit ratio over the last 100 requests. The 100-entry window is persisted alongside the day's cost record in `IStorageService.APPLICATION` (key `zeus.ai.cache.window`) so the ratio survives editor restarts; otherwise it would read `0%` for the first call after every relaunch, which is more misleading than helpful +- **`X% cache`**: rolling cache hit ratio over the last 100 requests. State (window + day totals) lives on the singleton **main-process** service `IAiCostService`, not directly in `IStorageService`, so multiple workbench windows opened against different workspaces stay consistent. The main process persists snapshots to `IStorageService.APPLICATION` (key `zeus.ai.cache.window`) once a second so a hard kill doesn't lose more than ~1s of data. Renderer windows subscribe to the service over IPC. - **`$X.XXX`**: cost of the most recent AI call - **`$X.XX today`**: cumulative cost for the local day (resets midnight) +All visible strings (`agents`, `cache`, `today`) ship through `nls.localize` so translations land alongside other UI localisation. The currency symbol and decimal separator are formatted via `Intl.NumberFormat(locale, { style: 'currency', currency: 'USD' })`, with a setting `zeus.ai.hud.currency` to override the display currency for users whose Anthropic billing is in another currency. + Hovering over each segment shows a tooltip with the breakdown (input tokens / output tokens / cached tokens / cost per million). ## Source of data - Live agent count: `IAgentRuntime` event stream (`feat/agent-sdk`) - Per-call cost and cache state: Anthropic SDK `usage` field, mapped to current model pricing -- Today's cumulative: persisted to `IStorageService` **`APPLICATION`** (per-user, cross-workspace) scope under a single key `zeus.ai.cost` holding `{ date: "YYYY-MM-DD", total: number }`. Application scope (not workspace) because the user's per-day spend should not reset when switching between workspaces — the goal is to surface real cost, not per-project cost. When local midnight passes, the record is reset before the next write, so storage doesn't grow per day. A future `zeus.ai.hud.scope` setting can flip it to `WORKSPACE` if a user wants per-project tracking. +- Today's cumulative: owned by the `IAiCostService` singleton in the **main process** (ensures atomic updates across all open workbench windows — `IStorageService` writes from concurrent renderers would race and silently lose updates). The main process snapshots the state to `IStorageService.APPLICATION` (per-user, cross-workspace) under key `zeus.ai.cost` holding `{ date: "YYYY-MM-DD", total: number }`. Application scope (not workspace) because the user's per-day spend should not reset when switching between workspaces — the goal is to surface real cost, not per-project cost. When local midnight passes, the record is reset before the next write, so storage doesn't grow per day. A future `zeus.ai.hud.scope` setting can flip it to `WORKSPACE` if a user wants per-project tracking. ## Configuration @@ -42,7 +44,7 @@ Hard credit caps are what makes Cursor frustrating. Zeus shows the number; the u - [ ] Hover tooltip shows token / cost breakdown - [ ] `today` value persists across editor restarts in the same local day - [ ] Setting `zeus.ai.hud.enabled = false` hides the item entirely -- [ ] Pricing table lives in a bundled JSON file (`src/vs/workbench/contrib/aiHud/common/anthropicPricing.json`) shipped with the build. Updated by a dependabot-style PR when the upstream price page changes — see `script/refresh-pricing.mjs`. The HUD never makes a live network call for pricing on a hot path (latency + offline). At process start, if the cached file is older than 30 days, the HUD logs a warning to the developer console suggesting a Zeus update; the user-visible numbers continue to use the bundled table. +- [ ] Pricing table lives in a bundled JSON file (`src/vs/workbench/contrib/aiHud/common/anthropicPricing.json`) shipped with the build. Updated by a dependabot-style PR when the upstream price page changes — see `script/refresh-pricing.mjs`. The HUD never makes a live network call for pricing on a hot path (latency + offline). If the file is older than 30 days the HUD shows a small `⚠` glyph next to the cost segment and the tooltip says `"Pricing data from {date}; estimates may be stale — update Zeus"`. The user-visible numbers continue to use the bundled table; the warning is visible because the transparency goal of this feature is broken if users silently look at outdated estimates. ## Status diff --git a/src/vs/workbench/contrib/mcp/test/node/mcpStdioStateHandler.test.ts b/src/vs/workbench/contrib/mcp/test/node/mcpStdioStateHandler.test.ts index d2b574ea..4ed399af 100644 --- a/src/vs/workbench/contrib/mcp/test/node/mcpStdioStateHandler.test.ts +++ b/src/vs/workbench/contrib/mcp/test/node/mcpStdioStateHandler.test.ts @@ -9,10 +9,11 @@ import * as assert from 'assert'; import { McpStdioStateHandler } from '../../node/mcpStdioStateHandler.js'; import { isWindows } from '../../../../../base/common/platform.js'; -// 250ms gives the child enough time on slow CI runners to handle SIGTERM -// and flush stdout before the parent escalates to SIGKILL. 100ms was -// racy on Linux containers under load. -const GRACE_TIME = 250; +// 1000ms gives the child enough time on slow CI runners to handle +// SIGTERM and flush stdout before the parent escalates to SIGKILL. 100ms +// was racy on Linux containers; 250ms still tripped under load. The +// test's `delay >= GRACE_TIME` assertion still scales correctly. +const GRACE_TIME = 1000; suite('McpStdioStateHandler', () => { const store = ensureNoDisposablesAreLeakedInTestSuite(); From 3fd22a64c1024eccf062189a333391e7bd4758b1 Mon Sep 17 00:00:00 2001 From: kanywst Date: Mon, 25 May 2026 00:48:48 +0900 Subject: [PATCH 7/8] test(mcp/stdio): raise suite timeout to 10s; GRACE_TIME*2 collided with mocha default 2000ms --- .../contrib/mcp/test/node/mcpStdioStateHandler.test.ts | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/src/vs/workbench/contrib/mcp/test/node/mcpStdioStateHandler.test.ts b/src/vs/workbench/contrib/mcp/test/node/mcpStdioStateHandler.test.ts index 4ed399af..8bba4ab8 100644 --- a/src/vs/workbench/contrib/mcp/test/node/mcpStdioStateHandler.test.ts +++ b/src/vs/workbench/contrib/mcp/test/node/mcpStdioStateHandler.test.ts @@ -15,7 +15,12 @@ import { isWindows } from '../../../../../base/common/platform.js'; // test's `delay >= GRACE_TIME` assertion still scales correctly. const GRACE_TIME = 1000; -suite('McpStdioStateHandler', () => { +// `sigkill after grace` waits at least GRACE_TIME * 2 = 2000ms, which +// collides with mocha's default 2000ms test timeout. Raise the suite +// timeout so the SIGTERM-then-SIGKILL path has room without becoming +// flaky again. +suite('McpStdioStateHandler', function () { + this.timeout(10_000); const store = ensureNoDisposablesAreLeakedInTestSuite(); function run(code: string) { From de188c4db812ff22c16bb58d26bc697f8e4353a6 Mon Sep 17 00:00:00 2001 From: kanywst Date: Mon, 25 May 2026 22:40:34 +0900 Subject: [PATCH 8/8] docs(ai-hud): stalePricingDays config + clarify 1s debounce is durability-only Address reviewer concerns: - Add zeus.ai.hud.stalePricingDays (default 30, null disables) so users on locked editor versions can suppress the stale-pricing glyph - Reword the 1s persistence cadence note: renderer windows stay in sync via IPC subscriptions, not by re-reading storage, so write frequency is purely a hard-kill durability bound --- docs/zeus-prompt-cache-hud.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/docs/zeus-prompt-cache-hud.md b/docs/zeus-prompt-cache-hud.md index 104c8e42..e8966286 100644 --- a/docs/zeus-prompt-cache-hud.md +++ b/docs/zeus-prompt-cache-hud.md @@ -11,7 +11,7 @@ This is positioned directly against Cursor's credit model, which most users desc ``` - **`⚡ N agents`**: number of currently-running subagents. Clickable → opens parallel-agents view -- **`X% cache`**: rolling cache hit ratio over the last 100 requests. State (window + day totals) lives on the singleton **main-process** service `IAiCostService`, not directly in `IStorageService`, so multiple workbench windows opened against different workspaces stay consistent. The main process persists snapshots to `IStorageService.APPLICATION` (key `zeus.ai.cache.window`) once a second so a hard kill doesn't lose more than ~1s of data. Renderer windows subscribe to the service over IPC. +- **`X% cache`**: rolling cache hit ratio over the last 100 requests. State (window + day totals) lives on the singleton **main-process** service `IAiCostService`, not directly in `IStorageService`, so multiple workbench windows opened against different workspaces stay consistent. The main process persists snapshots to `IStorageService.APPLICATION` (key `zeus.ai.cache.window`) on a 1s debounce so a hard kill doesn't lose more than ~1s of data. The debounce is durability-only — renderer windows stay in sync via IPC subscriptions, not by re-reading storage, so write frequency does not affect UI latency or cross-window consistency. - **`$X.XXX`**: cost of the most recent AI call - **`$X.XX today`**: cumulative cost for the local day (resets midnight) @@ -30,6 +30,7 @@ Hovering over each segment shows a tooltip with the breakdown (input tokens / ou - `zeus.ai.hud.enabled` (default: `true`) — show the HUD at all - `zeus.ai.hud.detail` (`"compact" | "verbose"`) — controls the format - `zeus.ai.hud.todayLimit` (number | null) — soft cap in USD (matches the units shown in the status bar); turns the cost segment red when exceeded, no enforcement. `null` disables the colouring. +- `zeus.ai.hud.stalePricingDays` (number | null, default `30`) — show the `⚠` stale-pricing glyph once the bundled pricing file is older than this many days. `null` disables the warning entirely for users in restricted environments (corporate-locked editor versions, offline installs) where update cadence is out of their control. The HUD is implemented as **multiple adjacent `StatusBarItem`s** (agents, cache, cost, today). VS Code's `StatusBarItem` API does not support per-segment coloring inside a single item, so the colored "over limit" treatment lives on its own item. @@ -44,7 +45,7 @@ Hard credit caps are what makes Cursor frustrating. Zeus shows the number; the u - [ ] Hover tooltip shows token / cost breakdown - [ ] `today` value persists across editor restarts in the same local day - [ ] Setting `zeus.ai.hud.enabled = false` hides the item entirely -- [ ] Pricing table lives in a bundled JSON file (`src/vs/workbench/contrib/aiHud/common/anthropicPricing.json`) shipped with the build. Updated by a dependabot-style PR when the upstream price page changes — see `script/refresh-pricing.mjs`. The HUD never makes a live network call for pricing on a hot path (latency + offline). If the file is older than 30 days the HUD shows a small `⚠` glyph next to the cost segment and the tooltip says `"Pricing data from {date}; estimates may be stale — update Zeus"`. The user-visible numbers continue to use the bundled table; the warning is visible because the transparency goal of this feature is broken if users silently look at outdated estimates. +- [ ] Pricing table lives in a bundled JSON file (`src/vs/workbench/contrib/aiHud/common/anthropicPricing.json`) shipped with the build. Updated by a dependabot-style PR when the upstream price page changes — see `script/refresh-pricing.mjs`. The HUD never makes a live network call for pricing on a hot path (latency + offline). If the file is older than `zeus.ai.hud.stalePricingDays` (default `30`) the HUD shows a small `⚠` glyph next to the cost segment and the tooltip says `"Pricing data from {date}; estimates may be stale — update Zeus"`. The user-visible numbers continue to use the bundled table; the warning is visible because the transparency goal of this feature is broken if users silently look at outdated estimates. Setting the threshold to `null` suppresses the glyph for users whose editor version cadence is out of their control. ## Status