You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix: sidebar agent uses real tab URL instead of stale Playwright URL (v0.12.6.0) (garrytan#544)
* fix: sidebar agent uses extension's activeTabUrl instead of stale Playwright URL
When the user navigates manually in headed Chrome, Playwright's page.url()
stays on the old page. The sidebar agent was using this stale URL in its
system prompt, causing it to navigate to the wrong page (e.g., Hacker News
instead of the user's current page).
The Chrome extension now captures the active tab URL via chrome.tabs.query()
and sends it as activeTabUrl in the /sidebar-command POST body. The server
prefers this over Playwright's URL. The URL is sanitized (http/https only,
control chars stripped, 2048 char limit) to prevent prompt injection.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: connect-chrome pre-flight cleanup + improved onboarding docs
Adds Step 0 pre-flight cleanup that kills stale browse servers and cleans
Chromium profile locks before connecting. Improves the onboarding flow with
clearer instructions for finding the extension, opening the Side Panel, and
troubleshooting connection issues. Fixes Mode check from cdp to headed.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test: sidebar agent test suite (layers 1-2)
Layer 1 (unit): 18 tests for URL sanitization in sidebar-utils.ts — http/https
pass, chrome:// rejected, javascript: rejected, control chars stripped, truncation.
Layer 2 (integration): 13 tests for server HTTP endpoints — auth, sidebar-command
queue writes, activeTabUrl override/fallback, event relay to chat buffer, message
queuing, queue overflow (429), chat clear, agent kill.
Source changes for testability:
- Extract sanitizeExtensionUrl() to browse/src/sidebar-utils.ts
- Add BROWSE_HEADLESS_SKIP env var to skip browser launch in HTTP-only tests
- Add SIDEBAR_QUEUE_PATH env var to both server.ts and sidebar-agent.ts
- Add SIDEBAR_AGENT_TIMEOUT env var to sidebar-agent.ts
- Sync package.json version to match VERSION (0.12.2.0)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test: sidebar agent round-trip tests with mock claude (layer 3)
Starts server + sidebar-agent together with a mock claude binary (shell script
outputting canned stream-json). Verifies the full queue-based message flow:
- Full round-trip: POST /sidebar-command → queue → agent → mock claude → events → chat
- Claude crash recovery: mock exits 1, agent_error appears, status returns to idle
- Sequential queue drain: two rapid messages both process in order
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test: sidebar agent E2E tests with real Claude (layer 4)
Two E2E tests that exercise the full sidebar agent flow with real Claude:
- sidebar-navigate: POST /sidebar-command asking Claude to describe a fixture
page, verify it responds with page content through the chat buffer
- sidebar-url-accuracy: POST with activeTabUrl differing from Playwright URL,
verify the queue prompt uses the extension URL (the core bug fix)
Both registered as periodic tier (~$0.80 total, non-deterministic).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: sidebar E2E tests — sequential execution + eval collector fix
Both tests now pass:
- sidebar-url-accuracy: deterministic queue file check (no Claude needed)
- sidebar-navigate: real Claude responds through sidebar agent queue
Fixed: testIfSelected (sequential, not concurrent) to avoid queue file
conflicts. Added cost_usd field for eval collector compatibility.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: kill stale sidebar-agent processes before starting new one
Each /connect-chrome starts a new sidebar-agent subprocess with unref()
but never kills the previous one. Old agents accumulate as zombies with
stale auth tokens. When they pick up queue entries, their event relay
fails (401), so the server never receives agent_done and marks the agent
as "hung". The user sees the sidebar freeze.
Fix: pkill any existing sidebar-agent.ts processes before spawning.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: bump version and changelog (v0.12.6.0)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* docs: add P1 TODO for sidebar Write tool + error visibility
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy file name to clipboardExpand all lines: CHANGELOG.md
+15Lines changed: 15 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,5 +1,20 @@
1
1
# Changelog
2
2
3
+
## [0.12.6.0] - 2026-03-27 — Sidebar Knows What Page You're On
4
+
5
+
The Chrome sidebar agent used to navigate to the wrong page when you asked it to do something. If you'd manually browsed to a site, the sidebar would ignore that and go to whatever Playwright last saw (often Hacker News from the demo). Now it works.
6
+
7
+
### Fixed
8
+
9
+
-**Sidebar uses the real tab URL.** The Chrome extension now captures the actual page URL via `chrome.tabs.query()` and sends it to the server. Previously the sidebar agent used Playwright's stale `page.url()`, which didn't update when you navigated manually in headed mode.
10
+
-**URL sanitization.** The extension-provided URL is validated (http/https only, control characters stripped, 2048 char limit) before being used in the Claude system prompt. Prevents prompt injection via crafted URLs.
11
+
-**Stale sidebar agents killed on reconnect.** Each `/connect-chrome` now kills leftover sidebar-agent processes before starting a new one. Old agents had stale auth tokens and would silently fail, causing the sidebar to freeze.
12
+
13
+
### Added
14
+
15
+
-**Pre-flight cleanup for `/connect-chrome`.** Kills stale browse servers and cleans Chromium profile locks before connecting. Prevents "already connected" false positives after crashes.
16
+
-**Sidebar agent test suite (36 tests).** Four layers: unit tests for URL sanitization, integration tests for server HTTP endpoints, mock-Claude round-trip tests, and E2E tests with real Claude. All free except layer 4.
17
+
3
18
## [0.12.5.1] - 2026-03-27 — Eng Review Now Tells You What to Parallelize
4
19
5
20
`/plan-eng-review` automatically analyzes your plan for parallel execution opportunities. When your plan has independent workstreams, the review outputs a dependency table, parallel lanes, and execution order so you know exactly which tasks to split into separate git worktrees.
**What:** Two issues with the sidebar agent (`sidebar-agent.ts`): (1) `--allowedTools` is hardcoded to `Bash,Read,Glob,Grep`, missing `Write`. Claude can't create files (like CSVs) when asked. (2) When Claude errors or returns empty, the sidebar UI shows nothing, just a green dot. No error message, no "I tried but failed", nothing.
191
+
192
+
**Why:** Users ask "write this to a CSV" and the sidebar silently can't. Then they think it's broken. The UI needs to surface errors visibly, and Claude needs the tools to actually do what's asked.
193
+
194
+
**Context:**`sidebar-agent.ts:163` hardcodes `--allowedTools`. The event relay (`handleStreamEvent`) handles `agent_done` and `agent_error` but the extension's sidepanel.js may not be rendering error states. The sidebar should show "Error: ..." or "Claude finished but produced no output" instead of staying on the green dot forever.
195
+
196
+
**Effort:** S (human: ~2h / CC: ~10min)
197
+
**Priority:** P1
198
+
**Depends on:** None
199
+
188
200
### Chrome Web Store publishing
189
201
190
202
**What:** Publish the gstack browse Chrome extension to Chrome Web Store for easier install.
0 commit comments