Grego-GT
diff --git a/‎.agents/skills/gstack-connect-chrome/SKILL.md‎
Lines changed: 110 additions & 44 deletions b/‎.agents/skills/gstack-connect-chrome/SKILL.md‎
Lines changed: 110 additions & 44 deletions
diff --git a/‎CHANGELOG.md‎
Lines changed: 15 additions & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 15 additions & 0 deletions
diff --git a/‎TODOS.md‎
Lines changed: 12 additions & 0 deletions b/‎TODOS.md‎
Lines changed: 12 additions & 0 deletions
diff --git a/‎VERSION‎
Lines changed: 1 addition & 1 deletion b/‎VERSION‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎browse/src/cli.ts‎
Lines changed: 43 additions & 3 deletions b/‎browse/src/cli.ts‎
Lines changed: 43 additions & 3 deletions
@@ -342,72 +342,126 @@ If `NEEDS_SETUP`:
 2. Run: `cd <SKILL_DIR> && ./setup`
 3. If `bun` is not installed: `curl -fsSL https://bun.sh/install | bash`
 
+## Step 0: Pre-flight cleanup
+
+Before connecting, kill any stale browse servers and clean up lock files that
+may have persisted from a crash. This prevents "already connected" false
+positives and Chromium profile lock conflicts.
+
+```bash
+# Kill any existing browse server
+if [ -f "$(git rev-parse --show-toplevel 2>/dev/null)/.gstack/browse.json" ]; then
+  _OLD_PID=$(cat "$(git rev-parse --show-toplevel)/.gstack/browse.json" 2>/dev/null | grep -o '"pid":[0-9]*' | grep -o '[0-9]*')
+  [ -n "$_OLD_PID" ] && kill "$_OLD_PID" 2>/dev/null || true
+  sleep 1
+  [ -n "$_OLD_PID" ] && kill -9 "$_OLD_PID" 2>/dev/null || true
+  rm -f "$(git rev-parse --show-toplevel)/.gstack/browse.json"
+fi
+# Clean Chromium profile locks (can persist after crashes)
+_PROFILE_DIR="$HOME/.gstack/chromium-profile"
+for _LF in SingletonLock SingletonSocket SingletonCookie; do
+  rm -f "$_PROFILE_DIR/$_LF" 2>/dev/null || true
+done
+echo "Pre-flight cleanup done"
+```
+
 ## Step 1: Connect
 
 ```bash
 $B connect
 ```
 
-This launches your system Chrome via Playwright with:
-- A visible window (headed mode, not headless)
-- The gstack Chrome extension pre-loaded
-- A green shimmer line + "gstack" pill so you know which window is controlled
+This launches Playwright's bundled Chromium in headed mode with:
+- A visible window you can watch (not your regular Chrome — it stays untouched)
+- The gstack Chrome extension auto-loaded via `launchPersistentContext`
+- A golden shimmer line at the top of every page so you know which window is controlled
+- A sidebar agent process for chat commands
 
-If Chrome is already running, the server restarts in headed mode with a fresh
-Chrome instance. Your regular Chrome stays untouched.
+The `connect` command auto-discovers the extension from the gstack install
+directory. It always uses port **34567** so the extension can auto-connect.
 
-After connecting, print the output to the user.
+After connecting, print the full output to the user. Confirm you see
+`Mode: headed` in the output.
+
+If the output shows an error or the mode is not `headed`, run `$B status` and
+share the output with the user before proceeding.
 
 ## Step 2: Verify
 
 ```bash
 $B status
 ```
 
-Confirm the output shows `Mode: cdp`. Print the port number — the user may need
-it for the Side Panel.
+Confirm the output shows `Mode: headed`. Read the port from the state file:
+
+```bash
+cat "$(git rev-parse --show-toplevel 2>/dev/null)/.gstack/browse.json" 2>/dev/null | grep -o '"port":[0-9]*' | grep -o '[0-9]*'
+```
+
+The port should be **34567**. If it's different, note it — the user may need it
+for the Side Panel.
+
+Also find the extension path so you can help the user if they need to load it manually:
+
+```bash
+_EXT_PATH=""
+_ROOT=$(git rev-parse --show-toplevel 2>/dev/null)
+[ -n "$_ROOT" ] && [ -f "$_ROOT/.agents/skills/gstack/extension/manifest.json" ] && _EXT_PATH="$_ROOT/.agents/skills/gstack/extension"
+[ -z "$_EXT_PATH" ] && [ -f "$HOME/.agents/skills/gstack/extension/manifest.json" ] && _EXT_PATH="$HOME/.agents/skills/gstack/extension"
+echo "EXTENSION_PATH: ${_EXT_PATH:-NOT FOUND}"
+```
 
 ## Step 3: Guide the user to the Side Panel
 
 Use AskUserQuestion:
 
-> Chrome is launched with gstack control. You should see a green shimmer line at the
-> top of the Chrome window and a small "gstack" pill in the bottom-right corner.
->
-> The Side Panel extension is pre-loaded. To open it:
-> 1. Look for the **puzzle piece icon** (Extensions) in Chrome's toolbar
-> 2. Click it → find **gstack browse** → click the **pin icon** to pin it
-> 3. Click the **gstack icon** in the toolbar
-> 4. Click **Open Side Panel**
+> Chrome is launched with gstack control. You should see Playwright's Chromium
+> (not your regular Chrome) with a golden shimmer line at the top of the page.
 >
-> The Side Panel shows a live feed of every browse command in real time.
+> The Side Panel extension should be auto-loaded. To open it:
+> 1. Look for the **puzzle piece icon** (Extensions) in the toolbar — it may
+>    already show the gstack icon if the extension loaded successfully
+> 2. Click the **puzzle piece** → find **gstack browse** → click the **pin icon**
+> 3. Click the pinned **gstack icon** in the toolbar
+> 4. The Side Panel should open on the right showing a live activity feed
 >
-> **Port:** The browse server is on port {PORT} — the extension auto-detects it
-> if you're using the Playwright-controlled Chrome. If the badge stays gray, click
-> the gstack icon and enter port {PORT} manually.
+> **Port:** 34567 (auto-detected — the extension connects automatically in the
+> Playwright-controlled Chrome).
 
 Options:
 - A) I can see the Side Panel — let's go!
 - B) I can see Chrome but can't find the extension
 - C) Something went wrong
 
 If B: Tell the user:
-> The extension should be auto-loaded, but Chrome sometimes doesn't show it
-> immediately. Try:
+
+> The extension is loaded into Playwright's Chromium at launch time, but
+> sometimes it doesn't appear immediately. Try these steps:
+>
 > 1. Type `chrome://extensions` in the address bar
-> 2. Look for "gstack browse" — it should be listed and enabled
-> 3. If not listed, click "Load unpacked" → navigate to the extension folder
->    (press Cmd+Shift+G in the file picker, paste this path):
->    `{EXTENSION_PATH}`
+> 2. Look for **"gstack browse"** — it should be listed and enabled
+> 3. If it's there but not pinned, go back to any page, click the puzzle piece
+>    icon, and pin it
+> 4. If it's NOT listed at all, click **"Load unpacked"** and navigate to:
+>    - Press **Cmd+Shift+G** in the file picker dialog
+>    - Paste this path: `{EXTENSION_PATH}` (use the path from Step 2)
+>    - Click **Select**
+>
+> After loading, pin it and click the icon to open the Side Panel.
 >
-> Then pin it from the puzzle piece icon and open the Side Panel.
+> If the Side Panel badge stays gray (disconnected), click the gstack icon
+> and enter port **34567** manually.
+
+If C:
 
-If C: Run `$B status` and show the output. Check if the server is healthy.
+1. Run `$B status` and show the output
+2. If the server is not healthy, re-run Step 0 cleanup + Step 1 connect
+3. If the server IS healthy but the browser isn't visible, try `$B focus`
+4. If that fails, ask the user what they see (error message, blank screen, etc.)
 
 ## Step 4: Demo
 
-After the user confirms the Side Panel is working, run a quick demo so they
-can see the activity feed in action:
+After the user confirms the Side Panel is working, run a quick demo:
 
 ```bash
 $B goto https://news.ycombinator.com
@@ -420,16 +474,17 @@ $B snapshot -i
 ```
 
 Tell the user: "Check the Side Panel — you should see the `goto` and `snapshot`
-commands appear in the activity feed. Every command Claude runs will show up here
+commands appear in the activity feed. Every command Claude runs shows up here
 in real time."
 
 ## Step 5: Sidebar chat
 
 After the activity feed demo, tell the user about the sidebar chat:
 
 > The Side Panel also has a **chat tab**. Try typing a message like "take a
-> snapshot and describe this page." A child Claude instance will execute your
-> request in the browser — you'll see the commands appear in the activity feed.
+> snapshot and describe this page." A sidebar agent (a child Claude instance)
+> executes your request in the browser — you'll see the commands appear in
+> the activity feed as they happen.
 >
 > The sidebar agent can navigate pages, click buttons, fill forms, and read
 > content. Each task gets up to 5 minutes. It runs in an isolated session, so
@@ -439,17 +494,28 @@ After the activity feed demo, tell the user about the sidebar chat:
 
 Tell the user:
 
-> You're all set! Chrome is under Claude's control with the Side Panel showing
-> live activity and a chat sidebar for direct commands. Here's what you can do:
+> You're all set! Here's what you can do with the connected Chrome:
+>
+> **Watch Claude work in real time:**
+> - Run any gstack skill (`/qa`, `/design-review`, `/benchmark`) and watch
+>   every action happen in the visible Chrome window + Side Panel feed
+> - No cookie import needed — the Playwright browser shares its own session
+>
+> **Control the browser directly:**
+> - **Sidebar chat** — type natural language in the Side Panel and the sidebar
+>   agent executes it (e.g., "fill in the login form and submit")
+> - **Browse commands** — `$B goto <url>`, `$B click <sel>`, `$B fill <sel> <val>`,
+>   `$B snapshot -i` — all visible in Chrome + Side Panel
+>
+> **Window management:**
+> - `$B focus` — bring Chrome to the foreground anytime
+> - `$B disconnect` — close headed Chrome and return to headless mode
 >
-> - **Chat in the sidebar** — type natural language instructions and Claude
->   executes them in the browser
-> - **Run any browse command** — `$B goto`, `$B click`, `$B snapshot` — and
->   watch it happen in Chrome + the Side Panel
-> - **Use /qa or /design-review** — they'll run in the visible Chrome window
->   instead of headless. No cookie import needed.
-> - **`$B focus`** — bring Chrome to the foreground anytime
-> - **`$B disconnect`** — return to headless mode when done
+> **What skills look like in headed mode:**
+> - `/qa` runs its full test suite in the visible browser — you see every page
+>   load, every click, every assertion
+> - `/design-review` takes screenshots in the real browser — same pixels you see
+> - `/benchmark` measures performance in the headed browser
 
 Then proceed with whatever the user asked to do. If they didn't specify a task,
 ask what they'd like to test or browse.
@@ -1,5 +1,20 @@
 # Changelog
 
+## [0.12.6.0] - 2026-03-27 — Sidebar Knows What Page You're On
+
+The Chrome sidebar agent used to navigate to the wrong page when you asked it to do something. If you'd manually browsed to a site, the sidebar would ignore that and go to whatever Playwright last saw (often Hacker News from the demo). Now it works.
+
+### Fixed
+
+- **Sidebar uses the real tab URL.** The Chrome extension now captures the actual page URL via `chrome.tabs.query()` and sends it to the server. Previously the sidebar agent used Playwright's stale `page.url()`, which didn't update when you navigated manually in headed mode.
+- **URL sanitization.** The extension-provided URL is validated (http/https only, control characters stripped, 2048 char limit) before being used in the Claude system prompt. Prevents prompt injection via crafted URLs.
+- **Stale sidebar agents killed on reconnect.** Each `/connect-chrome` now kills leftover sidebar-agent processes before starting a new one. Old agents had stale auth tokens and would silently fail, causing the sidebar to freeze.
+
+### Added
+
+- **Pre-flight cleanup for `/connect-chrome`.** Kills stale browse servers and cleans Chromium profile locks before connecting. Prevents "already connected" false positives after crashes.
+- **Sidebar agent test suite (36 tests).** Four layers: unit tests for URL sanitization, integration tests for server HTTP endpoints, mock-Claude round-trip tests, and E2E tests with real Claude. All free except layer 4.
+
 ## [0.12.5.1] - 2026-03-27 — Eng Review Now Tells You What to Parallelize
 
 `/plan-eng-review` automatically analyzes your plan for parallel execution opportunities. When your plan has independent workstreams, the review outputs a dependency table, parallel lanes, and execution order so you know exactly which tasks to split into separate git worktrees.
 
@@ -185,6 +185,18 @@ Sidebar agent writes structured messages to `.context/sidebar-inbox/`. Workspace
 **Priority:** P3
 **Depends on:** Headed mode (shipped)
 
+### Sidebar agent needs Write tool + better error visibility
+
+**What:** Two issues with the sidebar agent (`sidebar-agent.ts`): (1) `--allowedTools` is hardcoded to `Bash,Read,Glob,Grep`, missing `Write`. Claude can't create files (like CSVs) when asked. (2) When Claude errors or returns empty, the sidebar UI shows nothing, just a green dot. No error message, no "I tried but failed", nothing.
+
+**Why:** Users ask "write this to a CSV" and the sidebar silently can't. Then they think it's broken. The UI needs to surface errors visibly, and Claude needs the tools to actually do what's asked.
+
+**Context:** `sidebar-agent.ts:163` hardcodes `--allowedTools`. The event relay (`handleStreamEvent`) handles `agent_done` and `agent_error` but the extension's sidepanel.js may not be rendering error states. The sidebar should show "Error: ..." or "Claude finished but produced no output" instead of staying on the green dot forever.
+
+**Effort:** S (human: ~2h / CC: ~10min)
+**Priority:** P1
+**Depends on:** None
+
 ### Chrome Web Store publishing
 
 **What:** Publish the gstack browse Chrome extension to Chrome Web Store for easier install.
 
@@ -1 +1 @@
-0.12.5.1
+0.12.6.0
@@ -511,8 +511,27 @@ Refs:           After 'snapshot', use @e1, @e2... as selectors:
       }
     }
 
-    // Clean up Chromium profile locks (can persist after crashes)
+    // Kill orphaned Chromium processes that may still hold the profile lock.
+    // The server PID is the Bun process; Chromium is a child that can outlive it
+    // if the server is killed abruptly (SIGKILL, crash, manual rm of state file).
     const profileDir = path.join(process.env.HOME || '/tmp', '.gstack', 'chromium-profile');
+    try {
+      const singletonLock = path.join(profileDir, 'SingletonLock');
+      const lockTarget = fs.readlinkSync(singletonLock); // e.g. "hostname-12345"
+      const orphanPid = parseInt(lockTarget.split('-').pop() || '', 10);
+      if (orphanPid && isProcessAlive(orphanPid)) {
+        try { process.kill(orphanPid, 'SIGTERM'); } catch {}
+        await new Promise(resolve => setTimeout(resolve, 1000));
+        if (isProcessAlive(orphanPid)) {
+          try { process.kill(orphanPid, 'SIGKILL'); } catch {}
+          await new Promise(resolve => setTimeout(resolve, 500));
+        }
+      }
+    } catch {
+      // No lock symlink or not readable — nothing to kill
+    }
+
+    // Clean up Chromium profile locks (can persist after crashes)
     for (const lockFile of ['SingletonLock', 'SingletonSocket', 'SingletonCookie']) {
       try { fs.unlinkSync(path.join(profileDir, lockFile)); } catch {}
     }
@@ -545,17 +564,38 @@ Refs:           After 'snapshot', use @e1, @e2... as selectors:
       console.log(`Connected to real Chrome\n${status}`);
 
       // Auto-start sidebar agent
-      const agentScript = path.resolve(__dirname, 'sidebar-agent.ts');
+      // __dirname is inside $bunfs in compiled binaries — resolve from execPath instead
+      let agentScript = path.resolve(__dirname, 'sidebar-agent.ts');
+      if (!fs.existsSync(agentScript)) {
+        agentScript = path.resolve(path.dirname(process.execPath), '..', 'src', 'sidebar-agent.ts');
+      }
       try {
+        if (!fs.existsSync(agentScript)) {
+          throw new Error(`sidebar-agent.ts not found at ${agentScript}`);
+        }
         // Clear old agent queue
         const agentQueue = path.join(process.env.HOME || '/tmp', '.gstack', 'sidebar-agent-queue.jsonl');
         try { fs.writeFileSync(agentQueue, ''); } catch {}
 
+        // Resolve browse binary path the same way — execPath-relative
+        let browseBin = path.resolve(__dirname, '..', 'dist', 'browse');
+        if (!fs.existsSync(browseBin)) {
+          browseBin = process.execPath; // the compiled binary itself
+        }
+
+        // Kill any existing sidebar-agent processes before starting a new one.
+        // Old agents have stale auth tokens and will silently fail to relay events,
+        // causing the server to mark the agent as "hung".
+        try {
+          const { spawnSync } = require('child_process');
+          spawnSync('pkill', ['-f', 'sidebar-agent\\.ts'], { stdio: 'ignore', timeout: 3000 });
+        } catch {}
+
         const agentProc = Bun.spawn(['bun', 'run', agentScript], {
           cwd: config.projectDir,
           env: {
             ...process.env,
-            BROWSE_BIN: path.resolve(__dirname, '..', 'dist', 'browse'),
+            BROWSE_BIN: browseBin,
             BROWSE_STATE_FILE: config.stateFile,
             BROWSE_SERVER_PORT: String(newState.port),
           },