Skip to content

Releases: mikuh/browser-ctl

v0.2.13

Choose a tag to compare

@mikuh mikuh released this 26 Feb 12:38

What's Changed

  • Resilient window/tab handling: added getLastFocusedNormalWindowId() fallback for headless/background scenarios
  • Navigate now creates a new window when no active tab exists
  • Safer doTabs with try/catch for getLastFocused
  • Force-click improvements: auto-retry with force when element is obscured by overlays
  • Content-script robustness for edge cases
  • SKILL.md updates: resilience notes, SPA best practices, press keyCode documentation

v0.2.12: Pinned tab session, multi-connection server, SW resilience

Choose a tag to compare

@mikuh mikuh released this 26 Feb 09:05

What's New

Pinned Tab Session

  • bctl tab <id> now pins a tab — all subsequent commands auto-target it
  • bctl tab --clear to unpin, bctl tab to show current pin status
  • Stale pinned tab auto-clears and retries on failure

Multi-Connection Server

  • Server now supports multiple extension WebSocket connections
  • Seamless handling of Chrome service worker restarts without dropping commands
  • Pending futures only cancelled when all connections are gone

Service Worker Resilience (MV3)

  • Reduced waitForTabLoad timeout to 5s to prevent SW termination
  • Auto-recover from disconnect on new-tab/navigate commands
  • Use lastFocusedWindow instead of unreliable currentWindow in SW context

Screenshot Improvements

  • CDP-based screenshot for pinned (non-visible) tabs
  • Proper tabId null-check in screenshotViaCDP
  • handle_screenshot now respects pinned tab context

Enhanced URL Extraction

  • Canvas element support (converts to data URL)
  • CSS background-image fallback
  • Blob URL async conversion to data URL

Download Improvements

  • MIME-based file extension correction for image downloads

Bug Fixes

  • Fix stale pin retry to also clean up windowId
  • Fix screenshotViaCDP falsy check for tab ID 0
  • Move _MIME_EXT to module-level constant

v0.2.11: Fix browser automation when Chrome is minimized

Choose a tag to compare

@mikuh mikuh released this 26 Feb 02:32

What's Changed

Bug Fix: Chrome minimized support

Previously, browser-ctl commands would hang or fail when Chrome was minimized because several rendering-dependent APIs don't work on hidden pages:

  • requestAnimationFrame callbacks are paused → stability checks hung indefinitely
  • document.elementFromPoint() returns null → hit-test always failed
  • offsetParent / getBoundingClientRect() may return incorrect values

Changes

  • click.js: Skip rAF stability check and elementFromPoint hit-test when document.hidden is true
  • content-script.js: Same fix for all actionability checks (ensureActionable) — affects click, dblclick, hover, drag, type, focus, check/uncheck, and snapshot
  • actions.js: Add CDP fallback for doScreenshot() — uses Page.captureScreenshot when captureVisibleTab fails (minimized window)

All commands now work correctly with Chrome minimized or in a background desktop.

v0.2.10

Choose a tag to compare

@mikuh mikuh released this 15 Feb 12:57

Release v0.2.10\n\n- Improve SPA/browser automation reliability\n- Better click text matching and stale ref diagnostics\n- Batch and interaction stability updates

v0.2.9: Refactor extension into ES modules

Choose a tag to compare

@mikuh mikuh released this 15 Feb 11:29

What's Changed

Refactor extension into ES modules

Split the monolithic background.js into separate, focused modules for better maintainability:

  • actions.js — Chrome API action handlers (navigation, tabs, screenshot, download, eval, etc.)
  • click.js — Unified click implementation with Playwright-style actionability checks and CDP fallback
  • content-script.js — DOM operations handler with Shadow DOM support
  • background.js — Slim entry point with WebSocket lifecycle only
  • manifest.json — Enable ES module service worker

Full Changelog: v0.2.8...v0.2.9

v0.2.8: Local pure-sleep, multi-window tab switching, SPA resilience

Choose a tag to compare

@mikuh mikuh released this 14 Feb 13:53

What's Changed

  • Local pure-sleepbctl wait <seconds> now runs locally in Python, avoiding extension round-trip which can timeout on heavy SPA pages (YouTube, Gmail, etc.) where the service worker is busy during page load.
  • Multi-window tab switchingbctl tabs returns windowId per tab and focusedWindowId. bctl tab <id> automatically focuses the containing window before activating the tab, enabling reliable cross-window switching.
  • SPA form interaction guidance — Updated SKILL.md with best practices: never use eval to set form values or click buttons on SPA sites; use type/input-text/click instead.
  • Updated SKILL.md — Added SPA best practices, multi-window usage guidance, and updated tips.

v0.2.7: Fix garbled non-ASCII output on Windows

Choose a tag to compare

@mikuh mikuh released this 13 Feb 08:07

What's Changed

Bug Fix

  • Fix garbled non-ASCII (Chinese, etc.) output on Windows — Reconfigure stdout/stderr to UTF-8 at CLI startup. Windows console defaults to the system code page (e.g. CP936 for Chinese locales), which caused garbled output when printing Unicode characters via json.dumps(ensure_ascii=False).

Details

  • Added _ensure_utf8_stdio() in cli.py that calls sys.stdout.reconfigure(encoding='utf-8') on Windows
  • Safe: only activates on sys.platform == 'win32', no-op on macOS/Linux
  • Requires Python 3.7+ (project requires 3.11+)

Full Changelog: v0.2.6...v0.2.7

v0.2.6: Fast CLI startup, SPA-compatible click

Choose a tag to compare

@mikuh mikuh released this 13 Feb 07:27

What's New

⚡ Fast CLI Startup

  • Replaced urllib with raw-socket HTTP in the client — eliminates heavy imports
  • Lazy module loading in CLI — only imports what's needed per command
  • Cold start reduced to ~5ms (previously ~30ms)

🖱️ SPA-Compatible Click

  • New three-phase click mechanism for maximum SPA compatibility:
    1. Phase 1 (ISOLATED): Dispatch pointer/mouse events for hover states and tracking
    2. Phase 2 (MAIN): Hook window.open() + dispatch click in page context
    3. Phase 3 (background): Navigate via chrome.tabs.create if popup was intercepted
  • bctl click now works reliably on Tencent Video, Bilibili, and other Vue/React SPAs that use window.open() for navigation

📦 Server Improvements

  • Expanded batchable operations: dblclick, focus, input-text, check, uncheck, snapshot, is-visible, get-value can now be batched via pipe/batch

📝 Documentation

  • SKILL.md streamlined for AI agents — removed project structure section, added SPA tips
  • README/README_CN updated with bctl setup subcommands, SPA click feature, and performance notes
  • Added SPA click row to comparison table

Full Changelog: v0.2.5...v0.2.6

v0.2.5

Choose a tag to compare

@mikuh mikuh released this 13 Feb 05:33

Full Changelog: v0.2.4...v0.2.5

v0.2.4: Fix click for Vue/React SPAs

Choose a tag to compare

@mikuh mikuh released this 13 Feb 05:12

What's Changed

Fix click not triggering navigation on Vue.js/React rendered pages

Replace el.click() with a full pointer+mouse event dispatch sequence before the native el.click() call:

pointerdown → mousedown → pointerup → mouseup → el.click()

Problem

On SPA sites like Tencent Video, Bilibili, etc., bctl click on search result cards would fire the DOM click but not trigger the framework's navigation handler, requiring manual URL extraction as a workaround.

Fix

  • Dispatch PointerEvent (pointerdown/pointerup) + MouseEvent (mousedown/mouseup) with element center coordinates
  • Retain native el.click() at the end to preserve isTrusted:true behavior required by sites like GitHub

Full Changelog: v0.2.3...v0.2.4