Skip to content

feat(window): add KWin scripting backend with 5 read+control tools#26

Open
isac322 wants to merge 6 commits into
mainfrom
overhaul/pr5-window
Open

feat(window): add KWin scripting backend with 5 read+control tools#26
isac322 wants to merge 6 commits into
mainfrom
overhaul/pr5-window

Conversation

@isac322
Copy link
Copy Markdown
Owner

@isac322 isac322 commented May 5, 2026

Why

AT-SPI is good for widget semantics but weak for window-level enumeration / geometry / activation. KWin scripting (KDE-specific) is the deterministic path.

What

  • src/kwin_mcp/window.py (NEW, 301L): KWinScriptingBackend(dbus_address)
    • 5 JS templates: JS_LIST_WINDOWS, JS_ACTIVE_WINDOW, JS_GEOMETRY_BY_ID, JS_ACTIVATE_BY_ID, JS_CLOSE_BY_ID
    • loadScriptFromText is unsupported (KWin 6.6.4 — see PR 1b spike), so the backend writes JS to a tempfile and uses loadScript(path, name) + UUID per call
    • Result delivery: temporary D-Bus name + dbus.service.Object.Result(s) listener + callDBus(...) from JS. Uses DBusGMainLoop(set_as_default=True) plus scoped GLib.MainContext.default().iteration() polling — no global mainloop.
    • JS templates were written from scratch from KDE Plasma 6 scripting docs; kdotool (GPL-3.0) was not borrowed
  • src/kwin_mcp/server.py: 5 new @mcp.tool() entries (window_list, active_window, window_geometry, window_activate, window_close). Tool count: 30 → 35.
  • src/kwin_mcp/core.py: engine methods + _check_window_mutation_allowed (blocks mutation in live sessions)
  • tests/integration/test_window_backend.py: 7 tests (kcalc lifecycle + live mode gate + invalid id)

Docs sync

  • README.md: "30 MCP tools" → "35 MCP tools" (×2), "(30 tools)" → "(35 tools)" (arch diagram), new tool reference table, updated arch description
  • integrations/claude-code/skills/kwin-desktop-automation/SKILL.md (source) + opencode mirror (auto-synced via scripts/sync_plugin_version.py)
  • CHANGELOG.md: entry under [Unreleased]
  • scripts/check_docs_seo.py: TOOL_COUNT_CANONICAL = 35
  • .claude/positioning.yml: tool_count: 35, tool_count_canonical: 35

Safety

In live sessions, window_activate and window_close return "Error: window mutation not supported in live session (v1 safety). Use virtual session.". window_list / active_window / window_geometry work in both modes.

Verify

uv run pytest -m kwin tests/integration/test_window_backend.py   # → 7 passed
python3 scripts/check_docs_seo.py                                # → PASS

Series: Stacked on top of PR 4. Base = overhaul/pr4-screenshot. Final PR in the series.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
@isac322 isac322 changed the base branch from overhaul/pr4-screenshot to launch/backend-overhaul May 5, 2026 13:34
@isac322 isac322 force-pushed the overhaul/pr5-window branch from 698b982 to 6376973 Compare May 5, 2026 13:44
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 5, 2026

📝 Docs & SEO Review

Source files changed in this PR:

integrations/claude-code/.claude-plugin/plugin.json
integrations/claude-code/skills/kwin-desktop-automation/SKILL.md
integrations/opencode/plugin/skill/kwin-desktop-automation/SKILL.md
pyproject.toml
src/kwin_mcp/accessibility_worker.py
src/kwin_mcp/core.py
src/kwin_mcp/dbus_args.py
src/kwin_mcp/screenshot.py
src/kwin_mcp/server.py
src/kwin_mcp/session.py
src/kwin_mcp/window.py

Consistency check results:

✅  All documentation/plugin SEO checks passed.

Run @docs-seo in Claude Code to perform a full documentation review.

@isac322 isac322 force-pushed the overhaul/pr5-window branch 2 times, most recently from 240a2e0 to a346962 Compare May 5, 2026 14:10
isac322 added 5 commits May 5, 2026 23:12
- Add dbus_args.py: full dbus-send recursive-descent parser (12 basic
  types + array/dict/variant containers) returning dbus-python types.
- Add typed-JSON args dispatcher: args list now accepts both legacy
  'type:value' strings and {type, value} dicts, mixed in one call.
  Schema widened (additive): list[str] -> list[str | dict].
- core.py:dbus_call now uses dbus.bus.BusConnection + Interface in-process
  (no subprocess). Errors surface as 'D-Bus error: <name>: <msg>'.
- _format_dbus_result: void -> empty, primitive -> bare, container -> JSON.
- 80 unit tests for the parser; 6 integration tests against virtual KWin.
- docs/design/dbus-call-call-sites.md documents the public-tool contract.

Pre-commit: ruff/ty clean, ci_guards pass, 87 tests green.
…l worker

Replaces per-call `subprocess.run([sys.executable, '-m', 'kwin_mcp.accessibility', ...])`
with a long-lived spawn-context multiprocessing.Pool(processes=1) worker.

Key changes:
- New module `kwin_mcp.accessibility_worker` containing picklable top-level
  callables `_init_atspi_worker` and `do_atspi_op`. Module is NEVER imported
  at module-top by core.py (CI guard 3 enforces this); deferred imports inside
  `_ensure_atspi_pool` and `_run_atspi` only.
- Worker init validates `Atspi.get_desktop(0).get_child_count() >= 0` against
  the bus address and raises RuntimeError with the bus address on failure.
- Pool teardown protocol survives hung workers within 7s:
  close → join 5s → terminate → join 2s → SIGKILL → join 1s.
- IPC error handling catches `(EOFError, BrokenPipeError, ConnectionResetError)`
  and `OSError errno in (32, 104)` for transparent recovery; one retry, then
  re-raise.
- Defensive `__del__` teardown swallows exceptions for partial-init safety.
- AGENT_EXEC_APPROVAL=1 prefix used for all uv invocations during this change
  per session permission.

Performance:
- Cold start ≤ 1.5s (verified)
- Warm calls < 200ms (verified, plan-mandated threshold)
- vs ~700ms per subprocess on the previous design

Tests:
- tests/integration/test_atspi_pool.py — 6 tests, all passing under
  `AGENT_EXEC_APPROVAL=1 uv run pytest -m kwin`:
  cold start <1.5s; warm call <200ms; recovers from external SIGKILL;
  init failure surfaces RuntimeError with bus address; teardown within 7s
  even with SIGSTOPped worker; no zombie processes after session_stop.

Plan tasks: 11, 12, 13, 14, 15.
…ckend

PR 4 lands the screenshot-backend overhaul as a single atomic unit.

What changed:
- New `_probe_screenshot_capability(dbus_address, wayland_socket)` runs at
  session_start. Tries a 1x1 `CaptureArea` via in-process ScreenShot2 D-Bus,
  falls back to a real `spectacle` capture, returns one of:
  "screenshot2_dbus" | "spectacle_cli" | "unavailable".
  10s wall-clock cap; bus address GUID redacted in logs.
- `SessionInfo.screenshot_backend: str = "unavailable"` carries the probe
  result; `Session.start` and `LiveSession.__init__` populate it via a
  helper that swallows probe exceptions and warns on failure.
- `capture_screenshot_to_file` and `capture_frame_burst` now accept
  `screenshot_backend` as an explicit keyword argument (no global state).
  Dispatch is a `match` on the three values; "unavailable" raises
  `RuntimeError("No screenshot backend available; install spectacle or
  fix KWin EglBackend")`.
- Two `core.py` call sites pass `screenshot_backend=info.screenshot_backend`.

Tests:
- New `tests/integration/test_screenshot_backends.py` (4 tests, marked
  `@pytest.mark.kwin`):
  1. probe returns one of the three documented strings
  2. screenshot writes a real PNG (signature byte-checked)
  3. burst capture preserves backend invariant + valid PNGs per frame
  4. forcing backend="unavailable" raises with the exact message
  All 4 pass against a real virtual KWin session.

Plan tasks: 16, 17, 18, 19.
PR 5 - Wave 5 of kwin-mcp backend overhaul.

New module src/kwin_mcp/window.py: KWinScriptingBackend with 5 JS templates
(JS_LIST_WINDOWS, JS_ACTIVE_WINDOW, JS_GEOMETRY_BY_ID, JS_ACTIVATE_BY_ID,
JS_CLOSE_BY_ID) loaded via tempfile + KWin.Scripting.loadScript (because KWin
6.6.4 lacks loadScriptFromText). callDBus result-passing via per-call unique
bus name + dbus.service.Object listener with scoped GLib MainContext iteration
(no perpetual mainloop). JS templates rewritten from KDE Plasma 6 scripting
docs (no kdotool GPL-3.0 copy).

5 new MCP tools in server.py + corresponding AutomationEngine methods:
window_list, active_window, window_geometry, window_activate, window_close.
Tool count: 30 -> 35.

Live-mode safety gate: window_activate/window_close return error string when
session is LiveSession instance (read-only ops unchanged in both modes).

Tests: tests/integration/test_window_backend.py with 7 tests (kcalc launch,
list, active, geometry, activate, close, live-mode mock, invalid id) -- all
pass against virtual KWin session.

Docs sync: README/SKILL.md/check_docs_seo.py/positioning.yml all updated to
35 tools; opencode mirror auto-generated via sync_plugin_version.py.
@isac322 isac322 force-pushed the overhaul/pr5-window branch from a346962 to ec0b4c5 Compare May 5, 2026 14:12
Base automatically changed from launch/backend-overhaul to main May 29, 2026 02:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant