Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
191 changes: 191 additions & 0 deletions docs/superpowers/specs/2026-07-03-targeted-feedback-design.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,191 @@
# Targeted feedback: pin console comments to panes and elements

**Date:** 2026-07-03
**Status:** Draft — awaiting review

## Problem

The experimental agent↔human console (issue #108) lets the human send free-text messages to the
listening agent, but a message is scoped to the whole board (`project/agent`). "The p95 line is
mislabeled" forces the agent to guess *which* chart, node, or section the human means. The human
should be able to pin a comment to a specific pane or UI element, and the agent should receive
enough structured context to resolve exactly what was meant — without a screenshot and without
guessing.

## Goals

- Pin a console message to a pane, flow node, component node (`props.id`), checklist item, or
markdown heading.
- Deliver the pin to the agent through the existing inbox channel, in terms the agent already
thinks in (spec ids, pane indexes, labels) — not DOM paths.
- Capture enough human-readable context at pin time (labels, pane title, text excerpt, board
version) that the agent understands the reference even if it can't or doesn't re-pull the board.
- Zero change to the security model: pinning rides the existing URL-gated `POST /inbox`.

## Non-goals (explicitly out of scope for v1)

- Persistent annotations or pin badges rendered on the board. The inbox stays a transient
conversation ring; a durable annotation layer is future work.
- Comment threading, replies, or resolution state.
- Vega-Lite mark-level, mermaid element-level, or calltree node targeting (no stable ids today).
- Flow *edge* targeting (thin hit targets, selector reliability unverified) — nodes only in v1.
- Multi-element selection or region/box selection.

## Approaches considered

1. **Extend the inbox event with a structured `target` + client-side context snapshot**
(**chosen**). Rides the entire existing console machinery — SSE echo, read receipts, unread
nudge, `termchart inbox` polling. Small server delta (validate one new optional field). The
agent gets spec-level identifiers it can resolve against `termchart pull`.
2. **DOM-path anchoring** (capture CSS path + bounding box for any element). Pins anything
immediately, but anchors are fragile across re-renders and the agent reasons in spec terms,
not DOM terms — poor fit for "the agent fully understands".
3. **Persistent annotation layer** (new store, pin badges, CRUD, new CLI verb). Durable and
threaded, but a large new surface for what is fundamentally a live feedback loop. YAGNI.

Approach 1 wins: it extends a proven channel instead of building a parallel one. Approach 2's one
real advantage (pin anything) is folded in as a fallback — an unaddressable click resolves to its
enclosing pane plus a text excerpt of what was clicked.

## Design

### 1. Wire format: `InboxEvent.target`

`InboxEvent` (packages/viewer/src/state.ts) gains one optional field, present only on `message`
events created with a pin:

```ts
/** Where a console message is pinned. Captured client-side at pin time; agent-facing. */
export interface InboxTarget {
kind: "pane" | "flow-node" | "component" | "checklist-item" | "heading";
/** Pane indexes from the board root into nested `panes` arrays; [] on a non-panes board. */
panePath?: number[];
paneTitle?: string;
/** Stable spec id: flow node id, component props.id, "checklistId/itemId", heading slug. */
id?: string;
/** Human-readable label captured at pin time (node label, heading text, item text). */
label?: string;
/** Rendered text of/around the clicked element, for context when `id` is absent or stale. */
excerpt?: string;
/** Board history seq at pin time, so the agent can tell the board has since changed.
* Reserved in the wire format; the v1 viewer omits it (the board version isn't tracked
* client-side) — `label`/`excerpt`/`paneTitle` carry the staleness resilience instead. */
seq?: number;
}
```

Resolution contract for the agent: `panePath` indexes the (possibly nested) `panes` arrays of the
scope's current spec; `id` resolves within the pane it names (flow `nodes[].id`, component
`props.id`, checklist `<checklistId>/<itemId>`, markdown heading slug). `kind: "pane"` with an
`excerpt` means "this pane, specifically the part reading `<excerpt>`" — the fallback for elements
with no stable id.

### 2. Server: validation and caps (server.ts)

`POST /w/<wsid>/inbox` accepts an optional `target` object on `kind: "message"` bodies. Validation
mirrors the existing inbox caps (reject with 400, never truncate silently):

- `kind` must be one of the five values above; unknown keys stripped.
- `panePath`: array of ≤8 non-negative integers.
- `id`, `label`, `paneTitle`: strings ≤200 chars (same cap as `MAX_INBOX_REF`).
- `excerpt`: string ≤400 chars.
- `seq`: number.

The validated target is stored on the event and flows unchanged through the ring, SSE `inbox`
events, and `GET /inbox`. No new endpoints; read receipts, long-poll, and the unread-nudge header
are untouched.

### 3. Viewer client: gesture-triggered inline composer

No mode toggle and no extra buttons — the gesture itself is the entry point. Three equivalent
triggers, all resolving the element under the pointer and opening an **inline composer popover**
anchored to it:

- **Double-click / double-tap** on an element (primary, desktop + touch).
- **Long-press** (~500 ms, primary on touch where double-tap may zoom).
- **`c` while hovering** (desktop accelerator; ignored when focus is in an input).

**Inline composer** (`client/comment-target.ts`, new): a small floating popover at the element —
target chip (e.g. `📌 pane "Latency" › node api-gw`) + a one-line textarea. `Enter` sends via the
existing `postInbox()` with `{kind:"message", text, target}`; `Esc` or clicking elsewhere cancels.
While open, the resolved element gets an outline highlight (reusing the pulse styling approach).
The console panel does **not** need to be open; if it is, the sent message appears in the
transcript with the same chip. Listeners are installed once, document-level, capture-phase;
double-click on an unresolvable area (outside any pane/board) does nothing.

**Text-selection caveat**: double-click in prose (markdown/text panes) natively selects a word.
The composer opens anyway and collapses the accidental selection; users who are double-clicking
*to select/copy* can `Esc` (instant, no state), or use long-press/`c`-hover in text-heavy panes.
Acceptable for a beta-flagged feature; revisit if it proves annoying in practice.

Resolution walks the gesture's target element upward, most specific first:

1. `.react-flow__node[data-id]` → `flow-node` (label = trimmed node text).
2. `[data-item]` inside a checklist → `checklist-item` (id = `checklistId/itemId`).
3. Element with an `id` attribute inside a component renderer root → `component`
(component nodes already render `props.id` as the HTML id; chrome ids are excluded by
requiring the ancestor to be inside a renderer mount).
4. `h1–h4[id]` inside a markdown body → `heading` (requires the markdown renderer change below).
5. `.pane` → `pane`, with `excerpt` = trimmed text of the originally clicked element (≤400).

`panePath` and `paneTitle` are computed for every kind by walking the DOM up through nested pane
containers (index among sibling panes at each level). On a non-panes board, `panePath` is `[]`.

**Flow note**: React Flow selection stays disabled (`elementsSelectable: false`); the gesture
listener is a plain DOM capture handler and never involves React Flow's selection state.

### 4. Markdown heading ids (`client/renderers/markdown.ts`)

The markdown renderer adds slug `id`s to `h1–h4` (lowercase, alphanumeric + hyphens, `-2` suffix
on duplicates — GitHub-style). This is the one renderer change and is independently useful
(deep-linkable reports).

### 5. Agent side: CLI rendering + skill docs

`termchart inbox` (packages/cli/src/inbox.ts) renders the target as an indented context line:

```
[3] message: "the p95 line is mislabeled"
📌 pane[1] "Latency" › flow-node api-gw ("API Gateway") · board v37
```

`--json` passes `target` through verbatim.

Docs updated so a listening agent knows the contract:
- `plugin/skills/diagram-recipes/status-state.md` — the inbox HTTP/CLI contract section gains the
`target` shape and the resolution rules above.
- `plugin/skills/inbox-watch/SKILL.md` — "when an event carries a target, resolve `id`/`panePath`
against `termchart pull` for that scope before acting; if `target.seq` is behind the current
board version, re-check that the element still exists."

### 6. Error handling

- Stale pins: the board may change between pin and read. The agent detects this via `target.seq`
vs current version; `label`/`excerpt`/`paneTitle` keep the comment intelligible even when the
`id` no longer resolves. The server does **not** validate targets against the live spec —
a pin is testimony about what the human saw, not a reference the server guarantees.
- Unresolvable gesture (outside any pane/board): ignored — no popover, no highlight.
- Oversized/malformed target: 400 from the server; the console surfaces the same send-failure
state it already has.

### 7. Testing

- Server unit tests: `POST /inbox` with valid target (round-trips through `GET /inbox` and SSE),
rejects bad kind / oversized fields / deep panePath; `message` without target unchanged.
- Client unit test (if the harness allows): target resolution from a synthetic DOM (flow node,
nested pane path, heading, fallback-to-pane).
- E2E: double-click a flow node in a nested pane, type into the inline composer, send, assert
`termchart inbox --json` returns the expected target and the console transcript (when open)
renders the chip.

### 8. Rollout

Independent release channels all touched: viewer Cloud Run deploy (server + client), npm CLI
publish (inbox target rendering), plugin marketplace bump (skill doc updates). Feature stays
behind the existing console-beta toggle — no default-experience change.

## Future work (recorded, not planned)

- Durable annotations with on-board pin badges and resolution state.
- Flow edge and vega mark targeting.
- Agent-initiated pins ("look here") via `status`/`focus` carrying a target.
17 changes: 15 additions & 2 deletions packages/cli/src/inbox.ts
Original file line number Diff line number Diff line change
Expand Up @@ -37,12 +37,25 @@ function parse(argv: string[]): Args {
return a;
}

interface InboxEvent { seq: number; ts: number; kind: "message" | "action"; text?: string; ref?: string; }
interface InboxTarget { kind: string; panePath?: number[]; paneTitle?: string; id?: string; label?: string; excerpt?: string; seq?: number; }
interface InboxEvent { seq: number; ts: number; kind: "message" | "action"; text?: string; ref?: string; target?: InboxTarget; }
interface InboxResponse { events: InboxEvent[]; cursor: number; }

/** One indented context line for a pinned message: where the human pinned it, in spec terms. */
function fmtTarget(t: InboxTarget): string {
const parts: string[] = [];
if (t.panePath?.length || t.paneTitle) parts.push(`pane[${(t.panePath ?? []).join(",")}]${t.paneTitle ? ` "${t.paneTitle}"` : ""}`);
if (t.kind !== "pane") parts.push(`${t.kind}${t.id ? ` ${t.id}` : ""}${t.label ? ` ("${t.label}")` : ""}`);
let line = ` 📌 ${parts.join(" › ")}`;
if (t.excerpt) line += ` · "${t.excerpt}"`;
if (t.seq !== undefined) line += ` · board v${t.seq}`;
return line;
}

function fmt(e: InboxEvent): string {
if (e.kind === "action") return `[${e.seq}] action ${e.ref ?? ""}`;
return `[${e.seq}] message ${e.text ?? ""}`;
const head = `[${e.seq}] message ${e.text ?? ""}`;
return e.target ? `${head}\n${fmtTarget(e.target)}` : head;
}

const sleep = (ms: number) => new Promise<void>((r) => setTimeout(r, ms));
Expand Down
34 changes: 34 additions & 0 deletions packages/cli/test/inbox.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,40 @@ describe("inbox", () => {
expect(errs.join("")).toContain("cursor: 2");
});

it("renders a pinned target as an indented context line (targeted feedback)", async () => {
const { url } = await stub({
events: [
{ seq: 3, ts: 3, kind: "message", text: "the p95 line is mislabeled", target: { kind: "flow-node", panePath: [1, 0], paneTitle: "Latency", id: "api-gw", label: "API Gateway", seq: 37 } },
{ seq: 4, ts: 4, kind: "message", text: "tighten this section", target: { kind: "pane", panePath: [0], excerpt: "Rollout risks: none identified" } },
{ seq: 5, ts: 5, kind: "message", text: "untargeted, no pin line" },
],
cursor: 5,
});
const out: string[] = [];
const so = vi.spyOn(process.stdout, "write").mockImplementation((s) => (out.push(String(s)), true));
const se = vi.spyOn(process.stderr, "write").mockImplementation(() => true);
const code = await inbox(["--project", "p", "--agent", "a"], { env: env(url) });
so.mockRestore(); se.mockRestore();
expect(code).toBe(0);
const text = out.join("");
expect(text).toContain('📌 pane[1,0] "Latency" › flow-node api-gw ("API Gateway") · board v37');
expect(text).toContain('📌 pane[0] · "Rollout risks: none identified"');
// Exactly two pin lines — the untargeted message must not grow one.
expect(text.match(/📌/g)).toHaveLength(2);
});

it("--json passes a target through verbatim", async () => {
const target = { kind: "component", id: "risk-card", label: "Risks", seq: 2 };
const { url } = await stub({ events: [{ seq: 1, ts: 1, kind: "message", text: "hi", target }], cursor: 1 });
const out: string[] = [];
const so = vi.spyOn(process.stdout, "write").mockImplementation((s) => (out.push(String(s)), true));
const se = vi.spyOn(process.stderr, "write").mockImplementation(() => true);
const code = await inbox(["--project", "p", "--agent", "a", "--json"], { env: env(url) });
so.mockRestore(); se.mockRestore();
expect(code).toBe(0);
expect(JSON.parse(out.join("")).events[0].target).toEqual(target);
});

it("--json prints the raw { events, cursor }", async () => {
const { url } = await stub(resp);
const out: string[] = [];
Expand Down
Empty file modified packages/viewer/bin/termchart-viewer.mjs
100644 → 100755
Empty file.
38 changes: 30 additions & 8 deletions packages/viewer/e2e/rich.e2e.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -335,8 +335,8 @@ try {
}));
ok("agent-checked items show as checked from the spec",
await p.evaluate(() => {
// The Mantine Checkbox forwards data-item onto its <input>, so the attr IS the input.
const build = document.querySelector('#diagram [data-id="ci"] input[data-item="build"]');
// data-item sits on the Checkbox ROOT row (wrapperProps); the input is its descendant.
const build = document.querySelector('#diagram [data-id="ci"] [data-item="build"] input');
return !!build && build.checked;
}));

Expand All @@ -346,15 +346,15 @@ try {
await p2.waitForFunction(() => document.querySelector("#status .live-dot"), null, { timeout: 10000 });
await p2.locator(`#scopes button[title="e2e/checklist"]`).click();
await p2.waitForFunction(
() => document.querySelector('#diagram [data-id="approval"] input[data-item="changelog"]'),
() => document.querySelector('#diagram [data-id="approval"] [data-item="changelog"] input'),
null,
{ timeout: 20000 },
);

// USER toggles an editable item on client 1 → it ticks immediately (optimistic).
await p.locator('#diagram [data-id="approval"] input[data-item="changelog"]').click();
await p.locator('#diagram [data-id="approval"] [data-item="changelog"] input').click();
await p.waitForFunction(
() => document.querySelector('#diagram [data-id="approval"] input[data-item="changelog"]').checked,
() => document.querySelector('#diagram [data-id="approval"] [data-item="changelog"] input').checked,
null,
{ timeout: 5000 },
);
Expand All @@ -371,7 +371,7 @@ try {

// …and the SECOND client sees it via the SSE update.
await p2.waitForFunction(
() => document.querySelector('#diagram [data-id="approval"] input[data-item="changelog"]').checked,
() => document.querySelector('#diagram [data-id="approval"] [data-item="changelog"] input').checked,
null,
{ timeout: 8000 },
);
Expand All @@ -380,13 +380,13 @@ try {
// AGENT side: a setChecked patch ticks a read-only (agent-driven) item live on both clients.
await patchCheck("checklist", "ci", "e2e", true);
await p.waitForFunction(
() => document.querySelector('#diagram [data-id="ci"] input[data-item="e2e"]').checked,
() => document.querySelector('#diagram [data-id="ci"] [data-item="e2e"] input').checked,
null,
{ timeout: 8000 },
);
ok("agent setChecked patch ticks a read-only item live (client 1)", true);
await p2.waitForFunction(
() => document.querySelector('#diagram [data-id="ci"] input[data-item="e2e"]').checked,
() => document.querySelector('#diagram [data-id="ci"] [data-item="e2e"] input').checked,
null,
{ timeout: 8000 },
);
Expand Down Expand Up @@ -456,6 +456,28 @@ try {
ok("clicking a chip posts an action to the inbox (agent reads the ref)",
events.some((e) => e.kind === "action" && e.ref === "rerun"));
}
// ---- targeted feedback: pin a comment to an element (dblclick → inline composer) ------------
// While the console flag is on: double-click a checklist item → the inline composer opens with
// the target resolved in spec terms; Enter delivers it to the inbox with a structured `target`.
await p.locator("#diagram .tc-checklist [data-item]").first().dblclick();
await p.waitForFunction(() => document.querySelector(".tc-pin-popover textarea"), null, { timeout: 5000 });
ok("dblclick on a checklist item opens the pinned composer", true);
ok("composer chip names the element in spec terms", await p.evaluate(() => {
const chip = document.querySelector(".tc-pin-popover .tc-pin-chip");
return !!chip && /checklist-item/.test(chip.textContent);
}));
await p.locator(".tc-pin-popover textarea").fill("this step is obsolete");
await p.locator(".tc-pin-popover textarea").press("Enter");
await p.waitForFunction(() => !document.querySelector(".tc-pin-popover"), null, { timeout: 5000 });
ok("Enter sends the pinned comment and closes the composer", true);
await p.waitForTimeout(300);
{
const events = await readInbox("checklist");
const ev = events.find((e) => e.kind === "message" && e.text === "this step is obsolete");
ok("pinned comment lands in the inbox with its structured target (agent-resolvable)",
!!ev && !!ev.target && ev.target.kind === "checklist-item" && typeof ev.target.id === "string" && ev.target.id.includes("/"));
}

// Toggle it back off so the rest of the run sees the default (off) experience.
await p.locator("#console-toggle").click();
await p.waitForFunction(() => {
Expand Down
Loading
Loading