feat: Predictive Local Echo (mosh-style)#266
Conversation
There was a problem hiding this comment.
Pull request overview
Adds mosh-style predictive local echo to the web terminal by painting “tentative” glyphs immediately (underline/dim) and reconciling/rolling back against the real inbound echo stream, reducing perceived typing latency under load and setting groundwork for future remote rk serve.
Changes:
- Introduces a pure, DOM-free predictive-echo engine (queue + confidence state machine + SRTT estimator + VT apply/confirm/rollback builders) with unit tests.
- Wires the engine into
TerminalClientoutbound input and inbound flush (string + binary frames), plus DEV-gatedwindow.__rkPredictionsfor e2e measurement. - Extends echo-latency e2e benchmark/spec docs and updates UI patterns memory docs.
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
app/frontend/src/components/terminal/predictive-echo.ts |
New pure prediction engine (VT builders, queue/reconcile, confidence + RTT estimator). |
app/frontend/src/components/terminal/predictive-echo.test.ts |
Unit tests for the pure engine (builders, state machine, queue behavior, gates). |
app/frontend/src/components/terminal/predictive-echo-binding.ts |
Thin xterm binding implementing the engine IO seam via public xterm APIs. |
app/frontend/src/components/terminal-client.tsx |
Integrates prediction into onData + inbound flush paths; adds __rkPredictions handle. |
app/frontend/src/components/terminal-client.test.tsx |
Integration/regression tests for window switch reset, reconnect reset, binary reconciliation. |
app/frontend/tests/e2e/echo-latency.spec.ts |
Adds perceived-echo metric + alternate-screen suppression scenario. |
app/frontend/tests/e2e/echo-latency.spec.md |
Documents the new perceived-echo and alternate-screen tests. |
docs/memory/run-kit/ui-patterns.md |
Documents the predictive-local-echo pattern, constraints, and integration points. |
docs/memory/run-kit/index.md |
Updates memory index entry to include predictive local echo. |
fab/changes/260613-dxqh-predictive-local-echo/* |
Plan/intake/status/history artifacts for the change. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const lastResetWindowIdRef = useRef(windowId); | ||
| useEffect(() => { | ||
| if (lastResetWindowIdRef.current === windowId) return; // initial mount / no change | ||
| lastResetWindowIdRef.current = windowId; | ||
| predictionRef.current?.reset(); | ||
| }, [windowId]); |
There was a problem hiding this comment.
Fixed — the reset effect now re-keys __rkPredictions on a same-session windowId switch: it unregisters the old id and re-registers the surviving engine under the current windowId (when an engine exists). (d33009e)
| prediction.reset(); | ||
| if (predictionRef.current === prediction) predictionRef.current = null; | ||
| unregisterTestPrediction(predictionWindowId); | ||
| if (wsRef.current) { |
There was a problem hiding this comment.
Fixed — connection-effect teardown now unregisters BOTH the connect-time predictionWindowId and the current windowIdRef.current (double-delete is a safe no-op), so the re-keyed entry can't leak after a same-session switch. (d33009e)
| // Shared decoder for reconciling BINARY inbound frames against the predictive- | ||
| // echo engine (allocated once, mirroring textEncoder). The relay frames ALL PTY | ||
| // output as binary (websocket.BinaryMessage), so every server echo arrives on | ||
| // the binary path — reconciliation must decode these bytes to feed the engine | ||
| // (NMF-1). Decoding with `{ stream: true }` carries any trailing partial UTF-8 | ||
| // sequence across chunk boundaries so a multi-byte glyph split between two relay | ||
| // frames decodes correctly rather than emitting a replacement char. | ||
| const textDecoder = new TextDecoder(); |
There was a problem hiding this comment.
Deferred — this is a known, accepted v1 limitation tracked in our memory (shared module-level TextDecoder with { stream: true } across board panes). It is v1-safe: matching is ASCII-only, so the worst case is a spurious divergence → rollback → PASSIVE → authoritative server repaint, never a false confirm. Future fix: per-connection decoder or drop { stream: true }.
| private enqueue(glyph: string, paint: boolean): void { | ||
| const pos = this.shadowCursor(); | ||
| if (!pos) return; // can't establish a position -> skip safely | ||
| const snapshot = this.io.readCell(pos.absRow, pos.col); | ||
| if (!snapshot) return; // can't read the cell -> skip safely | ||
|
|
||
| if (paint) { | ||
| this.io.write(buildApplySequence(snapshot, glyph, this.io.baseY())); | ||
| this.painted++; | ||
| } | ||
| this.queue.push({ | ||
| expectedEcho: glyph, | ||
| snapshot, | ||
| glyph, | ||
| enqueuedAt: this.io.now(), | ||
| painted: paint, | ||
| }); | ||
| // Advance the shadow cursor past the painted cell. Wrap is delegated to | ||
| // xterm (it reflows the real buffer); the shadow simply tracks the next | ||
| // column so the following prediction in this burst lands one cell over. | ||
| this.shadow = { absRow: pos.absRow, col: pos.col + 1 }; | ||
| } |
There was a problem hiding this comment.
Deferred — this is a known, accepted v1 limitation tracked in our memory (unbounded PASSIVE observe-only queue in a non-echoing pane, e.g. a password prompt). It is correctness-safe — nothing is painted; the only effect is unbounded growth until a reset and hasPending() staying true (defeating the decode gate in that one pane). Future fix: cap/age-out observe-only predictions.
| /** Snapshot of engine state for the DEV-gated `__rkPredictions` test handle. */ | ||
| export interface PredictionDebugState { | ||
| state: ConfidenceState; | ||
| /** Predictions painted but not yet confirmed. */ | ||
| unconfirmed: number; | ||
| /** Cumulative mispredictions (divergence events) since construction. */ | ||
| mispredictions: number; | ||
| /** Cumulative predictions painted (for the perceived-echo metric). */ | ||
| painted: number; | ||
| confirmWindowMs: number; | ||
| } |
There was a problem hiding this comment.
Fixed — adjusted the unconfirmed doc to match the behavior: it reports queue.length (queued-not-yet-reconciled), which equals painted-unconfirmed while ACTIVE and additionally includes observe-only PASSIVE bootstrap entries. Kept it as a single field rather than splitting, since the debug handle's shape is consumed by the e2e harness and the doc was the ambiguity. (d33009e)
| @@ -505,6 +592,10 @@ test.describe("Echo latency benchmark", () => { | |||
| if (render.length) console.log(summarize("└─ render (recv→glyph)", render)); | |||
| if (load.length) console.log(summarize("under-load (ticks)", load)); | |||
| if (base.length) console.log(summarize("baseline (tmux only)", base)); | |||
| if (perceivedIdle.length) | |||
| console.log(summarize("perceived idle (pred.)", perceivedIdle)); | |||
| if (perceivedLoad.length) | |||
| console.log(summarize("perceived load (pred.)", perceivedLoad)); | |||
| if (full.length && base.length) { | |||
There was a problem hiding this comment.
Fixed — removed the dead perceived-idle label from the benchmark report (the pick, the summary row, and the histogram row) plus its unused resetSamples call. The test only ever records perceived-load, so the output now matches what is measured. (d33009e)
- Re-key DEV __rkPredictions registry on same-session windowId switch (drop old id, register surviving engine under current id); unregister both connect-time and current windowId on connection teardown so no stale/leaked handle survives. - Clarify PredictionDebugState.unconfirmed doc to match impl (reports queue.length: painted while ACTIVE, includes observe-only entries while PASSIVE). - Remove dead perceived-idle label from echo-latency benchmark report (only perceived-load is ever recorded).
8efb1af to
f2d068a
Compare
Meta
Pipeline: intake ✓ → apply ✓ → review ✓ → hydrate ✓ → ship → review-pr
Impact: +2191/−14 code (excluding
fab/,docs/) · +2751/−16 totalSummary
Keystroke→echo latency in the web terminal is bimodal under load — PR #255 proved the slow mode (~22ms vs. ~4ms idle) originates upstream of the client flush, in tmux pacing. This PR adds mosh-style predictive local echo: on a predictable keystroke, the glyph is painted immediately in the terminal buffer with tentative styling (underline), then confirmed or rolled back when the real echo arrives. Perceived latency drops to near-0ms regardless of pane load or network RTT — making this mandatory infrastructure for future remote
rk serveas well as a local UX fix.Changes
app/frontend/src/components/terminal/predictive-echo.ts— pure prediction engine (pending-prediction queue, adaptive confidence state machine, VT apply/rollback string builders, mosh-style SRTT estimator); no DOM dependency, fully unit-testableapp/frontend/src/components/terminal/predictive-echo-binding.ts— thin binding that wires the engine'swrite()/cell-snapshot reads/inbound observation intoTerminalClientapp/frontend/src/components/terminal/predictive-echo.test.ts— unit tests for the pure engine (confidence transitions, queue match/divergence, backspace, VT apply/rollback)app/frontend/src/components/terminal-client.tsx— wrapsonDatawith the prediction hook, taps the inbound flush path for reconciliation, exposes DEV-gatedwindow.__rkPredictionstest handleapp/frontend/src/components/terminal-client.test.tsx— integration tests for the prediction bindingapp/frontend/tests/e2e/echo-latency.spec.ts+.spec.md— perceived-echo metric, misprediction counter across idle/under-load/alternate-screen scenarios (audit-style, not asserted budgets)docs/memory/run-kit/ui-patterns.md+ regenerated index — documents the buffer-write + VT-rollback pattern and the adaptive confidence reflexTesting
tsc --noEmit): cleanecho-latency.spec.ts): NOT run — requires the isolated port-3020 harness (just test-e2e), which is environment-sensitive. The spec exists and is wired; run it manually to validate the perceived-echo metric and misprediction counter.