Skip to content

Follow-ups: structured diagnostics logger + begin App() de-monolith (find-in-page hook)#17

Open
Floofy6 wants to merge 5 commits into
claude/five-huge-improvementsfrom
claude/followups-logging-demonolith
Open

Follow-ups: structured diagnostics logger + begin App() de-monolith (find-in-page hook)#17
Floofy6 wants to merge 5 commits into
claude/five-huge-improvementsfrom
claude/followups-logging-demonolith

Conversation

@Floofy6

@Floofy6 Floofy6 commented Jun 19, 2026

Copy link
Copy Markdown
Owner

Summary

Two follow-ups from the audit, stacked on #16 (base is claude/five-huge-improvements, so this PR's diff shows only the follow-up work — retarget to main once #16 merges).

  1. Structured diagnostics logger (electron/logger.cjs). Operational failures were console.error at best and silent catch {} at worst — no durable breadcrumb when a user lost data or a tab crashed. The logger writes leveled JSON lines to a size-capped, single-backup file under the profile (prism-diagnostics.log), still mirrors to the console, and never throws into its caller (it's used from catch blocks and crash handlers). Wired into the points that previously had no record: secure-store corruption quarantine, uncaughtException, unhandledRejection, and guest render-process-gone. Level via PRISM_LOG_LEVEL. Covered by tests/logger.cjs (level filtering, structured fields, rotation, never-throws).

  2. Begin the App() de-monolith — extract find-in-page (src/hooks/useFindInPage.ts). First slice of the renderer's biggest structural issue (one ~6,200-line App() holding all state, with ~59 hand-maintained ref-mirrors). Lifts the entire find-in-page feature — state, input ref, in-page search calls, focus-on-open + auto-close effects — into a hook that takes only what it needs (activeIdRef, isActiveWebview, a stable getActiveWebview getter). Return names match the old locals, so only the 4 setFindOpen(true) calls and the onFound prop changed; every read-site stayed put. The run/close callbacks are now stable (read the active webview via a stable getter instead of capturing activeId), so they no longer churn on tab switches. Adds tests/find-in-page.cjs — this feature had no test before. Establishes the pattern for the next slices (useTabModel, useAssistantChat, useAgentRun).

Testing

npm test — all 23 suites pass (incl. the new logger and find-in-page suites); tsc clean.

🤖 Generated with Claude Code

Floofy6 and others added 5 commits June 19, 2026 01:23
…atches

Operational failures were console.error at best and silent catch {} at worst, so a user
losing data or a tab crashing left no durable breadcrumb. electron/logger.cjs writes leveled
JSON lines to a size-capped, single-backup file under the profile (prism-diagnostics.log),
still mirrors to the console, and never throws into its caller (it's used from catch blocks
and crash handlers). It's tiny and synchronous on purpose — only low-frequency events log,
never per-request.

Wired into the points that previously had no durable record: secure-store corruption
quarantine, uncaughtException, unhandledRejection, and guest render-process-gone. Level is
configurable via PRISM_LOG_LEVEL.

tests/logger.cjs covers level filtering, structured fields, rotation, and the never-throws
guarantee. electron-smoke boots clean with the logger in place.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
First slice of the renderer de-monolith (the audit's biggest structural finding: App() is
one ~6,200-line component holding all state, with ~59 hand-maintained ref-mirrors). Lifts the
entire find-in-page feature — open/query/matches state, the input ref, the in-page search
calls, and the focus-on-open + auto-close effects — out of App() into src/hooks/useFindInPage.ts.

The hook takes only what it needs (activeIdRef, isActiveWebview, a stable getActiveWebview
getter) and returns a small API. App() keeps the JSX and wires reportFound into each
BrowserFrame's onFound; the keyboard handler / command palette / agent find-text tool call
openFindBar(). Return names match the old locals, so only the 4 setFindOpen(true) calls and
the onFound prop changed — every read-site stayed put. The run/close callbacks are now stable
(they read the active webview via a stable getter instead of capturing activeId), so they no
longer churn on tab switches.

Adds tests/find-in-page.cjs — find-in-page had NO coverage before; it now has an e2e (Cmd+F
open, controlled query, Escape close) that also serves as the regression net for the extraction.
Establishes the pattern for future slices (useTabModel, useAssistantChat, useAgentRun).

Full suite + tsc green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… cache on re-encode

Addressing low-severity findings from the xhigh code review:
- secure-store writeAsync: re-check the write sequence before stamping the cache after the
  awaited rename, so a beforeunload sync flush that superseded us can't leave the cache holding
  our stale items under the newer file's mtime. Also drop the dead 'const seq =' binding in
  writeSync (the writeSeq bump is the load-bearing effect; the unused local was a refactor smell).
- match-pattern: collapse runs of '*' before building the path regex so a '**' glob doesn't
  compile to adjacent '.*.*' (needless backtracking on long non-matching paths in did-navigate).
- reencodeSecureStoresToPlaintext: invalidate the chat/memory store caches after the out-of-band
  atomicWriteFileSync rewrites (wires the previously-unused invalidateCache), so a same-mtime-tick
  re-encode can't be masked by a stale cache.

Refuted in review (no change): wouldClobberEncrypted TOCTOU (isEncryptionAvailable checked first,
serialize is synchronous after), cross-extension token leak (per-origin realm isolation),
uncaughtException semantics (intentional, audit-recommended).

secure-store + match-pattern + logger unit tests + tsc pass.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…le renames

Adversarially-verified Windows-portability audit of the session's changes (29 agents, 6
confirmed / 16 refuted) surfaced one HIGH cross-platform regression and a few file-lock edges:

1. Guest crash recovery (HIGH). The new main-process render-process-gone handler auto-reloaded
   the crashed guest, but Electron fires that same event on BOTH the guest WebContents (main) and
   the <webview> DOM element (renderer). The renderer's pre-existing handleCrash deliberately does
   NOT auto-reload — it shows an error page and lets the user retry — specifically to avoid reload
   loops under Windows GPU contention (the path it guards with a per-URL/lifetime cap). The
   main-process reloader was a second, uncoordinated reloader with only a 15s time throttle and no
   lifetime cap, reintroducing exactly that loop on Windows. Fix at altitude: the renderer owns
   recovery; the main process now ONLY logs the crash (its stated purpose) and steps aside. This
   also resolves the no-lifetime-cap and reason-allow-list findings (it no longer reloads at all).

2. Log rotation (LOW, Windows file lock). rotateIfNeeded renamed the log over its ".1" backup;
   on Windows a rename can fail if the log is held open (a tail, antivirus, the search indexer),
   silently skipping rotation so the log grows unbounded. Now: remove the prior ".1" first (rename
   over an existing target is fragile on win32), and if the rename still fails, truncate in place
   so the size cap always holds — we lose the backup, not the bound.

3. Store renames (LOW, Windows file lock). atomicWriteFileSync, the writeAsync commit, and the
   corrupt-file quarantine all rename over a live file; under a transient AV/indexer lock that
   throws EPERM/EBUSY on Windows. Added a small bounded EPERM/EBUSY/EACCES retry (sync + async).
   POSIX never hits it, so the first attempt wins there. Newly load-bearing because memory/runs
   gained atomic temp+rename this session.

Refuted by the audit (no change): the new electron/*.cjs ARE bundled (build.files has
"electron/**/*"); mode 0o600 is correctly a guarded no-op on Windows; the 0700 profile chmod and
0600 token file are pre-existing guarded no-ops (not session regressions); safeStorage use is
platform-abstract (DPAPI on Windows); the npm test scripts and worker-revival are cross-platform.

Full suite (23) + tsc green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The grounding gate let a factual final answer through after ANY grounding tool ran — including a
web-search that only returned snippets. The v6 failure mode it missed: a model searched, saw the
result list, then answered a population number from priors instead of opening and reading a source.

Splits grounding into two tiers:
- grounding tools (web-search, read-active-page, read-tabs-content, read-selected-text,
  extract-data) gather leads, as before;
- source-read tools (read-active-page, read-selected-text, read-tabs-content, extract-data) are
  the subset that bring actual source text into the observations.

New pure helpers in agent.ts: isAgentSourceReadTool classifies a tool, and
getAgentFinalGroundingBlock returns "none" | "no-evidence" | "needs-source-read" from the run
counters (autonomy, grounding/source-read observation counts, action tools run, pushbacks). Both
are unit-tested in agent-harness-logic.cjs.

App.tsx runAgentTask wires it in:
- track sourceReadObservations alongside groundingObservations;
- on a "needs-source-read" block, push back with guidance to open the 1-3 best result URLs and
  read/extract them (bounded by the existing <2 pushback budget so no-source and future-event
  traps still land honestly);
- as a last resort when a source tab is open but unread after the budget, auto-issue a
  read-tabs-content on the run's opened tabs instead of finalizing;
- defer memory writes until source-read evidence exists, so a fact never gets stored from priors;
- if still ungrounded after the budget, land an honest safety-fallback answer and skip the critic
  resynthesis pass on that fallback.

Telemetry + eval: agent-live-run.cjs now records a stepTimeline and memoryEvents (with timing) so a
run can be judged on WHEN it read a source vs. wrote memory. Adds tests/agent-policy-compare.cjs — a
live prism-brain vN-vs-vM policy/action comparison (source-read before factual answers, no
memory-before-source, valid-directive rate, agent-call efficiency) — wired as the eval:agent-policy
script. It needs local Ollama + the live web, so it is intentionally NOT part of npm test.

tsc + agent-harness-logic green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant