🤖 feat: add memory harvest pipeline#3558
Conversation
Implements exactly-once compaction completion metadata and a Harvest → Sweep memory consolidation path, with host-validated harvest inbox writes and Memory tab status updates.
---
_Generated with `mux` • Model: `openai:gpt-5.5` • Thinking: `xhigh` • Cost: `2391569{MUX_COSTS_USD:-unknown}`_
<!-- mux-attribution: model=openai:gpt-5.5 thinking=xhigh costs=75.32 -->
Fixes deep-review findings around harvest secret filtering, reset/malformed boundaries, archive ordering, stale harvest recovery, prompt chunking, candidate dedupe, sidecar compatibility/retention, and accessible failure status.
---
_Generated with `mux` • Model: `openai:gpt-5.5` • Thinking: `xhigh` • Cost: `2675211{MUX_COSTS_USD:-unknown}`_
<!-- mux-attribution: model=openai:gpt-5.5 thinking=xhigh costs=75.32 -->
|
@codex review Please review the compaction memory Harvest → Sweep pipeline and the deep-review follow-up fixes. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b5b45fa70c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Ensures a successful zero-candidate harvest removes any stale inbox file from a previous attempt before the sweep can read it.
---
_Generated with `mux` • Model: `openai:gpt-5.5` • Thinking: `xhigh` • Cost: `2921039{MUX_COSTS_USD:-unknown}`_
<!-- mux-attribution: model=openai:gpt-5.5 thinking=xhigh costs=171.97 -->
|
Addressed Codex finding
|
|
@codex review Please take another look after the stale harvest inbox fix. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 44f4248ce7
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Uses the compaction summary message id for harvest inbox paths so reset-induced epoch reuse cannot overwrite or delete another boundary's inbox.
---
_Generated with `mux` • Model: `openai:gpt-5.5` • Thinking: `xhigh` • Cost: `2990562{MUX_COSTS_USD:-unknown}`_
<!-- mux-attribution: model=openai:gpt-5.5 thinking=xhigh costs=171.97 -->
|
Addressed Codex finding
|
|
@codex review Please take another look after the boundary-keyed harvest inbox fix. |
|
Codex Review: Didn't find any major issues. Bravo. Reviewed commit: ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
Deduplicate the `completedAt ?? startedAt` harvest-record ordering key in src/node/services/memoryConsolidationService.ts. After #3558 ("add memory harvest pipeline"), the same effective-time expression was computed inline four times across two functions that must rank records identically: - findNewestHarvestRecord picks the record with the max completedAt/startedAt. - pruneHarvestRecords sorts descending by the same key to keep the newest 20. Both now derive recency from one `harvestRecordTime(record)` helper, documenting that they must agree on "newest" so the format can't drift between call sites. Behavior-preserving: the helper returns the identical value, the reduce/sort comparisons are unchanged, and normalizeHarvestRecord's distinct `completedAt ?? Date.now()` (finalizing a stale record) is intentionally left inline. Verified with `bun test memoryConsolidationService.test.ts` (pass) plus eslint + tsc. --- _Generated with `mux` • Model: `anthropic:claude-opus-4-8` • Thinking: `xhigh` • Cost: `n/a`_ <!-- mux-attribution: model=anthropic:claude-opus-4-8 thinking=xhigh costs=n/a -->
Deduplicate the `completedAt ?? startedAt` harvest-record ordering key in src/node/services/memoryConsolidationService.ts. After #3558 ("add memory harvest pipeline"), the same effective-time expression was computed inline four times across two functions that must rank records identically: - findNewestHarvestRecord picks the record with the max completedAt/startedAt. - pruneHarvestRecords sorts descending by the same key to keep the newest 20. Both now derive recency from one `harvestRecordTime(record)` helper, documenting that they must agree on "newest" so the format can't drift between call sites. Behavior-preserving: the helper returns the identical value, the reduce/sort comparisons are unchanged, and normalizeHarvestRecord's distinct `completedAt ?? Date.now()` (finalizing a stale record) is intentionally left inline. Verified with `bun test memoryConsolidationService.test.ts` (pass) plus eslint + tsc. --- _Generated with `mux` • Model: `anthropic:claude-opus-4-8` • Thinking: `xhigh` • Cost: `n/a`_ <!-- mux-attribution: model=anthropic:claude-opus-4-8 thinking=xhigh costs=n/a -->
Deduplicate the `completedAt ?? startedAt` harvest-record ordering key in src/node/services/memoryConsolidationService.ts. After #3558 ("add memory harvest pipeline"), the same effective-time expression was computed inline four times across two functions that must rank records identically: - findNewestHarvestRecord picks the record with the max completedAt/startedAt. - pruneHarvestRecords sorts descending by the same key to keep the newest 20. Both now derive recency from one `harvestRecordTime(record)` helper, documenting that they must agree on "newest" so the format can't drift between call sites. Behavior-preserving: the helper returns the identical value, the reduce/sort comparisons are unchanged, and normalizeHarvestRecord's distinct `completedAt ?? Date.now()` (finalizing a stale record) is intentionally left inline. Verified with `bun test memoryConsolidationService.test.ts` (pass) plus eslint + tsc. --- _Generated with `mux` • Model: `anthropic:claude-opus-4-8` • Thinking: `xhigh` • Cost: `n/a`_ <!-- mux-attribution: model=anthropic:claude-opus-4-8 thinking=xhigh costs=n/a -->
Deduplicate the `completedAt ?? startedAt` harvest-record ordering key in src/node/services/memoryConsolidationService.ts. After #3558 ("add memory harvest pipeline"), the same effective-time expression was computed inline four times across two functions that must rank records identically: - findNewestHarvestRecord picks the record with the max completedAt/startedAt. - pruneHarvestRecords sorts descending by the same key to keep the newest 20. Both now derive recency from one `harvestRecordTime(record)` helper, documenting that they must agree on "newest" so the format can't drift between call sites. Behavior-preserving: the helper returns the identical value, the reduce/sort comparisons are unchanged, and normalizeHarvestRecord's distinct `completedAt ?? Date.now()` (finalizing a stale record) is intentionally left inline. Verified with `bun test memoryConsolidationService.test.ts` (pass) plus eslint + tsc. --- _Generated with `mux` • Model: `anthropic:claude-opus-4-8` • Thinking: `xhigh` • Cost: `n/a`_ <!-- mux-attribution: model=anthropic:claude-opus-4-8 thinking=xhigh costs=n/a -->
Deduplicate the `completedAt ?? startedAt` harvest-record ordering key in src/node/services/memoryConsolidationService.ts. After #3558 ("add memory harvest pipeline"), the same effective-time expression was computed inline four times across two functions that must rank records identically: - findNewestHarvestRecord picks the record with the max completedAt/startedAt. - pruneHarvestRecords sorts descending by the same key to keep the newest 20. Both now derive recency from one `harvestRecordTime(record)` helper, documenting that they must agree on "newest" so the format can't drift between call sites. Behavior-preserving: the helper returns the identical value, the reduce/sort comparisons are unchanged, and normalizeHarvestRecord's distinct `completedAt ?? Date.now()` (finalizing a stale record) is intentionally left inline. Verified with `bun test memoryConsolidationService.test.ts` (pass) plus eslint + tsc. --- _Generated with `mux` • Model: `anthropic:claude-opus-4-8` • Thinking: `xhigh` • Cost: `n/a`_ <!-- mux-attribution: model=anthropic:claude-opus-4-8 thinking=xhigh costs=n/a -->
Deduplicate the `completedAt ?? startedAt` harvest-record ordering key in src/node/services/memoryConsolidationService.ts. After #3558 ("add memory harvest pipeline"), the same effective-time expression was computed inline four times across two functions that must rank records identically: - findNewestHarvestRecord picks the record with the max completedAt/startedAt. - pruneHarvestRecords sorts descending by the same key to keep the newest 20. Both now derive recency from one `harvestRecordTime(record)` helper, documenting that they must agree on "newest" so the format can't drift between call sites. Behavior-preserving: the helper returns the identical value, the reduce/sort comparisons are unchanged, and normalizeHarvestRecord's distinct `completedAt ?? Date.now()` (finalizing a stale record) is intentionally left inline. Verified with `bun test memoryConsolidationService.test.ts` (pass) plus eslint + tsc. --- _Generated with `mux` • Model: `anthropic:claude-opus-4-8` • Thinking: `xhigh` • Cost: `n/a`_ <!-- mux-attribution: model=anthropic:claude-opus-4-8 thinking=xhigh costs=n/a -->
Deduplicate the `completedAt ?? startedAt` harvest-record ordering key in src/node/services/memoryConsolidationService.ts. After #3558 ("add memory harvest pipeline"), the same effective-time expression was computed inline four times across two functions that must rank records identically: - findNewestHarvestRecord picks the record with the max completedAt/startedAt. - pruneHarvestRecords sorts descending by the same key to keep the newest 20. Both now derive recency from one `harvestRecordTime(record)` helper, documenting that they must agree on "newest" so the format can't drift between call sites. Behavior-preserving: the helper returns the identical value, the reduce/sort comparisons are unchanged, and normalizeHarvestRecord's distinct `completedAt ?? Date.now()` (finalizing a stale record) is intentionally left inline. Verified with `bun test memoryConsolidationService.test.ts` (pass) plus eslint + tsc. --- _Generated with `mux` • Model: `anthropic:claude-opus-4-8` • Thinking: `xhigh` • Cost: `n/a`_ <!-- mux-attribution: model=anthropic:claude-opus-4-8 thinking=xhigh costs=n/a -->
Deduplicate the `completedAt ?? startedAt` harvest-record ordering key in src/node/services/memoryConsolidationService.ts. After #3558 ("add memory harvest pipeline"), the same effective-time expression was computed inline four times across two functions that must rank records identically: - findNewestHarvestRecord picks the record with the max completedAt/startedAt. - pruneHarvestRecords sorts descending by the same key to keep the newest 20. Both now derive recency from one `harvestRecordTime(record)` helper, documenting that they must agree on "newest" so the format can't drift between call sites. Behavior-preserving: the helper returns the identical value, the reduce/sort comparisons are unchanged, and normalizeHarvestRecord's distinct `completedAt ?? Date.now()` (finalizing a stale record) is intentionally left inline. Verified with `bun test memoryConsolidationService.test.ts` (pass) plus eslint + tsc. --- _Generated with `mux` • Model: `anthropic:claude-opus-4-8` • Thinking: `xhigh` • Cost: `n/a`_ <!-- mux-attribution: model=anthropic:claude-opus-4-8 thinking=xhigh costs=n/a -->
Deduplicate the `completedAt ?? startedAt` harvest-record ordering key in src/node/services/memoryConsolidationService.ts. After #3558 ("add memory harvest pipeline"), the same effective-time expression was computed inline four times across two functions that must rank records identically: - findNewestHarvestRecord picks the record with the max completedAt/startedAt. - pruneHarvestRecords sorts descending by the same key to keep the newest 20. Both now derive recency from one `harvestRecordTime(record)` helper, documenting that they must agree on "newest" so the format can't drift between call sites. Behavior-preserving: the helper returns the identical value, the reduce/sort comparisons are unchanged, and normalizeHarvestRecord's distinct `completedAt ?? Date.now()` (finalizing a stale record) is intentionally left inline. Verified with `bun test memoryConsolidationService.test.ts` (pass) plus eslint + tsc. --- _Generated with `mux` • Model: `anthropic:claude-opus-4-8` • Thinking: `xhigh` • Cost: `n/a`_ <!-- mux-attribution: model=anthropic:claude-opus-4-8 thinking=xhigh costs=n/a -->
Deduplicate the `completedAt ?? startedAt` harvest-record ordering key in src/node/services/memoryConsolidationService.ts. After #3558 ("add memory harvest pipeline"), the same effective-time expression was computed inline four times across two functions that must rank records identically: - findNewestHarvestRecord picks the record with the max completedAt/startedAt. - pruneHarvestRecords sorts descending by the same key to keep the newest 20. Both now derive recency from one `harvestRecordTime(record)` helper, documenting that they must agree on "newest" so the format can't drift between call sites. Behavior-preserving: the helper returns the identical value, the reduce/sort comparisons are unchanged, and normalizeHarvestRecord's distinct `completedAt ?? Date.now()` (finalizing a stale record) is intentionally left inline. Verified with `bun test memoryConsolidationService.test.ts` (pass) plus eslint + tsc. --- _Generated with `mux` • Model: `anthropic:claude-opus-4-8` • Thinking: `xhigh` • Cost: `n/a`_ <!-- mux-attribution: model=anthropic:claude-opus-4-8 thinking=xhigh costs=n/a -->
Deduplicate the `completedAt ?? startedAt` harvest-record ordering key in src/node/services/memoryConsolidationService.ts. After #3558 ("add memory harvest pipeline"), the same effective-time expression was computed inline four times across two functions that must rank records identically: - findNewestHarvestRecord picks the record with the max completedAt/startedAt. - pruneHarvestRecords sorts descending by the same key to keep the newest 20. Both now derive recency from one `harvestRecordTime(record)` helper, documenting that they must agree on "newest" so the format can't drift between call sites. Behavior-preserving: the helper returns the identical value, the reduce/sort comparisons are unchanged, and normalizeHarvestRecord's distinct `completedAt ?? Date.now()` (finalizing a stale record) is intentionally left inline. Verified with `bun test memoryConsolidationService.test.ts` (pass) plus eslint + tsc. --- _Generated with `mux` • Model: `anthropic:claude-opus-4-8` • Thinking: `xhigh` • Cost: `n/a`_ <!-- mux-attribution: model=anthropic:claude-opus-4-8 thinking=xhigh costs=n/a -->
Summary
Implements the compaction-triggered memory Harvest → Sweep pipeline: successful compaction now emits one structured completion signal, harvest reads the just-compacted epoch, writes host-validated workspace-scope candidate inbox files, and then runs the existing Dream sweep to merge/promote/prune memory.
Background
The previous compaction completion path could trigger background dream consolidation twice for a single successful compaction. This change makes the completion lifecycle single-source and adds a conservative harvest step so durable memories can be extracted from raw compaction evidence before the existing sweep organizes them.
Implementation
CompactionCompletionMetadataand madeCompactionHandlerthe single post-compaction completion hook owner.HistoryService.getMessagesForCompactionEpochwith reset/malformed-boundary-safe epoch slicing.runMemoryHarvestwith structured candidate submission, host-side evidence/confidence/secret validation, candidate dedupe, deterministic input chunking, and workspace-only inbox writes viaMemoryService.MemoryConsolidationServiceto orchestrate Harvest → Sweep, recover retryable stale harvest records, preserve archive final-pass ordering, and retain bounded harvest sidecar status.Validation
bun test src/node/services/memoryHarvest.test.ts src/node/services/memoryConsolidationService.test.ts src/node/services/historyService.test.ts src/node/services/compactionHandler.test.ts src/browser/features/RightSidebar/Memory/MemoryTab.test.tsxmake testmake static-checkdeep-review-workflowand fixed the verified review findings before opening this PR.Risks
This touches compaction lifecycle, history slicing, background memory orchestration, and Memory UI status. The main risks are incorrect boundary selection, harvest retries/order causing extra model work, or stale sidecar compatibility; regression tests cover reset/malformed boundaries, archive ordering, retry recovery, sidecar compatibility/retention, candidate guardrails, and UI failure display.
Pains
Full-suite validation occasionally hit unrelated Bun/runtime instability earlier in the workspace, but final validation after review fixes and final rebase passed locally.
📋 Implementation Plan
Plan: Deduplicate compaction-complete dream trigger
Context and verified evidence
WorkspaceService.createSession()asonCompactionComplete, which schedules metadata refresh and callsmemoryConsolidationService?.triggerInBackground(workspaceId, "compaction")(src/node/services/workspaceService.ts:2096-2101).CompactionHandler.handleCompletion()callsthis.onCompactionComplete?.()afterperformCompaction()succeeds and telemetry is recorded (src/node/services/compactionHandler.ts:804-836).AgentSessionthen callsthis.onCompactionComplete?.()again whencompactionHandler.handleCompletion(streamEndPayload)returnshandled === true(src/node/services/agentSession.ts:4650-4695).src/node/services/memoryConsolidationService.ts:271-289), but a future harvest-by-compaction-boundary pipeline should not rely on debounce to provide once-per-boundary semantics.new CompactionHandler(...)is only constructed byAgentSessionin product code; tests construct it directly without usingonCompactionComplete.Goal
Make a successful compaction emit exactly one lifecycle completion signal to
WorkspaceService, without changing compaction persistence, metadata refresh, pending follow-up dispatch, auto-compaction UI events, or current dream consolidation behavior beyond removing the duplicate background trigger.Recommended approach: keep
CompactionHandleras the sole completion-hook ownerNet product LoC estimate: about -1 to -4 LoC.
CompactionHandler.onCompactionComplete?: () => voidinCompactionHandlerOptions.onCompactionCompletefield and constructor assignment.this.onCompactionComplete?.()call inhandleCompletion()immediately after a fresh successfulperformCompaction().this.onCompactionComplete?.()call from theAgentSessionhandled-compaction branch.AgentSessionshould still reset stale usage state, emitauto-compaction-completedfor auto-compaction, reset active stream state, and dispatch pending follow-ups as it does today.CompactionHandler: the hook fires only for a newly completed compaction, not for theprocessedCompactionRequestIdsdedupe path wherehandleCompletion()returnstruefor an already-processed request.onCompactionCompleteintonew CompactionHandler(...)fromAgentSession.AgentSessiontest, because the duplication only happens whenAgentSessionandCompactionHandlerare wired together.AgentSessionwith anonCompactionCompletemock, seed realHistoryServicewith a compaction request, emit a successfulstream-end, wait for the async handler, and assert the hook was called exactly once.stream-endagain and assert the hook is still called exactly once, covering replay/dedupe behavior.CompactionHandlertests intact.CompactionHandlercallback test is optional, but the most valuable regression coverage is at theAgentSessionwiring layer.Alternatives considered
Alternative A: make
AgentSessionthe sole completion-hook ownerNet product LoC estimate: about -4 to -10 LoC.
Rejected for this fix:
CompactionHandler.handleCompletion()returnstrueboth for a freshly completed compaction and for theprocessedCompactionRequestIdsdedupe path. IfAgentSessioncalled the hook for everyhandled === true, a replayed duplicatestream-endcould still emit another lifecycle signal. MakingAgentSessionownership correct would require changinghandleCompletion()to return a richer result such as{ handled: boolean; completedNow: boolean }, which is more invasive than needed for this cleanup.Alternative B: keep both calls but dedupe by compaction boundary id
Net product LoC estimate: about +25 to +60 LoC.
Rejected for this fix: boundary-id idempotency is important for the future harvest job, but using it to tolerate duplicate hook sources preserves unnecessary lifecycle ambiguity. First make the signal single-source; add explicit boundary idempotency later when implementing harvest.
Implementation phases and quality gates
Phase 1 — Surgical code change
this.onCompactionComplete?.()call from theAgentSessionhandled-compaction branch.CompactionHandler'sonCompactionCompleteoption/field/call intact.onCompactionCompleteproperty in thenew CompactionHandler(...)call intact.Quality gate: TypeScript should pass without stale references or unused fields.
Phase 2 — Regression coverage
AgentSessionexactly-once compaction-complete test described above.createTestHistoryService()fixture, matching repo guidance for history tests.Quality gate: Run the targeted test file(s):
bun test src/node/services/agentSession.postCompactionRefresh.test.ts src/node/services/compactionHandler.test.ts src/node/services/memoryConsolidationService.test.tsAdjust the exact targeted list if the new regression test lands in a different
AgentSessiontest file.Phase 3 — Broader validation
Run:
make typecheck make testIf full
make testhits a known repo-wide flake, confirm with the targeted tests and note the known flake separately rather than hiding it.Dogfooding / self-verification
Because this is a backend lifecycle-trigger fix, dogfooding should prove the observable behavior rather than just tests.
/compact./compact./compactflow if practical.Advisor review status
AgentSessionthe sole hook owner would have double-fired onCompactionHandler's replay/dedupe path unlesshandleCompletion()returned richer state.CompactionHandleras the sole hook owner and remove only the duplicateAgentSessioncall.Acceptance criteria
WorkspaceService'sonCompactionCompletecallback exactly once.stream-endfor the same compaction request does not emit another completion lifecycle signal.Addendum: Harvest + Sweep memory pipeline
The approved dedupe plan above should remain as-is and land first. This addendum appends the follow-up dream-agent change: keep harvesting separate from sweeping so compaction can produce new candidate memories before the existing dream consolidation pass organizes them.
Context and verified evidence
streamTextloop with no chat history, noStreamManager, and only a guarded memory tool (src/node/services/memoryConsolidation.ts:4-6,src/node/services/memoryConsolidation.ts:228-235).src/node/builtinAgents/dream.md:13-23).performCompaction()writes or updates a durable summary boundary withmetadata.compactionBoundary === true,metadata.compactionEpoch, andhistorySequencerather than deleting old transcript rows (src/node/services/compactionHandler.ts:1068-1115).HistoryService.findLastBoundaryByteOffset(...),readHistoryFromOffset(...), andgetHistoryFromLatestBoundary(...)insrc/node/services/historyService.ts.Goal
After a successful compaction, run a background Harvest → Sweep pipeline:
Manual
/dreamshould remain sweep-only unless a separate manual harvest command is intentionally added later.Recommended approach: structured harvest runner + existing dream sweep
Net product LoC estimate: about +550 to +900 LoC if the Memory tab shows harvest status; about +400 to +650 LoC for a backend-only MVP.
src/node/services/memoryHarvest.ts.src/node/builtinAgents/harvest.md, registered likedreamif we want global prompt overrides/model defaults for harvest.src/node/services/memoryConsolidation.tsfocused on sweep/consolidation.onCompactionCompletesignal, for example:CompactionHandler.performCompaction()has the newly persistedsummaryMessageand the pre-compaction boundary-awaremessages, so it can compute this payload after persistence succeeds and before firing the hook.HistoryServicehelper such asgetMessagesForCompactionEpoch(workspaceId, completionMetadata)orgetHistoryBySequenceRange(...).findLastBoundaryByteOffset/readHistoryFromOffset) and fall back to full history only for first compaction or malformed boundary recovery.previousBoundaryHistorySequence < historySequence < summaryHistorySequence, with an absentpreviousBoundaryHistorySequencetreated as an unbounded lower range for the first compaction; then explicitly filter outcompactionRequestMessageIdand any summary/boundary rows. The new compaction summary may be included only as orientation.submit_memory_candidates.harvest.md, if added, should not require thememorytool; host code performs all writes after validation.category,memoryText,evidenceMessageIds,confidence, and a shortrationale./memories/workspace/harvest-inbox.mdor/memories/workspace/harvest/compaction-<epoch>.md.MemoryServiceAPIs for path validation, metadata/status invalidation, and consistency; do not write memory files directly with raw filesystem calls./memories/workspace/....src/common/orpc/schemas/memory.tswhile preserving backward compatibility for existingmemory-consolidation.jsonsidecars.MemoryHarvestRecordshould distinguishpending/attempted,completed, andfailedstates, with timestamps,startedAt,attemptCount, accepted/skipped candidate counts, usage, error summary, and the compaction metadata key.summaryMessageIdas the durable boundary key and storecompactionEpochfor display/audit.MemoryConsolidationService.HistoryServiceintoMemoryConsolidationServicefromsrc/node/services/coreServices.ts.triggerHarvestThenSweepInBackground(completionMetadata)ortriggerInBackground(workspaceId, "compaction", { completionMetadata }); do not leave boundary harvest hidden behind a baretriggerInBackground(workspaceId, "compaction")call.Harvest prompt/rails
The harvest prompt should be stricter than dream because transcript content is untrusted source material:
Alternatives considered
Alternative A: give the current dream agent the transcript and memory tool
Net product LoC estimate: about +100 to +250 LoC.
Rejected: this is the simplest patch, but it gives one prompt both extraction and broad consolidation authority. That increases risk of prompt injection, noisy memory creation, accidental global/project writes, and destructive cleanup decisions based on raw transcript content.
Alternative B: separate harvest agent with guarded workspace memory tool
Net product LoC estimate: about +300 to +550 LoC.
Acceptable but not preferred for MVP: workspace-only, append/update-only guards would be much safer than broad dream access, but the model would still decide exact file paths and write operations. A structured candidate tool plus host-side inbox writes is more auditable and easier to make idempotent.
Alternative C: harvest from compaction summary only
Net product LoC estimate: about +150 to +300 LoC.
Rejected: the summary is compact and cheap, but it is lossy and may blend older boundary summaries with new transcript facts. Harvest should use raw evidence messages from the just-compacted epoch, with the compaction summary only as orientation.
Implementation phases and quality gates
Phase 0 — Land the duplicate-trigger fix above
Quality gate: The exactly-once/replay regression test from the first plan passes.
Phase 1 — Completion metadata and history epoch reader
onCompactionCompletetypes throughCompactionHandler,AgentSession, andWorkspaceServiceto carry compaction completion metadata.HistoryServicehelper to read the just-compacted epoch by boundary/sequence metadata.Quality gate: Unit tests cover first compaction, later compaction after a previous boundary, streamed summary update-in-place, malformed boundary fallback, and duplicate stream-end idempotence.
Phase 2 — Harvest runner
memoryHarvest.tswith transcript formatting, chunking/token budgeting,submit_memory_candidatestool schema, candidate validation, host-side inbox writing, usage capture, and failure reporting.harvest.mdprompt if using the built-in-agent override pattern.Quality gate: Runner tests cover no-candidate output, accepted candidates, rejected out-of-evidence candidates, rejected secret-looking candidates, chunked transcript behavior, mutation budget, and stream failure behavior.
Phase 3 — Service orchestration and sidecar status
MemoryConsolidationServiceto run harvest before sweep for compaction triggers that include completion metadata.Quality gate: Service tests prove harvest runs before sweep; a recent sweep debounce does not prevent boundary harvest; an in-flight sweep does not permanently drop a boundary harvest; duplicate compaction triggers for the same boundary do not duplicate inbox entries; failed harvest does not falsely mark the boundary harvested; stale pending records recover/retry instead of blocking forever; sweep still runs after failed/no-op harvest according to the chosen failure semantics; and existing debounce behavior remains intentional for bare sweep triggers.
Phase 4 — Memory tab/status surface
Quality gate: UI tests cover old sidecars without harvest fields and new records with harvest details.
Phase 5 — Validation
Run targeted tests first, then broader checks:
Adjust the targeted list to the final test file layout.
Dogfooding / self-verification for Harvest → Sweep
/compact./compact→ Harvest → Sweep flow.Harvest addendum advisor review status
MemoryServicerather than raw filesystem writes.Acceptance criteria for Harvest → Sweep
/dreamremains sweep-only unless a separate explicit harvest command is added.Generated with
mux• Model:openai:gpt-5.5• Thinking:xhigh• Cost:2824451{MUX_COSTS_USD:-unknown}