Skip to content

feat(proposal-a): Phase 1 recall governance (Issue #569)#736

Merged
rwmjhb merged 7 commits intoCortexReach:masterfrom
jlin53882:proposal-a-phase1-v3
May 5, 2026
Merged

feat(proposal-a): Phase 1 recall governance (Issue #569)#736
rwmjhb merged 7 commits intoCortexReach:masterfrom
jlin53882:proposal-a-phase1-v3

Conversation

@jlin53882
Copy link
Copy Markdown
Contributor

@jlin53882 jlin53882 commented May 1, 2026

Summary

Phase 1 of Proposal A: implements a feedback-driven recall governance loop that detects when auto-recalled memories are actually used vs not referenced in the agent response, and applies configurable scoring penalties accordingly.

See RFC: #569


What this PR does

Core feature: isRecallUsed() feedback loop

When auto-recall injects memories into context, the system tracks which recalled items the agent references in its response. If a recalled memory is not mentioned, it incurs a scoring penalty — reducing its chance of being recalled again.

Key components:

  • isRecallUsed(): keyword-based usage heuristic — checks if response text contains natural usage markers (e.g. "記得", "記住了", "之前說的", "根據...") that indicate the agent is referencing the recalled memory
  • Per-recall scoring: each recalled-but-not-used memory gets a penalty applied via store.update({ importance }) + store.patchMetadata({ bad_recall_count })
  • Configurable thresholds: feedbackActivationThreshold (how many recalls before feedback activates), suppressionThreshold (>= 3 consecutive unused before penalty)

Bugs fixed during review

Bug Severity Description
C1 🚨 CRITICAL before_prompt_build regex parse was dead code — prependContext format is [category:scope], not hex IDs, so regex /\[([a-f0-9]{8,})\]/gi always returned []. Scoring loop never ran. Fix: use pending.recallIds directly (populated by auto-recall hook).
C3 🚨 CRITICAL pendingRecall.delete(sessionKey) was only called in session_end hook. Without it, the same recallIds would be re-scored on every subsequent turn indefinitely. Fix: add pendingRecall.delete(sessionKey) after scoring loop completes.
TTL gap 🚨 CRITICAL pendingRecall Map had no TTL eviction. If session_end never fires (crash, SIGKILL, unhandled exception), entries accumulate forever. Fix: add 10-min max-age eviction when reading stale entries.

Previous (pre-review) claims that were incorrect

The original PR description claimed these fixes existed in the diff — they did not:

  • P0: "pendingRecall TTL max-age cleanup" — not in original diff (added by this review fix)
  • P1: "errorKeywords checked before usage heuristic, scoring penalty threshold aligned, natural usage detection" — not in diff
  • P2: "suppression threshold aligned with scoring path (>= 2), TTL cleanup on both read and set paths" — not in diff

Files changed

  • index.ts: before_prompt_build scoring hook (C1/C3/TTL fix), agent_end hook, feedback config parsing
  • src/reflection-slices.ts: isRecallUsed exports

Build notes

Rebuilt from scratch onto latest upstream master (0545c91), preserving all upstream changes:

Closes #569

… master

Rebuilt from scratch onto origin/master (0545c91).

Feature summary:
- isRecallUsed() feedback loop: detects when auto-recalled memories are
  actually used vs not referenced in agent response
- per-recall scoring: penalty for recalled-but-not-referenced memories
- configurable feedback amplitudes and min_recall_count threshold
- Summary path: natural usage detection without reliance on recall markers
- session_end hook: clean pendingRecall composite keys to prevent memory leak
- P0/P1/P2 fixes: pendingRecall TTL cleanup, AND gate in isRecallUsed,
  suppression threshold alignment, errorKeywords precedence

Closes CortexReach#569
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 17efa0ebc8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread index.ts Outdated
Comment on lines +3190 to +3194
if (event.prependContext && typeof event.prependContext === "string") {
// Parse IDs from injected context - format is typically "- [category:scope] summary"
// We'll check if any recall IDs are present in the context
const match = event.prependContext.match(/\[([a-f0-9]{8,})\]/gi);
if (match) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Use stored recall IDs instead of parsing prepend context

The feedback hook derives injectedIds by regex-parsing event.prependContext, but auto-recall injection formats lines as category/scope prefixes (e.g. [category:scope]) and does not include memory IDs, so this match path returns no IDs in normal runs. Because injectedIds stays empty, the scoring loop never executes and bad_recall_count/importance are never updated, effectively disabling the new recall-governance feature despite pendingRecall already storing the selected IDs at injection time.

Useful? React with 👍 / 👎.

Comment thread index.ts
Comment on lines +3166 to +3169
const existing = pendingRecall.get(sessionKey);
if (existing && lastMsgText && lastMsgText.trim().length > 0) {
existing.responseText = lastMsgText;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Clear scored pending recall entries to avoid re-scoring

Once a pendingRecall entry exists, this agent_end handler keeps overwriting its responseText on every subsequent turn, even when no new recall was injected, and the map is only cleared on session_end. Without deleting the entry after a scoring pass, later turns can repeatedly reward/penalize the same old recallIds based on unrelated responses, which will skew importance over long sessions.

Useful? React with 👍 / 👎.

jlin53882 pushed a commit to jlin53882/memory-lancedb-pro that referenced this pull request May 1, 2026
… (Codex review)

P1: before_prompt_build was trying to parse memory IDs from
prependContext via regex /\[[a-f0-9]{8,}\]/gi, but auto-recall
injects memories with [category:scope] prefixes (e.g. [reflection:correction])
and does NOT include memory IDs. pending.recallIds already contains the
correct IDs — use them directly instead of parsing a text format that
never contains IDs. This unblocks the scoring loop so bad_recall_count
and importance are actually updated.

P2: pendingRecall entry was never deleted after scoring, causing
subsequent turns to re-score the same recallIds against a new
responseText, distorting importance over time. Delete the entry
immediately after the scoring loop completes.

Fixes two Codex review comments on PR CortexReach#736.
@jlin53882 jlin53882 force-pushed the proposal-a-phase1-v3 branch from e5e60eb to 17efa0e Compare May 1, 2026 16:04
jlin53882 added 3 commits May 4, 2026 00:20
- Remove broken prependContext regex parse that always returned empty
  (prependContext format is `[category:scope]`, not hex IDs).
  Scoring now uses pending.recallIds directly (populated by auto-recall hook).
- Add pendingRecall TTL eviction (10min max-age) to prevent unbounded Map
  growth when session_end never fires (crash, SIGKILL, etc.).
- Add pendingRecall.delete(sessionKey) after scoring loop to prevent
  re-scoring same recallIds on subsequent turns (Codex P2).

Fixes: C1 (CRITICAL - scoring dead code), C3 (CRITICAL - re-scoring),
       TTL gap (CRITICAL - process crash leaks).
…s, boundary conditions, and realistic scenarios
Issue CortexReach#736 recall governance - isRecallUsed() 48-unit-test coverage
Copy link
Copy Markdown
Collaborator

@rwmjhb rwmjhb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review action: COMMENT

Thanks for the PR. GitHub currently reports this branch as not mergeable (mergeable=CONFLICTING, merge_state_status=DIRTY), so I am deferring deep review until the diff can be reviewed against the current base.

Please rebase onto the latest base branch, resolve the merge conflicts, and push the updated branch. Once it is cleanly mergeable again, I will re-run the full review on the updated diff.

@jlin53882
Copy link
Copy Markdown
Contributor Author

Conflict Resolution Done ✅

已解決 scripts/ci-test-manifest.mjs 的衝突。

衝突解決內容

變更 說明
❌ 移除 test/to-import-specifier-windows.test.mjs(master 已移除)
✅ 保留 test/is-recall-used.test.mjs(PR #736 新增的單元測試)
✅ 加入 test/agentid-validation.test.mjs(master 新增)
✅ 加入 test/command-reflection-guard.test.mjs(master 新增)

Commit: dab4b62推送到 branch

如 GitHub 仍顯示 dirty 狀態,可能需要 maintainer 協助一下。PR 開了 maintainer_can_modify: true

@rwmjhb rwmjhb merged commit 09255df into CortexReach:master May 5, 2026
7 checks passed
jlin53882 pushed a commit to jlin53882/memory-lancedb-pro that referenced this pull request May 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[RFC] Proposal A Phase 1-4 Stack Chain 順序確認

2 participants