Skip to content

[feature] active consolidation engine (dream cycle) — synthesize, merge, extract #37

@ramonlimaramos

Description

@ramonlimaramos

Problem

synapto has decay (memories fade over time) but no active consolidation. duplicate working memories accumulate, similar memories never merge into stable patterns, and the agent re-learns the same lesson over and over. competitor byterover-cli ships a "dream engine" that synthesizes/consolidates/prunes in background — this is the difference between "storage that fades" and "memory that learns".

Proposed Solution

new module synapto.consolidation with three background operations, runnable via synapto maintain --consolidate or scheduled job.

operations

op what it does trigger
synthesize take N similar working memories → produce one stable summary memory; preserve provenance (parent_ids[]) when ≥3 memories share entities + cosine sim > threshold
merge dedupe near-identical memories; keep highest-trust, link rest as superseded when cosine sim > 0.92 AND entity overlap = 1.0
extract detect cross-memory patterns ("user always X", "project Y depends on Z") → produce core rule memory when ≥5 memories support same proposition

candidate detection (cheap, no LLM)

  • reuse HRR vectors: high similarity + shared entities = candidate cluster
  • run dbscan/agglomerative clustering on phase vectors
  • only invoke LLM on confirmed clusters

LLM-driven step

  • summarization prompt produces synthesized content
  • log provenance: synthesized_from=[id1, id2, ...], synthesis_prompt_hash, model
  • new memory inherits max(trust) − 0.05 (slight discount for derivation)

CLI / MCP surface

synapto maintain --consolidate            # full pass
synapto maintain --consolidate --dry-run  # report candidates only

new MCP tool consolidate_pending() returning candidate clusters + diff preview, paired with consolidate_apply(cluster_id) for human-in-loop approval.

Trade-offs

  • cost: LLM calls per cluster (~$0.001 each with haiku). mitigation: only run on dirty tenants, cache by content hash.
  • drift: bad summaries lose nuance. mitigation: keep originals (soft-link, not delete) for 30d; trust feedback can revert.
  • latency: jobs are async; reuse existing redis queue infra.

Success criteria

  • 1k working memories → ~100 stable + 10 core after one full cycle
  • recall accuracy on LongMemEval-S improves vs no-consolidation baseline
  • provenance chain queryable (graph_query walks synthesized_from edges)

References

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions