Is your feature request related to a problem? Please describe.
The current memory deduplication in MemoryManager._deduplicate_and_store_facts() (src/suzent/memory/manager.py) uses a fixed cosine similarity threshold of 0.85 to decide whether a newly extracted fact is a duplicate of an existing memory. When similarity exceeds the threshold, the new fact is silently dropped.
This has several issues:
1. Contradictions and updates are treated as duplicates
If a user previously said "I work at Google" and later says "I just joined Microsoft", the two facts — both short sentences about employment at a tech company — can score above 0.85 cosine similarity. The system silently discards the newer fact, retaining stale information indefinitely.
More generally, any factual update to previously stored information (job changes, preference changes, project pivots) is likely to be dropped because the old and new facts are semantically close.
2. "Updated" memories are not actually updated
When a duplicate is detected, the code appends the existing memory's ID to memories_updated but performs no actual modification to the stored memory. The new content is lost.
# manager.py L510-515
if similar and similar[0].get("similarity", 0) > DEDUPLICATION_SIMILARITY_THRESHOLD:
result.memories_updated.append(str(similar[0]["id"])) # nothing is actually updated
else:
memory_id = await self._add_memory_internal(...)
3. Threshold is embedding-model-dependent
A fixed 0.85 does not generalize across embedding models. Different models produce different similarity distributions for the same text pairs. Swapping embedding models (e.g. gemini-embedding-001 → text-embedding-3-small) changes what 0.85 means in practice, with no automatic adjustment.
4. Short single-sentence embeddings cluster tightly
Each fact is embedded as a single sentence. Short texts produce vectors dominated by broad topic signal, so two facts about the same topic but with meaningfully different content (e.g. "Uses React for frontend" vs "Migrating from React to Vue") can exceed 0.85 and be collapsed.
Possible future paths
- LLM-assisted dedup: Use embedding similarity as a cheap pre-filter, then ask an LLM to classify near-matches as duplicate / update / distinct (similar to Mem0's approach).
- Adaptive thresholds: Auto-calibrate per embedding model rather than using a fixed constant.
- Entity-level merging: Extract entities and relations, dedup at the entity graph level (similar to LangMem).
- Actually update on conflict: When a near-match is detected, update or replace the existing memory rather than silently dropping the new fact.
Is your feature request related to a problem? Please describe.
The current memory deduplication in
MemoryManager._deduplicate_and_store_facts()(src/suzent/memory/manager.py) uses a fixed cosine similarity threshold of 0.85 to decide whether a newly extracted fact is a duplicate of an existing memory. When similarity exceeds the threshold, the new fact is silently dropped.This has several issues:
1. Contradictions and updates are treated as duplicates
If a user previously said "I work at Google" and later says "I just joined Microsoft", the two facts — both short sentences about employment at a tech company — can score above 0.85 cosine similarity. The system silently discards the newer fact, retaining stale information indefinitely.
More generally, any factual update to previously stored information (job changes, preference changes, project pivots) is likely to be dropped because the old and new facts are semantically close.
2. "Updated" memories are not actually updated
When a duplicate is detected, the code appends the existing memory's ID to
memories_updatedbut performs no actual modification to the stored memory. The new content is lost.3. Threshold is embedding-model-dependent
A fixed 0.85 does not generalize across embedding models. Different models produce different similarity distributions for the same text pairs. Swapping embedding models (e.g.
gemini-embedding-001→text-embedding-3-small) changes what 0.85 means in practice, with no automatic adjustment.4. Short single-sentence embeddings cluster tightly
Each fact is embedded as a single sentence. Short texts produce vectors dominated by broad topic signal, so two facts about the same topic but with meaningfully different content (e.g. "Uses React for frontend" vs "Migrating from React to Vue") can exceed 0.85 and be collapsed.
Possible future paths