[FEAT] Memory deduplication: fixed cosine threshold causes silent data loss

## Is your feature request related to a problem? Please describe.

The current memory deduplication in `MemoryManager._deduplicate_and_store_facts()` (`src/suzent/memory/manager.py`) uses a **fixed cosine similarity threshold of 0.85** to decide whether a newly extracted fact is a duplicate of an existing memory. When similarity exceeds the threshold, the new fact is silently dropped.

This has several issues:

### 1. Contradictions and updates are treated as duplicates

If a user previously said "I work at Google" and later says "I just joined Microsoft", the two facts — both short sentences about employment at a tech company — can score above 0.85 cosine similarity. The system silently discards the newer fact, retaining stale information indefinitely.

More generally, any factual *update* to previously stored information (job changes, preference changes, project pivots) is likely to be dropped because the old and new facts are semantically close.

### 2. "Updated" memories are not actually updated

When a duplicate is detected, the code appends the existing memory's ID to `memories_updated` but performs no actual modification to the stored memory. The new content is lost.

```python
# manager.py L510-515
if similar and similar[0].get("similarity", 0) > DEDUPLICATION_SIMILARITY_THRESHOLD:
    result.memories_updated.append(str(similar[0]["id"]))  # nothing is actually updated
else:
    memory_id = await self._add_memory_internal(...)
```

### 3. Threshold is embedding-model-dependent

A fixed 0.85 does not generalize across embedding models. Different models produce different similarity distributions for the same text pairs. Swapping embedding models (e.g. `gemini-embedding-001` → `text-embedding-3-small`) changes what 0.85 means in practice, with no automatic adjustment.

### 4. Short single-sentence embeddings cluster tightly

Each fact is embedded as a single sentence. Short texts produce vectors dominated by broad topic signal, so two facts about the same topic but with meaningfully different content (e.g. "Uses React for frontend" vs "Migrating from React to Vue") can exceed 0.85 and be collapsed.

## Possible future paths

- **LLM-assisted dedup**: Use embedding similarity as a cheap pre-filter, then ask an LLM to classify near-matches as duplicate / update / distinct (similar to Mem0's approach).
- **Adaptive thresholds**: Auto-calibrate per embedding model rather than using a fixed constant.
- **Entity-level merging**: Extract entities and relations, dedup at the entity graph level (similar to LangMem).
- **Actually update on conflict**: When a near-match is detected, update or replace the existing memory rather than silently dropping the new fact.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEAT] Memory deduplication: fixed cosine threshold causes silent data loss #34

Is your feature request related to a problem? Please describe.

1. Contradictions and updates are treated as duplicates

2. "Updated" memories are not actually updated

3. Threshold is embedding-model-dependent

4. Short single-sentence embeddings cluster tightly

Possible future paths

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[FEAT] Memory deduplication: fixed cosine threshold causes silent data loss #34

Description

Is your feature request related to a problem? Please describe.

1. Contradictions and updates are treated as duplicates

2. "Updated" memories are not actually updated

3. Threshold is embedding-model-dependent

4. Short single-sentence embeddings cluster tightly

Possible future paths

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions