Skip to content

Push keyword search down to Qdrant via a full-text payload index#72

Merged
imonroe merged 3 commits into
mainfrom
claude/wizardly-planck-pwvl93-i64
Jun 9, 2026
Merged

Push keyword search down to Qdrant via a full-text payload index#72
imonroe merged 3 commits into
mainfrom
claude/wizardly-planck-pwvl93-i64

Conversation

@imonroe

@imonroe imonroe commented Jun 9, 2026

Copy link
Copy Markdown
Owner

Closes #64

What

keyword_search() previously pulled up to 5,000 full payloads out of Qdrant on every keyword query and substring-matched them in Python — linear cost with store size, and memories beyond the scan cap silently became unsearchable. Matching is now pushed down to Qdrant in two stages:

  1. Indexed prefilter — a full-text payload index on the data field (created lazily and idempotently on first keyword search; the outcome is cached process-wide behind a lock) lets Qdrant return only points containing every query token via a MatchText filter, combined with the existing exact-match scoping (user_id, provenance filters).
  2. Substring verification — the original case-insensitive substring check runs over those candidates, so the contract is unchanged: exact substring matches, most recent first.

Recall never regresses

MatchText is token-based, so three cases transparently fall back to the legacy full scan:

  • the index can't be created (old Qdrant version, permissions) — cached, not retried per query;
  • the indexed query raises;
  • nothing survives substring verification (e.g. a mid-token fragment like hil matching Philips, or non-contiguous tokens).

Worst case (a query with no matches at all) costs one indexed query plus one scan — the scan being the status-quo price. The typical whole-word query transfers only the matching points.

No new dependencies (qdrant-client was already pinned) and no settings changes; the index is created automatically with no operator action.

Tests

8 new tests in tests/test_memory.py (index created once and with the right schema, MatchText+MatchValue filter composition, full-scan skipped when the index hits, token-but-not-substring rejection falls back, mid-token fragment recall via fallback, creation-failure caching, indexed-query failure fallback, empty-query short-circuit) plus an autouse conftest fixture isolating the index-state cache between tests. Existing keyword tests now exercise the scan path explicitly. Full suite: 174 passed, ruff clean.

Review

Adversarial review surfaced 8 candidates; 2 were fixed (a lock around the index-state cache so racing first searches can't clobber a successful creation with a transient failure's False; an explicit scroll stub in the REST keyword test instead of relying on a MagicMock unpack failure). Case-handling concerns are covered by Qdrant tokenizing queries with the index's lowercase=true config plus the substring fallback.

https://claude.ai/code/session_01H2Dbh6kD8bseWZZEf7kGhx


Generated by Claude Code

claude added 3 commits June 9, 2026 20:23
keyword_search() previously transferred up to 5000 full payloads per query
and substring-matched in Python; past that cap, older memories silently
became unsearchable. Now a full-text index on the data field (created
lazily and idempotently on first keyword search; outcome cached) lets
Qdrant prefilter candidates server-side with MatchText, and the original
substring check verifies them so semantics are unchanged.

MatchText is token-based, so queries that only match inside a word (or
yield no surviving candidates) transparently fall back to the legacy scan
- recall never regresses, and stores on Qdrant versions that cannot create
the index keep working via the scan path.

Closes #64

https://claude.ai/code/session_01H2Dbh6kD8bseWZZEf7kGhx
@imonroe imonroe merged commit ef21ddc into main Jun 9, 2026
1 check passed
@imonroe imonroe deleted the claude/wizardly-planck-pwvl93-i64 branch June 9, 2026 21:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Replace in-Python keyword search scan with a Qdrant full-text payload index

2 participants