feat: cross-session memory persistence with hybrid search (closes #133) by luceinaltis · Pull Request #151 · luceinaltis/qracer

luceinaltis · 2026-04-08T11:22:10Z

Summary

Closes #133.

Wires up the 3-tier memory system so conversation context survives across REPL sessions.

Tier 2 — disk persistence (`qracer/conversation/engine.py`)

ConversationEngine now takes an optional summaries_dir kwarg.
When set, _maybe_compact() calls SessionCompactor.compact_and_save() instead of the in-memory compact(), writing ~/.qracer/summaries/<session_id>.md after the turn log crosses the 8 000-token threshold.
After a successful save, if memory_searcher is also set, the summary is auto-indexed into Tier 3 via MemorySearcher.index_summary(session_id, summary) so the very next session can find it.
Backward compatible: without summaries_dir, behaviour is unchanged (calls compact()).

Tier 3 — hybrid search (`qracer/memory/memory_searcher.py`)

New embedding_fn: Callable[[str], list[float]] | None constructor parameter. Leave it None for keyword-only search (existing behaviour); pass any callable (Claude API, sentence-transformers, a stub, …) to get hybrid search.
New session_embeddings (session_id VARCHAR PK, embedding FLOAT[], indexed_at TIMESTAMP) table stores vectors alongside the existing FTS session_index.
_vector_search() runs cosine similarity via DuckDB's native list_cosine_similarity, joining back to session_index for the summary text.
_merge_results() fuses keyword + vector hits with reciprocal rank fusion (k=60), so BM25 and cosine scores combine without normalisation.
search() stays backward compatible: falls back to keyword-only when no embedding_fn is configured.
FTS extension loading is now lazy and cached per process. Pure vector workloads and offline environments no longer pay the extension download cost, and _keyword_search() degrades to an empty result set (with a warning) when FTS can't be loaded — vector-only hybrid search still proceeds.

Cross-session loading (`qracer/cli.py`)

qracer repl now creates ~/.qracer/summaries/, instantiates a file-backed MemorySearcher at ~/.qracer/memory_index.duckdb, indexes every Markdown summary under summaries/, and prints ✓ Loaded N past session summaries from … so returning users immediately see how much prior memory is in scope.
Both memory_searcher and summaries_dir are passed into the ConversationEngine, closing the loop.

Tests

tests/memory/test_memory_searcher.py — new TestHybridSearch class covering has_embeddings flag, embedding row storage, semantic vector search, hybrid search(), _merge_results() RRF ordering, removal from both tables, empty vector branch when no embedding_fn, and graceful handling when the embedding function raises.
tests/conversation/test_engine.py — new TestCompactionPersistence class with two tests: (1) _maybe_compact() writes the Markdown file and auto-indexes the row when summaries_dir + memory_searcher are set; (2) without summaries_dir, it still calls the in-memory compact() path for backward compatibility.
tests/conversation/test_topic_resolver.py — updated the existing FTS-availability skip pattern to pre-warm FTS now that MemorySearcher() construction no longer triggers it.
docs/memory-system.md — removed the 구현 예정 markers and described the new wiring.

Results

uv run pytest → 636 passed, 13 skipped (skipped tests require DuckDB's fts extension, which this sandbox can't download; they run on CI with internet access).
uv run ruff check . → clean.
uv run ruff format --check → clean.
uv run pyright → 0 errors.

How to test

uv run pytest tests/memory/test_memory_searcher.py tests/memory/test_session_compactor.py tests/conversation/test_engine.py -v
uv run ruff check . && uv run ruff format --check . && uv run pyright

End-to-end smoke test:

uv run qracer repl
# have a conversation long enough to trigger compaction (> 8 000 tokens of log)
ls ~/.qracer/summaries/            # new markdown file
uv run qracer repl
# startup should print: ✓ Loaded 1 past session summaries from …

Wire up the 3-tier memory system so conversation context survives across REPL sessions: - Tier 2: ConversationEngine._maybe_compact() now calls compact_and_save() when a summaries directory is configured, writing a Markdown summary to disk and auto-indexing it into Tier 3. - Tier 3: MemorySearcher gains an optional embedding_fn callable. When provided, summaries are embedded and stored in a new session_embeddings table; search() fuses BM25 keyword hits with cosine-similarity vector hits via reciprocal rank fusion. Keyword-only and vector-only search paths still work on their own. - Cross-session loading: qracer repl instantiates a file-backed MemorySearcher at ~/.qracer/memory_index.duckdb, indexes every Markdown file in ~/.qracer/summaries/ on startup, and reports how many past contexts were loaded. Side improvements driven by the tests: - FTS extension loading is now lazy (deferred to first keyword search) and cached per process, so pure-vector and offline workflows aren't blocked by a missing fts extension. - _keyword_search() degrades gracefully to an empty result set when the FTS extension can't be loaded, allowing vector-only hybrid search to proceed. Updated docs/memory-system.md to reflect the new wiring.

luceinaltis force-pushed the feat/133-cross-session-memory-persistence branch from 63b581b to 343b9bd Compare April 8, 2026 14:45

luceinaltis merged commit e9cf5e7 into main Apr 8, 2026
3 checks passed

luceinaltis deleted the feat/133-cross-session-memory-persistence branch April 8, 2026 14:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: cross-session memory persistence with hybrid search (closes #133)#151

feat: cross-session memory persistence with hybrid search (closes #133)#151
luceinaltis merged 1 commit into
mainfrom
feat/133-cross-session-memory-persistence

luceinaltis commented Apr 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

luceinaltis commented Apr 8, 2026

Summary

Tier 2 — disk persistence (qracer/conversation/engine.py)

Tier 3 — hybrid search (qracer/memory/memory_searcher.py)

Cross-session loading (qracer/cli.py)

Tests

Results

How to test

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Tier 2 — disk persistence (`qracer/conversation/engine.py`)

Tier 3 — hybrid search (`qracer/memory/memory_searcher.py`)

Cross-session loading (`qracer/cli.py`)