You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The knowledge-graph pipeline answers "what shape is this corpus" but not "given a query, what are the most relevant moments in it". Three concrete needs converge on the same missing capability — chunk-level retrieval:
The full implementation across embedder, retriever, synthesis rewiring, and UI is non-trivial. A small PoC keeps the cost down and protects against committing to the wrong choices (model, store, distribution).
Hypothesis to validate (PoC)
Retrieval quality: For a sample of ~100 real sessions, hybrid retrieval (BM25 + dense embedding) returns turns that a human judges more relevant to a Skill candidate than the current cluster-blob context.
Footprint: Index size at turn granularity, even unquantized, fits within a deployable budget for the static frontend (or, failing that, fits comfortably in skill-server memory).
Latency: Embedding generation runs in minutes, not hours, on a developer laptop. Query latency is sub-100ms for a few thousand chunks.
PoC scope (do this in a branch, in this issue)
Pick ~100 sessions from a real ~/.claude/projects/.
Generate turn-level embeddings with one local model (bge-small-en-v1.5 or paraphrase-multilingual-MiniLM-L12-v2 via Transformers.js).
Implement a flat-search retriever — no fancy index needed for ~5k chunks.
For 5 Skill candidates from that corpus, build two contexts:
A: today's cluster-blob
B: top-k turns from hybrid retrieval (BM25 + dense)
Eyeball B vs A for relevance. Optionally feed both into claude -p and compare resulting Skill markdown.
Measure index size, retrieval latency p50/p95, embedding throughput.
Output: short writeup at docs/rag-poc.md with answers + numbers + a recommendation.
Why
Hypothesis to validate (PoC)
skill-servermemory).PoC scope (do this in a branch, in this issue)
~/.claude/projects/.bge-small-en-v1.5orparaphrase-multilingual-MiniLM-L12-v2via Transformers.js).claude -pand compare resulting Skill markdown.docs/rag-poc.mdwith answers + numbers + a recommendation.Decision gate
Implementation issues (gated on this PoC)
Related issues
Risk to call out