Skip to content

[Epic] Evaluate adding a RAG layer on top of the knowledge graph (PoC-gated rollout) #35

@chigichan24

Description

@chigichan24

Why

Hypothesis to validate (PoC)

  1. Retrieval quality: For a sample of ~100 real sessions, hybrid retrieval (BM25 + dense embedding) returns turns that a human judges more relevant to a Skill candidate than the current cluster-blob context.
  2. Footprint: Index size at turn granularity, even unquantized, fits within a deployable budget for the static frontend (or, failing that, fits comfortably in skill-server memory).
  3. Latency: Embedding generation runs in minutes, not hours, on a developer laptop. Query latency is sub-100ms for a few thousand chunks.

PoC scope (do this in a branch, in this issue)

  • Pick ~100 sessions from a real ~/.claude/projects/.
  • Generate turn-level embeddings with one local model (bge-small-en-v1.5 or paraphrase-multilingual-MiniLM-L12-v2 via Transformers.js).
  • Implement a flat-search retriever — no fancy index needed for ~5k chunks.
  • For 5 Skill candidates from that corpus, build two contexts:
    • A: today's cluster-blob
    • B: top-k turns from hybrid retrieval (BM25 + dense)
  • Eyeball B vs A for relevance. Optionally feed both into claude -p and compare resulting Skill markdown.
  • Measure index size, retrieval latency p50/p95, embedding throughput.
  • Output: short writeup at docs/rag-poc.md with answers + numbers + a recommendation.

Decision gate

Implementation issues (gated on this PoC)

Related issues

Risk to call out

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestquestionFurther information is requested

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions