Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
5d92b61
doc: Adds RAG to the roadmap
nka11 Feb 24, 2026
6005756
docs(m8): add implementation plan for pluggable vector store
nka11 Feb 26, 2026
e6a1b79
feat(m8): convert rust/ to Cargo workspace
nka11 Feb 26, 2026
c10d70f
feat(m8): add vector_store crate with VectorStore trait and types
nka11 Feb 26, 2026
67bba6d
feat(m8): implement InMemoryVectorStore with HNSW backend and tests
nka11 Feb 26, 2026
9b9968b
chore(m8): update Cargo.lock with vector_store dependencies
nka11 Feb 26, 2026
7b23983
Merge pull request 'feat/m8-vector-store' (#15) from feat/m8-vector-s…
Feb 26, 2026
a66f4e3
plan(m9): add RAG pipeline implementation plan
nka11 Feb 26, 2026
fa1758b
feat(m9): scaffold rag_pipeline crate with module declarations
nka11 Feb 26, 2026
8b30063
feat(m9): add EmbeddingProvider trait and MockEmbeddingProvider
nka11 Feb 26, 2026
ae3084a
feat(m9): add RDF canonicalization for deterministic chunk text
nka11 Feb 26, 2026
9f0a53c
feat(m9): add Reranker trait and PassThroughReranker
nka11 Feb 26, 2026
a925a56
feat(m9): add ContextCompressor trait and TruncatingCompressor
nka11 Feb 26, 2026
feece8e
feat(m9): add RagPipeline orchestrator with integration tests
nka11 Feb 26, 2026
38d64bf
chore(m9): update Cargo.lock with rag_pipeline dependencies
nka11 Feb 26, 2026
25b5296
plan(m10): add agent orchestrator implementation plan
nka11 Feb 26, 2026
579e249
feat(m10): scaffold agent_orchestrator crate with core types
nka11 Feb 26, 2026
7f857fd
feat(m10): add SparqlTool agent tool wrapper
nka11 Feb 26, 2026
33b26af
feat(m10): add RagTool agent tool wrapper
nka11 Feb 26, 2026
0a62d02
feat(m10): add CodegenTool and LlmClient trait
nka11 Feb 26, 2026
017bec0
feat(m10): add prompt contract for citation validation and assembly
nka11 Feb 26, 2026
be1990f
feat(m10): add AgentRouter with planner/router and integration tests
nka11 Feb 26, 2026
d4af6c0
feat(m10): wire agent orchestrator into MCP server as agent_query tool
nka11 Feb 26, 2026
fae2bf0
chore(m10): update Cargo.lock with agent_orchestrator dependencies
nka11 Feb 26, 2026
1e8e343
plan(m10.5): add graph-to-vector indexer implementation plan
nka11 Feb 27, 2026
d37bb00
feat(m10.5): add HttpEmbeddingProvider for OpenAI-compatible endpoints
nka11 Feb 27, 2026
24511a0
refactor(m10.5): change RagPipeline from Box to Arc for shared state
nka11 Feb 27, 2026
5935841
feat(m10.5): add GraphIndexer for triplestore-to-vector pipeline
nka11 Feb 27, 2026
ea1c028
feat(m10.5): wire shared state and add index_graph MCP tool
nka11 Feb 28, 2026
f1506dd
feat(m10.5): add Qdrant docker-compose and MCP config
nka11 Feb 28, 2026
0fc4566
feat(m10.5): add QdrantVectorStore backend
nka11 Feb 28, 2026
64b6100
feat: add Python code loader (M3)
Feb 23, 2026
5c2491a
docs: update PLAN.md and TASKS.md to reflect current progress
nka11 Feb 28, 2026
7a79d9f
fix: address top 5 code quality issues from review
nka11 Feb 28, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 5 additions & 2 deletions .mcp.json
Original file line number Diff line number Diff line change
@@ -1,8 +1,11 @@
{
"mcpServers": {
"semantic-code-mcp": {
"command": "cargo",
"args": ["run", "--manifest-path", "./rust/Cargo.toml"]
"command": "./rust/target/release/semantic-code-mcp",
"env": {
"OXIGRAPH_STORE_PATH": "/tmp/oxigraph-test-store",
"QDRANT_URL": "http://localhost:6333"
}
}
}
}
72 changes: 66 additions & 6 deletions PLAN.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,10 +35,10 @@ Plugin system foundation plus the first code loader.
- Module hierarchy resolution
- Manual testing: load a Rust project and query its structure via SPARQL

### M3 — Python Loader
### M3 — Python Loader
- Implement Python loader (`load_python_code`):
- pyproject.toml / setup.py / requirements.txt parsing
- `.py` file AST extraction (modules, functions, classes, decorators, imports, docstrings)
- pyproject.toml parsing (project metadata, dependencies)
- `.py` file AST extraction via `rustpython-parser` (modules, functions, classes, decorators, imports, type annotations, docstrings, async)
- Manual testing: load a Python project and query its structure via SPARQL

### M4 — TypeScript Loader ✅
Expand All @@ -47,13 +47,13 @@ Plugin system foundation plus the first code loader.
- `.ts`/`.tsx`/`.js`/`.jsx` file AST extraction via `oxc_parser` (modules, functions, classes, interfaces, type aliases, enums, imports/exports, JSDoc)
- Manual testing: load a TypeScript project and query its structure via SPARQL

### M5 — Testing and Documentation
### M5 — Testing and Documentation
- Integration tests for generic RDF tools
- Integration tests for each language loader
- Integration tests for each language loader (Rust, TypeScript, Python)
- User-facing README with installation and usage instructions
- Claude Code MCP configuration examples

### M6 — Git History Loader
### M6 — Git History Loader
Load git commit history into the RDF knowledge graph, enabling queries that join code structure with change history.

- Add `git2` crate (libgit2 bindings) for native repository access
Expand All @@ -74,3 +74,63 @@ Load git commit history into the RDF knowledge graph, enabling queries that join
- Namespace prefix management (`list_namespaces`, `add_namespace`)
- Bulk loading with progress reporting
- Additional language loaders (Go, Java, C/C++, etc.)

### M8 — Pluggable Vector Store ✅
Introduce the `VectorStore` trait and in-memory default backend as a separate crate.

- Restructure `rust/` as a Cargo workspace
- Create `crates/vector_store/` crate
- Define `VectorStore` async trait, `RagChunk`, `SearchHit`, `Filter` types
- Implement `InMemoryVectorStore` using `hnsw_rs` + `dashmap`
- Unit tests: upsert, delete, search, filtering

### M9 — RAG Pipeline ✅
Build the retrieval pipeline connecting embeddings to the vector store.

- Create `crates/rag_pipeline/` crate
- Implement embedding provider abstraction (`EmbeddingProvider` trait, `MockEmbeddingProvider`, `HttpEmbeddingProvider`)
- Implement RDF canonicalization for chunk text generation
- Implement retrieval: embed query → `VectorStore.search()` → return ranked chunks
- Pass-through reranker (extensible via `Reranker` trait)
- Context compression via `TruncatingCompressor`
- Integration tests with in-memory vector store

### M10 — Agent Orchestrator ✅
Planner/router that dispatches to SPARQL, RAG, and codegen tools.

- Create `crates/agent_orchestrator/` crate
- Define `AgentTool` trait and `ToolInput` / `ToolOutput` types
- Implement `SparqlTool`, `RagTool`, `CodegenTool` wrappers
- Implement planner/router logic (`AgentRouter`)
- Enforce prompt contract (IRI citation, chunk ID citation, grounding)
- Wire orchestrator into the MCP server as `agent_query` tool
- Integration tests with mock LLM

### M10.5 — Graph Indexer & Qdrant Backend ✅
Bridge the triplestore and the vector store; add production-grade vector DB support.

- Implement `GraphIndexer` (SPARQL → canonicalize → embed → upsert) in `rag_pipeline`
- Wire `index_graph` as MCP tool
- Add `HttpEmbeddingProvider` for OpenAI-compatible embedding endpoints
- Implement `QdrantVectorStore` in `crates/vector_store/src/qdrant.rs` (gRPC via `qdrant-client`)
- Auto-select Qdrant when `QDRANT_URL` is set, fall back to in-memory
- Add `docker-compose.yml` for Qdrant (v1.13.2, REST + gRPC)
- Setup docs in `docs/qdrant-setup.md`

### M11 — External Vector DB Adapters
Additional production vector database adapters and configuration.

- Implement Milvus adapter (`milvus` feature flag)
- TOML-based configuration for backend selection
- Integration tests with containerized Qdrant/Milvus
- Fallback-to-inmemory on adapter failure

### M12 — Observability & Production Hardening
Tracing, metrics, security, and multi-tenancy.

- Add `request_id` propagation and structured tracing spans
- Implement Precision@K and Faithfulness evaluation hooks
- Implement PII redaction before embedding
- Implement namespace isolation and graph-level ACL filtering
- Hash-based chunk deduplication
- Performance benchmarks (latency, recall, memory)
20 changes: 20 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -120,6 +120,26 @@ cargo test
| AST Parsing | syn 2.x |
| Transport | stdio (JSON-RPC) |

## Roadmap

See [PLAN.md](PLAN.md) for full milestone details and [SPECIFICATIONS.md](SPECIFICATIONS.md) for technical specifications.

| Milestone | Description | Status |
|---|---|---|
| M0–M2 | Core server, generic RDF tools, Rust code loader | ✅ Done |
| M4 | TypeScript code loader | ✅ Done |
| M3 | Python code loader | Planned |
| M5 | Testing & documentation | In progress |
| M6 | Git history loader | Planned |
| M7 | Advanced features (graph management, store stats) | Planned |
| M8 | Pluggable vector store (`VectorStore` trait + in-memory backend) | Planned |
| M9 | RAG pipeline (embedding, retrieval, reranking) | Planned |
| M10 | Agent orchestrator (planner/router, tool dispatch) | Planned |
| M11 | External vector DB adapters (Qdrant, Milvus) | Planned |
| M12 | Observability & production hardening | Planned |

**Future enhancements:** Hybrid lexical + vector retrieval, SHACL-aware scoring, multi-vector per RDF node, incremental embeddings, WASM reranker.

## License

See [LICENSE](LICENSE) for details.
Loading