A Claude Code plugin that gives Claude improved memory, context intelligence, and session observability. Three modules working together:
| Module | Language | What it does |
|---|---|---|
| Brainiac | Python | Semantic knowledge graph with local embeddings, auto-linking, and intent-aware retrieval |
| Cortex | TypeScript | Hook processor that tracks sessions + MCP server with 5 dashboard tools |
| ContextScore | Python | 7-dimension context quality scoring with snapshot/recovery |
Built on research from A-MEM (NeurIPS 2025), MAGMA (2026), SmartSearch (Derehag et al. 2026), LCM (Ehrlich 2026), and Anthropic Harness Design (2025).
# From your project directory:
claude plugin add github:Peaky8linders/claude-cortex
# Or clone and install manually:
git clone https://github.com/Peaky8linders/claude-cortex.git
cd claude-cortex
pip install -e . # Python dependencies (brainiac + contextscore)
cd cortex && npm install && npm run build # TypeScript dependencies (cortex engine + MCP server)After install, the plugin auto-registers:
- 7 hooks — session tracking, context snapshots, compaction recovery
- 14 slash commands —
/learn,/hypothesis,/cortex-status,/review-and-ship, etc. - 5 MCP tools — token timeline, activity map, quality heatmap, graph explorer, unified dashboard
- 3 agents — cortex-advisor (haiku), graph-maintainer (haiku), work-evaluator (sonnet)
Claude forgets everything between sessions. Cortex doesn't. Use /learn at the end of a session to capture patterns, decisions, and solutions into a knowledge graph. Next session, Claude automatically searches the graph and applies what it learned.
Query your session in real-time via 5 MCP tools:
| Tool | What it shows |
|---|---|
cortex_token_timeline |
Minute-by-minute token usage with spike detection and cost estimates |
cortex_activity_map |
Gantt-like timeline of which hooks, skills, and tools fired |
cortex_quality_heatmap |
7-dimension radar chart of context quality (semantic relevance, redundancy, economics, etc.) |
cortex_graph_explorer |
Interactive force-directed graph visualization (JSON for terminal, HTML for browser) |
cortex_dashboard |
Unified session dashboard — KPIs, charts, cost breakdown, and knowledge graph in one page |
When Claude's context window compacts, critical decisions and patterns are lost. Cortex snapshots them before compaction and re-injects them after — lossless context management.
| Level | Command | What it does |
|---|---|---|
| L3 | /run-tasks |
Execute task YAML with generator-evaluator pattern and sprint contracts |
| L4 | /ralph-start |
Autonomous loop with dual quality gate and optional context resets |
| L5 | /auto-research |
Structured experiment runner with /hypothesis tracking and eval loops |
Inspired by Anthropic's harness design research: separating work production from assessment is more effective than self-evaluation.
- work-evaluator agent independently grades output across 5 weighted dimensions (correctness, architecture, completeness, safety, craft)
- Sprint contracts align generator and evaluator on testable success criteria before coding begins
- Dual quality gate combines graph health (30%) + work output quality (70%) into a composite score; halts at < 40
- Context reset strategy (
--reset-strategy reset) provides clean-slate iterations with structured handoffs for long-running tasks
| Command | Purpose |
|---|---|
/learn |
Extract session learnings into graph nodes with auto-linking |
/hypothesis |
Track testable claims with causal edge evidence |
/brainiac |
Direct graph CLI: search, stats, add, consolidate |
/cortex-status |
Graph health: node/edge counts by type |
/cortex-recommend |
Actionable optimization suggestions |
/cortex-dashboard |
Open interactive graph visualization |
/cortex-graph |
Full graph dump |
/cortex-snapshot |
Save graph state to disk |
/review-and-ship |
Deep code review (3 parallel agents) + fix + test + PR |
/run-tasks |
Execute task list with quality gating |
/ralph-start |
Start autonomous loop |
/ralph-stop |
Stop autonomous loop |
/ralph-status |
Monitor active loop |
/auto-research |
Structured experiment runner |
claude-cortex/
├── .claude-plugin/plugin.json Plugin manifest (auto-discovered by Claude Code)
├── .mcp.json MCP server registration
├── hooks/
│ ├── hooks.json 7 hook event definitions
│ └── scripts/ Shell handlers for each hook event
├── .claude/
│ ├── commands/ 14 slash command definitions (.md)
│ ├── skills/cortex/SKILL.md Auto-invoked cortex advisor skill
│ ├── agents/
│ │ ├── cortex-advisor.md Deep graph analysis (Haiku)
│ │ ├── graph-maintainer.md Graph hygiene — consolidation, pruning (Haiku)
│ │ └── work-evaluator.md Independent QA grading, 5 dimensions (Sonnet)
│ └── rules/ Path-scoped instruction modules
│
├── brainiac/ Python: semantic graph engine
│ ├── graph.py Core data model (nodes, edges, JSON CRUD)
│ ├── embeddings.py Local sentence-transformer (384-dim)
│ ├── linker.py Auto-linking (semantic, temporal, entity)
│ ├── retriever.py Intent-aware multi-hop BFS retrieval
│ ├── consolidator.py Propose-only: merge, abstract, prune
│ └── cli.py CLI entry point
│
├── cortex/ TypeScript: hook processor + MCP server
│ ├── src/engine/ Hook processor core
│ ├── src/hooks/ Event routing
│ ├── src/graph/ Knowledge graph wrapper
│ ├── src/mcp/ MCP server + 5 dashboard tools
│ └── dist/ Pre-built output (committed for portability)
│
├── contextscore/ Python: 7-dimension quality scoring
│ ├── src/contextscore/analyzers/ 7 quality analyzers
│ ├── src/contextscore/scorer.py Weighted aggregation
│ └── src/contextscore/snapshot/ Extract + store + recover context
│
├── dashboard/ Self-contained HTML visualizations
└── scripts/ tmux launcher for autonomous sessions
SessionStart ──► Load graph stats, inject quality summary
│
PostToolUse ───► Track tool events async (zero latency)
│ ├── Write/Edit → file + subsystem tracking
│ ├── Bash → command + test detection
│ └── Read → compaction signal detection
│
PreCompact ────► Snapshot decisions, entities, files to disk
PostCompact ───► Re-inject recovery pointers (lossless)
│
Stop ──────────► Persist graph, output summary, check Ralph loop
| Type | Prefix | Purpose |
|---|---|---|
| Pattern | pat- |
Reusable approaches that work |
| Antipattern | anti- |
What NOT to do, with evidence |
| Workflow | wf- |
Effective Claude Code workflows |
| Hypothesis | hyp- |
Testable claims being validated |
| Solution | sol- |
Proven debugging solutions |
| Decision | dec- |
Architecture decisions with rationale |
| Type | Meaning | Creation |
|---|---|---|
semantic |
Conceptually related | Auto: cosine similarity >= 0.7 |
temporal |
Learned in sequence | Auto: same project, 7-day window |
causal |
X led to discovering Y | Manual: via /learn |
entity |
Share project/domain | Auto: tag overlap >= 2 |
# TypeScript (cortex engine + MCP server)
cd cortex && npm install && npm run build && npx vitest run
# Python (brainiac graph engine)
pip install -e . && pytest tests/ --ignore=tests/shell -v
# Python (contextscore)
cd contextscore && pip install -e . && pytest tests/ -v
# Shell tests (hooks + autonomy scripts)
bash tests/shell/run_all.sh- JSON persistence, not SQLite — zero deps, graph stays under 200KB
- Async hooks for PostToolUse — zero latency impact on Claude's tool loop
- Sync hooks for context injection — SessionStart and PostCompact must complete before Claude continues
- Propose-only consolidation — never auto-merge or auto-delete graph nodes
- Pre-built dist/ committed — plugin works without requiring
npm install - No database, no Docker, no daemon — one install is the entire setup
MIT