Skip to content

Peaky8linders/claude-cortex

Repository files navigation

Claude Cortex

CI License: MIT Plugin Version

A Claude Code plugin that gives Claude improved memory, context intelligence, and session observability. Three modules working together:

Module Language What it does
Brainiac Python Semantic knowledge graph with local embeddings, auto-linking, and intent-aware retrieval
Cortex TypeScript Hook processor that tracks sessions + MCP server with 5 dashboard tools
ContextScore Python 7-dimension context quality scoring with snapshot/recovery

Built on research from A-MEM (NeurIPS 2025), MAGMA (2026), SmartSearch (Derehag et al. 2026), LCM (Ehrlich 2026), and Anthropic Harness Design (2025).

Install as Claude Code Plugin

# From your project directory:
claude plugin add github:Peaky8linders/claude-cortex

# Or clone and install manually:
git clone https://github.com/Peaky8linders/claude-cortex.git
cd claude-cortex
pip install -e .                    # Python dependencies (brainiac + contextscore)
cd cortex && npm install && npm run build  # TypeScript dependencies (cortex engine + MCP server)

After install, the plugin auto-registers:

  • 7 hooks — session tracking, context snapshots, compaction recovery
  • 14 slash commands/learn, /hypothesis, /cortex-status, /review-and-ship, etc.
  • 5 MCP tools — token timeline, activity map, quality heatmap, graph explorer, unified dashboard
  • 3 agents — cortex-advisor (haiku), graph-maintainer (haiku), work-evaluator (sonnet)

What You Get

Persistent Memory Across Sessions

Claude forgets everything between sessions. Cortex doesn't. Use /learn at the end of a session to capture patterns, decisions, and solutions into a knowledge graph. Next session, Claude automatically searches the graph and applies what it learned.

Session Dashboard (MCP Tools)

Query your session in real-time via 5 MCP tools:

Tool What it shows
cortex_token_timeline Minute-by-minute token usage with spike detection and cost estimates
cortex_activity_map Gantt-like timeline of which hooks, skills, and tools fired
cortex_quality_heatmap 7-dimension radar chart of context quality (semantic relevance, redundancy, economics, etc.)
cortex_graph_explorer Interactive force-directed graph visualization (JSON for terminal, HTML for browser)
cortex_dashboard Unified session dashboard — KPIs, charts, cost breakdown, and knowledge graph in one page

Context Compaction Survival

When Claude's context window compacts, critical decisions and patterns are lost. Cortex snapshots them before compaction and re-injects them after — lossless context management.

Autonomous Workflows

Level Command What it does
L3 /run-tasks Execute task YAML with generator-evaluator pattern and sprint contracts
L4 /ralph-start Autonomous loop with dual quality gate and optional context resets
L5 /auto-research Structured experiment runner with /hypothesis tracking and eval loops

Generator-Evaluator Pattern

Inspired by Anthropic's harness design research: separating work production from assessment is more effective than self-evaluation.

  • work-evaluator agent independently grades output across 5 weighted dimensions (correctness, architecture, completeness, safety, craft)
  • Sprint contracts align generator and evaluator on testable success criteria before coding begins
  • Dual quality gate combines graph health (30%) + work output quality (70%) into a composite score; halts at < 40
  • Context reset strategy (--reset-strategy reset) provides clean-slate iterations with structured handoffs for long-running tasks

Slash Commands

Command Purpose
/learn Extract session learnings into graph nodes with auto-linking
/hypothesis Track testable claims with causal edge evidence
/brainiac Direct graph CLI: search, stats, add, consolidate
/cortex-status Graph health: node/edge counts by type
/cortex-recommend Actionable optimization suggestions
/cortex-dashboard Open interactive graph visualization
/cortex-graph Full graph dump
/cortex-snapshot Save graph state to disk
/review-and-ship Deep code review (3 parallel agents) + fix + test + PR
/run-tasks Execute task list with quality gating
/ralph-start Start autonomous loop
/ralph-stop Stop autonomous loop
/ralph-status Monitor active loop
/auto-research Structured experiment runner

Architecture

claude-cortex/
├── .claude-plugin/plugin.json     Plugin manifest (auto-discovered by Claude Code)
├── .mcp.json                      MCP server registration
├── hooks/
│   ├── hooks.json                 7 hook event definitions
│   └── scripts/                   Shell handlers for each hook event
├── .claude/
│   ├── commands/                  14 slash command definitions (.md)
│   ├── skills/cortex/SKILL.md     Auto-invoked cortex advisor skill
│   ├── agents/
│   │   ├── cortex-advisor.md      Deep graph analysis (Haiku)
│   │   ├── graph-maintainer.md    Graph hygiene — consolidation, pruning (Haiku)
│   │   └── work-evaluator.md      Independent QA grading, 5 dimensions (Sonnet)
│   └── rules/                     Path-scoped instruction modules
│
├── brainiac/                      Python: semantic graph engine
│   ├── graph.py                   Core data model (nodes, edges, JSON CRUD)
│   ├── embeddings.py              Local sentence-transformer (384-dim)
│   ├── linker.py                  Auto-linking (semantic, temporal, entity)
│   ├── retriever.py               Intent-aware multi-hop BFS retrieval
│   ├── consolidator.py            Propose-only: merge, abstract, prune
│   └── cli.py                     CLI entry point
│
├── cortex/                        TypeScript: hook processor + MCP server
│   ├── src/engine/                Hook processor core
│   ├── src/hooks/                 Event routing
│   ├── src/graph/                 Knowledge graph wrapper
│   ├── src/mcp/                   MCP server + 5 dashboard tools
│   └── dist/                      Pre-built output (committed for portability)
│
├── contextscore/                  Python: 7-dimension quality scoring
│   ├── src/contextscore/analyzers/  7 quality analyzers
│   ├── src/contextscore/scorer.py   Weighted aggregation
│   └── src/contextscore/snapshot/   Extract + store + recover context
│
├── dashboard/                     Self-contained HTML visualizations
└── scripts/                       tmux launcher for autonomous sessions

Hook Lifecycle

SessionStart ──► Load graph stats, inject quality summary
     │
PostToolUse ───► Track tool events async (zero latency)
     │            ├── Write/Edit → file + subsystem tracking
     │            ├── Bash → command + test detection
     │            └── Read → compaction signal detection
     │
PreCompact ────► Snapshot decisions, entities, files to disk
PostCompact ───► Re-inject recovery pointers (lossless)
     │
Stop ──────────► Persist graph, output summary, check Ralph loop

Knowledge Graph

Node Types

Type Prefix Purpose
Pattern pat- Reusable approaches that work
Antipattern anti- What NOT to do, with evidence
Workflow wf- Effective Claude Code workflows
Hypothesis hyp- Testable claims being validated
Solution sol- Proven debugging solutions
Decision dec- Architecture decisions with rationale

Edge Types

Type Meaning Creation
semantic Conceptually related Auto: cosine similarity >= 0.7
temporal Learned in sequence Auto: same project, 7-day window
causal X led to discovering Y Manual: via /learn
entity Share project/domain Auto: tag overlap >= 2

Development

# TypeScript (cortex engine + MCP server)
cd cortex && npm install && npm run build && npx vitest run

# Python (brainiac graph engine)
pip install -e . && pytest tests/ --ignore=tests/shell -v

# Python (contextscore)
cd contextscore && pip install -e . && pytest tests/ -v

# Shell tests (hooks + autonomy scripts)
bash tests/shell/run_all.sh

Design Principles

  • JSON persistence, not SQLite — zero deps, graph stays under 200KB
  • Async hooks for PostToolUse — zero latency impact on Claude's tool loop
  • Sync hooks for context injection — SessionStart and PostCompact must complete before Claude continues
  • Propose-only consolidation — never auto-merge or auto-delete graph nodes
  • Pre-built dist/ committed — plugin works without requiring npm install
  • No database, no Docker, no daemon — one install is the entire setup

License

MIT

About

No description, website, or topics provided.

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors