Releases: DeusData/codebase-memory-mcp
v0.5.3
Incremental Reindex
Auto-detects previously indexed projects and re-parses only changed files.
- mtime+size classification against stored hashes
- Surgical node deletion (edges cascade), re-parse only deltas
- Instant no-op (<1ms) when nothing changed
- Auto-routes: first run = full RAM pipeline, subsequent = incremental disk
| Scenario | Time |
|---|---|
| Nothing changed | <1ms |
| 1 file modified | ~2ms |
| 1 file added/deleted | ~1ms |
ADR Hints
- index_repository: adr_present + adr_hint when no ADR exists
- get_graph_schema: adr_present + adr_hint per project
- manage_adr GET: creation hint when no ADR
Simplified get_code_snippet
Streamlined to exact QN + suffix matching. Guides users to search_graph when symbol not found.
Upgrading
```bash
codebase-memory-mcp update
```
v0.5.2
Fixes
- Release RAM after indexing: Call `mi_collect(true)` after pipeline completion to return mimalloc pages to the OS. On Linux this immediately reduces RSS; on macOS pages are marked reusable (cosmetically retained until memory pressure).
- Standalone Windows binary: Add `-static` to Windows linker flags. The binary no longer requires `libc++.dll`, `libunwind.dll`, or any MSYS2/CLANG64 runtime DLLs — fully self-contained .exe.
Upgrading
```bash
codebase-memory-mcp update
```
v0.5.1
Hotfix: MCP protocol handshake
Fixes the MCP server failing to connect to Claude Code (and other MCP clients).
Bug: protocolVersion in the initialize response was returned as a nested object {"version":"2024-11-05"} instead of the plain string "2024-11-05" required by the MCP specification. Claude Code rejected the malformed response and marked the server as failed.
Fix: One-line change — protocolVersion is now a plain string value.
Upgrading from v0.5.0
```bash
codebase-memory-mcp update
```
All other v0.5.0 features (Go-to-C rewrite, 8-agent install, UI, auto-index, update check) are unchanged.
v0.5.0
Complete Go to C Rewrite
v0.5.0 is the first release built entirely from C. The entire codebase -- pipeline, store, MCP server, CLI, watcher -- has been rewritten from Go to C, with tree-sitter grammars compiled via CGo replaced by vendored C source for all 64 languages.
What this enables
- RAM-first pipeline: All indexing runs in memory (LZ4 HC compressed read, in-memory SQLite, single dump at end). Zero disk I/O between bulk load and final write.
- Fused Aho-Corasick multi-pattern matching: Call resolution uses a single pass over the AST with all patterns loaded simultaneously, replacing sequential per-function grep.
- C/C++ hybrid LSP resolver: Template substitution, smart pointer chains, overload scoring, lambda/decltype inference, virtual dispatch -- 700+ dedicated tests.
- mimalloc global allocator: Tracks all allocations (C + C++ via global override), enabling precise memory budgeting per worker.
- No CGo boundary overhead: All tree-sitter parsing happens in pure C -- no per-file CGo hop.
Performance
Indexing the Linux kernel (28M LOC, 75K files):
- Full mode: 2.1M nodes, 5m33s (Apple M3 Pro)
- Fast mode: 1.88M nodes, 1m12s (Apple M3 Pro)
New: Graph Visualization UI
v0.5.0 ships in two variants:
- standard -- MCP server only (smaller binary)
- ui -- MCP server + embedded 3D graph visualization
Enable the UI:
```bash
codebase-memory-mcp --ui=true --port=9749
```
Then open http://localhost:9749 to explore your knowledge graph visually. The UI runs as a background thread on localhost, serving embedded frontend assets and proxying queries to a read-only SQLite connection.
Session Auto-Detect + Auto-Index
The MCP server now detects your project root from the working directory on session start. Combined with the config store:
```bash
codebase-memory-mcp config set auto_index true
```
When enabled, new projects are automatically indexed on first MCP connection, and the watcher registers them for ongoing git-based change detection. Previously-indexed projects are always registered with the watcher regardless of this setting.
The config CLI supports: `list`, `get `, `set `, `reset `.
Multi-Agent Install (8 Coding Agents)
`codebase-memory-mcp install` now auto-detects all installed coding agents and configures each one with MCP server entries, instruction files, and pre-tool hooks where supported.
| Agent | MCP Config | Instructions | Hooks |
|---|---|---|---|
| Claude Code | .claude/.mcp.json | 4 Skills (directive pattern) | PreToolUse on Grep/Glob/Read |
| Codex CLI | .codex/config.toml | .codex/AGENTS.md | -- |
| Gemini CLI | .gemini/settings.json | .gemini/GEMINI.md | BeforeTool on grep/read |
| Zed | settings.json (JSONC) | -- | -- |
| OpenCode | opencode.json | .config/opencode/AGENTS.md | -- |
| Antigravity | mcp_config.json | .gemini/antigravity/AGENTS.md | -- |
| Aider | -- | CONVENTIONS.md | -- |
| KiloCode | mcp_settings.json | ~/.kilocode/rules/ | -- |
Agentic Behavior Improvements
Agents now actively prefer MCP tools over grep/glob/read for code discovery:
- Directive skill descriptions achieve ~100% auto-activation (up from ~37% with the old descriptive pattern)
- PreToolUse / BeforeTool hooks print advisory reminders when agents reach for built-in search tools
- Keyword-rich MCP tool descriptions improve Claude's Tool Search discovery
- Instruction files with concrete examples and explicit fallback guidance for all agents
- Startup update check notifies on first tool call if a newer release is available
These improvements were driven by community reports and contributions:
- @David34920 -- reported Claude ignoring MCP tools without CLAUDE.md edits (#69)
- @sonicviz -- detailed analysis of agent tool-selection heuristics (#34)
- @chitralverma -- reported Gemini CLI defaulting to built-in tools (#19)
- @noelkurian -- identified Zed config format bug and JSONC parsing issue (#24)
- @zeval -- OpenCode install PR with config path research (#36)
- @harshil480 -- KiloCode install PR with config format and test plan (#53)
Fixes
- Zed: Config now uses args:[""] instead of broken source:"custom" (#24)
- Zed: Parser handles JSONC (comments + trailing commas) in existing settings.json
- Install is idempotent: Running install twice produces no duplicates -- marker-based upsert for instructions, key-based upsert for JSON/TOML configs
Cross-Platform Support
Fully tested on all platforms with ASan + LeakSanitizer + UBSan:
| Platform | Tests | Build |
|---|---|---|
| macOS arm64 | 2030 passed | OK |
| macOS amd64 | OK | OK |
| Linux arm64 | 2012 passed | OK |
| Linux amd64 | OK | OK |
| Windows amd64 | OK | OK (CLANG64) |
Vendored dependencies (zero system library requirements): sqlite3, mimalloc, tree-sitter runtime, yyjson, zlib, TRE regex (Windows only).
Upgrading
```bash
codebase-memory-mcp update
```
Or fresh install:
```bash
curl -fsSL https://raw.githubusercontent.com/DeusData/codebase-memory-mcp/main/install.sh | bash
codebase-memory-mcp install -y
```
Existing indexes will be rebuilt automatically -- the C pipeline produces a different (improved) graph format.
v0.4.10
Explicit Watch List for Watcher (#49)
Fixes the OOM issue where the watcher would open every indexed project's database on startup, even projects the user isn't actively working on.
What changed
- Replaced scan-all polling with an explicit watch list — the watcher no longer calls
ListProjects()to discover databases on disk. Only projects in the watch list get polled for file changes. - Watch on index — projects are added to the watch list when
index_repositorysucceeds or auto-index completes. - Unwatch on delete —
delete_projectremoves the project from the watch list immediately. - Cross-project touch — when a tool call references a non-session project (e.g.,
search_graph(project="other-repo")), that project is added to the watch list so it stays fresh for the duration of the session. - Removed dead code —
cachedProjects,projectsCacheTime,projectsCacheTTL, andInvalidateProjectsCache()are all gone.
Why
On machines with many indexed projects, the old watcher would open every .db file and capture file snapshots for all of them every 60 seconds. This caused unbounded memory growth proportional to the total number of indexed projects × their file counts. The new approach only watches projects the user is actively interacting with.
Behavior change
Projects indexed in a previous session are not automatically watched in the current session. They become watched again when the user interacts with them (index, query, trace, etc.). This is the desired behavior — don't spend resources on projects the user isn't using.
Upgrade
codebase-memory-mcp update
Full Changelog: v0.4.9...v0.4.10
v0.4.9
Dynamic Memory Limit
Replaces the static 2GB GOMEMLIMIT from v0.4.8 with platform-aware auto-detection.
What changed
- Auto-detect system memory on all platforms:
- Linux:
syscall.Sysinfo - macOS:
sysctl hw.memsize - Windows:
GlobalMemoryStatusEx
- Linux:
- GOMEMLIMIT set to 25% of physical RAM, clamped to [2GB, 8GB]
- Falls back to 4GB if detection fails
- User-configured
mem_limitstill takes priority
Why
The static 2GB default in v0.4.8 could cause excessive GC pressure on machines with plenty of RAM (e.g., a 64GB workstation was limited to 2GB). The new approach adapts to the system: a 16GB laptop gets a 4GB limit, a 32GB+ machine gets 8GB.
GOMEMLIMIT is a soft limit — hitting it causes more frequent garbage collection (slightly slower indexing) but never crashes or refuses allocations.
Examples
| System RAM | GOMEMLIMIT |
|---|---|
| 8 GB | 2 GB (min clamp) |
| 16 GB | 4 GB |
| 32 GB | 8 GB (max clamp) |
| 64 GB | 8 GB (max clamp) |
Upgrade
codebase-memory-mcp update
v0.4.8
Stability & Performance
This release addresses the top-reported stability and performance issues.
SQLite Lock Contention Fix (#52)
- Fix ref counting race condition:
ListProjectsand watcher'spollAllnow useAcquireStorewith proper ref counting, preventing the evictor from closing database connections mid-query — the most likely root cause of server hangs requiring SIGKILL - Stale SHM recovery: After an unclean shutdown (SIGKILL), stale
-shmfiles with orphaned lock state are automatically detected and removed on next startup, preventing deadlocks - Increased busy_timeout: From 5s to 10s for better tolerance on large databases
OOM Prevention (#49, #46)
- Default GOMEMLIMIT: 2GB memory limit applied by default when not user-configured, preventing unbounded memory growth that caused 13GB+ OOM kills
- Reduced mmap_size: From 256MB to 64MB per database — with multiple projects open, the old value could consume excessive virtual memory
CPU Usage Reduction (#45)
- Watcher base interval: Increased from 1s to 5s — ~5x fewer polling ticks
- Poll interval base: Increased from 1s to 5s — reduces
git statusfrequency for change detection - Net effect: significantly lower idle CPU usage, especially with multiple projects
Windows Duplicate Database Fix (#50)
- Drive letter normalization:
D:\projectandd:\projectnow resolve to the same database, preventing duplicate indexing on Windows
Upgrade
codebase-memory-mcp update
After updating, restart your editor/Claude Code session. Stale SHM files from previous crashes will be automatically cleaned up on first launch.
v0.4.7
Highlights
Go LSP Hybrid Type Resolution (experimental)
A new tree-sitter + LSP hybrid engine for Go brings cross-file type-aware call resolution — a first for codebase-memory-mcp. The C-based type resolver (internal/cbm/lsp/) combines tree-sitter AST parsing with a lightweight type registry, scope tracker, and 30,000+ Go stdlib definitions to resolve method calls, interface dispatches, and struct field accesses across package boundaries.
This is a foundational step: the same hybrid approach will be extended to TypeScript, Python, Java, and other languages in upcoming releases.
Key components:
lsp_bridge.go—CrossFileDefstruct andRunGoLSPCrossFileCGo bridgego_lsp_cross.go— cross-file definition index with struct field and interface method enrichmentgo_dep_registry.go— third-party Go module parser for dependency-aware resolutionlsp/go_lsp.c— C type resolver with scope-aware variable tracking, method set resolution, and channel direction inferencelsp/generated/go_stdlib_data.c— 30K+ Go stdlib type/function definitions for out-of-the-box resolution
.gitignore and .cbmignore Support
The indexer now respects .gitignore patterns — generated code, build artifacts, and vendored dependencies that are gitignored are no longer indexed. This is the most requested feature since launch.
- Full .gitignore hierarchy support: nested .gitignore files, .git/info/exclude, negation (!), globstar (**), directory-only patterns (logs/)
- .cbmignore — a new file that stacks additional ignore patterns on top of .gitignore, specific to codebase-memory-mcp indexing
- .cgrignore remains supported for backwards compatibility
- Zero file handle leaks — custom repository matcher reads and closes files immediately, avoiding the upstream library's handle leak (critical for Windows)
Memory Safety Fixes
- Swift scanner heap-buffer-overflow — the vendored Swift tree-sitter scanner called calloc(0, sizeof(ScannerState)), allocating a 0-or-1-byte region, then wrote a uint32_t (4 bytes) to it. Fixed to calloc(1, ...). Detected by the new AddressSanitizer CI job.
- File handle leak on Windows — the go-gitignore library's NewRepository opens .gitignore files but never closes them. Replaced with a custom repoMatcher that uses os.ReadFile + strings.NewReader — zero leaked handles. Fixes Windows CI failures.
New Features
- Persistent config store — ConfigStore backed by SQLite for server settings (auto_index, auto_index_limit, mem_limit). CLI config subcommand for get/set/list/delete.
- Comprehensive --help / -h — all 14 MCP tools documented with parameter schemas and JSON payload examples
- Symlink skipping — symlinked files and directories are no longer indexed, preventing duplicate nodes in the graph
- .worktrees directory skipping — git worktree directories are excluded from indexing (thanks @wassertim, #37)
Bug Fixes
- Cypher LIMIT respected — explicit LIMIT clauses in query_graph Cypher queries are now honored instead of being silently capped (thanks @re-thc, #40)
- 54 golangci-lint issues fixed — errcheck, gocognit, funlen, nilerr, gocritic, gosec, noctx across 10 files
- Expanded hardcoded ignore list — 28 new ecosystem-specific directories added (.next, .nuxt, .terraform, zig-cache, .cargo, bazel-out, etc.). Generic directories (bin, build, out) moved to fast-mode-only to avoid false exclusions in Go/CMake/Maven projects.
Infrastructure
- AddressSanitizer CI job — new test-asan workflow job catches heap-buffer-overflow, use-after-free, and other memory bugs in the C code
- CI dry-run improvements — workflow enhancements for cross-platform testing
Contributors
- @wassertim — .worktrees directory skipping (#37)
- @re-thc — Cypher LIMIT fix (#40)
Full changelog: v0.4.6...v0.4.7
v0.4.6
What's New
Wolfram Language support (59 → 64 languages)
- Wolfram Language (
.wl,.wls) — vendors LumaKernel/tree-sitter-wolfram grammar (ABI 13) - Function extraction:
f[x_] := ...andf[x_] = ...at top-level and nested insideModule/Block - Call extraction:
applynodes with LHS-definition filtering (definition heads are not treated as calls) - Import extraction:
<< "file.wl"(get_top) andNeeds["Package"]` - Caller attribution fix:
compute_func_qnin the unified walker now resolves Wolfram function names fromapply(user_symbol("f"), ...)LHS, so CALLS edges are attributed to the enclosing function rather than the file module
MATLAB call extraction
- Added
function_callandcommandnode types to call extraction - Resolves function references like
inv(A),eig(M),disp helloin.mfiles
Lean 4 call extraction
- Added
applycall nodes with type-position filtering to exclude type annotation false positives
Windows linker fix
- Added
-Wl,--allow-multiple-definitionCGo LDFLAGS for Windows to work around a GCC 15.2.0 / MSYS2 UCRT64 regression
v0.4.5
What's New
4 New Languages (59 → 63)
- MATLAB — function definitions, usages,
.mfile disambiguation via Linguist heuristics - Lean 4 — theorem/definition extraction, imports
- FORM —
#procedure/#callextraction for symbolic computation scripts - Magma — function/procedure/intrinsic definitions,
loadimports, call graph extraction
Magma Graph Quality Fix
- Fixed
load_statementimport extraction — addedfield('path', ...)to grammar so allloadstatements are captured (previously only the first per file) - Verified call extraction works end-to-end: cross-function CALLS edges, recursive calls, trace_call_path
Import Linker Improvement
passImports()now resolves file-path imports (e.g.load "utils.mag",#include "helpers.h") by tryingfqn.ModuleQN()when the raw path doesn't match — general fix benefiting any language with file-path-based imports
Other Changes
- Unified version management via build-injected ldflags
- Fixed Windows file URI path parsing in watcher
- Statically linked Windows binary to fix missing DLL error
- Fixed binary move command in setup script (#16)