perf: search result caches, background warmup, symbol-scan gating by justrach · Pull Request #613 · justrach/codedb

justrach · 2026-06-12T14:06:05Z

Summary

Five commits attacking the production latency tail (search p90 30ms, codedb_find p90 17.7ms) from the 2,467-call production query log:

Whole-query result LRUs for searchContent/renderPlainSearch (7c60f7d) and the BM25 searchContentRanked path (2e1c148). 64 entries / 4MB each; entries validated against both Explorer.search_gen and a fingerprint of the nine ranking kill-switch env vars. CODEDB_NO_SEARCH_CACHE=1 disables. Measured: error query 20.7us -> 2.0us on hit.
Race fix (2a40e62): generation bumps moved inside the exclusive lock — bumping before the lock let a concurrent search cache pre-mutation results under the post-mutation generation (permanent stale hit). Cache hits also restore the producing search's breakdown for telemetry/provenance.
Background warmup (9fdbc15): serve/mcp/cli-daemon spawn a thread that builds+persists the word index off the query path and replays the most-repeated queries from queries.log through the real search entry points. 62% of production calls are exact repeats of an earlier (tool, query) pair. First-call MCP search: 21.8ms -> 6.3ms. CODEDB_NO_WARMUP=1 disables; skipped under CODEDB_LOW_MEMORY (956c8d9).
Symbol-scan gating (c7fec7e): findSymbol/findAllSymbols/renderSymbols ran full O(files x symbols) outline safety scans on every call — a feat: local-server trial — restore HTTP port, configurable CODEDB_PORT, O(1) findSymbol, MCP stdout fix #310-era net that predates symbol_index_complete (perf: snapshot fast-load eagerly builds the symbol index — 33% of load time and ~43MB heap that plain search never uses #564). Gated on the flag: ~6ms/call -> 50-100ns for index misses on a 20k-file corpus. ensureSymbolIndex now rebuilds from scratch to avoid duplicate entries.

Stacked follow-up: the skip_trigram_files reconciliation fix (separate PR, based on this branch).

Test plan

zig build test — 835/835
python3 scripts/e2e_mcp_test.py — 20/20
Live MCP measurement on this repo (first-call latency, cache-hit timings)
Repo benchmark runs with CODEDB_NO_SEARCH_CACHE=1 for comparable per-query rows + explicit cached row

Generated with Devin

…earch Agents re-issue identical searches constantly. Two sibling caches now serve repeats: SearchResultCache (searchContent - a hit dupes the cached results into the caller's allocator, same ownership contract as a fresh search) and PlainRenderCache (renderPlainSearch - the MCP fast path renders straight to text and never reaches searchContent). 64 entries / 4 MB each, LRU. An entry is served only when BOTH its generation and env fingerprint still match. Explorer.search_gen bumps (atomically - searches hold the SHARED lock) on every mutation that can change results: commitParsedFileOwnedOutline, removeFile, rebuildWordIndex, and the one-shot lazy ranking builds (ensureSymbolIndex, call-graph, co-change). The fingerprint hashes the nine ranking kill-switch env vars, so tests that toggle CODEDB_LEX_FREQ_PENALTY et al mid-process can never be served results computed under the other setting. The generation is read BEFORE a search runs, so a concurrent mutation makes the stored entry stale immediately. CODEDB_NO_SEARCH_CACHE=1 disables both caches. The repo benchmark sets it for its per-query rows (numbers stay comparable across versions) and adds one explicit "cached" row: error 20.7us uncached -> 2.0us hit (10x). 8 new tests: hit identity + caller ownership, indexFile/removeFile invalidation, env-fingerprint staleness, kill-switch bypass, LRU bound, and the renderPlainSearch pair. 822/822 total, e2e MCP 20/20. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>

…breakdown on hits Bumping search_gen BEFORE taking the exclusive lock let a concurrent search load the new generation, win the shared lock, and cache pre-mutation results under the post-mutation generation — a permanent stale hit. Moved the bumps in commitParsedFileOwnedOutline, removeFile, and rebuildWordIndex inside the exclusive lock and documented the ordering invariant on bumpSearchGen. Cache hits also now restore the producing search's breakdown (tier/candidate/result counts, timings zeroed, cache_hit flag) instead of leaving last_search_breakdown pointing at whatever search ran last — mcp.zig's telemetry and the JSON provenance meta both read it after every search call. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>

…ries.log Production query logs (2,467 calls) show the latency tail is lazy work charged to an innocent first query — the word-index rebuild after a snapshot fast-load runs 50ms-2s and lands on the first codedb_word/ search call — and 62% of calls are exact repeats of an earlier (tool, query) pair that the result caches could serve at microseconds, but only within one process lifetime. The serve/mcp/cli-daemon modes now spawn a background warmup thread that waits for the scan to be ready, then (1) loads-or-rebuilds and persists the word index off the query path, and (2) replays the most repeated recent queries from the project's queries.log WAL through the same entry points real codedb_search calls use (renderPlainSearch with the handler's default max_results, searchContentAuto fallback, so the caches are warm before the first real call and the replayed searches trigger the lazy ranking builds too. CODEDB_NO_WARMUP=1 disables it. Live MCP measurement on this repo: first-call search latency 21.8ms -> 6.3ms (remaining cost is JSON-RPC round-trip), variance 21-40ms -> 6.2-6.5ms. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com> EOF )

Multi-word queries route through searchContentAuto to searchContentRanked, which had no result cache — so repeated conceptual/NL searches (and the warmup replay of logged multi-word queries) always paid the full BM25 + centrality pass. Uses a SEPARATE SearchResultCache instance with the same generation + env-fingerprint validation: the BM25 ranking returns different results than searchContent for an identical (query, max_results) key, so the two must never share entries (covered by a dedicated non-collision test). Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>

findSymbol/findAllSymbols/renderSymbols all ran full O(files × symbols) outline scans on EVERY call (renderSymbols twice: count + render) to catch symbols the index missed — a #310-era safety net that predates the symbol_index_complete flag (#564). All three entry points call ensureSymbolIndex first, and a complete index is maintained by every commit (rebuildSymbolIndexFor) and removal (removeSymbolIndexFor), so when complete the scans were pure overhead: ~6ms per call on a 20k-file corpus, matching the production codedb_find tail (med 4.5ms, p90 17.7ms). Now 50-100ns for index misses. ensureSymbolIndex also rebuilds from scratch now: entries indexed before markSymbolIndexIncomplete would otherwise be duplicated by the rebuild loop (had_prior=false skips eviction) — previously latent, load-bearing once the index is authoritative. benchmark: print tier3/4/5 in CODEDB_BENCH_BREAKDOWN rows (they were silently omitted, hiding 6ms of tier-3 time on zero-hit queries). Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>

Low-memory mode trades latency for RSS everywhere else (see compactMcpReadyMemory); do not pre-pay index builds + result caches there. Measured warmup cost on a 620-file repo: ~70ms one-time background CPU, +4.4MB steady-state RSS vs the post-first-query baseline (caches are hard-capped at 4MB each). Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>

github-actions · 2026-06-12T14:06:18Z

👋 Thanks for the contribution! Quick heads-up: this repo lands changes on the current release/* branch, not main.

Please retarget this PR via Edit → base branch to the active release branch (currently release/0.2.5825).

(Automated hint — reply here if you need a hand.)

github-actions · 2026-06-12T14:09:06Z

Benchmark Regression Report

Thresholds: 10.00% and 50,000 ns absolute delta

NOISE means the percentage threshold was exceeded, but the absolute delta was too small to fail CI.

Tool	Base (ns)	Head (ns)	Delta	Abs Delta (ns)	Status
`codedb_bundle`	106150	102656	-3.29%	-3494	OK
`codedb_changes`	10649	10269	-3.57%	-380	OK
`codedb_context`	1156508	791949	-31.52%	-364559	OK
`codedb_deps`	277	388	+40.07%	+111	NOISE
`codedb_edit`	78084	39900	-48.90%	-38184	OK
`codedb_find`	9698	2546	-73.75%	-7152	OK
`codedb_hot`	25449	24132	-5.18%	-1317	OK
`codedb_outline`	35501	37114	+4.54%	+1613	OK
`codedb_read`	17452	16641	-4.65%	-811	OK
`codedb_search`	27486	13183	-52.04%	-14303	OK
`codedb_snapshot`	64413	70517	+9.48%	+6104	OK
`codedb_status`	8794	8947	+1.74%	+153	OK
`codedb_symbol`	21419	51948	+142.53%	+30529	NOISE
`codedb_tree`	48138	19416	-59.67%	-28722	OK
`codedb_word`	13784	11405	-17.26%	-2379	OK

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 956c8d9d70

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-12T14:11:09Z

+                if (list.items.len >= spec.max_results) break;
+            }
+            if (list.items.len >= spec.max_results) break;


Rank symbol matches before applying max_results

When a broad symbol query matches more than max_results entries (for example fuzzy=true on a typo with many candidate names, or kind=function in a large repo), this loop stops while iterating the symbol_index hash map and only sorts that arbitrary prefix afterward. That means better-scoring fuzzy matches, or alphabetically earlier matches that happen to be later in the map/outlines, are never considered and the new symbol search can return non-top results. Collect all matches (or maintain a top-k heap) before truncating.

Useful? React with 👍 / 👎.

justrach and others added 6 commits June 12, 2026 17:24

justrach mentioned this pull request Jun 12, 2026

fix(search): reconcile skip_trigram_files with disk-loaded trigram indexes #614

Closed

4 tasks

chatgpt-codex-connector Bot reviewed Jun 12, 2026

View reviewed changes

justrach changed the base branch from main to release/0.2.5825 June 12, 2026 15:21

justrach merged commit 568c194 into release/0.2.5825 Jun 12, 2026
2 of 3 checks passed

justrach deleted the perf/background-warmup branch June 12, 2026 15:21

justrach mentioned this pull request Jun 12, 2026

fix(search): reconcile skip_trigram_files with disk-loaded trigram indexes #615

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: search result caches, background warmup, symbol-scan gating#613

perf: search result caches, background warmup, symbol-scan gating#613
justrach merged 6 commits into
release/0.2.5825from
perf/background-warmup

justrach commented Jun 12, 2026

Uh oh!

github-actions Bot commented Jun 12, 2026

Uh oh!

github-actions Bot commented Jun 12, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

justrach commented Jun 12, 2026

Summary

Test plan

Uh oh!

github-actions Bot commented Jun 12, 2026

Uh oh!

github-actions Bot commented Jun 12, 2026

Benchmark Regression Report

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant