Skip to content

perf: search result caches, background warmup, symbol-scan gating#613

Merged
justrach merged 6 commits into
release/0.2.5825from
perf/background-warmup
Jun 12, 2026
Merged

perf: search result caches, background warmup, symbol-scan gating#613
justrach merged 6 commits into
release/0.2.5825from
perf/background-warmup

Conversation

@justrach

Copy link
Copy Markdown
Owner

Summary

Five commits attacking the production latency tail (search p90 30ms, codedb_find p90 17.7ms) from the 2,467-call production query log:

  • Whole-query result LRUs for searchContent/renderPlainSearch (7c60f7d) and the BM25 searchContentRanked path (2e1c148). 64 entries / 4MB each; entries validated against both Explorer.search_gen and a fingerprint of the nine ranking kill-switch env vars. CODEDB_NO_SEARCH_CACHE=1 disables. Measured: error query 20.7us -> 2.0us on hit.
  • Race fix (2a40e62): generation bumps moved inside the exclusive lock — bumping before the lock let a concurrent search cache pre-mutation results under the post-mutation generation (permanent stale hit). Cache hits also restore the producing search's breakdown for telemetry/provenance.
  • Background warmup (9fdbc15): serve/mcp/cli-daemon spawn a thread that builds+persists the word index off the query path and replays the most-repeated queries from queries.log through the real search entry points. 62% of production calls are exact repeats of an earlier (tool, query) pair. First-call MCP search: 21.8ms -> 6.3ms. CODEDB_NO_WARMUP=1 disables; skipped under CODEDB_LOW_MEMORY (956c8d9).
  • Symbol-scan gating (c7fec7e): findSymbol/findAllSymbols/renderSymbols ran full O(files x symbols) outline safety scans on every call — a feat: local-server trial — restore HTTP port, configurable CODEDB_PORT, O(1) findSymbol, MCP stdout fix #310-era net that predates symbol_index_complete (perf: snapshot fast-load eagerly builds the symbol index — 33% of load time and ~43MB heap that plain search never uses #564). Gated on the flag: ~6ms/call -> 50-100ns for index misses on a 20k-file corpus. ensureSymbolIndex now rebuilds from scratch to avoid duplicate entries.

Stacked follow-up: the skip_trigram_files reconciliation fix (separate PR, based on this branch).

Test plan

  • zig build test — 835/835
  • python3 scripts/e2e_mcp_test.py — 20/20
  • Live MCP measurement on this repo (first-call latency, cache-hit timings)
  • Repo benchmark runs with CODEDB_NO_SEARCH_CACHE=1 for comparable per-query rows + explicit cached row

Generated with Devin

justrach and others added 6 commits June 12, 2026 17:24
…earch

Agents re-issue identical searches constantly. Two sibling caches now serve
repeats: SearchResultCache (searchContent - a hit dupes the cached results
into the caller's allocator, same ownership contract as a fresh search) and
PlainRenderCache (renderPlainSearch - the MCP fast path renders straight to
text and never reaches searchContent). 64 entries / 4 MB each, LRU.

An entry is served only when BOTH its generation and env fingerprint still
match. Explorer.search_gen bumps (atomically - searches hold the SHARED
lock) on every mutation that can change results: commitParsedFileOwnedOutline,
removeFile, rebuildWordIndex, and the one-shot lazy ranking builds
(ensureSymbolIndex, call-graph, co-change). The fingerprint hashes the nine
ranking kill-switch env vars, so tests that toggle CODEDB_LEX_FREQ_PENALTY
et al mid-process can never be served results computed under the other
setting. The generation is read BEFORE a search runs, so a concurrent
mutation makes the stored entry stale immediately.

CODEDB_NO_SEARCH_CACHE=1 disables both caches. The repo benchmark sets it
for its per-query rows (numbers stay comparable across versions) and adds
one explicit "cached" row: error 20.7us uncached -> 2.0us hit (10x).

8 new tests: hit identity + caller ownership, indexFile/removeFile
invalidation, env-fingerprint staleness, kill-switch bypass, LRU bound, and
the renderPlainSearch pair. 822/822 total, e2e MCP 20/20.

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
…breakdown on hits

Bumping search_gen BEFORE taking the exclusive lock let a concurrent
search load the new generation, win the shared lock, and cache
pre-mutation results under the post-mutation generation — a permanent
stale hit. Moved the bumps in commitParsedFileOwnedOutline, removeFile,
and rebuildWordIndex inside the exclusive lock and documented the
ordering invariant on bumpSearchGen.

Cache hits also now restore the producing search's breakdown
(tier/candidate/result counts, timings zeroed, cache_hit flag) instead
of leaving last_search_breakdown pointing at whatever search ran last —
mcp.zig's telemetry and the JSON provenance meta both read it after
every search call.

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
…ries.log

Production query logs (2,467 calls) show the latency tail is lazy work
charged to an innocent first query — the word-index rebuild after a
snapshot fast-load runs 50ms-2s and lands on the first codedb_word/
search call — and 62% of calls are exact repeats of an earlier
(tool, query) pair that the result caches could serve at microseconds,
but only within one process lifetime.

The serve/mcp/cli-daemon modes now spawn a background warmup thread
that waits for the scan to be ready, then (1) loads-or-rebuilds and
persists the word index off the query path, and (2) replays the most
repeated recent queries from the project's queries.log WAL through the
same entry points real codedb_search calls use (renderPlainSearch with
the handler's default max_results, searchContentAuto fallback, so the
caches are warm before the first real call and the replayed searches
trigger the lazy ranking builds too. CODEDB_NO_WARMUP=1 disables it.

Live MCP measurement on this repo: first-call search latency 21.8ms ->
6.3ms (remaining cost is JSON-RPC round-trip), variance 21-40ms ->
6.2-6.5ms.

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
EOF
)
Multi-word queries route through searchContentAuto to searchContentRanked,
which had no result cache — so repeated conceptual/NL searches (and the
warmup replay of logged multi-word queries) always paid the full BM25 +
centrality pass.

Uses a SEPARATE SearchResultCache instance with the same generation +
env-fingerprint validation: the BM25 ranking returns different results
than searchContent for an identical (query, max_results) key, so the two
must never share entries (covered by a dedicated non-collision test).

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
findSymbol/findAllSymbols/renderSymbols all ran full O(files × symbols)
outline scans on EVERY call (renderSymbols twice: count + render) to
catch symbols the index missed — a #310-era safety net that predates
the symbol_index_complete flag (#564). All three entry points call
ensureSymbolIndex first, and a complete index is maintained by every
commit (rebuildSymbolIndexFor) and removal (removeSymbolIndexFor), so
when complete the scans were pure overhead: ~6ms per call on a
20k-file corpus, matching the production codedb_find tail (med 4.5ms,
p90 17.7ms). Now 50-100ns for index misses.

ensureSymbolIndex also rebuilds from scratch now: entries indexed
before markSymbolIndexIncomplete would otherwise be duplicated by the
rebuild loop (had_prior=false skips eviction) — previously latent,
load-bearing once the index is authoritative.

benchmark: print tier3/4/5 in CODEDB_BENCH_BREAKDOWN rows (they were
silently omitted, hiding 6ms of tier-3 time on zero-hit queries).

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Low-memory mode trades latency for RSS everywhere else (see
compactMcpReadyMemory); do not pre-pay index builds + result caches
there. Measured warmup cost on a 620-file repo: ~70ms one-time
background CPU, +4.4MB steady-state RSS vs the post-first-query
baseline (caches are hard-capped at 4MB each).

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
@github-actions

Copy link
Copy Markdown

👋 Thanks for the contribution! Quick heads-up: this repo lands changes on the current release/* branch, not main.

Please retarget this PR via Edit → base branch to the active release branch (currently release/0.2.5825).

(Automated hint — reply here if you need a hand.)

@github-actions

Copy link
Copy Markdown

Benchmark Regression Report

Thresholds: 10.00% and 50,000 ns absolute delta

NOISE means the percentage threshold was exceeded, but the absolute delta was too small to fail CI.

Tool Base (ns) Head (ns) Delta Abs Delta (ns) Status
codedb_bundle 106150 102656 -3.29% -3494 OK
codedb_changes 10649 10269 -3.57% -380 OK
codedb_context 1156508 791949 -31.52% -364559 OK
codedb_deps 277 388 +40.07% +111 NOISE
codedb_edit 78084 39900 -48.90% -38184 OK
codedb_find 9698 2546 -73.75% -7152 OK
codedb_hot 25449 24132 -5.18% -1317 OK
codedb_outline 35501 37114 +4.54% +1613 OK
codedb_read 17452 16641 -4.65% -811 OK
codedb_search 27486 13183 -52.04% -14303 OK
codedb_snapshot 64413 70517 +9.48% +6104 OK
codedb_status 8794 8947 +1.74% +153 OK
codedb_symbol 21419 51948 +142.53% +30529 NOISE
codedb_tree 48138 19416 -59.67% -28722 OK
codedb_word 13784 11405 -17.26% -2379 OK

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 956c8d9d70

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/explore.zig
Comment on lines +2807 to +2809
if (list.items.len >= spec.max_results) break;
}
if (list.items.len >= spec.max_results) break;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Rank symbol matches before applying max_results

When a broad symbol query matches more than max_results entries (for example fuzzy=true on a typo with many candidate names, or kind=function in a large repo), this loop stops while iterating the symbol_index hash map and only sorts that arbitrary prefix afterward. That means better-scoring fuzzy matches, or alphabetically earlier matches that happen to be later in the map/outlines, are never considered and the new symbol search can return non-top results. Collect all matches (or maintain a top-k heap) before truncating.

Useful? React with 👍 / 👎.

@justrach justrach changed the base branch from main to release/0.2.5825 June 12, 2026 15:21
@justrach justrach merged commit 568c194 into release/0.2.5825 Jun 12, 2026
2 of 3 checks passed
@justrach justrach deleted the perf/background-warmup branch June 12, 2026 15:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant