perf(search): whole-query result LRU (10x on repeated searches) by justrach · Pull Request #612 · justrach/codedb

justrach · 2026-06-12T09:24:36Z

Summary

Whole-query LRU caches for repeated searches - agents re-issue identical queries constantly, and until now every repeat re-ran the full tier pipeline.

SearchResultCache (searchContent): a hit dupes the cached results into the caller's allocator - identical ownership contract to a fresh search.
PlainRenderCache (renderPlainSearch): the MCP codedb_search fast path renders straight to text and never reaches searchContent, so it gets its own rendered-bytes cache.
Both: 64 entries / 4 MB, LRU eviction, single-entry byte cap.

Correctness model

An entry is served only when BOTH validators still match:

Generation (Explorer.search_gen, atomic): bumped by every mutation that can change results - commitParsedFileOwnedOutline (all indexing funnels through it), removeFile, rebuildWordIndex, and the one-shot lazy ranking builds (ensureSymbolIndex, call-graph, co-change - these flip ranking gates mid-flight on first use).
Env fingerprint: hash of the nine ranking kill-switch env vars (CODEDB_NO_COCHANGE, CODEDB_LEX_FREQ_PENALTY, ...). The existing test suite toggles these mid-process between identical searches on one explorer; the fingerprint guarantees those can never see stale results.

The generation is read BEFORE a search runs, so a concurrent mutation makes the stored entry stale immediately (conservative direction). CODEDB_NO_SEARCH_CACHE=1 disables both caches entirely.

Benchmark honesty

The repo benchmark sets CODEDB_NO_SEARCH_CACHE=1 for its per-query rows so numbers stay comparable across versions, and adds one explicit cached row:

error    20.7 us   50   search    (uncached, unchanged)
error     2.0 us   50   cached    (10x on repeats)

Test plan

8 new tests: hit identity + caller-owned copies (testing.allocator catches UAF/leaks), indexFile/removeFile invalidation, env-fingerprint staleness + re-hit, kill-switch bypass, LRU entry bound, renderPlainSearch hit + invalidation
zig build test --summary all - 822/822
python3 scripts/e2e_mcp_test.py - 20/20

Generated with Devin

…earch Agents re-issue identical searches constantly. Two sibling caches now serve repeats: SearchResultCache (searchContent - a hit dupes the cached results into the caller's allocator, same ownership contract as a fresh search) and PlainRenderCache (renderPlainSearch - the MCP fast path renders straight to text and never reaches searchContent). 64 entries / 4 MB each, LRU. An entry is served only when BOTH its generation and env fingerprint still match. Explorer.search_gen bumps (atomically - searches hold the SHARED lock) on every mutation that can change results: commitParsedFileOwnedOutline, removeFile, rebuildWordIndex, and the one-shot lazy ranking builds (ensureSymbolIndex, call-graph, co-change). The fingerprint hashes the nine ranking kill-switch env vars, so tests that toggle CODEDB_LEX_FREQ_PENALTY et al mid-process can never be served results computed under the other setting. The generation is read BEFORE a search runs, so a concurrent mutation makes the stored entry stale immediately. CODEDB_NO_SEARCH_CACHE=1 disables both caches. The repo benchmark sets it for its per-query rows (numbers stay comparable across versions) and adds one explicit "cached" row: error 20.7us uncached -> 2.0us hit (10x). 8 new tests: hit identity + caller ownership, indexFile/removeFile invalidation, env-fingerprint staleness, kill-switch bypass, LRU bound, and the renderPlainSearch pair. 822/822 total, e2e MCP 20/20. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7c60f7d981

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-12T09:26:48Z

    }

    pub fn commitParsedFileOwnedOutline(self: *Explorer, path: []const u8, content: []const u8, outline: FileOutline, full_index: bool, skip_trigram: bool) !void {
+        self.bumpSearchGen();


Move search_gen bumps after the protected mutation

Because this bump happens before the exclusive lock and before the indexed state is updated, a concurrent search can read the new generation at searchContent, acquire the shared lock first, compute results from the old index, and then store those old results under the new generation. After this file commit completes, later identical searches will see the same generation and can serve that stale cache entry indefinitely; the same ordering also affects removals/rebuilds that bump before taking the write lock. Bump the generation while holding the write lock after the mutation, or otherwise make the generation read/state read atomic with respect to the mutation.

Useful? React with 👍 / 👎.

github-actions · 2026-06-12T09:27:35Z

Benchmark Regression Report

Thresholds: 10.00% and 50,000 ns absolute delta

NOISE means the percentage threshold was exceeded, but the absolute delta was too small to fail CI.

Tool	Base (ns)	Head (ns)	Delta	Abs Delta (ns)	Status
`codedb_bundle`	105570	108913	+3.17%	+3343	OK
`codedb_changes`	13777	10500	-23.79%	-3277	OK
`codedb_context`	1703188	1562024	-8.29%	-141164	OK
`codedb_deps`	329	317	-3.65%	-12	OK
`codedb_edit`	35996	48338	+34.29%	+12342	NOISE
`codedb_find`	13601	2692	-80.21%	-10909	OK
`codedb_hot`	25725	17081	-33.60%	-8644	OK
`codedb_outline`	35662	36315	+1.83%	+653	OK
`codedb_read`	16280	16423	+0.88%	+143	OK
`codedb_search`	10719	9352	-12.75%	-1367	OK
`codedb_snapshot`	67883	81114	+19.49%	+13231	NOISE
`codedb_status`	14581	9388	-35.61%	-5193	OK
`codedb_symbol`	50667	51889	+2.41%	+1222	OK
`codedb_tree`	35050	45581	+30.05%	+10531	NOISE
`codedb_word`	11939	12861	+7.72%	+922	OK

chatgpt-codex-connector Bot reviewed Jun 12, 2026

View reviewed changes

justrach merged commit 7c60f7d into release/0.2.5825 Jun 12, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(search): whole-query result LRU (10x on repeated searches)#612

perf(search): whole-query result LRU (10x on repeated searches)#612
justrach merged 1 commit into
release/0.2.5825from
perf/search-result-cache

justrach commented Jun 12, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 12, 2026

Uh oh!

github-actions Bot commented Jun 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

justrach commented Jun 12, 2026

Summary

Correctness model

Benchmark honesty

Test plan

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Jun 12, 2026

Benchmark Regression Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant