Skip to content

fix(search): reconcile skip_trigram_files with disk-loaded trigram indexes#615

Merged
justrach merged 1 commit into
release/0.2.5825from
perf/skip-trigram-reconcile
Jun 12, 2026
Merged

fix(search): reconcile skip_trigram_files with disk-loaded trigram indexes#615
justrach merged 1 commit into
release/0.2.5825from
perf/skip-trigram-reconcile

Conversation

@justrach

Copy link
Copy Markdown
Owner

Summary

Replaces #614 (auto-closed when its stacked base branch was deleted). Follow-up to #613, now merged into release/0.2.5825.

Fixes the dominant production search-tail bug: on the standard serve/mcp/cli-daemon startup path, tier 3 content-scanned the entire project on every fall-through query with recall_complete=false.

Two compounding failures:

  1. Snapshot restore parks every file in skip_trigram_files (it cannot know what a disk trigram index covers), and nothing reconciled the set when the disk index was later mmap-loaded.
  2. loadTrigramFromDiskIfPresent early-returned whenever trigram fileCount() > 0 — and the snapshot freshness pass reindexes changed files into the heap trigram before that check runs, so a single dirty file blocked the disk trigram load entirely.

Measured live on this repo: 613/616 files in the scan set, 9.2ms negative searches -> 0 files, 0.5–0.9ms, recall_complete=true. This matches the production search p90 (30ms).

Fix:

  • adoptTrigramIndex: single funnel for trigram replacement — swaps, bumps the search generation, prunes covered files from the skip set
  • adoptTrigramBase: mmap disk load keeps freshness-reindexed files as a masking overlay, so their newer content wins over stale base entries
  • rebuildTrigrams: now prunes what it covers + bumps the generation (it did neither)
  • both disk-load gates skip only when already disk-backed or the heap covers the whole project
  • watcher: CODEDB_TRIGRAM_CAP env override (measured on a 20k-file corpus: uncapped = zero-hit queries 7.1ms -> 1.4ms for +110MB peak RSS, +300ms index time); provenance meta reports the effective cap. Default unchanged.

Test plan

  • zig build test — 835/835 (includes new overlay test: new content wins / stale masked / base-only files resolve)
  • python3 scripts/e2e_mcp_test.py — 20/20
  • Live verification: skip set 613 -> 0 files after mmap load; overlay search finds new-content matches in freshness-reindexed files alongside base files
  • 20k-file corpus RSS/latency measurements for the cap override

Generated with Devin

…ndexes

Snapshot restore parks EVERY file in skip_trigram_files (it cannot know
what a disk trigram index covers), and nothing ever removed entries when
the index was later mmap-loaded. Worse, loadTrigramFromDiskIfPresent
early-returned whenever trigram fileCount() > 0 — and the snapshot
freshness pass reindexes changed files into the heap trigram BEFORE that
check runs, so one dirty file blocked the disk load entirely. Net
effect on the dominant serve/mcp/cli-daemon startup path: tier 3
content-scanned the ENTIRE project on every fall-through query, with
recall_complete=false. Measured live on this repo: 613/616 files in the
scan set, 9.2ms negative searches -> 0 files, 0.5-0.9ms, recall_complete
=true. This matches the production search tail (p90 30ms).

- adoptTrigramIndex: single funnel for trigram replacement — swaps,
  bumps the search generation, and prunes covered files from the skip set
- adoptTrigramBase: mmap disk load keeps freshness-reindexed files as a
  masking overlay (their newer content wins; stale base entries masked)
- rebuildTrigrams: prunes what it covers + bumps the generation (it did
  neither)
- both disk-load gates now skip only when already disk-backed or the
  heap covers the whole project
- watcher: trigram cap env override CODEDB_TRIGRAM_CAP (measured on a
  20k-file corpus: uncapped = zero-hit queries 7.1->1.4ms for +110MB
  peak RSS); provenance meta reports the effective cap

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
@justrach justrach merged commit f6670e1 into release/0.2.5825 Jun 12, 2026
3 checks passed
@justrach justrach deleted the perf/skip-trigram-reconcile branch June 12, 2026 15:22
@github-actions

Copy link
Copy Markdown

Benchmark Regression Report

Thresholds: 10.00% and 50,000 ns absolute delta

NOISE means the percentage threshold was exceeded, but the absolute delta was too small to fail CI.

Tool Base (ns) Head (ns) Delta Abs Delta (ns) Status
codedb_bundle 105203 109979 +4.54% +4776 OK
codedb_changes 10707 12326 +15.12% +1619 NOISE
codedb_context 798289 791905 -0.80% -6384 OK
codedb_deps 326 331 +1.53% +5 OK
codedb_edit 36682 40252 +9.73% +3570 OK
codedb_find 3016 4587 +52.09% +1571 NOISE
codedb_hot 24400 24714 +1.29% +314 OK
codedb_outline 36330 36819 +1.35% +489 OK
codedb_read 17228 16527 -4.07% -701 OK
codedb_search 13148 13413 +2.02% +265 OK
codedb_snapshot 73249 77025 +5.16% +3776 OK
codedb_status 9841 12963 +31.72% +3122 NOISE
codedb_symbol 52225 55042 +5.39% +2817 OK
codedb_tree 19387 20686 +6.70% +1299 OK
codedb_word 11647 14224 +22.13% +2577 NOISE

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant