perf: 37x faster Calendar, correct-plan Tags, no idle refetch#153
Merged
Conversation
Tier 1 of the performance roadmap. The analytics read path opened a fresh, untuned SQLite connection per query — the day view opened six per click on a default-journal-mode DB. - db::tune() applies WAL, synchronous=NORMAL, mmap_size=256MB, cache_size=64MB, temp_store=MEMORY on both the writer open() and the read-path open_ro(). WAL stops reader/scanner contention; mmap+cache keep the hot DB warm across opens. - open_ro() now hands back a pooled, tuned connection via a Deref guard (PooledConn) that returns itself to a process-wide pool on drop, so the PRAGMA setup and page cache stay warm. Callsites compile unchanged thanks to Deref coercion. Measured on a real 245MB / 147k-row DB: /api/day 477.7 -> 130.7 ms (3.6x), tags-summary 73 -> 45 ms, daily 88 -> 65 ms. Adds a perf_probe example to keep each tier honest with before/after numbers. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Tier 2 of the performance roadmap. Every day/* query filters `WHERE substr(timestamp,1,10) = ?`, which made the plain timestamp index unusable and full-scanned messages — six scans per calendar day-click. Add expression indexes idx_messages_day (substr(timestamp,1,10), type) and idx_tools_day. SQLite uses an expression index when the WHERE clause repeats the expression, so the existing SQL needs no rewrite. Indexes ship in SCHEMA with IF NOT EXISTS, so init_db builds them once on next launch for existing DBs. day_by_model EXPLAIN: SCAN -> SEARCH ... USING INDEX idx_messages_day. /api/day 130.7 -> 14.0 ms (34x vs baseline), under the 50ms target. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Tier 3 of the performance roadmap. With no sqlite_stat1, the planner drove tag_aggregates from messages (SCAN m over 147k rows) instead of the small session_tags table. ANALYZE flips it to SCAN st -> SEARCH m USING idx_messages_thread. Rather than build precomputed rollup tables (write-path complexity + drift risk), refresh stats in two cheap spots: - scan_dir runs `PRAGMA analysis_limit=400; PRAGMA optimize` after any scan that added rows. - init_db runs a one-time sampled ANALYZE when sqlite_stat1 is absent, so existing DBs get stats on next launch. tags-summary 46 -> 39 ms here; much larger win for users with few tagged sessions (no longer scans every message). All hot queries now under the 50ms target, so no rollups needed. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Tier 4 of the performance roadmap. The background scan loop ticks every 10s and emitted scan_complete unconditionally, so every connected frontend refetched its whole endpoint registry every 10s even when the user generated no new transcripts. Emit only when the scan ingested rows. Liveness is held up by the server 15s keep-alive ping, the frontend watchdog is one-shot, and the manual /api/scan route returns stats over HTTP, so suppressing idle events is safe. The planned query cache and SSE debounce were dropped as YAGNI: after T1-T3 every hot query is 13-40ms warm. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Arylmera
added a commit
that referenced
this pull request
May 31, 2026
…bottleneck (#154) Follow-up to PR #153 suspected first_prompts dominated /api/day (~95ms of ~108ms end-to-end). Measured directly: first_prompts is ~6ms for ~188 sessions and the endpoint is ~31ms server-side (raw HTTP). The earlier ~108ms was PowerShell Invoke-RestMethod deserializing the 4.6KB response into objects — a client-side artifact absent in the app, whose JS frontend parses the payload natively in <1ms. No query change warranted: rewriting the windowed query to GROUP BY MIN(timestamp) would risk the identical-timestamp tie case for ~4ms on an endpoint already under the 50ms target. Add the first_prompts timing + EXPLAIN to perf_probe so the finding stays measurable. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
Calendar day-click and the Tags tab were slow on a real-sized DB (245 MB, 147k message rows). Root cause was structural in the read path, not any one view.
What
Four tiers, each measured against the real DB with a
perf_probeexample (median of 5 warm runs):/api/day(calendar day-click)/api/tags-summary/api/dailydb::tune()applies WAL +synchronous=NORMAL+ mmap/cache/temp_store on both the writer and read paths;open_ro()now hands back a pooled, tuned connection via aDerefguard so the day view stops opening six cold connections per click. 477→131 ms.substr(timestamp,1,10). The day queries were full-scanningmessagesbecausesubstr()defeated the timestamp index; now they seek (SCAN→SEARCH … USING INDEX idx_messages_day). 131→14 ms.tag_aggregatesdrove its join frommessages(full scan) for lack of statistics.PRAGMA optimizeafter scans + a one-timeANALYZEininit_dbflip it to drive from the smallsession_tagstable. No rollup tables needed.scan_completenow broadcasts only when a scan ingested rows, so idle dashboards stop refetching the whole endpoint registry every 10s. Liveness rides the existing 15s keep-alive ping.Verification
cargo test --workspace— all passcargo fmt --checkclean,cargo clippy --all-targets --workspace -- -D warningsclean/api/day+/api/tags-summaryreturn correct payloads, all responsive.Notes
init_db(launch) for existing DBs — no manual migration.first_prompts(windowed query over a day's sessions) is now the dominant remaining cost of/api/dayend-to-end (~95 ms of ~108 ms); could be shaved if sub-50 ms end-to-end is wanted.🤖 Generated with Claude Code