perf: cache hot-path work in retrieval, logging, and health#74
Merged
Conversation
Four low-risk optimizations on frequently-executed paths: - retrieval/embeddings.py: memoize get_embedding() so repeated query text skips the SentenceTransformer forward pass (the costliest retrieval step). Cache returns an immutable tuple; get_embedding returns a fresh list copy so callers cannot poison the cache. Adds reset_embedding_cache(). - utils/logger.py: precompile redaction regexes once per privacy config instead of recompiling on every audited field of every query. - utils/health.py: cache parsed config per path so /health stops re-reading and re-parsing config.yaml on every request. - utils/personality.py: remove a duplicated _SQL_INSERT_SOUL_VERSION definition (dead reassignment). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01WqUukeKob4K5cfg2GXBqwP
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Four low-risk performance optimizations on frequently-executed code paths. Each was verified against the actual code (and functionally exercised) rather than applied blindly — a couple of candidate findings were discarded as incorrect or already implemented.
Optimizations
retrieval/embeddings.py— memoize query embeddings (highest impact)get_embedding()ran a full SentenceTransformer forward pass on every call, including for repeated query text. It now caches results vialru_cache. The cached value is an immutable tuple andget_embeddingreturns a freshlistcopy, so callers cannot poison the cache. Addedreset_embedding_cache().utils/logger.py— precompile redaction regexesredact_sensitive()runs on every audited field of every query and previously recompiled its email/IP/secret regexes each call. Patterns are now compiled once per privacy configuration (keyed on the hashable privacy settings, so a config change still yields a fresh set). Signature unchanged.utils/health.py— cache parsed config per pathcheck_all()runs on every/healthrequest and re-read + re-parsedconfig.yamleach time. Config is now parsed once per path vialru_cache.utils/personality.py— remove dead codeDeleted a duplicated
_SQL_INSERT_SOUL_VERSIONdefinition (the first was immediately overwritten by an identical reassignment).Discarded candidates (for the record)
stem_tagsas a native list in ChromaDB metadata — not possible; ChromaDB metadata only accepts scalar types, sojson.dumpsis required.embeddings.py/logger.py— already cached at HEAD.Verification
redact_sensitive(text, cfg)signature preserved;_SQL_INSERT_SOUL_VERSIONstill defined once and used at 4 call sites;check_allis patched in gate tests).🤖 Generated with Claude Code
Generated by Claude Code