perf: cache hot-path work in retrieval, logging, and health by CGFixIT · Pull Request #74 · CGFixIT/CyClaw

CGFixIT · 2026-06-20T07:36:42Z

Summary

Four low-risk performance optimizations on frequently-executed code paths. Each was verified against the actual code (and functionally exercised) rather than applied blindly — a couple of candidate findings were discarded as incorrect or already implemented.

Optimizations

retrieval/embeddings.py — memoize query embeddings (highest impact)
get_embedding() ran a full SentenceTransformer forward pass on every call, including for repeated query text. It now caches results via lru_cache. The cached value is an immutable tuple and get_embedding returns a fresh list copy, so callers cannot poison the cache. Added reset_embedding_cache().
utils/logger.py — precompile redaction regexes
redact_sensitive() runs on every audited field of every query and previously recompiled its email/IP/secret regexes each call. Patterns are now compiled once per privacy configuration (keyed on the hashable privacy settings, so a config change still yields a fresh set). Signature unchanged.
utils/health.py — cache parsed config per path
check_all() runs on every /health request and re-read + re-parsed config.yaml each time. Config is now parsed once per path via lru_cache.
utils/personality.py — remove dead code
Deleted a duplicated _SQL_INSERT_SOUL_VERSION definition (the first was immediately overwritten by an identical reassignment).

Discarded candidates (for the record)

Storing stem_tags as a native list in ChromaDB metadata — not possible; ChromaDB metadata only accepts scalar types, so json.dumps is required.
"Config not cached" in embeddings.py / logger.py — already cached at HEAD.
Hybrid-merge dict consolidation & micro config-lookup hoisting — skipped as risky/negligible.

Verification

Syntax-checked all four files.
Functionally exercised the changed logic directly (deps stubbed where chromadb/torch aren't installable in this env):
- logger redaction produces correct output incl. malformed-regex skip;
- embedding cache returns hits and the copy-on-return prevents mutation poisoning;
- health config returns the same cached object across calls.
Confirmed existing callers/tests remain compatible (redact_sensitive(text, cfg) signature preserved; _SQL_INSERT_SOUL_VERSION still defined once and used at 4 call sites; check_all is patched in gate tests).

Note: the full pytest suite couldn't run in this environment because chromadb/torch aren't installed (conftest imports them). Recommend running pytest in CI to confirm.

🤖 Generated with Claude Code

Generated by Claude Code

Four low-risk optimizations on frequently-executed paths: - retrieval/embeddings.py: memoize get_embedding() so repeated query text skips the SentenceTransformer forward pass (the costliest retrieval step). Cache returns an immutable tuple; get_embedding returns a fresh list copy so callers cannot poison the cache. Adds reset_embedding_cache(). - utils/logger.py: precompile redaction regexes once per privacy config instead of recompiling on every audited field of every query. - utils/health.py: cache parsed config per path so /health stops re-reading and re-parsing config.yaml on every request. - utils/personality.py: remove a duplicated _SQL_INSERT_SOUL_VERSION definition (dead reassignment). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01WqUukeKob4K5cfg2GXBqwP

CGFixIT marked this pull request as ready for review June 20, 2026 07:40

CGFixIT merged commit cadd8cd into main Jun 20, 2026
14 checks passed

CGFixIT deleted the claude/routines-code-optimization-9x7g50 branch June 20, 2026 07:45

CGFixIT restored the claude/routines-code-optimization-9x7g50 branch June 20, 2026 07:45

CGFixIT deleted the claude/routines-code-optimization-9x7g50 branch June 20, 2026 10:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: cache hot-path work in retrieval, logging, and health#74

perf: cache hot-path work in retrieval, logging, and health#74
CGFixIT merged 1 commit into
mainfrom
claude/routines-code-optimization-9x7g50

CGFixIT commented Jun 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

CGFixIT commented Jun 20, 2026

Summary

Optimizations

Discarded candidates (for the record)

Verification

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants