Skip to content

feat: opt-in in-memory package-insert cache (ADR-0011)#43

Merged
shin13 merged 5 commits into
mainfrom
feat/insert-cache
Jul 1, 2026
Merged

feat: opt-in in-memory package-insert cache (ADR-0011)#43
shin13 merged 5 commits into
mainfrom
feat/insert-cache

Conversation

@shin13

@shin13 shin13 commented Jul 1, 2026

Copy link
Copy Markdown
Owner

Summary

Implements ADR-0011 — an opt-in, off-by-default, in-memory cache of raw
GetDrugDoc XML keyed by license code. Cuts repeat egress to mcp.fda.gov.tw
for the shared HTTP service (ADR-0010 Model B), where every clinician's
get_package_insert leaves through one IP. Individual uvx users keep the
ADR-0009 live-fetch behaviour by default (cache off).

  • InsertCache (sources/insert/cache.py): stores raw XML bytes, re-parses
    on hit (one entry serves any fields/response_format, robust to parser
    changes). TTL + entry/byte caps with oldest-first eviction; per-license herd
    lock with a lock-free fast path; periodic INFO stats rollup.
  • Client split: fetch_drug_insert_bytes (network→bytes) under the existing
    fetch_drug_insert (bytes→parsed). check_insert_updates keeps the parsed
    path and never touches the cache.
  • Response: get_package_insert gains from_cache + cache_age_hours;
    retrieved_at reports the real TFDA fetch time even on a hit; last_update_date
    (clinical currency) is unaffected by caching.
  • Safety: empty and unparseable responses are never cached — a transient
    upstream blip cannot pin a false INSERT_NOT_FOUND for the TTL.
  • Config: INSERT_CACHE_ENABLED (off), INSERT_CACHE_TTL_HOURS (6),
    INSERT_CACHE_MAX_ENTRIES (1000), INSERT_CACHE_MAX_MB (128).

Deviation from ADR §3 phasing: both entry and byte caps ship together (the
byte counter is needed regardless, one eviction loop handles both, and the
Verification section tests the byte cap) — clarity over splitting.

Test Plan

  • ruff check . clean
  • pyright src — 0 errors
  • pytest — 152 passed (+20 new), 8 integration/smoke deselected, ~1.7s
  • Output schema snapshot regenerated (additive: from_cache, cache_age_hours)
  • CI green on the PR

Assisted-By: Claude noreply@anthropic.com

shin13 added 5 commits July 1, 2026 20:17
Assisted-By: Claude <noreply@anthropic.com>
Separates network (bytes) from parse so the cache can store raw XML and
re-parse on hit. check_insert_updates keeps using the parsed wrapper.

Assisted-By: Claude <noreply@anthropic.com>
In-memory LRU-by-age cache of raw GetDrugDoc XML keyed by license code:
TTL + entry/byte caps (oldest-first eviction), per-key herd lock with a
lock-free fast path. Empty and unparseable results are never cached (no
false 'not found'); periodic INFO stats rollup for tuning.

Assisted-By: Claude <noreply@anthropic.com>
… (ADR-0011)

Adds from_cache + cache_age_hours to the response; retrieved_at reports the
real TFDA fetch time even on a hit. last_update_date is unaffected by caching.

Assisted-By: Claude <noreply@anthropic.com>
… cache

ADR-0011 -> Accepted. CHANGELOG, .env.example, README deployment note.

Assisted-By: Claude <noreply@anthropic.com>
@shin13 shin13 merged commit d6e0027 into main Jul 1, 2026
2 checks passed
@shin13 shin13 deleted the feat/insert-cache branch July 1, 2026 12:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant