Skip to content

security: enforce nonce-store hard cap to bound memory under unique-id bursts#19

Merged
keirsalterego merged 2 commits into
mainfrom
fix/nonce-store-hard-cap
May 29, 2026
Merged

security: enforce nonce-store hard cap to bound memory under unique-id bursts#19
keirsalterego merged 2 commits into
mainfrom
fix/nonce-store-hard-cap

Conversation

@keirsalterego

Copy link
Copy Markdown
Contributor

Item 4 of the pre-launch hardening punch list (proxy side).

Problem

NonceStore::claim_or_replay only ran TTL-based eviction (evict_expired) when at the cap. A burst of more than MAX_RECORDS (100k) unique request_ids inside the RETENTION_SECONDS (600s) window leaves every record younger than the cutoff, so evict_expired frees nothing and the DashMap grows without bound — OOM. The MAX_RECORDS doc comment claimed a hard cap ("evict the oldest entries to make room") that the code never implemented.

Authenticated path (valid HMAC + 30s replay window gate it), so not an open-internet DoS — but the bounded-memory guarantee the comment promises was false.

Fix

evict_to_cap drops the oldest entries down to 90% of the cap when TTL eviction leaves the map still at/above MAX_RECORDS. Keys are collected (iterator fully drained) before removal, so no DashMap iteration guard overlaps a write guard.

Test

unique_id_burst_is_bounded_by_hard_cap inserts MAX_RECORDS + 1000 unique fresh claims and asserts len() <= MAX_RECORDS.

Validation

  • cargo test nonce — 7 passed (incl. new test)
  • cargo clippy -- -D warnings — clean
  • cargo fmt --check — clean

…ECURE

F7 from the 2026-05-26 CSO review: serving /execute and /audit/export over
plain HTTP on a non-loopback address exposes containment commands and tenant
audit history in cleartext (signed for integrity, not encrypted). The proxy
now refuses to start in that case unless ALLOW_INSECURE=true is set
explicitly, matching the safe-by-default posture of DRY_RUN. A warning was
not enough to stop a misconfigured deploy from serving cleartext to the
internet.

Validation: cargo fmt --check + clippy -D warnings clean, 19 tests pass.
claim_or_replay ran only TTL-based eviction (evict_expired) at the cap, so a burst of more than MAX_RECORDS unique request_ids inside the RETENTION window left every record younger than the cutoff, freed nothing, and grew the map without bound (OOM). The MAX_RECORDS comment promised a hard cap that did not exist. evict_to_cap now drops the oldest entries down to 90% of the cap when TTL eviction is insufficient. Keys are collected before removal so no DashMap iteration and write guard overlap. Regression test inserts MAX_RECORDS+1000 unique fresh claims and asserts len() stays <= MAX_RECORDS.
Copilot AI review requested due to automatic review settings May 29, 2026 06:10
@keirsalterego keirsalterego merged commit f190ab3 into main May 29, 2026
1 check failed
@keirsalterego keirsalterego deleted the fix/nonce-store-hard-cap branch May 29, 2026 06:10

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants