Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
96 changes: 96 additions & 0 deletions docs/design/ADVISOR.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
# Design: Per-shard background advisor (expert selection + bounded knob autotuning)

Issue: #126. Decisions: ADR-0013 (advisor off/shadow default posture), ADR-0008
(S3-FIFO default eviction). Related: #91 (safety guardrails, ADVISOR_SAFETY.md),
#48 (EvictionPolicy trait, EVICTION.md), #49 (W-TinyLFU filter, WTINYLFU.md), #85
(config snapshot, CONFIG.md), #13 (no per-request inference), #88 (parent,
decomposed).

## Goal and scope

This specifies the per-shard background advisor: an off-path loop that weights
experts over the deterministic policy set and autotunes a bounded knob set, then
publishes its choice through the atomic config-snapshot swap that
ADVISOR_SAFETY.md (#91) defines. The request loop stays inference-free and
deterministic (#13); the advisor only observes counters and proposes snapshots.
The off/shadow default posture is fixed by ADR-0013 and is not re-decided here.

In scope: the expert-weighting mechanism, the policy set it selects among, the
bounded knob set, the cadence/hysteresis shape, the snapshot-swap coupling, and
the binding to the EvictionPolicy trait (#48). Out of scope: numeric tuning
(retune interval, marginal-gain threshold), deferred to the harness (#8); the
safety envelope (bounds enforcement, rollback, kill-switch, seeding), owned by
#91; the promotion gate (#154).

## Design

### Expert weighting

- Each shard runs a regret-minimizing / contextual-bandit controller that weights
experts off the hot path. LeCaR maintains weights over two experts with regret
minimization and beats ARC by more than 18x at small cache-to-working-set
ratios [lecar-regret-min-18x]; CACHEUS generalizes this to an adaptive mixture
selected per workload primitive [cacheus-experts]. We borrow the controller as
the off-path selector and reject per-request ensemble evaluation, which would
reintroduce hot-path cost (#13).

### Policy set

- The experts are the cheap deterministic policies already behind the trait:
SIEVE (one FIFO, a hand, a visited bit) [sieve-algorithm], a W-TinyLFU
admission filter [wtinylfu-caffeine-sketch] (the non-ML floor, WTINYLFU.md
#49), and sampled LRU/LFU. The controller selects among them and tunes their
knobs; it never invents a policy. The default eviction core remains S3-FIFO
with its small/main split [s3fifo-small-main-split] (ADR-0008), which the
advisor may select but does not replace as the baseline.

### Bounded knob set

- The advisor tunes a small, bounded set: the active policy, sample count, LFU
log-factor [redis-lfu-log-factor] and decay, ghost size, and
slab/encoding/compression thresholds. The set is deliberately small so the
search space is enumerable and every proposal maps to a documented knob. Bounds
enforcement, clamping, and rejection are the safety spec's job (#91).

### Cadence and hysteresis

- The loop retunes on a fixed cadence (interval deferred to #8), proposes at most
the bounded knob deltas, and respects the per-knob hysteresis band and cooldown
from #91 so it cannot flap. A proposal is published only after it beats the
current snapshot on replay (the gate in #154).

### Snapshot swap and trait binding

- The advisor never mutates live policy directly. It builds an immutable seeded
snapshot and hands it to the atomic RCU pointer swap (#91), monotonically
versioned and coordinated with #85. The hot path reads the active policy and
knobs through the EvictionPolicy trait (#48); a swap changes which trait impl
and which knob values the shard uses on the next access, with no reader lock and
no torn read.

## Open questions

- Retune interval and the marginal-gain threshold for accepting a proposal
(deferred to the harness #8).
- Per-primitive context features the bandit conditions on, and whether the expert
set is fixed or extensible per tenant.
- Whether expert weights persist across restart or reset to the static baseline.
- How shadow-mode recommendations (ADR-0013) are surfaced before active tuning.

## Acceptance and test hooks

- The request loop performs no inference and is deterministic under replay (#13).
- The advisor proposes only knobs in the bounded set, each within its #91 bounds.
- A proposal reaches live policy only via the atomic versioned snapshot swap
(#91/#85), never by direct mutation.
- Policy selection routes through the EvictionPolicy trait (#48); a swap changes
the active impl with no reader lock.
- With the advisor off or in shadow the engine behaves identically to the static
baseline (ADR-0013).

## References

- ADR-0013, ADR-0008; issues #91, #48, #49, #85, #13, #154, #88; specs
ADVISOR_SAFETY.md, EVICTION.md, WTINYLFU.md, CONFIG.md.
- Claims: [lecar-regret-min-18x], [cacheus-experts], [wtinylfu-caffeine-sketch],
[sieve-algorithm], [s3fifo-small-main-split], [redis-lfu-log-factor].
108 changes: 108 additions & 0 deletions docs/design/ADVISOR_SAFETY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
# Design: Advisor safety guardrails (bounds, hysteresis, rollback, kill-switch)

Issue: #91. Decisions: ADR-0013 (advisor off/shadow default posture), ADR-0008
(S3-FIFO default eviction). Related: #126 (advisor mechanism, ADVISOR.md), #85
(config sources, CONFIG.md), #48 (EvictionPolicy trait, EVICTION.md), #154
(promotion gate), #88 (parent, decomposed).

## Goal and scope

The background advisor is the only place ML touches IronCache and it never runs
on the hot path. This spec is the safety envelope around it: per-knob bounds,
anti-oscillation, automatic rollback, a hard kill-switch to a known-good static
baseline, and a seeded monotonic-versioned config-snapshot contract the hot path
reads. The governing guarantee is that the advisor can only ever match or improve
the static baseline, never regress below it. With the advisor off the cache is
correct and fast, which the queueing result on hit-path contention motivates
[hit-ratio-can-hurt-throughput].

In scope: knob min/max bounds, hysteresis band plus cooldown, the regression
detector and rollback, the kill-switch, and the immutable seeded snapshot the
advisor publishes and the hot path consumes. Out of scope: the advisor objective
function and the expert algorithms (#126); knob storage and reload semantics
(#85); the policies behind the EvictionPolicy trait (#48); the off/shadow default
posture (ADR-0013).

## Design

### Per-knob bounds

- Every tunable knob has a documented, enforced min and max; an out-of-range
proposal is clamped or rejected, never applied. The knob set is the bounded set
ADVISOR.md (#126) defines (active policy, Redis-style sample count
[redis-maxmemory-samples-5], LFU log-factor and decay
[redis-lfu-morris-counter-params], ghost size, slab/encoding/compression
thresholds). Bounds are a property of the snapshot schema, so a malformed or
adversarial proposal cannot widen them.

### Hysteresis and cooldown

- Each knob carries a hysteresis band and a cooldown timer. A change applies only
when the measured signal crosses the band, and no further change to that knob is
permitted until the cooldown elapses. This provably bounds change frequency and
stops flapping near a threshold, which a rate limit alone does not.

### Regression detector and rollback

- After a swap the detector compares live throughput-per-core and hit ratio
against the pre-change snapshot over the cooldown window. A measured regression
in either rolls the active snapshot back to the immediately prior one. Both
signals matter because a higher hit ratio can still lower throughput on a
relink-bound policy [hit-ratio-can-hurt-throughput]; FIFO-class policies
(S3-FIFO [s3fifo-small-main-split], SIEVE [sieve-algorithm]) avoid that, but the
detector does not assume it.

### Kill-switch to the static baseline

- A persistent or repeated breach trips a kill-switch that atomically reverts to a
static baseline of W-TinyLFU admission [wtinylfu-caffeine-sketch] over a
FIFO-class core, the deterministic floor any learned change must first beat
[sieve-simpler-than-lru-nsdi24]. The kill-switch is operator-forceable and is
the boot default (ADR-0013). The baseline is chosen over last-known-good because
a learned snapshot can itself be subtly bad, whereas the static path is
provably correct and fast.

### Seeded versioned RCU snapshot contract

- The hot path reads an immutable config snapshot through a single atomic pointer
swap (RCU-style); readers never block and never see a torn set. The advisor
publishes a new snapshot only after a candidate beats the current one on a
sampled replay (the gate detailed in #154). Each snapshot carries a seed and a
strictly monotonic version, coordinated with the config layers in #85. A given
seed plus an input replay yields identical eviction decisions, the determinism
invariant rollback and audit depend on.

## Open questions

- Regression thresholds and window length per knob class (throughput vs hit-ratio
sensitivity); numeric values deferred to the harness (#8).
- Cooldown duration and band width per knob, and whether they are themselves
bounded-tunable.
- Whether a kill-switch trip is sticky until operator reset or auto-clears after
a quiet period.
- Seed scope (per-shard vs global) and how it is recorded in the snapshot.
- Maximum knob delta per step (bounded step vs jump to any in-range value).

## Acceptance and test hooks

- Every knob has an enforced min/max; an out-of-range proposal is clamped or
rejected (schema test).
- A soak under a shifting workload shows hysteresis and cooldown bound change
frequency with no oscillation.
- An injected throughput or hit-ratio regression triggers rollback to the prior
snapshot.
- The kill-switch reverts to the static baseline atomically, is the boot default,
and is operator-forceable.
- With the advisor disabled the cache is correct and no path regresses below the
static baseline [hit-ratio-can-hurt-throughput].
- A seeded snapshot plus input replay yields identical eviction decisions, and
snapshots are published and consumed with monotonic versioning (#85).

## References

- ADR-0013, ADR-0008; issues #126, #85, #48, #154, #88; specs ADVISOR.md,
CONFIG.md, EVICTION.md, WTINYLFU.md.
- Claims: [hit-ratio-can-hurt-throughput], [s3fifo-small-main-split],
[sieve-algorithm], [wtinylfu-caffeine-sketch], [sieve-simpler-than-lru-nsdi24],
[redis-maxmemory-samples-5], [redis-lfu-morris-counter-params],
[lecar-regret-min-18x], [cacheus-experts].
6 changes: 6 additions & 0 deletions docs/design/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,3 +100,9 @@ Specs added as the M1 milestone progresses.
READWRITE, replica routing, bounded staleness surfaced to clients) (#147).
- [NODE_LIFECYCLE.md](NODE_LIFECYCLE.md): cluster bootstrap and node lifecycle
(seed/MEET join, learner to voter to slot-owner promotion, add/remove-node) (#149).
- [ADVISOR_SAFETY.md](ADVISOR_SAFETY.md): the advisor safety envelope (per-knob
bounds, hysteresis/cooldown, regression detect + rollback, kill-switch, RCU
snapshot contract) (#91).
- [ADVISOR.md](ADVISOR.md): the per-shard background advisor (LeCaR/bandit expert
weighting, bounded knobs, atomic RCU config swap, EvictionPolicy-trait binding,
shadow/off default per ADR-0013) (#126).
Loading