diff --git a/docs/design/ADVISOR.md b/docs/design/ADVISOR.md
new file mode 100644
index 0000000..cc76eb5
--- /dev/null
+++ b/docs/design/ADVISOR.md
@@ -0,0 +1,96 @@
+# Design: Per-shard background advisor (expert selection + bounded knob autotuning)
+
+Issue: #126. Decisions: ADR-0013 (advisor off/shadow default posture), ADR-0008
+(S3-FIFO default eviction). Related: #91 (safety guardrails, ADVISOR_SAFETY.md),
+#48 (EvictionPolicy trait, EVICTION.md), #49 (W-TinyLFU filter, WTINYLFU.md), #85
+(config snapshot, CONFIG.md), #13 (no per-request inference), #88 (parent,
+decomposed).
+
+## Goal and scope
+
+This specifies the per-shard background advisor: an off-path loop that weights
+experts over the deterministic policy set and autotunes a bounded knob set, then
+publishes its choice through the atomic config-snapshot swap that
+ADVISOR_SAFETY.md (#91) defines. The request loop stays inference-free and
+deterministic (#13); the advisor only observes counters and proposes snapshots.
+The off/shadow default posture is fixed by ADR-0013 and is not re-decided here.
+
+In scope: the expert-weighting mechanism, the policy set it selects among, the
+bounded knob set, the cadence/hysteresis shape, the snapshot-swap coupling, and
+the binding to the EvictionPolicy trait (#48). Out of scope: numeric tuning
+(retune interval, marginal-gain threshold), deferred to the harness (#8); the
+safety envelope (bounds enforcement, rollback, kill-switch, seeding), owned by
+#91; the promotion gate (#154).
+
+## Design
+
+### Expert weighting
+
+- Each shard runs a regret-minimizing / contextual-bandit controller that weights
+  experts off the hot path. LeCaR maintains weights over two experts with regret
+  minimization and beats ARC by more than 18x at small cache-to-working-set
+  ratios [lecar-regret-min-18x]; CACHEUS generalizes this to an adaptive mixture
+  selected per workload primitive [cacheus-experts]. We borrow the controller as
+  the off-path selector and reject per-request ensemble evaluation, which would
+  reintroduce hot-path cost (#13).
+
+### Policy set
+
+- The experts are the cheap deterministic policies already behind the trait:
+  SIEVE (one FIFO, a hand, a visited bit) [sieve-algorithm], a W-TinyLFU
+  admission filter [wtinylfu-caffeine-sketch] (the non-ML floor, WTINYLFU.md
+  #49), and sampled LRU/LFU. The controller selects among them and tunes their
+  knobs; it never invents a policy. The default eviction core remains S3-FIFO
+  with its small/main split [s3fifo-small-main-split] (ADR-0008), which the
+  advisor may select but does not replace as the baseline.
+
+### Bounded knob set
+
+- The advisor tunes a small, bounded set: the active policy, sample count, LFU
+  log-factor [redis-lfu-log-factor] and decay, ghost size, and
+  slab/encoding/compression thresholds. The set is deliberately small so the
+  search space is enumerable and every proposal maps to a documented knob. Bounds
+  enforcement, clamping, and rejection are the safety spec's job (#91).
+
+### Cadence and hysteresis
+
+- The loop retunes on a fixed cadence (interval deferred to #8), proposes at most
+  the bounded knob deltas, and respects the per-knob hysteresis band and cooldown
+  from #91 so it cannot flap. A proposal is published only after it beats the
+  current snapshot on replay (the gate in #154).
+
+### Snapshot swap and trait binding
+
+- The advisor never mutates live policy directly. It builds an immutable seeded
+  snapshot and hands it to the atomic RCU pointer swap (#91), monotonically
+  versioned and coordinated with #85. The hot path reads the active policy and
+  knobs through the EvictionPolicy trait (#48); a swap changes which trait impl
+  and which knob values the shard uses on the next access, with no reader lock and
+  no torn read.
+
+## Open questions
+
+- Retune interval and the marginal-gain threshold for accepting a proposal
+  (deferred to the harness #8).
+- Per-primitive context features the bandit conditions on, and whether the expert
+  set is fixed or extensible per tenant.
+- Whether expert weights persist across restart or reset to the static baseline.
+- How shadow-mode recommendations (ADR-0013) are surfaced before active tuning.
+
+## Acceptance and test hooks
+
+- The request loop performs no inference and is deterministic under replay (#13).
+- The advisor proposes only knobs in the bounded set, each within its #91 bounds.
+- A proposal reaches live policy only via the atomic versioned snapshot swap
+  (#91/#85), never by direct mutation.
+- Policy selection routes through the EvictionPolicy trait (#48); a swap changes
+  the active impl with no reader lock.
+- With the advisor off or in shadow the engine behaves identically to the static
+  baseline (ADR-0013).
+
+## References
+
+- ADR-0013, ADR-0008; issues #91, #48, #49, #85, #13, #154, #88; specs
+  ADVISOR_SAFETY.md, EVICTION.md, WTINYLFU.md, CONFIG.md.
+- Claims: [lecar-regret-min-18x], [cacheus-experts], [wtinylfu-caffeine-sketch],
+  [sieve-algorithm], [s3fifo-small-main-split], [redis-lfu-log-factor].
diff --git a/docs/design/ADVISOR_SAFETY.md b/docs/design/ADVISOR_SAFETY.md
new file mode 100644
index 0000000..7278783
--- /dev/null
+++ b/docs/design/ADVISOR_SAFETY.md
@@ -0,0 +1,108 @@
+# Design: Advisor safety guardrails (bounds, hysteresis, rollback, kill-switch)
+
+Issue: #91. Decisions: ADR-0013 (advisor off/shadow default posture), ADR-0008
+(S3-FIFO default eviction). Related: #126 (advisor mechanism, ADVISOR.md), #85
+(config sources, CONFIG.md), #48 (EvictionPolicy trait, EVICTION.md), #154
+(promotion gate), #88 (parent, decomposed).
+
+## Goal and scope
+
+The background advisor is the only place ML touches IronCache and it never runs
+on the hot path. This spec is the safety envelope around it: per-knob bounds,
+anti-oscillation, automatic rollback, a hard kill-switch to a known-good static
+baseline, and a seeded monotonic-versioned config-snapshot contract the hot path
+reads. The governing guarantee is that the advisor can only ever match or improve
+the static baseline, never regress below it. With the advisor off the cache is
+correct and fast, which the queueing result on hit-path contention motivates
+[hit-ratio-can-hurt-throughput].
+
+In scope: knob min/max bounds, hysteresis band plus cooldown, the regression
+detector and rollback, the kill-switch, and the immutable seeded snapshot the
+advisor publishes and the hot path consumes. Out of scope: the advisor objective
+function and the expert algorithms (#126); knob storage and reload semantics
+(#85); the policies behind the EvictionPolicy trait (#48); the off/shadow default
+posture (ADR-0013).
+
+## Design
+
+### Per-knob bounds
+
+- Every tunable knob has a documented, enforced min and max; an out-of-range
+  proposal is clamped or rejected, never applied. The knob set is the bounded set
+  ADVISOR.md (#126) defines (active policy, Redis-style sample count
+  [redis-maxmemory-samples-5], LFU log-factor and decay
+  [redis-lfu-morris-counter-params], ghost size, slab/encoding/compression
+  thresholds). Bounds are a property of the snapshot schema, so a malformed or
+  adversarial proposal cannot widen them.
+
+### Hysteresis and cooldown
+
+- Each knob carries a hysteresis band and a cooldown timer. A change applies only
+  when the measured signal crosses the band, and no further change to that knob is
+  permitted until the cooldown elapses. This provably bounds change frequency and
+  stops flapping near a threshold, which a rate limit alone does not.
+
+### Regression detector and rollback
+
+- After a swap the detector compares live throughput-per-core and hit ratio
+  against the pre-change snapshot over the cooldown window. A measured regression
+  in either rolls the active snapshot back to the immediately prior one. Both
+  signals matter because a higher hit ratio can still lower throughput on a
+  relink-bound policy [hit-ratio-can-hurt-throughput]; FIFO-class policies
+  (S3-FIFO [s3fifo-small-main-split], SIEVE [sieve-algorithm]) avoid that, but the
+  detector does not assume it.
+
+### Kill-switch to the static baseline
+
+- A persistent or repeated breach trips a kill-switch that atomically reverts to a
+  static baseline of W-TinyLFU admission [wtinylfu-caffeine-sketch] over a
+  FIFO-class core, the deterministic floor any learned change must first beat
+  [sieve-simpler-than-lru-nsdi24]. The kill-switch is operator-forceable and is
+  the boot default (ADR-0013). The baseline is chosen over last-known-good because
+  a learned snapshot can itself be subtly bad, whereas the static path is
+  provably correct and fast.
+
+### Seeded versioned RCU snapshot contract
+
+- The hot path reads an immutable config snapshot through a single atomic pointer
+  swap (RCU-style); readers never block and never see a torn set. The advisor
+  publishes a new snapshot only after a candidate beats the current one on a
+  sampled replay (the gate detailed in #154). Each snapshot carries a seed and a
+  strictly monotonic version, coordinated with the config layers in #85. A given
+  seed plus an input replay yields identical eviction decisions, the determinism
+  invariant rollback and audit depend on.
+
+## Open questions
+
+- Regression thresholds and window length per knob class (throughput vs hit-ratio
+  sensitivity); numeric values deferred to the harness (#8).
+- Cooldown duration and band width per knob, and whether they are themselves
+  bounded-tunable.
+- Whether a kill-switch trip is sticky until operator reset or auto-clears after
+  a quiet period.
+- Seed scope (per-shard vs global) and how it is recorded in the snapshot.
+- Maximum knob delta per step (bounded step vs jump to any in-range value).
+
+## Acceptance and test hooks
+
+- Every knob has an enforced min/max; an out-of-range proposal is clamped or
+  rejected (schema test).
+- A soak under a shifting workload shows hysteresis and cooldown bound change
+  frequency with no oscillation.
+- An injected throughput or hit-ratio regression triggers rollback to the prior
+  snapshot.
+- The kill-switch reverts to the static baseline atomically, is the boot default,
+  and is operator-forceable.
+- With the advisor disabled the cache is correct and no path regresses below the
+  static baseline [hit-ratio-can-hurt-throughput].
+- A seeded snapshot plus input replay yields identical eviction decisions, and
+  snapshots are published and consumed with monotonic versioning (#85).
+
+## References
+
+- ADR-0013, ADR-0008; issues #126, #85, #48, #154, #88; specs ADVISOR.md,
+  CONFIG.md, EVICTION.md, WTINYLFU.md.
+- Claims: [hit-ratio-can-hurt-throughput], [s3fifo-small-main-split],
+  [sieve-algorithm], [wtinylfu-caffeine-sketch], [sieve-simpler-than-lru-nsdi24],
+  [redis-maxmemory-samples-5], [redis-lfu-morris-counter-params],
+  [lecar-regret-min-18x], [cacheus-experts].
diff --git a/docs/design/README.md b/docs/design/README.md
index 5bb2a24..4972cec 100644
--- a/docs/design/README.md
+++ b/docs/design/README.md
@@ -100,3 +100,9 @@ Specs added as the M1 milestone progresses.
   READWRITE, replica routing, bounded staleness surfaced to clients) (#147).
 - [NODE_LIFECYCLE.md](NODE_LIFECYCLE.md): cluster bootstrap and node lifecycle
   (seed/MEET join, learner to voter to slot-owner promotion, add/remove-node) (#149).
+- [ADVISOR_SAFETY.md](ADVISOR_SAFETY.md): the advisor safety envelope (per-knob
+  bounds, hysteresis/cooldown, regression detect + rollback, kill-switch, RCU
+  snapshot contract) (#91).
+- [ADVISOR.md](ADVISOR.md): the per-shard background advisor (LeCaR/bandit expert
+  weighting, bounded knobs, atomic RCU config swap, EvictionPolicy-trait binding,
+  shadow/off default per ADR-0013) (#126).