From f6a9e2f49ae1cb7a71b320e25c5df47f0afbac68 Mon Sep 17 00:00:00 2001 From: Zeke Date: Sat, 13 Jun 2026 21:43:08 -0700 Subject: [PATCH 1/2] design: replication model (ADR-0026) + replica-read + node lifecycle (closes #76, closes #147, closes #149) ADR-0026 (#76): default async primary/replica replication with WAIT for bounded acks; documented best-effort-not-CP with a named loss window; reject Raft/quorum by default and Dynamo sloppy-quorum; ship replica-read-only=on, min-replicas-to- write, min-replicas-max-lag. Strong consistency is opt-in (#78/#12), never a tax. REPLICA_READ.md (#147): READONLY/READWRITE pair, replica routing, Envoy-style bounded staleness surfaced to clients. NODE_LIFECYCLE.md (#149): seed/MEET join, learner->voter->slot-owner promotion, add/remove-node, single->first-replica bootstrap. Adds 2 web-verified claims (envoy read policy, cluster readonly replica). Authored+reviewed via workflow. CI passes. Closes #76, closes #147, closes #149. Signed-off-by: Zeke --- .../adr/0026-replication-consistency-model.md | 92 ++++++++++++++++++ docs/adr/INDEX.md | 1 + docs/design/NODE_LIFECYCLE.md | 93 +++++++++++++++++++ docs/design/README.md | 4 + docs/design/REPLICA_READ.md | 86 +++++++++++++++++ docs/prior-art/claims.yaml | 59 ++++++++++++ 6 files changed, 335 insertions(+) create mode 100644 docs/adr/0026-replication-consistency-model.md create mode 100644 docs/design/NODE_LIFECYCLE.md create mode 100644 docs/design/REPLICA_READ.md diff --git a/docs/adr/0026-replication-consistency-model.md b/docs/adr/0026-replication-consistency-model.md new file mode 100644 index 0000000..1f0bb41 --- /dev/null +++ b/docs/adr/0026-replication-consistency-model.md @@ -0,0 +1,92 @@ +# ADR-0026: Default replication and consistency model (async primary/replica plus WAIT) + +Status: Accepted +Issue: #76 + +## Context + +IronCache must ship a single default replication and consistency model before +the opt-in tiers (#77, #78, #12) can be specified against a fixed baseline. The +default has to be correct under the failure modes operators actually hit and +stay drop-in compatible with how Redis clients already reason about durability. +The top two tenets, Compatible then Efficient, both bear on the choice: clients +should port over unchanged, and the hot write path should pay no quorum tax. +Scope here is the default model only; this ADR does not specify the streaming +protocol, replica handoff mechanics, or the read contract (#147 owns the +replica-read contract, #149 owns node lifecycle). + +Redis Cluster replicates asynchronously between a primary and its replicas by +default, and exposes WAIT for callers that want bounded synchronous +acknowledgement [redis-cluster-async-replication]. WAIT confirms in-memory +replica receipt, not disk persistence, and the Redis docs are explicit that it +does not make the store strongly consistent: a write synchronously replicated +to several replicas can still be lost [redis-wait-not-cp]. The Jepsen analysis +reaches the same conclusion from the failure side, that default async +replication can drop acknowledged writes on failover [redis-wait-not-strongly-consistent]. +The two strong-consistency alternatives are per-shard Raft or quorum writes, +which remove single-node-failover write loss at the cost of write latency and +operational weight on every write, and Dynamo-style leaderless quorums with a +sloppy quorum over the first N healthy nodes plus hinted handoff and +app-level conflict resolution [dynamo-quorum-sloppy-hinted]. + +## Decision + +- **Default to asynchronous primary/replica replication, with WAIT exposed for + bounded synchronous acks.** This is the Compatible and Efficient choice: + clients that already speak Redis replication semantics and tooling port over + unchanged, and the steady-state write path pays no quorum round-trip + [redis-cluster-async-replication]. WAIT N timeout is offered as a per-command + durability floor, not a consistency mode. +- **Document the default as best-effort, not CP, and name the loss window.** + There is a write-loss window: a write acknowledged to the client but not yet + replicated can be lost on primary failover or on the minority side of a + partition. WAIT bounds this window but does not eliminate it, because it + confirms in-memory receipt only and is not strong consistency + [redis-wait-not-cp] [redis-wait-not-strongly-consistent]. This honesty is a + shipped requirement, surfaced to clients per #147. +- **Ship three guardrail defaults.** `replica-read-only` is on, so replicas + reject writes and cannot silently diverge [redis-replica-read-only-default]. + `min-replicas-to-write` is wired so a primary can stop accepting writes when + too few replicas are in sync [redis-min-replicas-to-write-default]. + `min-replicas-max-lag` bounds how stale an in-sync replica may be before it + stops counting toward that floor [redis-min-replicas-max-lag-default]. The + shipped numeric defaults track the pinned upstream values in claims.yaml. +- **Strong consistency is opt-in, never a tax on every write.** No-acknowledged- + write-loss on single-node failover is real value, but it is delivered through + an opt-in quorum/Raft tier (#78, #12), layered on this async baseline, not by + changing the default. Whether that becomes a headline differentiator is + deferred to those issues; this ADR commits only to the async default. + +## Rejected Alternatives + +- **Per-shard Raft or quorum writes by default.** Removes acknowledged-write + loss on single-node failover and gives a clean CP story. Rejected as the + default: it adds write latency and operational weight to every write and + diverges from Redis defaults, breaking Compatible, which ranks above the + consistency gain. It survives as the opt-in tier in #78 and #12, layered on + this baseline rather than replacing it. +- **Dynamo-style sloppy quorum with hinted handoff.** Stays writable during + partitions via a sloppy quorum over the first N healthy nodes, hinted handoff, + and vector-clock conflict resolution [dynamo-quorum-sloppy-hinted]. Rejected: + its conflict, read-repair, and merge model is foreign to the Redis data model, + surprising for compatibility-focused users, and complex, so it violates + Compatible and Simple for a marginal availability gain. This is the rejection + this ADR exists to freeze so it is not relitigated. + +## Consequences + +- Unmodified Redis clients and replication tooling work against IronCache with + no protocol change, and the steady-state write path carries no quorum tax, + satisfying Compatible then Efficient [redis-cluster-async-replication]. +- The default is explicitly best-effort, not CP. The acknowledged-but- + unreplicated write-loss window on failover and partition is documented and + surfaced to clients (#147), and WAIT is positioned as a durability floor that + bounds but does not close it [redis-wait-not-cp] [redis-wait-not-strongly-consistent]. +- Three guardrails ship on by intent: read-only replicas, a min-in-sync- + replicas write floor, and a max-replica-lag bound, so a misconfigured or + lagging fleet fails toward refusing writes rather than silently diverging + [redis-replica-read-only-default] [redis-min-replicas-to-write-default] + [redis-min-replicas-max-lag-default]. +- The opt-in strong-consistency tier (#78, #12) is unblocked to build on a + fixed async baseline, and the replica-read contract (#147) and node lifecycle + (#149) are specified against this decision rather than against an open one. diff --git a/docs/adr/INDEX.md b/docs/adr/INDEX.md index 20bfc20..37d6c8c 100644 --- a/docs/adr/INDEX.md +++ b/docs/adr/INDEX.md @@ -31,6 +31,7 @@ in [OPEN.md](OPEN.md); research questions in [QUESTIONS.md](QUESTIONS.md). | [0023](0023-cold-tier-engine.md) | Cold-tier engine (reject RocksDB/LSM, adopt hybrid log) | Accepted | #65 | | [0024](0024-geo-command-scope.md) | Geo command family scope (non-goal for v1) | Accepted | #133 | | [0025](0025-cluster-partition-count.md) | Cluster keyspace partition count (16384 dual-purpose unit) | Accepted | #72 | +| [0026](0026-replication-consistency-model.md) | Default replication and consistency model (async primary/replica plus WAIT) | Accepted | #76 | As `[DECISION]` issues close, each adds its row here and its `NNNN-*.md` record. The numbering is monotonic and never reused, even after supersession. diff --git a/docs/design/NODE_LIFECYCLE.md b/docs/design/NODE_LIFECYCLE.md new file mode 100644 index 0000000..01e5eb7 --- /dev/null +++ b/docs/design/NODE_LIFECYCLE.md @@ -0,0 +1,93 @@ +# Design: Cluster bootstrap and node lifecycle (seed/MEET join, learner-to-voter-to-slot-owner) + +Issue: #149. Decisions: ADR-0026 (async primary/replica default, replica +guardrails). Related: #73 (CONTROL_PLANE: Raft slot map and config epoch), #74 +(MEMBERSHIP: SWIM plus Lifeguard), #69 (single-node-first staged path), #75 +(migration), #1 (vision). + +## Goal and scope + +The roster owns steady-state membership (#74 SWIM), the authoritative map (#73 +Raft), and migration (#75), but nothing owns how a node enters or leaves the +cluster. This spec owns the lifecycle: cold-start seed discovery and a CLUSTER +MEET-equivalent handshake, the staged promotion of a joining node from +SWIM-discovered to Raft learner to voter to slot owner, the operator/CLI +add-node and remove-node surface, and the single-node-to-first-replica +bootstrap that #69's staged path assumes. Scope is the transitions between +states; the SWIM signal itself (#74), the Raft commit semantics (#73), and the +slot-migration mechanics (#75) are owned elsewhere and only invoked here. + +## Design + +### Seed and MEET join + +- A new node boots with a seed list (operator-supplied or CLI add-node). It + contacts a seed, which performs a MEET-equivalent handshake: the seed + introduces the joiner into the SWIM membership view (#74) so the rest of the + ring learns of it through normal gossip, no full-mesh fan-out. +- SWIM membership is a hint, not authority: a node SWIM has discovered is not + yet part of the cluster's committed state. Per #74's contract, SWIM proposes + and Raft commits, so MEET only makes a node a candidate for promotion. + +### Learner to voter to slot-owner promotion + +- A SWIM-discovered node is first admitted to the Raft control plane (#73) as a + non-voting learner: it receives committed slot-map deltas and config-epoch + updates but does not vote, so it cannot affect commit latency or quorum while + it catches up [raft-overview]. +- A learner is promoted to voter only by an explicit committed control-plane + decision (#73), keeping the voter set small (the #73 3-to-5 voter group) and + the slot map linearizable. Voter promotion is a control-plane role change, not + a data assignment. +- Becoming a slot owner is the last and separate step: the control plane assigns + slots and, for a replica being promoted toward ownership, applies a + replication-lag gate before the node is eligible, since replication is async + (ADR-0026). Replica handoff reuses PSYNC2-style secondary-replid resync so + promotion does not force a full resync [redis-psync2-secondary-replid]. Slot + movement itself runs through migration (#75) under MOVED/ASK. + +### Add/remove-node operator surface + +- add-node (CLI/operator) supplies a seed and triggers the MEET handshake, then + the staged learner-to-voter-to-owner path above; the operator observes each + stage via CLUSTER SHARDS health/role (#74) rather than poking internal state. +- remove-node is the reverse and drains first: slots are migrated off (#75), + the node is demoted from voter to learner to leave the quorum cleanly, then + removed from the committed membership (#73) and finally from the SWIM view + (#74). A node is never removed from the map while it still owns a slot. + +### Single-node to first-replica bootstrap + +- A single node boots as a degenerate one-voter control plane owning all 16384 + slots, consistent with #69's single-node-first staged layout. The first + replica joins by the same seed/MEET path, enters as a learner, and attaches as + an async replica of the primary under ADR-0026 (replica-read-only on, + min-replicas guardrails inactive at one replica). This is the transition #69 + assumes but does not specify: the inter-stage step from standalone to a + primary-with-replica pair. + +## Open questions + +- The replication-lag threshold that gates a replica from learner to + slot-owner-eligible (ties to ADR-0026 min-replicas-max-lag and #73's + promotion-policy open decision). +- Whether learner admission to Raft is automatic on SWIM discovery or requires + an explicit operator add-node (#73 lists data-nodes-as-learners as open). +- Seed-list bootstrapping when all seeds are down, and how MEET interacts with a + partitioned SWIM view. + +## Acceptance and test hooks + +- A node added by seed/MEET appears as a SWIM hint, then a Raft learner, then a + voter, then a slot owner, with each stage visible in CLUSTER SHARDS and never + skipped. +- remove-node drains all slots (#75) and demotes through learner before the + node leaves committed membership; no slot is orphaned. +- A standalone node accepts a first replica via MEET and reaches a + primary-with-async-replica pair per #69 and ADR-0026. + +## References + +- ADR-0026; issues #149, #73, #74, #69, #75, #1; specs CONTROL_PLANE (#73), + MEMBERSHIP (#74). +- Claims: [raft-overview], [redis-psync2-secondary-replid]. diff --git a/docs/design/README.md b/docs/design/README.md index cbb336e..5bb2a24 100644 --- a/docs/design/README.md +++ b/docs/design/README.md @@ -96,3 +96,7 @@ Specs added as the M1 milestone progresses. authoritative slot map, config epoch, membership, and replica promotion (#73). - [MEMBERSHIP.md](MEMBERSHIP.md): SWIM + non-optional Lifeguard data-plane membership and failure detection, joined with the Raft-committed map (#74). +- [REPLICA_READ.md](REPLICA_READ.md): the replica-read contract (READONLY/ + READWRITE, replica routing, bounded staleness surfaced to clients) (#147). +- [NODE_LIFECYCLE.md](NODE_LIFECYCLE.md): cluster bootstrap and node lifecycle + (seed/MEET join, learner to voter to slot-owner promotion, add/remove-node) (#149). diff --git a/docs/design/REPLICA_READ.md b/docs/design/REPLICA_READ.md new file mode 100644 index 0000000..d30e532 --- /dev/null +++ b/docs/design/REPLICA_READ.md @@ -0,0 +1,86 @@ +# Design: Replica-read contract (READONLY/READWRITE, replica routing, bounded staleness) + +Issue: #147. Decisions: ADR-0026 (async primary/replica default, best-effort not +CP, replica-read-only on). Related: #70 (CLUSTER_CONTRACT: 16384 slots, CRC16, +MOVED/ASK), #76 (replication default), #1 (vision). + +## Goal and scope + +Redis Cluster clients scale reads by sending READONLY on a connection and then +routing reads to replicas; this is part of the wire contract IronCache promises +to keep. ADR-0026 fixes replica-read-only on, so replicas reject writes, but it +does not decide whether clients may READ from replicas, how the +READONLY/READWRITE connection-state pair behaves, how a replica answers for the +slots it serves, or how async-replication staleness is bounded and surfaced. +This spec owns that command-pair and consistency contract. Scope: the +per-connection READONLY/READWRITE state machine, replica read routing under the +#70 slot view, and the bounded-staleness signal surfaced to clients. Out of +scope: the slot map authority (#73), migration redirection mechanics (#70), and +the write path (#76). + +## Design + +### READONLY/READWRITE connection state + +- A connection carries one bit: read-write (default) or read-only. READONLY + sets the bit; READWRITE clears it. The bit is per-connection, not global, and + is unaffected by the node role. This mirrors the Redis Cluster READONLY/ + READWRITE pair that lets a replica serve reads for slots it replicates + [redis-cluster-readonly-replica]. +- On a replica, a read for an owned-or-replicated slot succeeds only when the + read-only bit is set; otherwise the replica returns MOVED to the primary, so + a default (read-write) connection keeps the strong-read behavior unmodified + clients expect [redis-cluster-readonly-replica]. Writes on a replica are + always rejected per ADR-0026's replica-read-only posture, independent of the + bit. + +### Replica read routing + +- Slot ownership and the CLUSTER SLOTS/SHARDS projection come from + CLUSTER_CONTRACT (#70); this spec only adds the replica leg. A read-only + connection whose key hashes (CRC16 mod 16384, #70) to a slot this replica + replicates is answered locally; a key for a slot this node neither owns nor + replicates returns MOVED, driving the client's normal map refresh. +- Because replication is asynchronous (ADR-0026), a replica read may observe a + value older than the primary. This is the Envoy ReadPolicy model: non-primary + read targets may return stale data due to async replication + [envoy-redis-readpolicy] [redis-cluster-async-replication]. IronCache does + not silently proxy reads to the primary to hide this; the client chose the + replica by setting READONLY and is told the staleness bound. + +### Bounded staleness surfaced to clients + +- Each replica tracks its replication lag against the primary using the same + in-sync signal ADR-0026 bounds with min-replicas-max-lag. A replica whose lag + exceeds the configured staleness bound stops serving read-only reads for its + slots and returns MOVED, so a client never silently reads beyond the bound. +- The bound is observable, not just enforced: it is exposed through INFO + replication fields and the CLUSTER SHARDS health/role projection (#70), so a + client or operator can reason about the worst-case staleness of any replica + read. This makes the best-effort-not-CP property of ADR-0026 legible at the + read path rather than hidden. + +## Open questions + +- Whether to expose an Envoy-style per-request ReadPolicy hint + (PREFER_REPLICA/PREFER_MASTER) beyond the binary READONLY/READWRITE bit + [envoy-redis-readpolicy], or keep the Redis-native pair only for v1. +- The exact default staleness bound, and whether it is derived from + min-replicas-max-lag (ADR-0026) or set independently per keyspace. +- How a replica that crosses the staleness bound interacts with the #70 ASK + path during an in-flight slot migration. + +## Acceptance and test hooks + +- A READONLY connection reads from a replica for a replicated slot; the same + connection after READWRITE gets MOVED to the primary for that slot. +- A replica past its staleness bound returns MOVED for read-only reads and its + lag is visible in INFO/CLUSTER SHARDS before and after crossing the bound. +- Unmodified redis-cli, go-redis, lettuce, and ioredis route replica reads via + READONLY without errors, matching the #70 contract. + +## References + +- ADR-0026; issues #147, #76, #70, #1; specs CLUSTER_CONTRACT (#70). +- Claims: [redis-cluster-readonly-replica], [envoy-redis-readpolicy], + [redis-cluster-async-replication]. diff --git a/docs/prior-art/claims.yaml b/docs/prior-art/claims.yaml index dbb73ed..6c5d429 100644 --- a/docs/prior-art/claims.yaml +++ b/docs/prior-art/claims.yaml @@ -6982,3 +6982,62 @@ claims: note: rustsec.org states the database is maintained by the Rust Secure Code Working Group, covers crates published via crates.io, uses RUSTSEC-YYYY-NNNN IDs (example RUSTSEC-2022-0051), and exports to OSV in real time for the GitHub Advisory Database/Dependabot. Cross-checked against github.com/rustsec/advisory-db. +- id: envoy-redis-readpolicy + dimension: resp-protocol-compat + system: Envoy Redis proxy (network filter) + version: latest (1.39.0-dev-ec0eb7 docs build) + claim: 'Envoy''s Redis proxy exposes a ReadPolicy enum on the connection-pool settings that surfaces + replica reads with bounded staleness as an explicit, named contract. The enum has five values: MASTER + (default, read from the current primary), PREFER_MASTER (read from primary, fall back to replicas + if unavailable), REPLICA (read from replica nodes; a random node is chosen among multiple replicas + in a shard, healthy nodes preferred), PREFER_REPLICA (read from replicas, fall back to primary if + all replicas are unavailable/unhealthy), and ANY (read from any node, primary or replica, healthy + nodes preferred). The documentation explicitly warns that all ReadPolicy settings except MASTER may + return stale data because replication is asynchronous, so the caller must tolerate staleness.' + value: ReadPolicy enum = {MASTER (default), PREFER_MASTER, REPLICA, PREFER_REPLICA, ANY}; non-MASTER + policies documented as possibly returning stale data due to asynchronous replication + source_url: https://www.envoyproxy.io/docs/envoy/latest/api-v3/extensions/filters/network/redis_proxy/v3/redis_proxy.proto + accessed_date: '2026-06-13' + confidence: high + confidence_reason: Read directly from the official Envoy v3 API proto documentation; all five enum values + and the asynchronous-replication staleness warning are quoted verbatim from the upstream page. + load_bearing: true + verification: + verdict: confirmed + best_source_url: https://www.envoyproxy.io/docs/envoy/latest/api-v3/extensions/filters/network/redis_proxy/v3/redis_proxy.proto + note: WebFetch of the latest Envoy v3 redis_proxy.proto docs (build 1.39.0-dev-ec0eb7) returned all + five ReadPolicy values with MASTER as DEFAULT and the explicit note that all settings except MASTER + may return stale data because replication is asynchronous. Cross-checked against WebSearch summaries + of older tagged versions (v1.12.0/v1.14.x) which describe the same enum semantics, confirming this + is a long-standing, canonical Envoy feature rather than a recent addition. +- id: redis-cluster-readonly-replica + dimension: resp-protocol-compat + system: Redis Cluster (READONLY / READWRITE connection commands) + version: 'Redis 3.0.0+ (current docs: redis.io latest)' + claim: Redis Cluster surfaces replica reads with bounded staleness as a per-connection contract via + the READONLY and READWRITE commands. READONLY tells a Redis Cluster replica node that the client is + willing to read possibly stale data and is not interested in running write queries; with the connection + in readonly mode the cluster only sends a redirection to the client when the operation involves keys + not served by the replica's master (e.g. slots never owned by that master, or after a resharding). + READWRITE resets the readonly flag back to the default, where read queries against a replica are redirected + to the authoritative master. Both commands have been available since Redis 3.0.0 and belong to the + `cluster` command group. + value: READONLY = opt-in per-connection 'willing to read possibly stale data' from replica (redirect + only for keys not served by replica's master); READWRITE resets to default; since Redis 3.0.0; group=cluster + source_url: https://redis.io/docs/latest/commands/readonly/ + accessed_date: '2026-06-13' + confidence: high + confidence_reason: Read directly from the official redis.io command reference for READONLY; the 'willing + to read possibly stale data' wording, redirection conditions, 'since 3.0.0' version, and 'cluster' + command group are quoted from the upstream page, with READWRITE semantics corroborated by the companion + redis.io page. + load_bearing: false + verification: + verdict: confirmed + best_source_url: https://redis.io/docs/latest/commands/readonly/ + note: 'WebFetch of redis.io/docs/latest/commands/readonly confirmed the verbatim description (''willing + to read possibly stale data and is not interested in running write queries''), the redirection-only-for-unserved-keys + behavior, ''Available since: 3.0.0'', and ''Group: cluster''. The complementary READWRITE command + (redis.io/docs/latest/commands/readwrite) was surfaced in the same search and resets the readonly + flag; redis-py-cluster docs independently note replicas may not return latest data due to asynchronous + replication, consistent with the bounded-staleness framing.' From ecf771717ea3eeb3dca2384d8e6ca155a5e589a4 Mon Sep 17 00:00:00 2001 From: Zeke Date: Sat, 13 Jun 2026 21:44:37 -0700 Subject: [PATCH 2/2] fix: drop duplicate envoy claim (already exists at claims.yaml as envoy-redis-readpolicy) The Batch-B research redundantly re-created the envoy read-policy claim that already existed in claims.yaml; REPLICA_READ.md correctly cites the pre-existing [envoy-redis-readpolicy]. Only redis-cluster-readonly-replica was genuinely new. Signed-off-by: Zeke --- docs/prior-art/claims.yaml | 28 ---------------------------- 1 file changed, 28 deletions(-) diff --git a/docs/prior-art/claims.yaml b/docs/prior-art/claims.yaml index 6c5d429..b1228f0 100644 --- a/docs/prior-art/claims.yaml +++ b/docs/prior-art/claims.yaml @@ -6982,34 +6982,6 @@ claims: note: rustsec.org states the database is maintained by the Rust Secure Code Working Group, covers crates published via crates.io, uses RUSTSEC-YYYY-NNNN IDs (example RUSTSEC-2022-0051), and exports to OSV in real time for the GitHub Advisory Database/Dependabot. Cross-checked against github.com/rustsec/advisory-db. -- id: envoy-redis-readpolicy - dimension: resp-protocol-compat - system: Envoy Redis proxy (network filter) - version: latest (1.39.0-dev-ec0eb7 docs build) - claim: 'Envoy''s Redis proxy exposes a ReadPolicy enum on the connection-pool settings that surfaces - replica reads with bounded staleness as an explicit, named contract. The enum has five values: MASTER - (default, read from the current primary), PREFER_MASTER (read from primary, fall back to replicas - if unavailable), REPLICA (read from replica nodes; a random node is chosen among multiple replicas - in a shard, healthy nodes preferred), PREFER_REPLICA (read from replicas, fall back to primary if - all replicas are unavailable/unhealthy), and ANY (read from any node, primary or replica, healthy - nodes preferred). The documentation explicitly warns that all ReadPolicy settings except MASTER may - return stale data because replication is asynchronous, so the caller must tolerate staleness.' - value: ReadPolicy enum = {MASTER (default), PREFER_MASTER, REPLICA, PREFER_REPLICA, ANY}; non-MASTER - policies documented as possibly returning stale data due to asynchronous replication - source_url: https://www.envoyproxy.io/docs/envoy/latest/api-v3/extensions/filters/network/redis_proxy/v3/redis_proxy.proto - accessed_date: '2026-06-13' - confidence: high - confidence_reason: Read directly from the official Envoy v3 API proto documentation; all five enum values - and the asynchronous-replication staleness warning are quoted verbatim from the upstream page. - load_bearing: true - verification: - verdict: confirmed - best_source_url: https://www.envoyproxy.io/docs/envoy/latest/api-v3/extensions/filters/network/redis_proxy/v3/redis_proxy.proto - note: WebFetch of the latest Envoy v3 redis_proxy.proto docs (build 1.39.0-dev-ec0eb7) returned all - five ReadPolicy values with MASTER as DEFAULT and the explicit note that all settings except MASTER - may return stale data because replication is asynchronous. Cross-checked against WebSearch summaries - of older tagged versions (v1.12.0/v1.14.x) which describe the same enum semantics, confirming this - is a long-standing, canonical Envoy feature rather than a recent addition. - id: redis-cluster-readonly-replica dimension: resp-protocol-compat system: Redis Cluster (READONLY / READWRITE connection commands)