Add Label-Based Group Replica Response Strategy#279
Add Label-Based Group Replica Response Strategy#279
Conversation
7e7553b to
fa140fa
Compare
fa140fa to
020cb44
Compare
cmd/thanos/query.go
Outdated
| groupReplicaGroupLabel := cmd.Flag("query.group-replica.group-label", "External label name that identifies the group for group-replica partial response strategy. Stores with the same group label value hold replicated data. Must be set together with --query.group-replica.quorum-label."). | ||
| Default("").String() | ||
|
|
||
| groupReplicaQuorumLabel := cmd.Flag("query.group-replica.quorum-label", "External label name whose value specifies the minimum number of healthy stores required per group. Must be set together with --query.group-replica.group-label. Stores without these labels or with invalid quorum values (<1) are treated as must-success stores."). |
There was a problem hiding this comment.
didn't understand from the pr description what this quorum-label is for. If a querier connects to multiple store groups like below, which quorum number should we use? this is a single value for entire querier seems can't sematically fit:
- pantheon-db: quorum == 2
- pantheon-db-dp: quorum == 2
- pantheon-store: quorum == 1
- pantheon-long-range-store: quorum == 1
| ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ | ||
| │ Receive-0 │ │ Receive-1 │ │ Receive-2 │ | ||
| │ │ │ │ │ │ | ||
| │ Labels: │ │ Labels: │ │ Labels: │ | ||
| │ receive_group│ │ receive_group│ │ receive_group│ | ||
| │ ="group-A" │ │ ="group-A" │ │ ="group-A" │ | ||
| │ quorum="2" │ │ quorum="2" │ │ quorum="2" │ | ||
| └──────────────┘ └──────────────┘ └──────────────┘ | ||
| │ │ │ | ||
| └───────────────┴───────────────┘ | ||
| │ | ||
| Group "group-A" (quorum=2) | ||
| Needs 2 of 3 stores healthy | ||
| ``` |
There was a problem hiding this comment.
could you explain a bit, does that mean those receive need to attach external labels all the time for every time series? that might implicitly mean we have a constant tax of network IO overheads as well as CPU overheads in db pods?
There was a problem hiding this comment.
not super sure if this approach is optimal, understood about aligned_ketama so we have consistent shards, but the overhead might be a lot (each returned series needs to carry over external labels), is it possible to use Store.Info to exchange this additional info instead?
b6658c0 to
a98483e
Compare
a98483e to
a8f58fc
Compare
Add Label-Based Group Replica Response Strategy
Summary
This PR enhances the
GROUP_REPLICApartial response strategy to support label-based group and quorum identification, enabling more flexible failure tolerance for replicated data setups likealigned_ketamahashring.New Flags
--query.group-replica.group-label: External label name identifying the group (stores with same value hold replicated data)--query.group-replica.quorum-label: External label name whose value specifies minimum healthy stores required per groupHow It Works
Behavior
>= quorumhealthy stores< quorumhealthy storesExample Configuration
Receive pods with external labels:
Query configuration:
Failure scenarios:
Label Stripping
Both
group-labelandquorum-labelare automatically stripped from query results (similar to replica labels with deduplication).Backward Compatibility
GROUP_REPLICAbehavior is preserved