Skip to content

sync: tag pulled durability evidence with its asserting peer (#104 residual)#133

Open
mbertschler wants to merge 1 commit into
mainfrom
issue-104-durability-provenance
Open

sync: tag pulled durability evidence with its asserting peer (#104 residual)#133
mbertschler wants to merge 1 commit into
mainfrom
issue-104-durability-provenance

Conversation

@mbertschler

Copy link
Copy Markdown
Owner

What

Closes the residual of #104: provenance tagging of pulled durability evidence. The exploitable gating concern was closed by #120 (fail-closed freshness) and the pull-scoping hygiene by #123. This PR records which peer asserted each pulled vector component, so the audit trail answers "which evidence gated this offload" and a compromised peer's assertions become revocable without rewinding the live verified vector.

This PR does not touch the #120 freshness condition or the #123 scoping/drop logic — both remain intact (go test ./... green).

How

Migration v22 adds a nullable source_node_id INTEGER REFERENCES nodes(id) to both destination_run_ids and destination_run_ids_history. SchemaVersion bumped 21 → 22; store/schema.sql regenerated via go test ./store -update-schema.

  • NULL = locally-verified (the trusted class, written by AdvanceDestinationVectorTo and UpsertDestinationRunIDVerified — a transfer this node observed itself).
  • non-NULL = peer-asserted, tagged with the asserting peer's nodes.id. The durability pull resolves the source peer once and routes every merged component through a new UpsertDestinationRunIDPulled entry point. This is the only path that stamps non-local provenance.

Provenance is monotone-safe in the upsert's ON CONFLICT CASE:

  • a strict run advance adopts the incoming provenance;
  • a local (NULL) re-confirmation reclaims a peer-tagged component back to local;
  • a peer re-confirmation at the recorded run never downgrades a locally-verified (NULL) component to peer-asserted — a peer cannot launder local provenance away.

Revocation: RevokeDestinationRunIDsFromSource(sourceNodeID) deletes the live components a single peer asserted, leaving locally-verified components, other peers' assertions, and the append-only history untouched. A revoked component reverts to "no row" (no floor); a later legitimate pull or verified push re-advances it. History is never rewritten — revocation is a forward act.

Gate/audit surfacing: the offload gate carries the provenance on its loaded component and names it in decisions — "asserted by peer <name>" or "locally verified" — in the stale and not-content-verified failure reasons.

Schema design choice (for sign-off)

I chose a nullable column on the existing tables over a sibling provenance table. Rationale, optimised for the invariants (not SQL ergonomics):

  • The vector is one live row per (volume, destination, origin_node). Provenance is an attribute of that coordinate's current value, not a fact with its own lifecycle. A sibling table would duplicate the PK and require a second write kept in lockstep with every upsert — a place for the live vector and its provenance to diverge. The column keeps them in one transaction, the same "history can never diverge from the live vector" contract the table already honours.
  • NULL keeps locally-verified rows clean and untouched; revocation is a single equality DELETE that cannot reach a NULL (local) row.
  • No index: source_node_id is low-cardinality (a handful of peers) and revocation/audit narrows by the existing PK; an index on it would violate the repo's low-cardinality-index guidance.

The id↔hash binding stays immutable; monotonic merge and the fail-closed freshness behaviour are unchanged.

Decisions that may want your eye

  1. Schema shape — column vs. sibling table (chosen: column; tradeoffs above).
  2. Revocation semanticsRevokeDestinationRunIDsFromSource deletes live peer-asserted rows (reverting them to "no row", which imposes no floor) while preserving history. It is a store method with no CLI surface yet — wire-up can follow once you confirm the semantics. No automatic revocation; operator-driven only, consistent with the no-auto-prune policy.
  3. Behavioural change in gate messages — failure reasons now carry a provenance suffix ((origin X, asserted by peer Y) / (... , locally verified)). One existing assertion in offload_test.go was updated to match.

Tests

  • TestMigrateV21ToV22AddsSourceNodeID — forward migration on a populated v21 DB; carried-over row reads NULL (locally-verified), coordinate + method preserved.
  • TestUpsertDestinationRunIDPulledTagsSource — pulled advance tagged with the peer on live row + history; a local advance for another origin stays NULL.
  • TestUpsertDestinationRunIDProvenanceTransitions — peer re-confirm never downgrades local; local re-confirm reclaims a peer-tagged component.
  • TestRevokeDestinationRunIDsFromSource — revoke drops only the bad peer's rows; local + other peer + history untouched.
  • TestPullDurabilityTagsSourcePeer (sync) — end-to-end pull tags components with the source peer; local evidence stays untagged.
  • TestOffloadGateNamesPeerProvenance (offload) — gate refusal names the asserting peer.

go vet ./..., go test ./..., golangci-lint run, and the TestSchemaSnapshot golden test all pass.

Record provenance on the durability vector so a peer-pulled component is
distinguishable from a locally-verified one and a compromised peer's
assertions become revocable without rewinding the live verified vector
(issue #104 residual).

Migration v22 adds a nullable source_node_id (FK nodes(id)) to both
destination_run_ids and destination_run_ids_history. NULL is the
locally-verified class; a non-NULL value names the asserting peer. The
durability pull resolves the source peer once and stamps every merged
component through the new UpsertDestinationRunIDPulled entry point;
AdvanceDestinationVectorTo and the verified upsert leave it NULL.

The upsert keeps provenance monotone-safe: a strict advance adopts the
incoming provenance, a local re-confirmation reclaims a peer-tagged
component, and a peer re-confirmation at the recorded run never
downgrades a locally-verified component to peer-asserted.

RevokeDestinationRunIDsFromSource drops a single peer's live components
while leaving locally-verified ones, other peers' assertions, and the
append-only history untouched. The offload gate carries the provenance
and names the asserting peer (or 'locally verified') in its decisions, so
the audit trail answers which evidence gated an offload.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant