Skip to content

agent: activate session-caller binding via opt-in per-peer tokens (#110a, safe slice of #110d)#132

Merged
mbertschler merged 2 commits into
mainfrom
issue-110-agent-hardening
Jun 21, 2026
Merged

agent: activate session-caller binding via opt-in per-peer tokens (#110a, safe slice of #110d)#132
mbertschler merged 2 commits into
mainfrom
issue-110-agent-hardening

Conversation

@mbertschler

Copy link
Copy Markdown
Owner

What & why

Issue #110 is a cluster of HTTP-agent hardening gaps. Findings 9a, 9b, 9c
were already implemented and merged
in PR #119 (origin/main). This PR lands
the safe, incremental slice of 9d that makes the dormant 9a session-caller
binding actually fire, and binds a declared node name to its credential.

Today the agent authenticates every caller with one shared token, so
callerNodeName(req) returned "" and the 9a lookupSession/takeSession
caller-binding — built and unit-tested in #119 — was inert in production. This
PR gives the agent a way to recover an authenticated node identity, turning that
protection on.

Per finding

  • 9a — session bound to caller — ACTIVATED. requireBearer now resolves a
    presented token to a node identity (when per-peer tokens are configured),
    stamps it on the request context, and callerNodeName reads it back. The
    already-present session binding then rejects a phase call
    (/plan//verify//close//plan-folders) whose authenticated identity
    differs from the node that opened the session — so a second token-holder can no
    longer hijack or /close status=failed-abort another node's in-flight sync.
    (Mechanism was merged in Harden peer-sync receiver: origin poisoning, history loss, agent DoS (#105 #106 #110a-c) #119; this PR makes it enforce.)
  • 9b — peer:// placeholder upgrade — already fixed (Harden peer-sync receiver: origin poisoning, history loss, agent DoS (#105 #106 #110a-c) #119), untouched.
    Receiver passes allowEndpointUpgrade=false; peerEndpoint ignores the wire
    InitiatorEndpoint.
  • 9c — request body size limit — already fixed (Harden peer-sync receiver: origin poisoning, history loss, agent DoS (#105 #106 #110a-c) #119), untouched.
    decodeJSON wraps the body in http.MaxBytesReader (256 MiB) and /plan
    caps len(Entries) at 1<<20. The OOM vector is closed regardless of the
    deliberately-omitted ReadTimeout (streaming negotiation needs no wall clock;
    rationale is in agent/serve.go).
  • 9d — single token / identity unbound — PARTIAL (safe increment); rest DEFERRED.
    • Done here: optional [agent.auth.peers.<node>] bearer = ... map (literal
      or { env }), wired to agent.Config.PeerTokens. When configured, a token
      maps to a node identity (constant-time shared-token check preserved; per-peer
      set keyed by SHA-256 digest). /begin also refuses a declared
      initiator_node_name that contradicts the authenticated identity, binding the
      declared name to the credential.
    • Deferred (warrants its own design discussion): making per-peer tokens
      mandatory / removing the shared token; per-volume scoping of a credential;
      rotating-token / revocation story. These are larger and out of scope here.

Known limitation (by design, flagged for sign-off)

This increment is opt-in and backward compatible: with no [agent.auth.peers]
configured the agent behaves exactly as before (single shared token, no binding).
A holder of the shared token still authenticates with no identity and so can
still drive any session — strict binding only applies to per-peer-token callers.
Closing that gap (shared-token removal + volume scoping) is the deferred part of
9d above.

Tests

  • config: per-peer token resolution incl. env secrets; rejection of
    duplicate tokens, collision with the shared token, empty bearer, invalid node
    name; absent map leaves PeerTokens nil.
  • agent: authenticator.authenticate resolution table; end-to-end
    TestPeerTokenSessionBinding (intruder token → 403 on /verify, owner →
    200); TestBeginRejectsImpersonatedNodeName (declared name ≠ credential → 403).

go vet ./..., go test ./..., golangci-lint run all green. No schema change.

Refs #110 (does not fully close it — 9d's larger redesign is deferred).

Add an opt-in per-peer bearer token map keyed by node name. Each entry
carries its own secret in the same shape [nodes.X] uses (literal or
{ env = "VAR" }). Tokens must be distinct and must differ from the
shared auth.token so a credential never maps to two identities;
collisions are rejected at load time. Absent leaves PeerTokens nil and
preserves the single-shared-token behaviour.
requireBearer now resolves a presented token to an authenticated node
identity when per-peer tokens are configured: a per-peer match stamps
the caller's node name on the request context, which callerNodeName
reads back and the sync handlers bind each in-flight session to (#110a).
The shared token still authenticates but carries no identity, so its
binding stays a no-op — preserving today's single-token behaviour.

/begin additionally refuses a declared initiator_node_name that
contradicts the authenticated identity, binding the declared node name
to the credential rather than letting it be self-asserted (a safe,
incremental slice of #110d; the full per-peer-token redesign and
volume scoping remain deferred).

The per-peer token set is keyed by SHA-256 digest so the map probe
never compares attacker bytes against a stored secret directly; the
shared-token check stays constant-time.
@mbertschler mbertschler merged commit a367cd0 into main Jun 21, 2026
2 checks passed
@mbertschler mbertschler deleted the issue-110-agent-hardening branch June 21, 2026 08:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant