attribution: per-session human identity (STS SourceIdentity + k8s impersonation)#30
Merged
Conversation
…ersonation)
By default a fab session acts as the pod's tenant IRSA role, bound to no
named human — so every Bedrock call and every `aws`/`kubectl` the agent
runs traces to a role, not a person. That is exactly the gap an evidence
engine surfaces. This is the opt-in that closes it.
─── src/attribution.ts ───
Set FAB_OPERATOR (a named human) + FAB_SESSION_ROLE_ARN and the in-pod
session, before the agent loop:
- assumes the session role carrying the operator as STS SourceIdentity
(via the `aws` CLI already in the image — no new dependency, same
child-process posture as the claude-cli runtime), exports the temp
creds, and drops the pod's IRSA web-identity vars so exactly one
credential mechanism remains. Every AWS call — the Bedrock InvokeModel
inference call and any `aws` the Bash tool runs — is then recorded in
CloudTrail under SourceIdentity=<operator>.
- writes a kubeconfig that authenticates with the SA token but
impersonates the operator (KUBECONFIG), so apiserver audit records
impersonatedUser=<operator>.
The operator is validated STS-clean up front so the SAME string binds both
streams. Credential assumption + kubeconfig are computed before any env
mutation (no half-attributed env). Fail-closed: if attribution was
requested but setup fails, the session aborts rather than run unattributed.
─── Wiring ───
role-session.ts applies it once per session pod (after a cheap role check
that avoids a wasted STS call on a typo'd role); sdk-k8s.ts forwards
FAB_OPERATOR / FAB_SESSION_ROLE_ARN / FAB_SESSION_DURATION onto the pod.
Inert for every other runtime and when FAB_OPERATOR is unset.
─── Why SourceIdentity, not Bedrock requestMetadata ───
The Agent SDK doesn't expose the InvokeModel request, so fab can't stamp
requestMetadata from this path. SourceIdentity rides the credentials the
SDK already resolves and is crossbearing's strongest binding — it
attributes the agent's aws/kubectl tool-call records, which crossbearing
corroborates.
docs/attribution.md covers the required platform IAM (session-role trust
policy allowing sts:AssumeRole + sts:SetSourceIdentity; the role needs
bedrock:InvokeModel) and the k8s impersonate RBAC, plus limitations
(process-wide operator, credential-TTL hard cliff, success-only records).
20 tests; full suite green; lint + format clean.
Co-authored-by: stxkxsbot <275011021+stxkxsbot@users.noreply.github.com>
…at model
A quality-check hardening pass over the per-session attribution feature. No
behavior change to the happy path; every edit closes a verified gap from an
adversarial review. Findings were all non-blocking polish — this folds them in.
─── Input validation (src/attribution.ts) ───────────────────────────────
- FAB_SESSION_ROLE_ARN is now shape-validated up front (ROLE_ARN_RE) so a
typo fails fast with a clear message instead of an opaque STS error after a
network round-trip — matching the rigor already applied to the operator.
Accepts every real partition (aws / aws-cn / aws-us-gov / aws-iso-*) and
role paths; rejects non-ARNs, wrong service/resource, and bad account ids.
- FAB_SESSION_DURATION parse is gated on a decimal-only regex before Number(),
so hex/float/garbage ("0x384", "3600.5", "3600abc") reject deterministically
rather than slipping through Number()'s coercion.
- The post-assume credential cleanup is expanded from 3 to 8 env vars
(adds AWS_CONTAINER_CREDENTIALS_FULL_URI/RELATIVE_URI, AWS_PROFILE,
AWS_SHARED_CREDENTIALS_FILE, AWS_CONFIG_FILE) so the assumed SourceIdentity
creds are the only source the default provider chain can resolve regardless
of pod env shape. The comment is rescoped to the default chain — it no longer
overclaims "exactly one mechanism" and notes it cannot stop explicit re-auth
with the still-mounted web-identity token. AWS_REGION is deliberately kept so
Bedrock base-URL derivation still works.
- The MAX_DURATION_SECONDS / durationSeconds comments now record that STS role
chaining caps a chained session at 3600s, so values in (3600, 43200] fail
closed at assume-role time rather than silently.
─── Tests (+6 net; suite 320 → 326) ─────────────────────────────────────
Pins the security-load-bearing behavior a green baseline couldn't catch:
- role-session.ts fail-closed: a real spy on SdkRuntime.prototype.runRoleSession
asserts exit 1 AND that the runtime is never reached when attribution throws
(deterministic + offline — an invalid operator throws before any aws call).
- applySessionIdentity throw-leaves-env-pristine: a throwing CliRunner proves
the compute-both-bindings-before-mutate ordering leaves no half-attributed
state (IRSA vars survive, no creds/kubeconfig set).
- kubeconfig file mode asserted 0600 (statSync), not just contents.
- non-decimal duration rejection, the 8-var cleanup, and GovCloud-partition
ARN acceptance.
─── Docs ─────────────────────────────────────────────────────────────────
- docs/attribution.md gains a "Threat model" section stating plainly that this
attributes a cooperating agent and is NOT a containment control: both
bindings are droppable under bypassPermissions (drop KUBECONFIG; re-auth via
the mounted web-identity token), AWS resistance relies on STS SourceIdentity
stickiness, and "strongest binding" is relative to requestMetadata, not
absolute. It names the platform backstops (session SA holds no direct RBAC;
session role denies broad sts:AssumeRole). The module header carries the same
scope note.
- A "Duration and the role-chaining cap" section + a config-table note explain
the 3600s ceiling. An operator-must-byte-match-RBAC caveat covers the
silent AWS-attributed/K8s-denied split.
- The feature is no longer orphaned: the three FAB_OPERATOR/FAB_SESSION_* vars
are now surfaced consistently in docs/transports.md, README.md, CLAUDE.md's
architecture map, and .env.example, all pointing at docs/attribution.md.
Build / lint / format / test all green. Still draft — the companion session
role IAM and impersonate RBAC live platform-side (eks-agent-platform) and must
exist before FAB_OPERATOR does anything.
Co-authored-by: stxkxsbot <275011021+stxkxsbot@users.noreply.github.com>
The attribution and sdk-k8s session-role fixtures hard-coded a real AWS account id (351619759866) inside their role ARNs. fab is a public repo, so this PR's branch was publishing a real account number in plain text. Swap it for the RFC-style documentation placeholder 111111111111 across both files. Each occurrence is a paired input + expected value (the ARN is fed in and asserted back unchanged), so the substitution is behavior-neutral — vitest still passes (30 tests). No production code touched; fixtures only. Co-authored-by: stxkxsbot <275011021+stxkxsbot@users.noreply.github.com>
stxkxs
added a commit
to nanohype/eks-agent-platform
that referenced
this pull request
Jun 20, 2026
…mpersonate RBAC A Platform opts into per-session human attribution with spec.attribution; the operator then provisions, per tenant, the two resources fab's role-session entrypoint needs, and tears them down on removal or deletion. The consumer side is documented in nanohype/fab docs/attribution.md. ─── API (api/platform/v1alpha1) ────────────────────────────────────────── - PlatformSpec.Attribution (*AttributionSpec, optional): Operators []string (required, min 1) and SessionRoleMaxDurationSeconds (*int32, 900–43200, default 3600). nil = unattributed, the default. - PlatformStatus.SessionRoleArn carries the provisioned session role ARN. - Each operator string is reused verbatim as BOTH an allowed STS SourceIdentity and an impersonate resourceName, so the same identity binds the AWS and Kubernetes audit records. ─── Session role (platform_session_iam.go) ─────────────────────────────── - ensureSessionRole mints <env>-<platform>-session (same 64-char + FNV-1a hash scheme as the tenant role). Trust: only the tenant IRSA role may assume, and only while setting one of the Platform's operators as SourceIdentity — Action [sts:AssumeRole, sts:SetSourceIdentity] with a StringEquals condition on sts:SourceIdentity. Permissions: the tenant baseline policy (Bedrock invoke) and the same permissions boundary — never broad sts:AssumeRole. - Idempotent: GetRole→Create on miss; on hit it refreshes the trust policy (the operator list can change) and converges the baseline attachment. - Kill-switch parity: when the Platform is suspended the session role's baseline is detached, so a suspended tenant can't keep invoking Bedrock through it. - deleteIamRole/deleteSessionRole share a detachAndDeleteRole helper. ─── Impersonate RBAC (platform_rbac.go) ────────────────────────────────── - ensureOperatorImpersonateRBAC creates a ClusterRole granting impersonate on exactly the named operator users (never impersonate *) bound to the tenant-runtime ServiceAccount, named <tenant-ns>-impersonate. fab's session kubeconfig authenticates with that SA token while impersonating the operator, so apiserver audit records impersonatedUser=<operator>. Cluster-scoped, so reaped through the finalizer rather than OwnerReferences. ─── Wiring (platform_controller.go) ────────────────────────────────────── - Reconcile provisions the pair when spec.attribution is set (after the tenant IRSA role + SA exist), records status.SessionRoleArn, and tears the pair down when attribution is removed. Finalizer cleans up both (no-ops when never enabled). RBAC markers added for clusterroles/clusterrolebindings and (for escalation prevention) impersonate on users. ─── Tests / codegen ────────────────────────────────────────────────────── - 10 unit tests (session-role trust/baseline/duration/idempotency/suspend/ delete via fakeIAM; impersonate RBAC create/update/delete via the controller-runtime fake client). Build, vet, golangci-lint, and the internal + api unit suites pass locally; CI is green, including the envtest conformance suite. - Deepcopy, the CRD (config + Helm chart), the RBAC role, and the CRD reference doc are generated. Pairs with nanohype/fab#30. Co-authored-by: stxkxsbot <275011021+stxkxsbot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Draft — the per-session human-attribution hook. Closes the "every agent action collapses to one anonymous IRSA role" gap so an evidence engine can bind actions to a named human. Marked draft because it needs companion platform IAM/RBAC (below) stood up before it does anything.
What
FAB_OPERATOR=<human>+FAB_SESSION_ROLE_ARN→ the in-pod session, before the agent loop:SourceIdentity(via theawsCLI already in the image; zero new deps), exports the temp creds, and drops the pod's IRSA web-identity vars. Every AWS call — BedrockInvokeModeland anyawsthe Bash tool runs — is recorded in CloudTrail underSourceIdentity=<operator>.impersonatedUser=<operator>.The operator is validated STS-clean up front so the same string binds both streams; creds + kubeconfig are computed before any env mutation (no half-attributed env); fail-closed if setup fails. Inert for every other runtime and when
FAB_OPERATORis unset.Crossbearing consumes it
CloudTrail
SourceIdentity→AttrSTSSourceIdentity; K8simpersonatedUser→AttrK8sImpersonation. Both genuinely extracted and bound by the engine (verified against its ingesters during review). This is the "after" state of the divergence demo — corroborated agent actions attribute to a named human instead of a faceless role.Needs (before merge) — platform IAM/RBAC
Documented in
docs/attribution.md:sts:AssumeRole+sts:SetSourceIdentity, withbedrock:InvokeModelin its permission policy.impersonateRBAC for the session SA, scoped to the operator user(s).Notes
AgentSandbox.spec), credential-TTL hard cliff (fail-safe), success-only records.🤖 Generated with Claude Code