Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 15 additions & 3 deletions .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,11 +15,23 @@ jobs:
with:
path: guardian

- name: Checkout seam-core (replace dep)
- name: Checkout seam (replace dep)
uses: actions/checkout@v4
with:
repository: ontai-dev/seam-core
path: seam-core
repository: ontai-dev/seam
path: seam

- name: Checkout seam-sdk (replace dep)
uses: actions/checkout@v4
with:
repository: ontai-dev/seam-sdk
path: seam-sdk

- name: Checkout platform (replace dep)
uses: actions/checkout@v4
with:
repository: ontai-dev/platform
path: platform

- name: Checkout conductor (replace dep)
uses: actions/checkout@v4
Expand Down
15 changes: 6 additions & 9 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,23 +3,20 @@

### Schema authority
Primary: docs/guardian-schema.md
Supporting: ~/ontai/conductor/docs/conductor-schema.md (Conductor capabilities and job protocol)
Supporting: ~/ontai/seam-core/docs/seam-core-schema.md (InfrastructureRunnerConfig type definition; Decision G)
Supporting: ~/ontai/conductor/docs/conductor-schema.md (conductor capabilities and job protocol)
Supporting: ~/ontai/seam/docs/seam-schema.md (RunnerConfig type definition)

### Invariants
CS-INV-001 -- The admission webhook is the enforcement mechanism. Policy without enforcement is decoration. The webhook must be operational before any other operator is considered enabled. (root INV-003)
CS-INV-002 -- CNPG is a guardian dependency only. No other component references or accesses the CNPG cluster in seam-system. (root INV-016)
CS-INV-003 -- The two-phase boot (CRD-only to database-backed) is a named, explicit transition. It is never a silent fallback.
CS-INV-003 -- The two-phase boot (migration and bootstrap to full enforcement) is a named, explicit transition. It is never a silent fallback.
CS-INV-004 -- The bootstrap RBAC window has a definite close: when the admission webhook becomes operational. The window is documented, bounded, and reconciled on startup. (root INV-020)
CS-INV-005 -- provisioned=true on RBACProfile is set exclusively by this operator. No other controller writes to RBACProfile status.
CS-INV-006 -- Leader election required. Admission webhook requires a stable leader.
CS-INV-006 -- Leader election required. The admission webhook requires a stable leader.
CS-INV-007 -- Third-party RBAC ownership is wrapping, not replacement. Helm upgrades must remain safe. Drift is surfaced, not silently overwritten.
INV-005 -- ClusterAssignment references, never owns, cluster/pack/security resources.
INV-015 -- Deletion of TalosCluster never triggers physical cluster destruction. ClusterReset is the only path to cluster destruction.
CS-INV-008 -- Three-layer RBAC hierarchy is the authoritative governance model (guardian-schema.md §19). Layer 1: management-policy + management-maximum in seam-system, compiler-authored. Layer 2: cluster-policy + cluster-maximum in seam-tenant-{clusterName}, guardian-authored by ClusterRBACPolicyReconciler. Layer 3: component RBACProfiles only -- no per-component RBACPolicy, no per-component PermissionSet. RBACPolicy is never human-authored.
CS-INV-009 -- Cluster-policy validation against management-maximum happens at ClusterRBACPolicyReconciler creation time (option a). Never at RBACProfile admission time. This is the deadlock-prevention invariant -- admission-time management ceiling checks are prohibited.
CS-INV-010 -- security-system namespace does not exist in this platform. All guardian-owned objects that previously referenced security-system live in seam-system.
CS-INV-009 -- Cluster-policy validation against management-maximum happens at ClusterRBACPolicyReconciler creation time. Never at RBACProfile admission time. This is the deadlock-prevention invariant -- admission-time management ceiling checks are prohibited.

### Session protocol additions
Step 4a -- Read guardian-design.md in this repository.
Step 4a -- Read guardian-design.md in this repository before any implementation work.
Step 4b -- Before implementing any EPG computation change, trace its impact on PermissionSnapshot generation and target cluster delivery. Document the impact in PROGRESS.md before proceeding.
163 changes: 86 additions & 77 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,123 +1,132 @@
# guardian

**Seam Security Plane operator**
**API Group:** `security.ontai.dev`
**Image:** `registry.ontai.dev/ontai-dev/guardian:<semver>`
RBAC governance operator for the ONT platform. Guardian owns all RBAC across every
cluster. No operator, application, or human provisions Kubernetes RBAC outside of
guardian.

API group: `guardian.ontai.dev`

---

## What this repository is
## CRD types

`guardian` is the intelligent operator in the Seam platform. It is the only ONT
operator with genuine in-process logic beyond the thin reconciler pattern.
All types under `guardian.ontai.dev/v1alpha1`.

Guardian owns all RBAC on every cluster. No component provisions its own RBAC.
Guardian's admission webhook gates every RBAC resource written to any cluster in
the Seam stack.
| Kind | Short name | Scope | Description |
|---------------------------|------------|------------|------------------------------------------------------------------------------------|
| RBACPolicy | rp | Namespaced | Governing policy constraining what RBACProfiles in its scope may declare |
| RBACProfile | rbp | Namespaced | Per-component per-tenant permission declaration; gates operator enablement |
| PermissionSet | ps | Namespaced | Named, reusable collection of permission rules; used as governance ceiling |
| PermissionSnapshot | psn | Namespaced | Computed, versioned, signed EPG for a specific target cluster; never hand-authored |
| PermissionSnapshotReceipt | psr | Namespaced | Target-cluster acknowledgement record; written by conductor in agent mode |
| IdentityProvider | idp | Namespaced | External identity source declaration (OIDC, PKI, token) |
| IdentityBinding | ib | Namespaced | Maps an external identity to an ONT permission principal |

---

## CRDs
## Architecture

| Kind | API Group | Role |
|---|---|---|
| `RBACProfile` | `security.ontai.dev` | Declares RBAC policy intent for a tenant or operator |
| `RBACPolicy` | `security.ontai.dev` | Concrete policy rule set applied to a principal set |
| `PermissionSet` | `security.ontai.dev` | Compiled effective permissions for a principal |
| `PermissionSnapshot` | `security.ontai.dev` | Signed point-in-time permission record for target delivery |
| `PermissionSnapshotReceipt` | `security.ontai.dev` | Acknowledgement of snapshot delivery on a target cluster |
| `IdentityProvider` | `security.ontai.dev` | OIDC or LDAP identity source configuration |
| `IdentityBinding` | `security.ontai.dev` | Binding between a domain identity and a Kubernetes principal |
| `InfrastructureLineageIndex` | `security.ontai.dev` | Sealed causal chain index (one per root declaration) |
| `SeamMembership` | `security.ontai.dev` | Domain membership record for a principal on a target cluster |
### Management cluster role

---

## Architecture
Guardian runs as a single Deployment on the management cluster with `GUARDIAN_ROLE=management`.
It is provisioned exclusively by the compiler enable bundle. No human, operator, or pipeline
stamps `role=management` on a Guardian Deployment.

Guardian has two operating modes.
Responsibilities in this role:

**Management cluster (role=management):**
- Computes the Effective Permission Graph (EPG) in-process from RBACProfile and RBACPolicy objects.
- Generates signed PermissionSnapshot CRs for delivery to target clusters.
- Runs a CNPG-backed persistent store for EPG state, audit events, and identity resolution logs.
- Deploys first. Its `RBACProfile` reaching `provisioned=true` is the gate that unblocks all other operators.
- EPG computation from provisioned RBACProfiles
- PermissionSnapshot generation (one per target cluster)
- Policy validation and admission webhook enforcement on the management cluster
- ClusterRBACPolicyReconciler: creates `cluster-policy` and `cluster-maximum` per TalosCluster
- APIGroupSweepController: extends `management-maximum` as new CRD API groups are discovered
- PermissionService gRPC: the authoritative authorization decision point for the fleet
- AuditSinkReconciler: receives forwarded audit events from federated tenant Guardians
- CNPG-backed persistence for EPG and audit state

**Target cluster (Conductor agent on target, Guardian runner on admission):**
- Runs the local admission webhook that intercepts all RBAC resources.
- Enforces the `ontai.dev/rbac-owner=guardian` annotation on every RBAC object.
- Serves the local PermissionService gRPC endpoint for authorization decisions.
Guardian deploys first. No other operator is considered enabled until its RBACProfile
reaches `provisioned=true`. INV-003.

---
### Target cluster admission via conductor

## Two-phase boot
Guardian does not deploy a separate agent on target clusters. Conductor in agent mode
(`CONDUCTOR_ROLE=tenant`) hosts the security plane on each target cluster:

Guardian boots in two phases:
- Admission webhook at `/validate/rbac-ownership`: rejects unannotated RBAC resources
- PermissionSnapshotReceipt: acknowledges the current signed snapshot
- Local PermissionService gRPC: serves authorization decisions from the locally acknowledged snapshot

1. **CRD-only phase** (before CNPG is reachable): reconcilers start with an in-memory
EPG stub. The bootstrap RBAC window is open. Guardian emits a named `BootstrapWindow`
condition during this phase.
The conductor local PermissionService means target cluster authorization decisions are
fully operational even when the management cluster is temporarily unreachable.

2. **Database-backed phase** (after CNPG connection is confirmed): full EPG computation
starts. The `BootstrapWindow` condition closes permanently when the admission webhook
becomes operational. This transition is named and never a silent fallback.
### Optional tenant Guardian

See `guardian-design.md` for the full boot sequence and `docs/guardian-schema.md` for
the API contract.
Tenants may optionally deploy a Guardian ClusterPack (`GUARDIAN_ROLE=tenant`) via the
Dispatcher pack delivery flow. Role=tenant Guardian is sovereign by default: it connects
to a tenant-local CNPG instance and does not forward audit events to the management
Guardian unless `GUARDIAN_AUDIT_FORWARD=true` is explicitly set. Platform never knows
whether a tenant has deployed Guardian and never depends on its presence.

---

## Building
## Two-phase boot

```sh
go build ./cmd/guardian
```
Guardian on the management cluster starts after the CNPG operator and CNPG Cluster CR
are pre-provisioned by the compiler enable bundle (phase 0).

The binary is built into a distroless container image:
**Phase 1 - Migration and bootstrap:**

```sh
docker build -t registry.ontai.dev/ontai-dev/guardian:<semver> .
```
1. Startup migration runner connects to CNPG and applies all pending schema migrations.
If CNPG is unreachable, Guardian holds in degraded state until CNPG becomes reachable.
This is the only blocking gate before controller registration.
2. Bootstrap annotation sweep stamps `ontai.dev/rbac-owner=guardian` on all pre-existing
RBAC resources across non-exempt namespaces (audit mode during the sweep).
3. Guardian creates baseline PermissionSet, RBACPolicy, and RBACProfile for each known
third-party component whose namespace is present.

**Phase 2 - Full enforcement:**

4. All role-gated controllers register. The admission webhook becomes operational.
The bootstrap RBAC window closes permanently. INV-020.
5. BootstrapController monitors all RBACProfiles. Once all profiles reach `provisioned=true`,
the webhook advances from `ObserveOnly` to `Enforcing`. Any RBAC resource created or
updated without `ontai.dev/rbac-owner=guardian` is rejected at admission.

The two-phase transition is a named, explicit event. It is never a silent fallback.

---

## Testing
## Build

```sh
go test ./test/unit/...
```
make build
make docker-build
make docker-push
```

---

## Schema and design reference
## Test

- `docs/guardian-schema.md` - API contract, field definitions, status conditions
- `guardian-design.md` - Implementation architecture and reconciler design
```
make test
make e2e
```

---
E2e specs live under `test/e2e/` and skip automatically when `MGMT_KUBECONFIG` is absent.
Every skipped spec references the exact backlog item ID required for promotion to live.

## Status
---

Alpha. Deployed and tested on management cluster (ccs-mgmt).
Tenant cluster onboarding is not yet verified end to end.
See [docs/guardian-schema.md](./docs/guardian-schema.md)
for current capability and known gaps.
## Schema

CRDs are deployed and reconciling on the live management cluster.
The schema specification is published at:
https://schema.ontai.dev/v1alpha1/
The authoritative field reference is `docs/guardian-schema.md`.

## Contributing
---

Read [CONTRIBUTING.md](./CONTRIBUTING.md) before opening a pull
request. Every new reconciliation behavior requires a written
specification and senior engineer sign-off before any code is
written.
## Issues

File issues at https://github.com/ontai-dev/guardian/issues.
For security issues contact security@ontai.dev directly.
https://github.com/ontai-dev/guardian/issues

---

*guardian - Seam Security Plane*
*Apache License, Version 2.0*
guardian - Seam RBAC Governance / Apache License, Version 2.0
8 changes: 4 additions & 4 deletions api/v1alpha1/groupversion_info.go
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
// Package v1alpha1 contains API types for the security.ontai.dev/v1alpha1 API group.
// Package v1alpha1 contains API types for the guardian.ontai.dev/v1alpha1 API group.
//
// This package is the Kubernetes API contract for guardian. All CRD types are
// registered here. Breaking changes require a version bump to v1alpha2 or v1beta1
// and coordination with all operators that reference these types.
//
// +groupName=security.ontai.dev
// +groupName=guardian.ontai.dev
// +kubebuilder:object:generate=true
package v1alpha1

Expand All @@ -15,8 +15,8 @@ import (

var (
// GroupVersion is the group and version for all types in this package.
// API group: security.ontai.dev. INV-008 this value is ground truth.
GroupVersion = schema.GroupVersion{Group: "security.ontai.dev", Version: "v1alpha1"}
// API group: guardian.ontai.dev. INV-008 -- this value is ground truth.
GroupVersion = schema.GroupVersion{Group: "guardian.ontai.dev", Version: "v1alpha1"}

// SchemeBuilder is used to add Go types to the Kubernetes runtime scheme.
// All CRD types in this package register via this builder.
Expand Down
Loading
Loading