From 572fd50ef3af9ab989a6170eafa6cf27999750a6 Mon Sep 17 00:00:00 2001 From: Chris Roadfeldt Date: Tue, 16 Jun 2026 16:44:31 -0500 Subject: [PATCH] spec: integrations (RHDH, OPA, Kubernetes) RHDH, OPA, and Kubernetes-compatibility integration specs. Signed-off-by: Chris Roadfeldt --- .../dcm-opa-integration-spec.md | 582 ++++++++++++++++ .../dcm-rhdh-integration-spec.md | 623 ++++++++++++++++++ .../kubernetes-compatibility.md | 443 +++++++++++++ 3 files changed, 1648 insertions(+) create mode 100644 docs/specifications/dcm-opa-integration-spec.md create mode 100644 docs/specifications/dcm-rhdh-integration-spec.md create mode 100644 docs/specifications/kubernetes-compatibility.md diff --git a/docs/specifications/dcm-opa-integration-spec.md b/docs/specifications/dcm-opa-integration-spec.md new file mode 100644 index 0000000..c3f791a --- /dev/null +++ b/docs/specifications/dcm-opa-integration-spec.md @@ -0,0 +1,582 @@ +# DCM OPA Integration Specification + +**Document Status:** 📋 Draft — Ready for Implementation Feedback +**Document Type:** Integration Specification + + +> **AEP Alignment:** DCM API endpoints referenced in this spec follow [AEP](https://aep.dev) conventions. `resource_type` accepts FQN string or Registry UUID — DCM resolves internally. See `schemas/openapi/dcm-consumer-api.yaml` and `dcm-admin-api.yaml`. + + +> **📋 Draft** +> +> This specification has been promoted from Work in Progress to Draft status. All questions resolved. All 7 policy types validated with working Rego examples. OPA/Rego confirmed as complete reference implementation. It is ready for implementation feedback but has not yet been formally reviewed for final release. +> +> This specification defines the OPA integration contract for DCM External Policy Evaluators. It is published to share design direction and invite feedback. Do not build production integrations against this specification until it reaches draft status. + +**Version:** 0.1.0-draft +**Status:** Draft — Ready for implementation feedback +**Document Type:** Technical Specification +**Related Documents:** [Foundational Abstractions](https://github.com/croadfeldt/udlm/blob/main/foundations/foundations.md) | [Policy Profiles](../../architecture/governance-enforcement/policy-profiles.md) | [Control Plane Components](../../architecture/control-plane/components.md) | [DCM Operator Interface Specification](dcm-operator-interface-spec.md) + +--- + +## Abstract + +This specification defines how Open Policy Agent (OPA) integrates with the DCM Policy Engine as the reference implementation for OPA-based policy evaluations. It defines the DCM payload schema as an OPA input document, the expected decision schema as OPA output, the built-in functions DCM provides to Rego policies, and the test harness contract for validating policies before activation. + +OPA is not required to implement DCM — any OPA-based policy evaluation can implement DCM's policy contract. However, OPA with Rego is the recommended reference implementation, and this specification enables implementors and integrators to build standards-compliant DCM policy engines. + +--- + +## 1. Introduction + +### 1.1 The Policy Engine Contract + +DCM's Policy Engine evaluates policies at multiple points in the request lifecycle. The engine receives a payload, evaluates all active matching policies, and accumulates mutations. The OPA integration maps this contract to Rego evaluation. + +DCM policy types: +- **GateKeeper** — approve or reject; output is a decision (allow/deny + reason) +- **Validation** — verify correctness; output is a validation result (pass/fail + details) +- **Transformation** — enrich or modify; output is a set of field mutations +- **Recovery** — respond to failure/ambiguity; output is a recovery action +- **Orchestration Flow** — coordinate pipeline steps; output is a flow directive + +All five types share the same OPA input schema. The output schema differs per type. + +### 1.2 OPA-based policy evaluation + +A OPA-based policy evaluation executes OPA Rego bundles. DCM dispatches the policy input document to the OPA instance and receives the decision document. The OPA instance may be: +- Embedded within DCM (the reference implementation) +- A sidecar OPA instance (co-located with DCM) +- A remote OPA instance (requires network call; latency considerations apply) + +--- + +## 2. Input Schema — DCM Payload as OPA Document + +Every OPA policy evaluation receives the following input document: + +```rego +# input document structure +input := { + # The current payload being evaluated + "payload": { + "type": "request.initiated", # payload type from the vocabulary + "entity_uuid": "...", + "resource_type": "Compute.VirtualMachine", + "version": "2.1.0", + "fields": { + "cpu_count": { + "value": 4, + "provenance": { "origin": {...}, "modifications": [...] } + } + # ... all assembled fields with provenance + } + }, + + # The requesting actor context + "actor": { + "uuid": "...", + "type": "human", # human | service_account | system + "tenant_uuid": "...", + "roles": ["developer"], + "groups": ["payments-team", "eu-west-users"], + "mfa_verified": true, + "auth_level": "oidc_mfa" + }, + + # The active deployment governance + "deployment": { + "posture": "prod", + "compliance_domains": ["hipaa", "gdpr"], + "recovery_posture": "notify-and-wait", + "profile_uuid": "..." + }, + + # Entity context (null for new requests) + "entity": { + "uuid": "...", + "lifecycle_state": "OPERATIONAL", + "ownership_model": "whole_allocation", + "owned_by_tenant_uuid": "...", + "relationship_count": 3, + "drift_status": "clean" + }, + + # Provider context (null before placement) + "provider": { + "uuid": "...", + "sovereignty_declaration": {...}, + "trust_score": 94, + "capacity_confidence": "high" + }, + + # DCM built-in data (resolved by DCM before OPA evaluation) + "dcm": { + "tenant": { + "uuid": "...", + "display_name": "Payments Platform", + "active_entity_count": { "Compute.VirtualMachine": 47 }, + "compliance_overlays": ["hipaa"] + }, + "cost_estimate": { + "per_hour": 0.32, + "confidence": "high" + } + } +} +``` + +--- + +## 3. Output Schema — OPA Decision Documents + +### 3.1 GateKeeper Output + +```rego +package dcm.gatekeeper.vm_size_limits + +import future.keywords + +# Main decision +allow if { + input.payload.fields.cpu_count.value <= max_cpu +} + +deny contains reason if { + input.payload.fields.cpu_count.value > max_cpu + reason := sprintf("cpu_count %d exceeds maximum %d for tenant %s", + [input.payload.fields.cpu_count.value, max_cpu, input.actor.tenant_uuid]) +} + +# DCM reads the deny set; empty = allow +max_cpu := 32 +``` + +DCM output contract: +```json +{ + "allow": true, + "deny": [], + "warnings": [], + "policy_uuid": "...", + "evaluated_at": "..." +} +``` + +### 3.2 Transformation Output + +```rego +package dcm.transformation.inject_monitoring + +mutations contains mutation if { + input.payload.type == "request.layers_assembled" + not input.payload.fields.monitoring_endpoint + mutation := { + "field": "monitoring_endpoint", + "value": concat(".", ["https://metrics.internal", input.deployment.posture, "example.com"]), + "source_type": "policy", + "operation_type": "enrichment", + "reason": "Standard monitoring endpoint injection" + } +} +``` + +DCM output contract: +```json +{ + "mutations": [ + { + "field": "monitoring_endpoint", + "value": "https://metrics.internal.prod.example.com", + "source_type": "policy", + "operation_type": "enrichment", + "reason": "Standard monitoring endpoint injection" + } + ], + "policy_uuid": "..." +} +``` + +### 3.3 Recovery Policy Output + +```rego +package dcm.recovery.discard_on_timeout + +action := "DISCARD_AND_REQUEUE" if { + input.payload.type == "recovery.timeout_fired" + input.entity.lifecycle_state == "TIMEOUT_PENDING" +} +``` + +DCM output contract: +```json +{ + "action": "DISCARD_AND_REQUEUE", + "action_parameters": { "requeue_delay": "PT0S" }, + "policy_uuid": "..." +} +``` + +--- + +## 4. DCM Built-in Functions for Rego + +DCM provides built-in functions callable from Rego policies: + +```rego +# Entity relationship graph queries +dcm.entity.relationships(entity_uuid) + # Returns: array of relationship records for the entity + +dcm.entity.has_relationship(entity_uuid, relationship_type) + # Returns: bool + +dcm.entity.stakeholder_count(entity_uuid, min_stake_strength) + # Returns: int + +# Information Provider data +dcm.entity.field_confidence(entity_uuid, field_path) + # Returns: { band, score, authority_level } + +# Sovereignty checks +dcm.sovereignty.compatible(entity_uuid, provider_uuid) + # Returns: bool + +dcm.sovereignty.violates(entity_uuid, data_residency_requirement) + # Returns: bool + +# Cost queries +dcm.cost.estimate(catalog_item_uuid, fields) + # Returns: { per_hour, currency, confidence } + +# Tenant quota queries +dcm.tenant.active_count(tenant_uuid, resource_type) + # Returns: int + +dcm.tenant.has_authorization(granting_tenant_uuid, consuming_tenant_uuid, resource_type) + # Returns: bool +``` + +--- + +## 5. Policy Bundle Structure + +OPA policies for DCM are packaged as bundles: + +``` +dcm-policy-bundle/ +├── .manifest +│ { +│ "roots": ["dcm"], +│ "metadata": { +│ "dcm_policy_type": "gatekeeper", +│ "resource_types": ["Compute.VirtualMachine"], +│ "domain": "tenant", +│ "handle": "org/policies/vm-size-limits", +│ "version": "1.0.0" +│ } +│ } +├── dcm/ +│ └── gatekeeper/ +│ └── vm_size_limits/ +│ └── policy.rego +└── tests/ + └── vm_size_limits_test.rego +``` + +--- + +## 6. Test Harness + +DCM provides a test harness that policy authors use to validate policies against sample payloads before activation: + +``` +POST /api/v1/admin/policies/test + +{ + "policy_bundle": "", + "test_cases": [ + { + "description": "VM within CPU limit should be allowed", + "input": { + "payload": { "type": "request.initiated", "fields": { "cpu_count": { "value": 4 } } }, + "actor": { "roles": ["developer"] }, + "deployment": { "posture": "prod" } + }, + "expected_output": { "allow": true, "deny": [] } + } + ] +} +``` + +The test harness is also used during shadow mode — DCM runs the policy against real traffic and compares actual output to expected output before the policy activates. + +--- + +## 7. Policy Shadow Mode with OPA + +When a policy is in `proposed` status, DCM evaluates it in shadow mode: + +1. Policy bundle loaded into a shadow OPA instance +2. Every real request payload is evaluated by both active policies AND shadow policies +3. Shadow outputs recorded in the Validation Store (not applied to requests) +4. Policy authors review shadow results via the Admin API or Flow GUI +5. On approval (no adverse results): policy status → `active` + + +--- + +## 8. Policy Model Validation — All Seven Types + +This section validates that OPA/Rego can express all seven DCM policy types and both levels of the orchestration model. Each type is shown with a working Rego example and an assessment. + +### 8.1 GateKeeper + +```rego +package dcm.gatekeeper.vm_size_limits + +import future.keywords + +allow if { + input.payload.type == "request.layers_assembled" + input.payload.fields.cpu_count.value <= 32 +} + +deny contains reason if { + input.payload.type == "request.layers_assembled" + input.payload.fields.cpu_count.value > 32 + reason := sprintf("cpu_count %d exceeds maximum 32", + [input.payload.fields.cpu_count.value]) +} + +field_locks contains lock if { + input.deployment.compliance_domains[_] == "hipaa" + lock := {"field": "fields.patient_id", "lock_type": "immutable"} +} +``` +**Assessment:** Clean. Set-based deny with reasons, allow rules, field locks as set output. + +### 8.2 Validation + +```rego +package dcm.validation.memory_alignment + +field_results contains result if { + input.payload.fields.memory_gb.value % 2 != 0 + result := { + "field": "fields.memory_gb", + "result": "invalid", + "message": "memory_gb must be a power of 2" + } +} + +result := "pass" if count(field_results) == 0 +result := "fail" if count(field_results) > 0 +``` +**Assessment:** Clean. Set comprehension for field results. + +### 8.3 Transformation + +```rego +package dcm.transformation.inject_monitoring + +import future.keywords + +mutations contains mutation if { + input.payload.type == "request.layers_assembled" + not input.payload.fields.monitoring_endpoint + mutation := { + "field": "fields.monitoring_endpoint", + "operation": "set", + "value": concat(".", ["https://metrics.internal", + input.deployment.deployment_posture, "example.com"]), + "reason": "Standard monitoring endpoint injection", + "source_type": "enrichment" + } +} +``` +**Assessment:** Clean. Multiple mutations as independent set members. + +### 8.4 Recovery + +```rego +package dcm.recovery.timeout_response + +action := "NOTIFY_AND_WAIT" if { + input.payload.type == "recovery.timeout_fired" + input.deployment.deployment_posture in ["prod", "fsi", "sovereign"] +} + +action := "DRIFT_RECONCILE" if { + input.payload.type == "recovery.timeout_fired" + input.deployment.deployment_posture in ["minimal", "dev", "standard"] +} + +action_parameters := {"deadline": "PT4H", "on_deadline_exceeded": "ESCALATE"} + if action == "NOTIFY_AND_WAIT" +``` +**Assessment:** Clean. Conditional action based on trigger + context. + +### 8.5 Orchestration Flow (Named Workflow) + +```rego +package dcm.orchestration.request_lifecycle + +steps := [ + {"step": 1, "payload_type": "request.initiated", + "policy_handle": "system/orchestration/capture-intent", "on_fail": "halt"}, + {"step": 2, "payload_type": "request.intent_captured", + "policy_handle": "system/orchestration/assemble-layers", "on_fail": "halt"}, + {"step": 3, "payload_type": "request.layers_assembled", + "policy_handle": "system/orchestration/run-placement", "on_fail": "halt"}, + {"step": 4, "payload_type": "request.placement_complete", + "policy_handle": "system/orchestration/dispatch", "on_fail": "halt"} +] + +ordered := true +``` +**Assessment:** Clean. Step sequence as an array with `ordered: true` flag. GateKeeper and Transformation policies declared in separate packages fire on the same payload types independently — the Policy Engine coordinates both. + +### 8.6 Governance Matrix Rule + +```rego +package dcm.governance_matrix.phi_federation + +import future.keywords + +decision := "DENY" if { + input.data.classification == "phi" + input.target.type == "dcm_peer" + not "hipaa" in input.target.accreditation_held +} + +decision := "ALLOW_WITH_CONDITIONS" if { + input.data.classification == "phi" + input.target.type == "dcm_peer" + "hipaa" in input.target.accreditation_held + input.target.trust_posture == "verified" +} + +field_permissions := { + "mode": "allowlist", + "paths": ["fields.resource_type", "fields.lifecycle_state"], + "on_blocked_field": "STRIP_FIELD" +} if decision == "ALLOW_WITH_CONDITIONS" + +enforcement := "hard" if decision == "DENY" +enforcement := "soft" if decision != "DENY" +``` +**Assessment:** Clean. Four-axis input maps directly to OPA's input document. Decision + field permissions + enforcement as structured output. + +### 8.7 Lifecycle Policy + +```rego +package dcm.lifecycle.required_dependency + +import future.keywords + +on_related_destroy := "cascade" if { + input.payload.type == "relationship.related_entity_destroying" + input.relationship.stake_strength == "required" +} + +on_related_destroy := "notify" if { + input.payload.type == "relationship.related_entity_destroying" + input.relationship.stake_strength == "preferred" +} + +propagation_depth := 1 +action_delay := "PT0S" +``` +**Assessment:** Clean. Relationship event conditions; action output. + +--- + +## 9. Three Things the Policy Engine Does That OPA Does Not + +OPA evaluates each package independently and returns results. The Policy Engine provides three coordination functions that OPA alone cannot: + +**1. Cross-policy ordered enforcement:** OPA produces the Orchestration Flow step sequence; the Policy Engine tracks which steps have fired and enforces ordering. Clean separation — OPA declares; Policy Engine enforces. + +**2. Hard enforcement composition:** OPA returns `enforcement: "hard"` as output metadata; the Policy Engine ensures hard DENY wins over all soft decisions. Clean — OPA produces the flag; Policy Engine applies the composition algorithm. + +**3. Domain precedence sequencing:** Multiple packages match the same payload type. The Policy Engine evaluates them in domain precedence order (system → platform → tenant → resource_type → entity) and composes results. Clean — each OPA package is stateless and independently evaluable; Policy Engine manages composition. + +**Conclusion:** OPA/Rego is a complete reference implementation for all seven DCM policy types and both levels of the orchestration model. No model gaps exist. + + +--- + +*Document maintained by the DCM Project. For questions or contributions see [GitHub](https://github.com/dcm-project).* + + +--- + +## Scoring Model — OPA/Rego Patterns + +### Operational GateKeeper Output Schema + +```rego +package dcm.gatekeeper.operational.cost_ceiling + +# Operational-class GateKeeper produces risk_score_contribution, not deny +# enforcement_class: operational is declared in policy YAML metadata + +risk_score_contribution[result] { + input.payload.cost_estimate.per_month > 500 + result := { + "contribution": 35, + "label": "cost_ceiling_exceeded", + "reason": sprintf( + "Estimated monthly cost $%v exceeds Tenant ceiling $500", + [input.payload.cost_estimate.per_month] + ) + } +} + +# Operational GateKeepers can also produce hard deny for extreme values +deny contains reason { + input.payload.cost_estimate.per_month > 10000 + reason := "Cost exceeds absolute maximum — manual review required before submission" +} +``` + +### Advisory Validation Output Schema + +```rego +package dcm.validation.advisory.cost_center + +# Advisory-class Validation produces completeness_contribution + warning +# output_class: advisory is declared in policy YAML metadata + +completeness_warnings[warning] { + not input.payload.fields.cost_center + warning := { + "contribution": 10, + "warning_code": "recommended_field_absent", + "warning_message": "cost_center not provided — cost attribution will use Tenant default", + "field": "fields.cost_center" + } +} +``` + +### Validation — Structural vs Advisory in Same Package + +```rego +package dcm.validation.vm_fields + +# Structural validation (output_class: structural) +fail contains reason { + not input.payload.fields.cpu_count + reason := { + "field": "fields.cpu_count", + "code": "required_field_absent", + "message": "cpu_count is required" + } +} + +# Advisory validation (output_class: advisory — separate policy) +# Never mix structural and advisory in the same policy artifact +``` + diff --git a/docs/specifications/dcm-rhdh-integration-spec.md b/docs/specifications/dcm-rhdh-integration-spec.md new file mode 100644 index 0000000..9217f47 --- /dev/null +++ b/docs/specifications/dcm-rhdh-integration-spec.md @@ -0,0 +1,623 @@ +# DCM Red Hat Developer Hub Integration Specification + +**Document Status:** 🔄 In Progress +**Document Type:** Specification — RHDH / Backstage Integration Architecture +**Related Documents:** [Consumer GUI Specification](dcm-consumer-gui-spec.md) | [Admin GUI Specification](dcm-admin-gui-spec.md) | [Provider GUI Specification](dcm-provider-gui-spec.md) | [Consumer API Specification](consumer-api-spec.md) | [Auth Providers](https://github.com/croadfeldt/udlm/blob/main/governance/auth-providers.md) | [Standards Catalog](https://github.com/croadfeldt/udlm/blob/main/reference/standards-catalog.md) + +> **Status:** Draft — Ready for implementation feedback +> +> This specification defines the complete integration between DCM and Red Hat Developer Hub (RHDH) or upstream Backstage. It covers plugin architecture, entity model, auth delegation, permission mapping, Software Template auto-generation, and deployment. + +--- + +## 1. Integration Architecture Overview + +### 1.1 Layering Model + +DCM and RHDH are separate systems that integrate at well-defined boundaries. DCM remains authoritative for all infrastructure state; RHDH provides the developer experience layer. + +``` +┌─────────────────────────────────────────────────────────────────┐ +│ RHDH / Backstage │ +│ Software Catalog │ Scaffolder │ TechDocs │ Search │ +│ ─────────────────────────────────────────────────────────────│ +│ @dcm/plugin suite (Dynamic Plugins) │ +│ ├── Entity Provider ← pulls from DCM API │ +│ ├── Scaffolder Actions → pushes to DCM API │ +│ ├── Frontend Plugin ← reads DCM API via proxy │ +│ ├── Permission Policy ↔ DCM roles │ +│ └── Auth Bridge ↔ DCM Auth Provider (OIDC token exchange) │ +└──────────────────────────────┬──────────────────────────────────┘ + │ DCM Consumer API (HTTPS) + │ X-DCM-Tenant from RHDH group context + ▼ +┌─────────────────────────────────────────────────────────────────┐ +│ DCM Control Plane │ +│ Consumer API │ Policy Engine │ Scoring │ Providers │ +│ Stores: Intent, Requested, Realized, Discovered, Audit │ +└─────────────────────────────────────────────────────────────────┘ +``` + +**DCM is authoritative for:** resource state, policy decisions, audit trail, realized data, cost, drift detection. + +**RHDH is authoritative for:** developer experience, documentation, search index, Software Templates, organization/group model. + +### 1.2 Plugin Packages + +The DCM RHDH integration is delivered as six npm packages, all loadable as RHDH Dynamic Plugins: + +| Package | Type | Purpose | +|---------|------|---------| +| `@dcm/backstage-plugin` | Frontend | Nav, pages, entity tabs, drawers | +| `@dcm/backstage-plugin-backend` | Backend | API proxy, SSE relay, auth middleware | +| `@dcm/backstage-plugin-catalog-backend` | Backend | Entity provider, catalog processor | +| `@dcm/backstage-plugin-scaffolder-backend` | Backend | Custom scaffolder actions | +| `@dcm/backstage-permission-policy` | Backend | DCM → Backstage permission bridge | +| `@dcm/backstage-plugin-auth-backend` | Backend | RHDH as DCM Auth Provider (optional) | + +--- + +## 2. Authentication and Token Flow + +### 2.1 RHDH as DCM Auth Provider + +The recommended pattern: configure RHDH (Keycloak/RHSSO) as the Auth Provider for both RHDH and DCM. DCM trusts OIDC tokens issued by the same IdP that RHDH uses. + +``` +User authenticates → RHDH (via Keycloak/RHSSO OIDC) + │ + RHDH issues Backstage session token + OIDC access token + │ + DCM plugin backend receives OIDC access token + │ + DCM plugin backend presents OIDC token to DCM Consumer API + (/api/v1/auth/token with grant_type: oidc_token_exchange) + │ + DCM issues its own session token (JWT with actor_uuid, roles, tenant_scope) + │ + DCM session token cached in RHDH backend (keyed by Backstage user entity ref) + │ + All subsequent DCM API calls use DCM session token +``` + +DCM is registered as an OIDC Auth Provider with the same issuer as RHDH's Keycloak: + +```yaml +# DCM Auth Provider registration +auth_provider_registration: + provider_type: auth_provider + auth_method: oidc + oidc_config: + issuer: https://keycloak.corp.com/realms/corporate + client_id: dcm-api + trust_level: authoritative + role_mapping: + group_role_map: + - external_group: dcm-consumers + dcm_role: consumer + - external_group: dcm-approvers + dcm_role: approver + - external_group: dcm-platform-admins + dcm_role: platform_admin +``` + +### 2.2 Token Lifetime and Refresh + +- RHDH session: governed by Keycloak session settings (typically PT8H) +- DCM session token: PT30M (prod profile) — refreshed transparently by RHDH backend plugin +- DCM plugin backend maintains a token cache: `backstage_user_ref → dcm_session_token` +- Token refresh: triggered when DCM session token is within PT5M of expiry + +### 2.3 Service Account Token for Entity Provider + +The `@dcm/backstage-plugin-catalog-backend` entity provider runs as a background service, not on behalf of a user. It uses a DCM service account: + +```yaml +# DCM service account for RHDH catalog entity provider +service_account: + handle: rhdh-catalog-provider + roles: [catalog_reader] # read-only: catalog items + realized entities + credential_type: api_key + rotation: P30D +``` + +The service account API key is stored as a Kubernetes Secret and mounted into the RHDH backend pod. + +### 2.4 Tenancy from RHDH Group Context + +The active RHDH namespace/group context maps to `X-DCM-Tenant`: + +```typescript +// In @dcm/backstage-plugin-backend — DCM API proxy middleware +const groupContext = request.headers['x-backstage-namespace'] || + userEntity.spec?.memberOf?.[0]; +const tenantUuid = await dcmTenantCache.resolveFromGroup(groupContext); +proxyRequest.headers['X-DCM-Tenant'] = tenantUuid; +``` + +Tenant UUID resolution: `@dcm/backstage-plugin-catalog-backend` maintains a `RHDH Group ref → DCM Tenant UUID` mapping, populated during entity sync. + +--- + +## 3. Entity Model + +### 3.1 Custom Entity Kinds + +DCM introduces two custom Backstage entity kinds: + +#### DCMService (catalog item) + +Represents a DCM service catalog item — something a user can request. + +```yaml +apiVersion: dcm.io/v1alpha1 +kind: DCMService +metadata: + name: compute-vm-standard + namespace: dcm-catalog # shared namespace for all DCM catalog items + annotations: + dcm.io/catalog-item-uuid: "" + dcm.io/resource-type-fqn: "Compute.VirtualMachine" + backstage.io/techdocs-ref: url:/docs/compute-vm + tags: [compute, infrastructure, self-service] +spec: + type: dcm-service + lifecycle: production + owner: group:platform-team + providedBy: + providerHandle: k8s-operator-prod + providerType: service_provider + fieldSchema: + # JSON Schema for the request form — auto-populated from DCM catalog item + $ref: "dcm-api://catalog//schema" + costEstimate: + currency: USD + estimatedMonthly: 240 + billingDimensions: [cpu_count, memory_gb] + availability: + quotaRemaining: 4 # computed at sync time + sla: "99.9%" +``` + +#### DCMResource (realized entity) + +Represents a realized DCM resource — something that exists. + +```yaml +apiVersion: dcm.io/v1alpha1 +kind: DCMResource +metadata: + name: payments-api-server-01 + namespace: payments-team # namespace = DCM tenant + annotations: + dcm.io/entity-uuid: "" + dcm.io/resource-type-fqn: "Compute.VirtualMachine" + dcm.io/provider-uuid: "" + dcm.io/request-uuid: "" # the request that created this + backstage.io/techdocs-ref: url: +spec: + type: Compute.VirtualMachine + lifecycle: production + owner: group:payments-team + system: payments-platform + realizedFields: # key fields from Realized State + primary_ip: "10.42.0.105" + hostname: "payments-api-server-01.corp.internal" + cpu_count: 4 + memory_gb: 16 + os_family: rhel + lifecycleState: OPERATIONAL # DCM lifecycle state + ttlExpiresAt: "2026-09-15" + driftStatus: none # none | minor | moderate | significant | critical + providerHandle: k8s-operator-prod + dependsOn: + - dcmresource:payments-team/payments-db-01 +``` + +### 3.2 Entity Provider + +`@dcm/backstage-plugin-catalog-backend` implements a `EntityProvider` that: + +1. On startup: fetches all DCM catalog items → emits `DCMService` entities +2. On startup: fetches all realized entities for each tenant → emits `DCMResource` entities +3. On schedule (default: every PT5M): polls for changes, emits delta mutations +4. On `dcm:catalog:refresh` scaffolder action: triggers immediate refresh for specific entity + +```typescript +class DcmEntityProvider implements EntityProvider { + async refresh(logger: Logger): Promise { + // Fetch catalog items + const catalogItems = await this.dcmApi.getCatalogItems(); + const serviceEntities = catalogItems.map(toDCMServiceEntity); + + // Fetch realized resources per tenant + const tenants = await this.dcmApi.getTenants(); // admin service account + const resourceEntities = (await Promise.all( + tenants.map(t => this.dcmApi.getResources(t.uuid)) + )).flat().map(toDCMResourceEntity); + + await this.connection.applyMutation({ + type: 'full', + entities: [...serviceEntities, ...resourceEntities], + }); + } +} +``` + +### 3.3 Catalog Processor + +Handles entity validation, relationship resolution, and annotation enrichment for `DCMService` and `DCMResource` kinds. + +--- + +## 4. Software Template Auto-Generation + +### 4.1 Generation Model + +`@dcm/backstage-plugin-catalog-backend` automatically generates Backstage Software Templates from DCM catalog items. No manual template authoring is needed when new resource types appear. + +Generation pipeline: +``` +GET /api/v1/catalog → DCM catalog items + │ + For each catalog item: + │ GET /api/v1/catalog/{uuid} → field schema + │ + Transform: + │ field schema → Backstage template parameters (JSON Schema compatible) + │ catalog item metadata → template metadata + │ provider info → template tags + │ + Emit as Backstage Template entity +``` + +### 4.2 Schema Transformation Rules + +| DCM field type | Backstage ui:widget | Notes | +|---------------|---------------------|-------| +| `enum` list | `select` | Options from DCM enum | +| `string` with pattern | `text` + pattern validation | Pattern in JSON Schema | +| `integer` range | `number` or `select` | Select if < 10 options | +| `boolean` | `checkbox` | — | +| `uuid` reference | `dcm:EntityPicker` | Custom picker component | +| `duration` (ISO 8601) | `dcm:DurationPicker` | Custom picker | +| `datetime` | `datetime` | Standard Backstage widget | +| Injected field (read-only) | `readonly` | Shows source layer in tooltip | + +### 4.3 Multi-Step Template Structure + +All generated templates follow a consistent multi-step structure: + +``` +Step 1: "Configure [Service Name]" ← DCM required fields +Step 2: "Options" ← Optional DCM fields + scheduling +Step 3: "Scheduling (Optional)" ← dispatch: immediate/at/window/recurring +Step 4: "Review" ← cost estimate + pre-flight check + ← Submit ← +Step 5: "Provisioning..." ← dcm:request:submit + dcm:request:wait (live log) +Step 6: "Complete" ← link to entity in catalog + resource URL +``` + +--- + +## 5. Scaffolder Actions Reference + +All actions in `@dcm/backstage-plugin-scaffolder-backend`: + +### `dcm:request:estimate` + +```typescript +input: + catalogItemUuid: string // DCM catalog item UUID + fields: object // field values from template parameters + tenantUuid?: string // defaults to RHDH group context + +output: + estimatedMonthlyCost: number + currency: string + breakdown: Array<{dimension: string, cost: number}> + quotaCheck: {passes: boolean, remaining: number} + policyPreCheck: {passes: boolean, warnings: string[]} +``` + +**Purpose:** Called during the Review step. Provides cost estimate, quota check, and policy pre-flight. Does not submit the request. + +### `dcm:request:submit` + +```typescript +input: + catalogItemUuid: string + fields: object + schedule?: {dispatch: 'immediate'|'at'|'window'|'recurring', notBefore?: string, notAfter?: string, windowId?: string} + dependsOn?: Array<{requestUuid: string, waitFor: string, injectFields?: ...}> + +output: + requestUuid: string + entityUuid: string // UUID the resource will have when realized + status: string // typically ACKNOWLEDGED + requestUrl: string // link to request in DCM consumer portal +``` + +### `dcm:request:wait` + +```typescript +input: + requestUuid: string + timeoutMinutes?: number // default: 30 + pollIntervalSeconds?: number // default: 5; uses SSE if available + +output: + status: 'REALIZED'|'FAILED'|'CANCELLED' + entityUuid: string + entityUrl: string // link to entity in RHDH catalog + realizedFields: object // key fields from provider (IP, hostname, etc.) + failureReason?: string +``` + +Streams status updates to the Scaffolder log panel: +``` +[LOG] 09:01:05 Status: PROVISIONING — Step 3/7: Configuring network interfaces +[LOG] 09:03:12 ✅ REALIZED — IP: 10.42.0.105, Hostname: payments-api-server-01.corp.internal +``` + +### `dcm:request:group` + +```typescript +input: + groupHandle?: string + onFailure?: 'cancel_remaining'|'continue' + timeout?: string // ISO 8601 duration + requests: Array<{ + ref: string, // local reference within this submission + catalogItemUuid: string, + fields: object, + dependsOn?: Array<{ref: string, waitFor: string, injectFields?: ...}> + }> + +output: + groupUuid: string + requests: Array<{ref: string, requestUuid: string, entityUuid: string}> + groupUrl: string +``` + +### `dcm:catalog:refresh` + +```typescript +input: + entityUuid: string // DCM entity UUID to refresh in RHDH catalog + +output: + entityRef: string // Backstage entity ref: dcmresource:/ + entityUrl: string // URL to entity page in RHDH +``` + +Triggers immediate re-poll of the entity provider for the specified entity. The entity appears in RHDH catalog within PT30S of REALIZED status. + +--- + +## 6. Permission Framework Integration + +### 6.1 DCM Permissions in Backstage + +`@dcm/backstage-permission-policy` defines DCM permissions in Backstage permission framework terms: + +```typescript +// DCM permission definitions +export const dcmPermissions = { + // Resource permissions + resourceRead: createPermission({name: 'dcm.resource.read', attributes: {action: 'read'}}), + resourceUpdate: createPermission({name: 'dcm.resource.update', attributes: {action: 'update'}}), + resourceDelete: createPermission({name: 'dcm.resource.delete', attributes: {action: 'delete'}}), + + // Catalog permissions + catalogRequest: createPermission({name: 'dcm.catalog.request', attributes: {action: 'create'}}), + + // Approval permissions + approvalVote: createPermission({name: 'dcm.approval.vote', attributes: {action: 'update'}}), + + // Admin permissions + tenantManage: createPermission({name: 'dcm.tenant.manage', attributes: {action: 'update'}}), + providerManage: createPermission({name: 'dcm.provider.manage', attributes: {action: 'update'}}), +}; +``` + +### 6.2 Role Mapping + +The permission policy maps Backstage group membership to DCM permission grants: + +```typescript +class DcmPermissionPolicy implements PermissionPolicy { + async handle(request: PolicyQuery, user?: BackstageIdentityResponse) { + const groups = user?.identity.ownershipEntityRefs ?? []; + + // Basic consumer permissions — all authenticated users + if (isAuthenticated(user)) { + if (DCM_READ_PERMISSIONS.includes(request.permission.name)) { + return { result: AuthorizeResult.ALLOW }; + } + } + + // Role-based grants + if (groups.includes('group:dcm-approvers')) { + if (request.permission.name === 'dcm.approval.vote') { + return { result: AuthorizeResult.ALLOW }; + } + } + + if (groups.includes('group:dcm-platform-admins')) { + return { result: AuthorizeResult.ALLOW }; // all permissions + } + + return { result: AuthorizeResult.DENY }; + } +} +``` + +### 6.3 RHDH RBAC Plugin Integration + +The RHDH RBAC plugin provides a no-code UI for managing role assignments. DCM roles are represented as RHDH group memberships: + +``` +RHDH RBAC UI: + Role: dcm-consumers → Group: all-authenticated-users + Role: dcm-approvers → Groups: [payments-leads, platform-approvers] + Role: dcm-platform-admins → Groups: [platform-team] + Role: dcm-contributors → Groups: [policy-authors, power-users] +``` + +Changes to group membership propagate to DCM via SCIM 2.0 (if configured) or OIDC group claims on next login. + +--- + +## 7. Deployment + +### 7.1 Dynamic Plugin Loading + +All DCM plugins are deployed as RHDH Dynamic Plugins — no RHDH image rebuild required: + +```yaml +# RHDH app-config.yaml additions +dynamicPlugins: + frontend: + dcm.backstage-plugin: + disabled: false + backend: + dcm.backstage-plugin-backend: + disabled: false + dcm.backstage-plugin-catalog-backend: + disabled: false + dcm.backstage-plugin-scaffolder-backend: + disabled: false + dcm.backstage-permission-policy: + disabled: false +``` + +Plugins loaded from OCI registry or npm. New plugin versions deployed by updating the tag — no RHDH pod rebuild. + +### 7.2 RHDH Configuration + +```yaml +# app-config.yaml — DCM integration configuration +dcm: + baseUrl: https://dcm.corp.internal + apiPath: /api/v1 + + # Service account for catalog entity provider + serviceAccount: + apiKey: + $env: DCM_SERVICE_ACCOUNT_API_KEY + + # Catalog entity sync configuration + catalog: + syncIntervalSeconds: 300 # poll DCM API every 5 minutes + refreshOnScaffolderComplete: true + entityNamespace: dcm-catalog # for DCMService entities + + # Tenant resolution + tenancy: + groupNamespacePrefix: "dcm-tenant-" # RHDH group dcm-tenant-{uuid} → tenant uuid + fallbackTenantUuid: null # null = require explicit group context + + # Auth delegation + auth: + oidcIssuer: https://keycloak.corp.com/realms/corporate + clientId: dcm-rhdh-bridge + clientSecret: + $env: DCM_OIDC_CLIENT_SECRET + + # Feature flags + features: + liveStatusSse: true # use SSE for request status (fallback to polling if false) + costEstimateInCatalog: true # show cost estimate on catalog cards + quotaCheckOnBrowse: true # show quota availability in catalog + autoGenerateTemplates: true # auto-generate Scaffolder templates from catalog items +``` + +### 7.3 Kubernetes Deployment Pattern + +```yaml +# RHDH configuration in OpenShift/Kubernetes +apiVersion: v1 +kind: ConfigMap +metadata: + name: rhdh-app-config + namespace: rhdh +data: + app-config.dcm.yaml: | + dcm: + baseUrl: https://dcm-api.dcm-system.svc.cluster.local + # ... (internal cluster DNS for in-cluster communication) + +--- +apiVersion: v1 +kind: Secret +metadata: + name: dcm-integration-secrets + namespace: rhdh +stringData: + DCM_SERVICE_ACCOUNT_API_KEY: "" + DCM_OIDC_CLIENT_SECRET: "" +``` + +### 7.4 Zero-Trust in Cluster + +RHDH backend → DCM Consumer API communication: +- Both running in same Kubernetes cluster (typically) +- mTLS enforced by Istio service mesh (ICOM model applies to RHDH as a client) +- RHDH is not a DCM internal component — it is an external client that uses the Consumer API +- Auth: OIDC token exchange (Section 2.1) — RHDH backend presents OIDC access token; DCM issues session token + +--- + +## 8. RHDH Pre-Built Capabilities Leveraged + +### 8.1 No-Build Integrations (Immediate Value) + +These work before writing any DCM-specific code: + +| RHDH Feature | DCM Benefit | Config needed | +|-------------|-------------|---------------| +| Keycloak/RHSSO auth | SSO into DCM portal — same login as everything else | Configure OIDC provider | +| RBAC Plugin | No-code role management | Define DCM groups | +| TechDocs | DCM docs rendered in-portal | Add `techdocs-ref` annotations | +| Search | DCM entities searchable | Provided by catalog backend plugin | +| Kubernetes plugin | See DCM pods alongside resources | Standard RHDH Kubernetes plugin config | +| ArgoCD plugin | Layer store GitOps visibility | Standard RHDH ArgoCD plugin config | +| Tekton plugin | DCM scaffolding pipeline visibility | Standard RHDH Tekton plugin config | + +### 8.2 Ansible Automation Platform Plugin + +RHDH ships an existing AAP (Ansible Automation Platform) plugin. DCM Service Providers that use Ansible Automation Platform can surface AAP job status directly in RHDH: + +``` +DCMResource entity page +└── Additional tab contributed by AAP plugin: + "Automation" ← shows AAP job runs for this resource's provisioning +``` + +This requires no DCM code — it emerges from RHDH's existing AAP plugin + `dcm.io/aap-job-id` annotation on DCMResource entities. + +### 8.3 OCM (Open Cluster Management) Plugin + +Organizations using OCM for cluster lifecycle management get cluster management alongside DCM service catalog in the same portal — genuinely one pane of glass for sovereign cloud operations. + +--- + +## 9. Deployment Options + +DCM supports two frontend deployment modes that can be selected at initial deployment: + +**Standalone SPA** — DCM deploys its own React-based consumer portal. No RHDH dependency. Suitable for environments where RHDH is not present. + +**RHDH Mode** — DCM plugins are loaded into an existing RHDH instance. The RHDH Developer Hub becomes the consumer portal surface. Recommended for organizations already running RHDH. + +Both modes use the same DCM APIs and the same authentication model. The choice is a deployment configuration, not an architectural difference. + +```yaml +# dcm-config.yaml +frontend: + mode: standalone_spa | rhdh + rhdh_base_url: https://rhdh.internal # only required for rhdh mode +``` + +--- diff --git a/docs/specifications/kubernetes-compatibility.md b/docs/specifications/kubernetes-compatibility.md new file mode 100644 index 0000000..3036744 --- /dev/null +++ b/docs/specifications/kubernetes-compatibility.md @@ -0,0 +1,443 @@ +# DCM — Kubernetes Compatibility and Concept Mappings + + +> ## 📋 Draft — Promoted from Work in Progress +> +> All questions resolved. Cluster-as-a-Service model defined. Namespace-to-Tenant mapping, admission webhook model, and managed K8s integration all specified. +> +> **This section is explicitly a work in progress and is less mature than the core DCM data model and architecture documentation.** +> +> The Kubernetes operator integration layer — including the Operator Interface Specification, Operator SDK API, and Kubernetes compatibility mappings — represents design intent that has not yet been validated against implementation. Specific interface contracts, API signatures, SDK method names, and CRD structures **will change** as implementation work begins. +> +> **Do not build against these specifications yet.** They are published to share design direction and invite feedback, not as stable contracts. +> +> Known gaps and open items for this section: +> - Operator Interface Specification: reconciliation hook signatures are provisional +> - Operator SDK API: Go module structure and dependency model not yet finalized +> - Kubernetes Compatibility Mappings: some concept mappings remain under discussion +> - SDK code examples are illustrative only — not yet tested against a real implementation +> +> Feedback and contributions welcome via [GitHub Issues](https://github.com/dcm-project/issues). + + + +**Document Status:** ✅ Complete +**Document Type:** Architecture Reference +**Related Documents:** [Foundational Abstractions](https://github.com/croadfeldt/udlm/blob/main/foundations/foundations.md) | [Entity Relationships](https://github.com/croadfeldt/udlm/blob/main/entities/entity-relationships.md) | [Resource Type Hierarchy](https://github.com/croadfeldt/udlm/blob/main/entities/resource-type-hierarchy.md) | [Resource/Service Entities](https://github.com/croadfeldt/udlm/blob/main/entities/resource-service-entities.md) | [DCM Operator Interface Specification](dcm-operator-interface-spec.md) + +--- + +## 1. Purpose + +> **AEP Alignment:** API endpoint references in this document follow [AEP](https://aep.dev) conventions +> (custom methods use colon syntax). See `schemas/openapi/dcm-consumer-api.yaml` for the +> normative OpenAPI specification. + + +DCM is designed as a **superset of Kubernetes** — extending Kubernetes' declarative, controller-based model upward to provide unified management across multiple clusters, infrastructure types, and organizational boundaries that Kubernetes alone cannot address. + +This document serves three purposes: + +1. **Defines the formal mapping** between Kubernetes concepts and DCM concepts — enabling implementors to understand how the two models relate and where DCM extends beyond Kubernetes +2. **Establishes DCM Resource Types** for standard Kubernetes resources — so that Kubernetes-managed resources participate in the DCM registry alongside non-Kubernetes resources +3. **Documents the boundary** between what Kubernetes governs and what DCM governs — making clear that DCM extends Kubernetes rather than replacing it + +--- + +## 2. The Superset Relationship + +DCM is a superset of Kubernetes in the sense that it provides all the capabilities Kubernetes provides — and more. An organization running Kubernetes exclusively is using a subset of what DCM can manage. DCM does not replace Kubernetes; it manages the lifecycle of Kubernetes clusters and the resources running on them. + +The superset relationship means DCM can manage Kubernetes-native resources (Deployments, Services, PersistentVolumes) through conformant operators, and it can manage the clusters themselves as catalog items. It also means DCM manages resources that have no Kubernetes equivalent — bare metal, VMs, VLANs, IP allocations, and organizational data entities. + +### 2.1 What Kubernetes Provides + +Kubernetes is a container orchestration platform that provides: +- Declarative desired-state management within a single cluster +- A controller/operator pattern for extending resource management +- Namespace-based isolation within a cluster +- RBAC for access control within a cluster +- A rich ecosystem of operators for managing complex stateful resources + +### 2.2 What DCM Adds + +DCM extends Kubernetes upward by providing: + +| Capability | Kubernetes | DCM | +|------------|-----------|-----| +| Scope | Single cluster | Multi-cluster, multi-infrastructure | +| Tenancy | Namespace isolation | First-class Tenant model with ownership | +| Policy | RBAC + admission webhooks | Full Policy Engine with Validation/Transformation/GateKeeper | +| Data lineage | Not provided | Field-level provenance on all data | +| Cost attribution | Not provided | Full lifecycle cost analysis | +| Drift detection | Basic — controller reconciles | Full four-state model with Intent/Requested/Realized/Discovered | +| Service catalog | Not provided | Full self-service catalog with RBAC-governed presentation | +| Sovereignty | Not provided | Sovereignty declarations, placement constraints, compliance evidence | +| Information context | Labels/annotations | First-class Information Provider relationships | +| Non-Kubernetes resources | Not provided | VMware, bare metal, OpenStack, etc. all managed through same model | + +### 2.3 What DCM Does Not Replace + +DCM does not replace Kubernetes at the runtime level. Kubernetes continues to: +- Schedule and run containers +- Manage Pod lifecycle within a cluster +- Enforce network policies within a cluster +- Provide the Kubernetes API for cluster-native tooling +- Run operators that manage complex stateful resources + +DCM manages the management plane — the lifecycle of what gets requested, provisioned, owned, governed, and decommissioned. Kubernetes manages the execution plane — the runtime behavior of what is running. + +--- + +## 3. Core Concept Mappings + +### 3.1 Resource Model + +| Kubernetes Concept | DCM Concept | Relationship | Notes | +|-------------------|-------------|--------------|-------| +| Custom Resource Definition (CRD) | Resource Type Specification | CRD schema → DCM Resource Type fields | DCM Resource Type is the portable, provider-agnostic equivalent. CRD is the Kubernetes-specific implementation schema. | +| Custom Resource (CR) | Requested State payload → Realized State entity | CR is the naturalized form of the DCM payload | The operator translates DCM Requested State into a CR (Naturalization) and translates CR status back to DCM Realized State (Denaturalization). | +| Built-in resource (Pod, Service, PV) | DCM Resource Type in Compute.*, Network.*, Storage.* | Kubernetes built-ins are valid DCM Resource Types | See Section 5 for standard Kubernetes resource type mappings. | +| Kubernetes object | Resource/Service Entity | Every Kubernetes object managed by DCM has a corresponding DCM entity with UUID and provenance | | + +### 3.2 Control Loop + +| Kubernetes Concept | DCM Concept | Relationship | Notes | +|-------------------|-------------|--------------|-------| +| Operator reconciliation loop | Realization + Drift Detection combined | Reconciliation IS the realization process — the operator drives actual state toward desired state | DCM's Drift Detection compares Discovered State against Realized State. The operator's reconciliation loop is the mechanism that corrects drift. | +| Desired state (CR spec) | Requested State | CR spec is the naturalized form of the DCM Requested State | DCM stores the Requested State in DCM format. The operator translates it to CR spec format. | +| Actual state (CR status) | Realized State | CR status is the Kubernetes-native form of the DCM Realized State | The operator must denaturalize CR status back to DCM Realized State format and report it to DCM. | +| Watch/Inform pattern | DCM Discovered State polling | Kubernetes watch events are the mechanism for keeping DCM Discovered State current | | + +### 3.3 Isolation and Multi-tenancy + +| Kubernetes Concept | DCM Concept | Relationship | Notes | +|-------------------|-------------|--------------|-------| +| Namespace | DCM Tenant boundary | One namespace per DCM Tenant (per_tenant strategy) | Kubernetes namespace provides the physical isolation enforcement. DCM Tenant provides the ownership and governance model. A single DCM Tenant maps to exactly one namespace per cluster. | +| Namespace | DCM Resource Group | In shared namespace strategies, Resource Group labels replace namespace isolation | When multiple Tenants share a namespace, DCM Resource Group labels provide logical separation. | +| Kubernetes RBAC | DCM IDM/IAM + Policy Engine | Kubernetes RBAC is the runtime enforcement mechanism. DCM Policy Engine governs who can request what via the service catalog. | DCM policies determine what a user can request. Kubernetes RBAC determines what a running workload can do. These are complementary, not duplicative. | +| ServiceAccount | DCM Identity.ServiceAccount Information Type | Kubernetes ServiceAccounts that DCM provisions or references are modeled as DCM Information Type entities | | + +### 3.4 Relationships and Dependencies + +| Kubernetes Concept | DCM Concept | Relationship | Notes | +|-------------------|-------------|--------------|-------| +| ownerReference | Entity Relationship (`contains`/`contained_by`) | Kubernetes ownerReferences are a subset of DCM entity relationships — ownership only | DCM relationships are richer — supporting `requires`, `depends_on`, `references`, `peer`, `manages` in addition to ownership. During Denaturalization, ownerReferences are translated to DCM `contains` relationships. | +| Finalizers | Lifecycle policy (`retain`, `detach`) | Kubernetes finalizers implement DCM lifecycle policies at the Kubernetes level | When DCM declares `on_parent_destroy: retain` for a storage entity, the operator implements this using Kubernetes finalizers to prevent deletion until DCM confirms the lifecycle policy has been applied. | +| Label selectors | Resource Group membership | Kubernetes label selectors used for DCM Resource Group filtering | DCM mandatory labels (`dcm-tenant-id`, `dcm-entity-id`) are used as label selectors for Resource Group queries. | + +### 3.5 Data Model + +| Kubernetes Concept | DCM Concept | Relationship | Notes | +|-------------------|-------------|--------------|-------| +| Labels | DCM entity metadata + relationships | DCM-mandatory labels (`dcm-managed`, `dcm-tenant-id`, `dcm-entity-id`, etc.) carry core DCM identity data. Custom labels may map to DCM Information Type relationships. | | +| Annotations | DCM field-level provenance + metadata | Annotations used by DCM to carry request correlation data during the request lifecycle | `dcm-request-id` annotation on a CR identifies the DCM request that created or last modified it — enabling unsanctioned change detection. | +| Resource version | Entity version (Revision component) | Kubernetes resource versions map to DCM entity Revision increments | Major and Minor versions are managed by DCM based on breaking/non-breaking changes. Kubernetes resource version increments map to DCM Revision increments. | +| Generation | Requested State version | CR generation increments correspond to new DCM Requested State records | Each new generation of a CR corresponds to a new intent/request cycle in DCM. | + +### 3.6 Lifecycle + +| Kubernetes Concept | DCM Concept | Relationship | Notes | +|-------------------|-------------|--------------|-------| +| Pod phases (Pending, Running, Succeeded, Failed, Unknown) | DCM lifecycle states | Pod phases map to DCM lifecycle states via condition_mappings declaration | | +| CRD conditions | DCM lifecycle states and events | Standard conditions (Ready, Degraded, Progressing) map to DCM states and events via the field mapping specification | | +| Kubernetes events | DCM lifecycle events | Kubernetes watch events trigger DCM lifecycle event reports | The operator translates Kubernetes events into DCM lifecycle event types (ENTITY_HEALTH_CHANGE, DEGRADATION, UNSANCTIONED_CHANGE, etc.) | +| Cluster deletion | DCM decommission workflow | Cluster deletion triggers DCM's full decommission lifecycle — lifecycle policies applied to all related entities | | + +--- + + +## 3a. Cluster as a Service — The Primary Model + +A Kubernetes cluster is a first-class catalog item in DCM. Any authorized Tenant can request and own a cluster through the service catalog, the same way they request a VM or a network. This is not a special case — it is the expected primary consumption model for Kubernetes infrastructure in DCM. + +**How it works:** + +```yaml +catalog_item: Platform.KubernetesCluster +provider: CAPI-based Service Provider (or managed K8s Service Provider) +tenant_uuid: + +entity: + resource_type: Platform.KubernetesCluster + tenant_uuid: # Tenant owns the cluster + lifecycle_state: OPERATIONAL + fields: + kubernetes_version: "1.29" + node_count: 3 + api_endpoint: "https://cluster-01.eu-west.example.com" + kubeconfig_ref: # via credential management service +``` + +**Ownership scope:** When a Tenant owns a `Platform.KubernetesCluster` entity, that Tenant owns everything within the cluster boundary — including cluster-scoped resources (ClusterRoles, StorageClasses, PersistentVolumes, CRDs registered for that cluster). The cluster entity is the ownership boundary. DCM treats the cluster as an opaque resource from a Tenant ownership perspective — the Tenant gets the cluster; what's inside it belongs to them. + +**The composite service definition pattern:** A Cluster-as-a-Service catalog item typically composes multiple constituent resources: +```yaml +Platform.KubernetesCluster → constituent providers: + - Compute resources (control plane + worker nodes) + - Network resources (load balancer, ingress) + - Storage resources (CSI driver + storage class) + - DNS records (cluster API endpoint) + - Credential issuance (kubeconfig via credential management service) +``` + +This is a composite service definition — the cluster catalog item orchestrates all constituents and presents a single entity to the Tenant. + +**Sovereignty and accreditation:** Cluster placement follows the standard Placement Engine model. Sovereignty constraints declared by the Tenant apply to cluster placement — a GDPR-scoped Tenant requesting a cluster gets a cluster placed in an EU sovereignty zone. The CAPI provider (or managed K8s Service Provider) must hold appropriate accreditations. + +**Post-provision:** Once the cluster is OPERATIONAL, it can optionally register with DCM as a nested Service Provider for workload resources. The Tenant can then request workload resources (Deployments, Services, PersistentVolumes) against their cluster through the same DCM service catalog. This creates the superset model: DCM provisions the cluster → cluster becomes a workload Service Provider → Tenant uses DCM to manage workloads on their cluster. + + +## 4. Where DCM Extends Beyond Kubernetes + +These are capabilities that exist in DCM but have no Kubernetes equivalent. None of these require Kubernetes to be present — they operate across all provider types. For organizations running pure Kubernetes estates, these are the capabilities DCM brings that Kubernetes tooling alone cannot provide. + +**Summary of extensions:** + +| DCM Capability | Kubernetes Gap | +|---------------|---------------| +| Intent State | No concept of original consumer intent separate from desired state | +| Field-Level Provenance | No field lineage — a field is a field | +| Data Layers and Assembly | No layering model — manifests are flat declarations | +| Policy Engine | Admission webhooks are cluster-scoped, admission-time only | +| Cost Analysis | No native cost attribution in the request lifecycle | +| Information Providers | No structured external organizational data relationships | +| Cross-Cluster Lifecycle | Single-cluster scope — multi-cluster requires external tooling | + +These are concepts that exist in DCM but have no Kubernetes equivalent. They are the capabilities DCM adds that justify the superset positioning. + +### 4.1 Intent State + +Kubernetes has no concept of a consumer's original intent separate from the desired state. Once you apply a manifest, Kubernetes only knows the current desired state — not what the consumer originally asked for or why. + +DCM's Intent State is the immutable record of what the consumer asked for, stored before any policy processing or layer enrichment. This enables: +- Rehydration — replaying the original intent through current policies to produce a new request +- Intent portability — the same intent applied to a different provider +- Audit — answering "what did the consumer originally ask for?" independently of what was realized + +### 4.2 Field-Level Provenance + +Kubernetes has no concept of where a field value came from or why it was set. A field in a CR spec is a field — there is no lineage. + +DCM's field-level provenance carries the full lineage of every field value through the entire lifecycle — which layer set it, which policy modified it, which provider realized it, and why each change was made. This enables complete audit trails and sovereignty evidence. + +### 4.3 Data Layers and Assembly + +Kubernetes has no equivalent to DCM's layering model. A Kubernetes manifest is a flat declaration — there is no concept of organizational standards, site-specific configuration, and service-specific configuration being separate layers that compose into a final manifest. + +DCM's layering model enables 36 layer definitions to govern 40,000 VMs without duplication — impossible in the Kubernetes model. + +### 4.4 Policy Engine + +Kubernetes admission webhooks provide some policy capability (validation, mutation) but are cluster-scoped, apply at admission time only, and have no concept of hierarchy (Global → Tenant → User policy levels) or field-level override control. + +DCM's Policy Engine operates at the management plane level, applies across all clusters and providers, enforces a three-level hierarchy with field-level override control (allow/constrained/immutable), and carries policy decisions as provenance metadata in the payload. + +### 4.5 Cost Analysis + +Kubernetes has no native cost attribution model. Tools like Kubecost exist but are add-ons with no integration into the request lifecycle. + +DCM's cost analysis is built into the lifecycle model — cost attribution is tracked from request time through realization, operation, and decommission for every entity. + +### 4.6 Information Providers + +Kubernetes has no concept of structured relationships to external organizational data (Business Units, Cost Centers, Product Owners). Labels and annotations are unstructured key-value pairs with no type safety, no external system integration, and no verification model. + +DCM's Information Provider model gives every entity structured, verified, versioned relationships to external organizational data with a stable external key model. + +### 4.7 Cross-Cluster Lifecycle + +Kubernetes manages resources within a single cluster. Multi-cluster management requires additional tools (ACM, Argo CD, Fleet) that are not part of the core Kubernetes model. + +DCM manages the lifecycle of resources across multiple clusters as a first-class capability — the same Resource Type can be instantiated on any cluster that has a conformant Service Provider registered. + +--- + +## 5. Standard Kubernetes Resource Type Mappings + +These are the DCM Resource Type registry entries for standard Kubernetes resource types. Operators implementing these types should use these registry UUIDs and field definitions. + +### 5.1 Compute + +| DCM Resource Type | Kubernetes Equivalent | Notes | +|------------------|----------------------|-------| +| `Compute.Pod` | Pod | Lowest-level compute unit | +| `Compute.Container` | Container (within a Pod) | Sub-entity of Pod — expanded via bundled declaration | +| `Compute.Deployment` | Deployment | Managed set of Pods | +| `Compute.StatefulSet` | StatefulSet | Stateful managed set of Pods | +| `Compute.Job` | Job | One-time execution workload | +| `Compute.CronJob` | CronJob | Scheduled execution workload | + +### 5.2 Network + +| DCM Resource Type | Kubernetes Equivalent | Notes | +|------------------|----------------------|-------| +| `Network.Service` | Service | In-cluster service discovery and load balancing | +| `Network.Ingress` | Ingress | External HTTP/HTTPS routing | +| `Network.NetworkPolicy` | NetworkPolicy | In-cluster network isolation | + +### 5.3 Storage + +| DCM Resource Type | Kubernetes Equivalent | Notes | +|------------------|----------------------|-------| +| `Storage.PersistentVolume` | PersistentVolume | Cluster-level storage resource | +| `Storage.PersistentVolumeClaim` | PersistentVolumeClaim | Consumer's storage declaration — expanded into Storage.PersistentVolume relationship | +| `Storage.StorageClass` | StorageClass | Storage type definition — maps to DCM Provider Catalog Item | +| `Storage.ConfigMap` | ConfigMap | Configuration data storage | +| `Storage.Secret` | Secret | Sensitive data storage | + +### 5.4 Platform + +| DCM Resource Type | Kubernetes Equivalent | Notes | +|------------------|----------------------|-------| +| `Platform.KubernetesCluster` | Kubernetes Cluster (via CAPI or managed service) | The cluster itself is a DCM-managed resource | +| `Platform.Namespace` | Namespace | Maps to DCM Tenant boundary in per_tenant strategy | +| `Platform.CustomResourceDefinition` | CRD | CRD registration maps to DCM Resource Type registration | + +### 5.5 Identity + +| DCM Resource Type | Kubernetes Equivalent | Notes | +|------------------|----------------------|-------| +| `Security.ServiceAccount` | ServiceAccount | Kubernetes identity for workloads | +| `Security.Role` | Role / ClusterRole | Kubernetes RBAC role | +| `Security.RoleBinding` | RoleBinding / ClusterRoleBinding | Kubernetes RBAC binding | + +--- + +## 6. The Kubernetes Information Provider + +Kubernetes clusters function as both Service Providers (for provisioning resources) and Information Providers (for querying existing state). As an Information Provider, a Kubernetes cluster exposes its current resource state to DCM for: + +- **Brownfield ingestion** — discovering existing resources and bringing them under DCM lifecycle management +- **Discovered State** — DCM's Discovered State for Kubernetes resources comes from querying the Kubernetes API +- **Drift detection** — comparing DCM Realized State against what Kubernetes actually has + +### 6.1 Kubernetes as Information Provider Registration + +```yaml +information_provider_registration: + name: kubernetes-cluster-01 + implements: + - information_type: Platform.KubernetesCluster + - information_type: Compute.Pod + - information_type: Storage.PersistentVolume + # ... all resource types the cluster contains + endpoint: + kubernetes_credentials: + auth_method: + discovery_capabilities: + label_selector: "dcm-managed=true" + # Only returns DCM-managed resources by default + full_discovery: true + # Can also return all resources for brownfield ingestion +``` + +### 6.2 Discovered State from Kubernetes + +DCM queries the Kubernetes API using the Kubernetes Information Provider to populate Discovered State: + +``` +DCM Drift Detection + │ + ▼ +Kubernetes Information Provider + │ GET /apis/{group}/{version}/namespaces/{ns}/{kind} + │ Filter: label dcm-entity-id = {entity_uuid} + ▼ +Discovered State payload (DCM format) + │ Kubernetes object denaturalized to DCM format + ▼ +Compare against Realized State + │ Field-by-field comparison + ▼ +UNSANCTIONED_CHANGE if differences found + │ Reported to Policy Engine for response determination +``` + +--- + +## 7. Kubernetes-Native Patterns and DCM Equivalents + +### 7.1 GitOps + +Kubernetes GitOps (Argo CD, Flux) manages Kubernetes manifests in Git and synchronizes them to clusters. DCM's data model is also Git-based — all layers, Resource Type definitions, and policy definitions are stored in Git. + +The relationship: DCM manages the **request lifecycle** (what gets asked for, approved, and provisioned). GitOps manages the **deployment lifecycle** (what gets deployed to a cluster from a Git repository). These are complementary: + +- DCM governs the provisioning request — "is this consumer allowed to provision this resource?" +- GitOps deploys application code to the provisioned resource +- DCM and GitOps together form a complete lifecycle: DCM provisions the cluster, GitOps deploys applications to it + +### 7.2 Helm + +Helm charts are packages of Kubernetes manifests that can be parameterized. In DCM terms, a Helm chart is a form of Catalog Item — a curated, parameterized offering of a set of Kubernetes resources. + +DCM does not replace Helm — it can use Helm as a delivery mechanism inside a Service Provider. The Service Provider receives the DCM Requested State, translates it to Helm values, and uses Helm to deploy the resources. The operator pattern is preferred for Day 2 management (Helm has limited reconciliation), but Helm remains valid for initial provisioning. + +### 7.3 Cluster API (CAPI) + +CAPI is the Kubernetes sub-project for managing Kubernetes clusters themselves using the Kubernetes API and operator pattern. CAPI clusters are a natural fit for DCM's `Platform.KubernetesCluster` Resource Type — a CAPI-based operator would be the Service Provider for provisioning new Kubernetes clusters as DCM-managed resources. + +This is particularly significant: DCM managing the lifecycle of Kubernetes clusters through CAPI means DCM can provision the very infrastructure that operators run on. The superset relationship becomes concrete — DCM provisions the cluster, the cluster runs the operators, the operators provision the resources that DCM manages. + +--- + +## 8. Incremental Adoption — Kubernetes-Native to DCM-Managed + +Organizations running Kubernetes can adopt DCM incrementally across these phases: + +### Phase 1 — Observation (no operator changes) +Deploy DCM with the Kubernetes Information Provider. DCM observes existing resources via the Kubernetes API and builds a Discovered State inventory. No changes to existing operators or workloads. + +### Phase 2 — Brownfield Ingestion (no operator changes) +DCM promotes Discovered State records to Realized State — assuming lifecycle management of existing resources. Resources get DCM UUIDs, Tenant assignments, and provenance records. Existing resources are now DCM-managed without any operator changes. + +### Phase 3 — Level 1 Conformance (minimal operator changes) +Operators implement Level 1 of this specification via the DCM Operator SDK. New resources are provisioned through DCM's service catalog. Existing resources managed via brownfield ingestion continue as-is. + +### Phase 4 — Level 2 Conformance (moderate operator changes) +Operators implement Level 2 — full field mappings, capacity reporting, lifecycle events. DCM gains placement intelligence, drift detection, and cross-cluster management capabilities. + +### Phase 5 — Level 3 Conformance (complete integration) +Operators implement Level 3 — sovereignty declarations, provenance, discovery endpoint. Full DCM capabilities available. + +--- + +## 9. Open Questions + +| # | Question | Impact | Status | +|---|----------|--------|--------| +| 1 | How does the Namespace-to-Tenant mapping work when a cluster has existing namespaces that predate DCM adoption? | Brownfield migration | ✅ Resolved | +| 2 | Should `Platform.KubernetesCluster` be the boundary for a DCM deployment, or can DCM manage resources across clusters without treating the cluster as a DCM entity? | Architecture scope | ✅ Resolved | +| 3 | How does DCM interact with Kubernetes admission webhooks — do they duplicate Policy Engine functions or complement them? | Policy model | ✅ Resolved | +| 4 | Should the Kubernetes Information Provider be a built-in DCM component or a separately deployed provider? | Deployment architecture | ✅ Resolved | +| 5 | How does the DCM superset model interact with managed Kubernetes services (EKS, GKE, AKS) where cluster management is outside the user's control? | Cloud provider integration | ✅ Resolved | + +--- + +## 10. Related Concepts + +- **DCM Operator Interface Specification** — the technical contract for operators integrating with DCM +- **DCM Operator SDK** — Go library implementing this specification for operator developers +- **Entity Relationships** — DCM's universal relationship model, of which Kubernetes ownerReferences are a subset +- **Resource Type Hierarchy** — the DCM registry where Kubernetes Resource Types are registered +- **Information Providers** — the DCM model for the Kubernetes API as a discoverable information source +- **Four States** — DCM's Intent/Requested/Realized/Discovered model, which extends Kubernetes' desired/actual model + +--- + + + +## Resolution Notes + +**Q1:** Pre-existing namespaces are handled by the brownfield ingestion model. Each namespace maps to one DCM Tenant. Resources without clear ownership land in the `__transitional__` Tenant and are promoted by a platform admin. Same flow as brownfield VM ingestion — no special handling required. + +**Q2:** DCM manages resources across multiple clusters simultaneously. `Platform.KubernetesCluster` is a DCM-managed resource type — both something DCM provisions as a catalog item (Cluster as a Service) and something DCM tracks when externally provisioned. A Tenant can own a full cluster as a catalog item; the cluster is not the boundary of a DCM deployment. DCM's organizational boundary is the Tenant. A single DCM deployment routes requests to Service Providers across many clusters, and can provision new clusters as service catalog items. + +**Q3:** Admission webhooks and the DCM Policy Engine are complementary layers, not duplicates. Admission webhooks enforce cluster-native policy (security contexts, image policies, resource quotas). The DCM Policy Engine enforces DCM request policy (business rules, data governance, sovereignty). A DCM-managed workload resource is validated by both — DCM Policy Engine before dispatch, admission webhook at the cluster. This is defense in depth. + +**Q4:** The Kubernetes Information Provider is a separately deployed provider that registers with DCM as a standard Information Provider. It serves cluster state, namespace inventory, and workload status. There are no built-in Information Providers in DCM's architecture — all Information Providers follow the unified base contract and are independently deployable. + +**Q5:** Managed Kubernetes services (EKS, GKE, AKS) register as Service Providers of resource type `Platform.ManagedKubernetesCluster`. DCM manages workload resources within the cluster (Deployments, Services, PersistentVolumes) but explicitly does not manage the cluster control plane. Sovereignty enforcement applies at cluster selection — DCM places workloads on clusters satisfying sovereignty constraints. The cloud provider manages cluster infrastructure. + +*Document maintained by the DCM Project. For questions or contributions see [GitHub](https://github.com/dcm-project).*