From c7b0cb0fe8752adf4acb2cbea1085a01aa4f8e17 Mon Sep 17 00:00:00 2001 From: Chris Roadfeldt Date: Tue, 16 Jun 2026 16:43:51 -0500 Subject: [PATCH] architecture: integrations ITSM integration and Kessel evaluation. Signed-off-by: Chris Roadfeldt --- architecture/integrations/itsm.md | 666 ++++++++++++++++++ .../integrations/kessel-evaluation.md | 470 ++++++++++++ 2 files changed, 1136 insertions(+) create mode 100644 architecture/integrations/itsm.md create mode 100644 architecture/integrations/kessel-evaluation.md diff --git a/architecture/integrations/itsm.md b/architecture/integrations/itsm.md new file mode 100644 index 0000000..ea2396f --- /dev/null +++ b/architecture/integrations/itsm.md @@ -0,0 +1,666 @@ +# DCM Data Model β€” ITSM Integration + +**Document Status:** πŸ”„ In Progress +**Document Type:** Architecture Reference β€” ITSM integration Type and ITSM Policy Type +**Related Documents:** [Provider Contract](https://github.com/croadfeldt/udlm/blob/main/contracts/provider-contract.md) | [Policy Contract](https://github.com/croadfeldt/udlm/blob/main/contracts/policy-contract.md) | [Notification Model](../runtime-features/notifications.md) | [Event Catalog](https://github.com/croadfeldt/udlm/blob/main/contracts/event-catalog.md) | [Authority Tier Model](https://github.com/croadfeldt/udlm/blob/main/governance/authority-tier-model.md) | [Consumer API Specification](../../docs/specifications/consumer-api-spec.md) + +> **Design principle:** DCM is built to *replace* the infrastructure ticket as the primary provisioning mechanism. ITSM integration is additive β€” it enriches DCM entities with ITSM metadata, enables ITSM-initiated requests, and provides bidirectional lifecycle traceability for organizations that need it for compliance. **DCM never requires an ITSM system to function.** +> +> Two new additions to the DCM architecture: +> 1. **ITSM integration** β€” a new Provider type (12th) that speaks ITSM system APIs bidirectionally +> 2. **ITSM Policy** β€” a new Policy output type (8th) that triggers ITSM actions as a side-effect of DCM pipeline events + +--- + +## 1. ITSM Integration + +### 1.1 What ITSM Integration Is + +ITSM integration connects DCM to an external IT Service Management system. It handles: + +- **Outbound**: DCM lifecycle events β†’ ITSM records (create change requests, update CMDB CIs, close incidents, link tickets to entities) +- **Inbound**: ITSM approvals and decisions β†’ DCM (change approval recorded via approval vote API, request initiation from ITSM workflow) +- **Sync**: ITSM record references stored on DCM entities as business data (bidirectional link) + +ITSM integration is **not** a separate provider type (it doesn't realize resources), **not** a notification service (though it may create notification-like records), and **not** a External Policy Evaluator (though ITSM approval status may inform DCM policies). It is its own type because it has a bidirectional contract, manages external record lifecycle, and requires specific capability declarations around ITSM system connectivity. + +### 1.2 Data Flow + +``` +DCM lifecycle event fires (e.g. request.dispatched) + β”‚ + β–Ό ITSM Policy evaluates (see Section 3) + β”‚ Determines: should an ITSM action fire? Which action? + β”‚ + β–Ό ITSM integration receives action request + β”‚ Translates to target system's API format + β”‚ Calls ITSM system (ServiceNow, Jira, etc.) + β”‚ + β–Ό ITSM system creates/updates record + β”‚ Returns record ID (CHG0012345, INC-4821, etc.) + β”‚ + β–Ό ITSM integration stores reference on DCM entity + β”‚ entity.business_data.itsm_references[] updated + β”‚ + β–Ό ITSM integration reports back to DCM + itsm_reference_created event published + External record ID in audit record + +───────────────────────────────────────────── + +ITSM system approves a change record + β”‚ + β–Ό ITSM system calls DCM API (via webhook or polling) + β”‚ POST /api/v1/admin/approvals/{uuid}:vote + β”‚ { decision: "approve", recorded_via: "servicenow", + β”‚ external_reference: "CHG0012345" } + β”‚ + β–Ό DCM records approval vote + β”‚ approval.decision_recorded event + β”‚ + β–Ό Pipeline resumes if quorum/tier satisfied +``` + +### 1.3 Capability Declaration + +```yaml +itsm_provider_capabilities: + itsm_system: servicenow | jira_service_management | bmc_remedy | bmc_helix | + freshservice | zendesk | pagerduty | opsgenie | manageengine | + cherwell | topdesk | generic_rest + + # What this provider can do + supported_actions: + - create_change_request # create a change record for DCM provisioning events + - update_change_request # update change record on state transitions + - close_change_request # close change record on realization/failure + - create_incident # create incident for failures, drift, security events + - update_incident # update incident on resolution + - close_incident # close incident on recovery + - update_cmdb_ci # update CMDB configuration item record + - create_cmdb_ci # create new CMDB CI for realized entities + - retire_cmdb_ci # retire CMDB CI on decommission + - create_service_request # create service request record + - link_parent_record # link DCM entity to existing ITSM record + - inbound_approval # accept approval decisions from ITSM system + - inbound_request_initiation # allow ITSM workflows to submit DCM requests + + # System connectivity + endpoint_url: # ITSM system API base URL + api_version: # system-specific API version + auth_credential_uuid: # references credential management service + + # Bidirectional webhook (for inbound) + inbound_webhook: + enabled: + secret_credential_uuid: # HMAC secret for webhook verification + + # Field mappings (system-specific) + field_mapping_ref: # path to field mapping YAML in Layer Store + + # CMDB CI type mapping + cmdb_ci_type_map: + - dcm_resource_type: Compute.VirtualMachine + itsm_ci_type: cmdb_ci_server # ServiceNow CI class + - dcm_resource_type: Network.VLAN + itsm_ci_type: cmdb_ci_network_gear + - dcm_resource_type: Storage.Volume + itsm_ci_type: cmdb_ci_storage_device +``` + +### 1.4 Required API Endpoints (ITSM integration implements) + +``` +POST {provider_base}/actions # DCM submits action requests +GET {provider_base}/actions/{action_id} # DCM checks action status +GET {provider_base}/records/{record_id} # DCM retrieves record status +POST {provider_base}/inbound # ITSM system sends inbound events +GET /health # standard OIS health check +``` + +### 1.5 DCM Entity ITSM References + +Realized entities gain an `itsm_references` block in business data: + +```yaml +itsm_references: + - system: servicenow + provider_uuid: + record_type: change_request + record_id: "CHG0012345" + record_url: "https://corp.service-now.com/nav_to.do?uri=change_request.do?sys_id=..." + created_at: + status: approved # DCM's view of the record status + last_synced_at: + + - system: jira_service_management + provider_uuid: + record_type: incident + record_id: "INC-4821" + record_url: "https://corp.atlassian.net/browse/INC-4821" + created_at: + status: open + last_synced_at: +``` + +--- + +## 2. Supported ITSM Systems + +### 2.1 ServiceNow + +**API:** REST Table API (`/api/now/table/`), Business Rule webhooks, Flow Designer + +```yaml +# ServiceNow ITSM integration registration +itsm_provider_registration: + provider_handle: "servicenow-prod" + itsm_system: servicenow + endpoint_url: "https://corp.service-now.com" + api_version: "v2" + auth_credential_uuid: # api_key or oauth2 credential + + supported_actions: + - create_change_request # β†’ change_request table + - update_change_request + - close_change_request + - create_incident # β†’ incident table + - update_cmdb_ci # β†’ cmdb_ci_server (or mapped class) + - create_cmdb_ci + - retire_cmdb_ci + - inbound_approval # Change Advisory Board approval β†’ DCM vote + + # ServiceNow-specific field mapping + change_request_template: + assignment_group: "Infrastructure Automation" + category: "Software" + risk: "2" # Low + impact: "3" # Low + # DCM fields injected at runtime: + short_description: "DCM: Provision {resource_type} '{entity_handle}'" + description: "Requested by: {actor_handle}\nTenant: {tenant_handle}\nDCM Request: {request_uuid}" + + # CAB approval β†’ DCM vote mapping + inbound_approval: + webhook_url: "https://dcm.corp/api/v1/admin/approvals/{approval_uuid}:vote" + trigger_on: "change_request.state β†’ 'Approved'" + decision_field: "state" + decision_map: + "Approved": "approve" + "Rejected": "reject" + "Cancelled": "reject" + external_reference_field: "number" # β†’ CHG0012345 + + cmdb_ci_type_map: + - dcm_resource_type: Compute.VirtualMachine + itsm_ci_type: cmdb_ci_server + - dcm_resource_type: Network.VLAN + itsm_ci_type: cmdb_ci_netgear + - dcm_resource_type: Storage.Volume + itsm_ci_type: cmdb_ci_disk + - dcm_resource_type: Kubernetes.Cluster + itsm_ci_type: cmdb_ci_kubernetes_cluster +``` + +**Inbound: CAB approval flow** + +``` +ServiceNow Change Advisory Board approves CHG0012345 + β”‚ + β–Ό ServiceNow Business Rule fires on state change β†’ "Approved" + β”‚ Calls DCM webhook: POST /api/v1/admin/approvals/{uuid}:vote + β”‚ Headers: X-ServiceNow-Signature: + β”‚ Body: { decision: "approve", recorded_via: "servicenow", + β”‚ external_reference: "CHG0012345" } + β”‚ + β–Ό DCM verifies HMAC signature against secret_credential_uuid + β”‚ Records approval vote + β”‚ Pipeline resumes if tier satisfied +``` + +### 2.2 Jira Service Management (Atlassian) + +**API:** REST API v3, Atlassian Connect webhooks, Automation rules + +```yaml +itsm_provider_registration: + provider_handle: "jira-service-mgmt-prod" + itsm_system: jira_service_management + endpoint_url: "https://corp.atlassian.net" + api_version: "3" + auth_credential_uuid: # API token or OAuth2 + + supported_actions: + - create_change_request # β†’ Jira issue (Change type) + - update_change_request + - close_change_request + - create_incident # β†’ Jira issue (Incident type) + - create_service_request # β†’ Jira issue (Service Request type) + - inbound_approval # Jira Change approval β†’ DCM vote + + change_request_template: + project_key: "OPS" + issue_type: "Change" + summary: "DCM: {resource_type} '{entity_handle}'" + description: | + *Requested by:* {actor_handle} + *Tenant:* {tenant_handle} + *DCM Request UUID:* {request_uuid} + *Catalog Item:* {catalog_item_handle} + priority: "Medium" + labels: ["dcm-automated", "{tenant_handle}"] + + inbound_approval: + webhook_url: "https://dcm.corp/api/v1/admin/approvals/{approval_uuid}:vote" + trigger_on: "issue.status β†’ 'Approved'" + decision_map: + "Approved": "approve" + "Declined": "reject" + external_reference_field: "key" # β†’ OPS-4821 +``` + +### 2.3 BMC Remedy / Helix ITSM + +**API:** REST API (Remedy AR System REST), webhook callbacks + +```yaml +itsm_provider_registration: + provider_handle: "bmc-helix-prod" + itsm_system: bmc_helix + endpoint_url: "https://remedy.corp.example.com/api/arsys/v1" + api_version: "v1" + + supported_actions: + - create_change_request # β†’ CHG:Infrastructure Change + - update_change_request + - close_change_request + - create_incident # β†’ HPD:Help Desk + - update_cmdb_ci # β†’ AST:Config Item + - inbound_approval + + change_request_template: + form: "CHG:Infrastructure Change" + Location_Company: "{tenant_handle}" + Summary: "DCM: Provision {resource_type} '{entity_handle}'" + Categorization_Tier_1: "Infrastructure" + Categorization_Tier_2: "Provisioning" + Change_Type: "Normal" +``` + +### 2.4 Freshservice + +```yaml +itsm_provider_registration: + provider_handle: "freshservice-prod" + itsm_system: freshservice + endpoint_url: "https://corp.freshservice.com/api/v2" + + supported_actions: + - create_change_request + - update_change_request + - close_change_request + - create_incident + - create_service_request + + change_request_template: + type: "Normal" + risk: "Low" + impact: "Low" + subject: "DCM: {resource_type} '{entity_handle}'" + description: "Tenant: {tenant_handle} | Actor: {actor_handle} | Request: {request_uuid}" + group_id: +``` + +### 2.5 PagerDuty (Incident Management) + +```yaml +itsm_provider_registration: + provider_handle: "pagerduty-prod" + itsm_system: pagerduty + endpoint_url: "https://api.pagerduty.com" + + supported_actions: + - create_incident # for DCM failures, drift, security events + - update_incident + - close_incident + + # PagerDuty Events API v2 + incident_template: + service_id: + escalation_policy_id: + payload: + summary: "DCM {event_type}: {entity_handle}" + severity: "{{ drift_severity | map: criticalβ†’critical, significantβ†’error, moderateβ†’warning, minorβ†’info }}" + source: "dcm" + custom_details: + entity_uuid: "{entity_uuid}" + tenant: "{tenant_handle}" + dcm_event: "{event_type}" +``` + +### 2.6 Generic REST (Custom ITSM) + +For ITSM systems not natively supported, the `generic_rest` type allows template-based HTTP calls: + +```yaml +itsm_provider_registration: + provider_handle: "custom-itsm-prod" + itsm_system: generic_rest + endpoint_url: "https://itsm.corp.example.com/api" + + action_templates: + - action: create_change_request + method: POST + path: "/changes" + headers: + Content-Type: "application/json" + X-API-Key: "{{ credential_value }}" + body_template: | + { + "title": "DCM: {{ resource_type }} '{{ entity_handle }}'", + "requested_by": "{{ actor_handle }}", + "category": "Infrastructure", + "external_id": "{{ request_uuid }}" + } + response_id_path: "$.id" # JSONPath to extract record ID from response + + - action: inbound_approval + inbound_field: "status" + decision_map: + "approved": "approve" + "rejected": "reject" +``` + +--- + +## 3. ITSM Policy Type + +### 3.1 What an ITSM Policy Is + +An **ITSM Policy** is a new DCM Policy output type (8th, alongside GateKeeper, Validation, Transformation, Recovery, Orchestration Flow, Governance Matrix Rule, and Lifecycle Policy). + +It fires as a **side-effect policy** β€” it does not block pipeline execution (it is not a GateKeeper) and does not transform the payload. It fires on a DCM event and triggers an ITSM action via a registered ITSM integration. The pipeline continues whether or not the ITSM action succeeds; ITSM failures are logged and alerted but do not block DCM operations. + +**Key distinction:** An ITSM Policy is about *record-keeping and integration* with external governance systems. A GateKeeper Policy is about *allowing or blocking* operations. These are complementary, not competing. + +### 3.2 Output Schema + +```yaml +# ITSM Policy output schema +itsm_policy_output: + type: itsm_action # new output type identifier + + # Required + itsm_provider_uuid: # which ITSM integration to call + action: create_change_request | update_change_request | close_change_request | + create_incident | update_incident | close_incident | + update_cmdb_ci | create_cmdb_ci | retire_cmdb_ci | + create_service_request | link_parent_record + + # Payload β€” fields to pass to ITSM integration + # Supports template variables from the triggering event payload + action_payload: + : + + # How to handle ITSM failure + on_failure: log_and_continue | alert_and_continue | alert_only + + # Store the ITSM record reference on the DCM entity (optional) + store_reference_on_entity: + reference_label: # human-readable label for the reference + + # Require ITSM record creation before dispatch (optional β€” see note) + block_until_created: # default: false + block_timeout: # max wait if block_until_created: true +``` + +> **`block_until_created`:** When `true`, the ITSM Policy behaves like a pre-dispatch gate β€” DCM waits for the ITSM record to be created before dispatching to the Service Provider. This is used when organizational policy requires a change record to exist before any provisioning begins. When `false` (default), the ITSM record is created in parallel with or after dispatch β€” suitable for notification-only use cases. + +### 3.3 Example Policies + +#### Policy 1: Create Change Request on Dispatch (ServiceNow) + +```yaml +policy_handle: "create-change-on-dispatch" +policy_type: itsm_action +enforcement_level: soft +status: active + +match: + payload_type: request.dispatched + conditions: + - field: resource_type + operator: in + value: [Compute.VirtualMachine, Storage.Volume, Network.VLAN] + +output: + type: itsm_action + itsm_provider_uuid: + action: create_change_request + action_payload: + short_description: "DCM: Provision {{ resource_type }} '{{ entity_handle }}'" + description: | + Automated provisioning via DCM. + Request UUID: {{ request_uuid }} + Actor: {{ actor_handle }} + Tenant: {{ tenant_handle }} + Catalog Item: {{ catalog_item_handle }} + risk: "{{ risk_score | map: <25β†’'Low', <60β†’'Medium', elseβ†’'High' }}" + store_reference_on_entity: true + reference_label: "Change Request" + on_failure: alert_and_continue +``` + +#### Policy 2: Block Dispatch Until Change Record Exists (Compliance Gate) + +```yaml +policy_handle: "require-change-record-before-dispatch" +policy_type: itsm_action +enforcement_level: hard +status: active + +match: + payload_type: request.layers_assembled + conditions: + - field: tenant_handle + operator: in + value: [payments-team, pci-scope-team] + +output: + type: itsm_action + itsm_provider_uuid: + action: create_change_request + action_payload: + short_description: "DCM: {{ resource_type }} provision β€” {{ tenant_handle }}" + change_type: "Normal" + assignment_group: "Change Advisory Board" + store_reference_on_entity: true + reference_label: "Change Request (PCI Scope)" + block_until_created: true + block_timeout: PT30M + on_failure: alert_and_continue +``` + +#### Policy 3: Update CMDB on Realization + +```yaml +policy_handle: "sync-cmdb-on-realization" +policy_type: itsm_action +status: active + +match: + payload_type: entity.realized + conditions: + - field: resource_type + operator: in + value: [Compute.VirtualMachine, Compute.BareMetalServer] + +output: + type: itsm_action + itsm_provider_uuid: + action: create_cmdb_ci + action_payload: + name: "{{ entity_handle }}" + ip_address: "{{ realized_fields.primary_ip }}" + os: "{{ realized_fields.os_family }}" + managed_by: "DCM" + environment: "{{ tenant_handle }}" + correlation_id: "{{ entity_uuid }}" + store_reference_on_entity: true + reference_label: "CMDB CI" + on_failure: alert_and_continue +``` + +#### Policy 4: Create Incident on Drift (Jira) + +```yaml +policy_handle: "create-incident-on-critical-drift" +policy_type: itsm_action +status: active + +match: + payload_type: drift.detected + conditions: + - field: drift_severity + operator: in + value: [significant, critical] + +output: + type: itsm_action + itsm_provider_uuid: + action: create_incident + action_payload: + summary: "DCM Drift: {{ entity_handle }} β€” {{ drift_severity }}" + description: | + DCM has detected significant configuration drift. + Entity: {{ entity_handle }} ({{ entity_uuid }}) + Severity: {{ drift_severity }} + Drifted fields: {{ drifted_fields | count }} fields + Detected at: {{ discovered_at }} + View in DCM: https://dcm.corp/resources/{{ entity_uuid }}/drift + priority: "{{ drift_severity | map: criticalβ†’'Highest', significantβ†’'High' }}" + labels: ["dcm-drift", "{{ resource_type | slugify }}"] + store_reference_on_entity: true + reference_label: "Drift Incident" + on_failure: log_and_continue +``` + +#### Policy 5: Retire CMDB CI on Decommission + +```yaml +policy_handle: "retire-cmdb-on-decommission" +policy_type: itsm_action +status: active + +match: + payload_type: entity.decommissioned + +output: + type: itsm_action + itsm_provider_uuid: + action: retire_cmdb_ci + action_payload: + correlation_id: "{{ entity_uuid }}" # find CI by DCM entity UUID + install_status: "7" # ServiceNow: Retired + retired_at: "{{ event_timestamp }}" + decommission_reason: "DCM decommission β€” {{ actor_handle }}" + on_failure: alert_and_continue +``` + +#### Policy 6: Close Change Record on Completion + +```yaml +policy_handle: "close-change-on-completion" +policy_type: itsm_action +status: active + +match: + payload_type: request.realized + conditions: + - field: entity.itsm_references[?(@.record_type=='change_request')].record_id + operator: exists + +output: + type: itsm_action + itsm_provider_uuid: + action: close_change_request + action_payload: + state: "3" # ServiceNow: Closed + close_code: "Successful" + close_notes: "Provisioning completed successfully by DCM. Entity: {{ entity_uuid }}" + on_failure: log_and_continue +``` + +--- + +## 4. ITSM integration System Policies + +| Policy | Rule | +|--------|------| +| `ITSM-001` | ITSM integrations implement the base Provider contract (PRV-001) including registration, health check, sovereignty declaration, and zero trust authentication. ITSM system connectivity credentials must reference a registered credential management service β€” no plaintext credentials in provider registration. | +| `ITSM-002` | DCM does not require ITSM integration to function. ITSM Policies with `on_failure: alert_and_continue` (the default) never block DCM pipeline execution. Organizations must explicitly set `block_until_created: true` to gate pipeline on ITSM record creation. | +| `ITSM-003` | Inbound events from ITSM systems must be authenticated. ITSM integrations must verify HMAC signatures or OAuth tokens on all inbound webhooks before forwarding to DCM. Unauthenticated inbound events are rejected and logged. | +| `ITSM-004` | ITSM record references stored on DCM entities follow entity lifecycle β€” they are included in the Realized State record, preserved through updates, and retained in the decommissioned entity record for audit purposes. | +| `ITSM-005` | ITSM Policies that use `block_until_created: true` must declare a `block_timeout`. If the ITSM system does not confirm record creation within the timeout, the policy fires `on_failure` behavior and the block is released β€” the pipeline continues. A blocked pipeline is never permanently stalled by ITSM unavailability. | +| `ITSM-006` | Field mappings between DCM entities and ITSM CI types must be declared in the ITSM integration capability registration. Unmapped resource types are silently skipped for CMDB sync actions. | +| `ITSM-007` | Template expressions in ITSM Policy `action_payload` fields must resolve using values from the triggering event payload. Template expressions that reference unavailable fields produce a warning in the audit record and substitute an empty string. They do not block ITSM action execution. | + +--- + +## 5. ITSM Policy System Policies + +| Policy | Rule | +|--------|------| +| `ITSM-POL-001` | ITSM Policies follow the full Policy base contract (B-policy-contract.md): lifecycle (developing β†’ proposed β†’ active), shadow mode validation, audit obligation on every evaluation, domain precedence. | +| `ITSM-POL-002` | ITSM Policies are side-effect policies β€” they do not produce pipeline decisions (allow/deny/transform). They may not be used as GateKeeper substitutes except through the explicit `block_until_created: true` mechanism, which has its own timeout guarantee (ITSM-005). | +| `ITSM-POL-003` | ITSM Policy evaluation is recorded in the audit trail. The audit record includes: policy handle, matched event, ITSM provider UUID, action requested, ITSM record ID returned, and outcome (success/failure/timeout). | +| `ITSM-POL-004` | Multiple ITSM Policies may fire on the same event. All fire independently β€” one policy's failure does not prevent other ITSM Policies from executing. | + +--- + +## 6. Additions to the Foundations Document + +The foundations document provider type table gains a 12th row: + +| Provider Type | Capability | Data direction | +|--------------|-----------|----------------| +| **ITSM integration** | Bidirectional integration with ITSM systems; creates/updates ITSM records from DCM events; routes ITSM approvals back to DCM | DCM β†’ ITSM (outbound) / ITSM β†’ DCM (inbound) | + +The foundations document policy type table gains an 8th entry: + +| Policy Type | Output | Pipeline role | +|------------|--------|---------------| +| **ITSM Action** | Triggers action in connected ITSM system; optionally stores record reference on entity; optionally gates pipeline on record creation | Side-effect (non-blocking by default) | + +--- + +## 7. Event Catalog Additions + +Two new events for the Event Catalog (doc 33): + +| Event Type | Urgency | Trigger | +|-----------|---------|---------| +| `itsm.record_created` | info | ITSM integration successfully created a record in external system | +| `itsm.record_failed` | medium | ITSM integration failed to create/update record; `block_until_created` timeout reached | + +These extend the existing event catalog with a new `itsm.*` domain prefix. + +--- + +## 8. Standards Catalog Addition + +ITSM integration standards and protocols used: + +| Standard | Use in DCM ITSM | +|----------|----------------| +| ServiceNow REST Table API | Primary integration for ServiceNow create/update/query | +| Jira REST API v3 | Primary integration for Atlassian Jira Service Management | +| BMC AR REST API v1 | Primary integration for BMC Remedy/Helix | +| PagerDuty Events API v2 | Incident creation for alert-type ITSM integrations | +| ITIL v4 Change Management | Conceptual framework for DCM change record lifecycle mapping | +| JSON:API | Standard used by several ITSM REST APIs | +| HMAC-SHA256 | Inbound webhook signature verification for all ITSM systems | + +--- + +*Document maintained by the DCM Project. For questions or contributions see [GitHub](https://github.com/dcm-project).* diff --git a/architecture/integrations/kessel-evaluation.md b/architecture/integrations/kessel-evaluation.md new file mode 100644 index 0000000..6e02652 --- /dev/null +++ b/architecture/integrations/kessel-evaluation.md @@ -0,0 +1,470 @@ +# DCM β€” Kessel Integration Evaluation + +**Document Status:** πŸ“‹ Draft β€” For Discussion +**Document Type:** Integration Evaluation β€” Pre-Implementation +**Purpose:** This document evaluates the potential integration of DCM with the [Kessel project](https://github.com/project-kessel) for identity/access management and resource inventory. It is intended as a basis for discussion with the Kessel development team. **No architectural changes should be made to DCM based on this document until alignment with the Kessel team is confirmed.** + +**Related Documents:** [Auth Providers](https://github.com/croadfeldt/udlm/blob/main/governance/auth-providers.md) | [Universal Group Model](https://github.com/croadfeldt/udlm/blob/main/observability/universal-groups.md) | [Entity Relationships](https://github.com/croadfeldt/udlm/blob/main/entities/entity-relationships.md) | [Four States](https://github.com/croadfeldt/udlm/blob/main/foundations/four-states.md) | [Accreditation and Zero Trust](https://github.com/croadfeldt/udlm/blob/main/governance/accreditation-and-authorization-matrix.md) | [Control Plane Components](../control-plane/components.md) | [Provider Callback Authentication](https://github.com/croadfeldt/udlm/blob/main/contracts/provider-callback-auth.md) + +**Related Projects:** [project-kessel](https://github.com/project-kessel) | [SpiceDB](https://github.com/authzed/spicedb) | [Google Zanzibar](https://research.google/pubs/zanzibar-googles-consistent-global-authorization-system/) + +--- + +## 1. Executive Summary + +Kessel is a Red Hat project providing two capabilities: **Kessel Relations** (Relationship-Based Access Control built on SpiceDB, a Google Zanzibar implementation) and **Kessel Asset Inventory** (a hybrid cloud resource state tracking service with a common Protobuf/gRPC API). + +DCM has architecturally similar needs in both areas. The evaluation concludes: + +- **Kessel Relations** has strong alignment with DCM's access control requirements. The permission model maps cleanly, and the operational benefits β€” Zanzibar-style consistency, scalable graph traversal, shared source of truth across Red Hat products β€” are meaningful. Integration path exists via DCM's Auth Provider abstraction. + +- **Kessel Inventory** has partial alignment with DCM's Discovered State store. The fit is real but narrower than it might appear: Kessel Inventory is a current-state snapshot system; DCM's inventory is a four-state lifecycle model with field-level provenance, drift detection, and append-only audit. Integration path exists via DCM's data store abstraction. + +**Recommended next step:** Discussion with the Kessel development team to validate assumptions, confirm schema extensibility for DCM-specific resource types, and understand the Kessel Relations API stability and sovereign/air-gapped deployment model. + +--- + +## 2. What Kessel Provides + +### 2.1 Kessel Relations + +Kessel Relations is an authorization service built on [SpiceDB](https://github.com/authzed/spicedb), which implements the [Google Zanzibar](https://research.google/pubs/zanzibar-googles-consistent-global-authorization-system/) consistent global authorization model. + +**Core model β€” Relationship-Based Access Control (ReBAC):** +- Resources and subjects are defined in a typed schema +- Relationships between subjects and resources are stored as tuples: `subject:X relation:Y object:Z` +- Permissions are computed by evaluating the relationship graph: "does user X have permission `submit_request` on tenant T?" traverses all paths from X to T through groups, roles, and other relationships +- Transitive relationships are handled natively: if user X is a member of group G, and group G has `admin` on tenant T, X inherits `admin` on T + +**Zanzibar consistency model:** +- Snapshot reads: consistent reads at a point in time +- Zookie tokens: causality tokens that guarantee "read your own writes" without requiring full global linearizability β€” after writing a relationship tuple, the response includes a zookie; subsequent reads with that zookie are guaranteed to observe the write + +**gRPC API surface (from the Kessel project):** +- `CheckPermission(subject, permission, resource)` β†’ allow/deny +- `LookupResources(subject, permission, resource_type)` β†’ list of resources subject has permission on +- `LookupSubjects(resource, permission, subject_type)` β†’ list of subjects that have permission on resource +- `WriteRelationships(tuples)` β†’ write relationship tuples +- `DeleteRelationships(filter)` β†’ remove relationship tuples + +### 2.2 Kessel Asset Inventory + +Kessel Asset Inventory is a resource tracking service designed to provide a unified inventory view across hybrid cloud infrastructure β€” OpenShift clusters, RHEL systems, edge devices, and other Red Hat-managed resources. + +**Core model:** +- Resources are described using a common Protobuf schema with a typed `ResourceType` and a `Spec` for type-specific fields +- Current state is tracked as an upsertable snapshot β€” last-write wins +- gRPC streaming API for push (providers send state updates) and pull (consumers query current state) +- Integration with Kessel Relations for auth-filtered inventory queries: "what resources of type X does subject Y have access to?" + +**Intended use case:** Giving tools like ACM (Advanced Cluster Management), Insights, and the Hybrid Cloud Console a single query surface for "what exists across my estate?" + +--- + +## 3. DCM's Current Model β€” What Needs to Be Understood + +Before evaluating integration, it is important to characterize what DCM already has in both areas. + +### 3.1 DCM's Authorization Model + +DCM's current authorization model has five components working together: + +**Auth Providers** (doc 19) β€” DCM delegates authentication to registered Auth Providers (LDAP, OIDC, FreeIPA, Active Directory, mTLS). Auth Providers are registered through the standard Provider contract. Multiple Auth Providers can be active simultaneously. Auth Providers return: authenticated actor identity, group memberships, roles. + +**Universal Group Model** (doc 15) β€” DCM groups (`DCMGroup`) are typed by `group_class`. The classes relevant to authorization: +- `tenant_boundary` β€” the ownership and isolation boundary; every resource entity belongs to exactly one tenant +- `cross_tenant_authorization` β€” the formal mechanism for Tenant A to grant Tenant B access to a specific resource +- `policy_collection` β€” groups that activate policy sets +- DCMGroup membership is the basis for role resolution and policy application + +**RBAC via role mapping** β€” Auth Providers map external groups to DCM roles (`consumer`, `platform_admin`, `sre`, etc.). The Policy Engine uses roles + group membership to evaluate access. + +**Five-check boundary model** (doc 26) β€” Every interaction crosses five checks in sequence: identity verification β†’ authorization β†’ accreditation β†’ data/capability matrix β†’ sovereignty. Checks 1 and 2 are RBAC. Checks 3–5 are DCM-specific and involve accreditation records, data classification, and sovereignty zones. + +**Cross-tenant authorization records** β€” When Tenant A grants Tenant B access to a resource, a `cross_tenant_authorization` DCMGroup is created. The Policy Engine checks for the existence of this record when evaluating cross-tenant requests. + +**What DCM asks for in authorization decisions:** +1. Does actor X have role Y within tenant T? +2. What catalog items is actor X allowed to see? (RBAC-filtered list) +3. Can actor X perform operation O on resource R? (role + tenant ownership) +4. Does tenant T have a cross-tenant authorization to use resource R owned by tenant T2? +5. Is actor X a member of DCMGroup G with the required quorum? (approval gates, `authorized` tier) + +### 3.2 DCM's Inventory Model + +DCM's inventory is the **Four States model** (doc 02). This is meaningfully different from a general-purpose resource inventory. + +**Intent State** β€” The consumer's declared desired state. Stored as a GitOps artifact (PR-based workflow). Immutable after creation. Not a snapshot β€” it is the authoritative record of what was requested and why. + +**Requested State** β€” The assembled payload after layer enrichment, policy evaluation, and placement resolution. Write-once. Contains the full data model payload that was dispatched to the provider, including field-level provenance tracing every value back to its source. + +**Realized State** β€” An append-only event stream of what the provider actually built. Every realization event is a new record β€” not an upsert. Contains field-level provenance from the provider. The relationship between a Realized State record and its corresponding Requested State record is explicit and mandatory. + +**Discovered State** β€” An ephemeral snapshot of what the provider currently reports as existing, obtained through active discovery polling. Used by the Drift Reconciliation Component to compare against Realized State. + +**What DCM asks for in inventory decisions:** +1. What is the current lifecycle state of entity UUID X? (Realized State read) +2. What resources does tenant T own? (indexed query over Realized State) +3. What entities have relationship R to entity X? (Entity Relationship Graph, doc 09) +4. What is the field-level provenance of field F on entity X? (Realized State metadata) +5. What entities are currently drifted? (Drift Record Store, DRC component output) +6. What did we discover vs what do we have as realized? (Drift comparison) +7. What happened to entity X over its full lifecycle? (Audit Store, time-indexed) + +--- + +## 4. Integration Analysis + +### 4.1 Kessel Relations β€” Authorization Backend + +#### Mapping DCM's Permission Model to SpiceDB + +DCM's five authorization questions map to SpiceDB as follows: + +``` +# Proposed SpiceDB schema for DCM +definition user {} + +definition group { + relation member: user | group#member + relation parent_group: group + permission member = member + parent_group->member +} + +definition tenant { + relation member: user | group#member + relation admin: user | group#member + relation platform_admin: user | group#member + permission submit_request = member + admin + platform_admin + permission manage_resources = admin + platform_admin + permission administer = platform_admin +} + +definition resource { + relation owner_tenant: tenant + relation authorized_tenant: tenant # cross-tenant authorization + relation viewer: user | group#member + permission read = owner_tenant->member + authorized_tenant->member + viewer + permission modify = owner_tenant->admin + permission decommission = owner_tenant->admin +} + +definition dcm_group { + relation member: user | group#member + relation quorum_threshold: integer # NOTE: see Section 4.1.2 +} +``` + +**Question 1** (does actor X have role Y in tenant T?) β†’ `CheckPermission(user:X, permission:submit_request, tenant:T)` + +**Question 2** (what catalog items can actor X see?) β†’ `LookupResources(user:X, permission:read, resource_type:catalog_item)` + +**Question 3** (can actor X do operation O on resource R?) β†’ `CheckPermission(user:X, permission:modify, resource:R)` + +**Question 4** (does tenant T have cross-tenant authorization on resource R?) β†’ `CheckPermission(tenant:T#member, permission:read, resource:R)` β€” satisfied if `authorized_tenant` relationship exists + +**Question 5** (approval gate quorum) β€” **Does not map cleanly to SpiceDB.** See Section 4.1.2. + +#### 4.1.2 Approval Gate Quorum β€” The Gap + +DCM's `authorized` tier approval requires N of M members of a declared DCMGroup to record decisions before an operation proceeds. SpiceDB is a membership and permission graph β€” it answers "does this subject have this permission?" but it does not count decisions or track quorum state across time. + +**Resolution:** The approval gate workflow stays in DCM's Policy Engine regardless of Kessel integration. Kessel Relations handles who *can* approve (membership in the DCMGroup); DCM's Approval Store tracks who *has* approved and whether quorum is reached. + +This is a clean boundary: Kessel answers the structural question ("is this actor authorized to vote?"); DCM answers the state question ("how many valid votes have been recorded?"). + +#### 4.1.3 DCM's Entity Relationship Graph is NOT an Authorization Graph + +This is a critical distinction. DCM's entity relationships β€” `requires`, `constituent`, `shareable`, `allocated_from`, `peer` β€” are **operational relationships between infrastructure resources**, not access control relationships. They express: "VM X requires Storage Y", "Composite C has constituent VM X." + +These must **not** be stored in Kessel Relations. They are: +- Semantically different from access control (lifecycle implications, not permissions) +- DCM-specific (not meaningful to any other system consuming Kessel) +- Owned by DCM's entity lifecycle model + +DCM's Entity Relationship Graph (doc 09) remains entirely in DCM regardless of Kessel integration. + +#### 4.1.4 Checks 3–5 of the Five-Check Boundary Model + +DCM's five-check boundary model (identity β†’ authorization β†’ accreditation β†’ data matrix β†’ sovereignty) maps to Kessel Relations only for checks 1 and 2. Checks 3–5 are DCM-specific: + +- **Accreditation** (check 3): Does the target provider hold the required accreditation for the data classification present? This involves DCM's Accreditation Registry and is not a subject/permission/resource question. +- **Data/Capability Matrix** (check 4): Is each field permitted to cross this boundary given its classification? This involves DCM's Governance Matrix policies. +- **Sovereignty** (check 5): Is the target endpoint within the sovereignty boundary? This involves DCM's Sovereignty Zone declarations. + +None of checks 3–5 can be delegated to Kessel Relations. They remain in DCM's Policy Engine. + +#### 4.1.5 Integration Path via Auth Provider Abstraction + +DCM's Auth Provider abstraction (doc 19) is the natural integration point. Kessel Relations would register as a DCM Auth Provider or External Policy Evaluator: + +```yaml +kessel_relations_auth_provider: + provider_type: auth_provider + auth_mode: kessel_rebac + endpoint: https://kessel-relations.internal:9000 + schema_ref: + + # What this provider handles: + handles: + - check_permission # CheckPermission calls + - lookup_resources # LookupResources calls + - lookup_subjects # LookupSubjects calls + + # What stays in DCM's Policy Engine: + does_not_handle: + - accreditation_checks + - data_classification_matrix + - sovereignty_checks + - approval_gate_quorum +``` + +DCM's Policy Engine calls the Kessel Relations provider for authorization questions (checks 1 and 2) and evaluates checks 3–5 internally. The five-check sequence is preserved; only the implementation of checks 1–2 changes. + +**Zookie handling:** DCM's API Gateway must thread zookie tokens through the request lifecycle: when a relationship is written (e.g., a new cross-tenant authorization is created), the resulting zookie is stored and used for subsequent permission checks in the same request context, guaranteeing consistency. + +--- + +### 4.2 Kessel Inventory β€” Discovered State Store + +#### 4.2.1 The Fit + +Of DCM's four stores, **Discovered State** is the only one Kessel Inventory could plausibly replace. The reasons: + +- Discovered State is the most ephemeral store β€” it is overwritten on each discovery cycle +- Discovered State does not require immutability or append-only semantics β€” it represents "what the provider reports right now" +- Discovered State is the "current state of infrastructure" β€” exactly what Kessel Inventory is designed to track +- Other Red Hat tools consuming Kessel Inventory would benefit from seeing the same discovered state that DCM uses for drift detection + +The other three stores β€” Intent, Requested, and Realized β€” **cannot** be replaced by Kessel Inventory: +- Intent and Requested State require GitOps semantics (PR workflow, immutability, version history) +- Realized State requires append-only event stream semantics with field-level provenance and hash chain integrity +- None of DCM's lifecycle or audit requirements are in scope for Kessel Inventory + +#### 4.2.2 The Schema Alignment Question + +DCM's Discovered State uses the same unified data model format as Realized State β€” the DCM Resource Type Spec schema. Kessel Inventory uses a Protobuf-defined common resource schema. + +For standard resource types (Compute, Network, Storage that map to well-known infrastructure concepts), the alignment is likely achievable. For DCM-specific resource types (Automation.AnsiblePlaybook, Platform.KubernetesCluster, custom org-defined types), schema extension or mapping is required. + +**Open question for Kessel team:** How extensible is the Kessel Inventory resource type schema? Can DCM register custom resource types? Is there a type registry mechanism analogous to DCM's Resource Type Registry? + +#### 4.2.3 Drift Detection Logic Stays in DCM + +Kessel Inventory is a state store, not a drift detection system. Even if DCM uses Kessel Inventory as the Discovered State store, the Drift Reconciliation Component (doc 25, DRC domain) remains entirely in DCM: + +- DRC queries Kessel Inventory for current discovered state +- DRC compares discovered state against DCM's Realized State +- DRC classifies differences by field criticality and change magnitude +- DRC produces Drift Records with SECURITY_DEGRADATION, BROKEN_REFERENCE, UNSANCTIONED_CHANGE classifications +- DRC writes Drift Records to DCM's Drift Record Store + +Kessel Inventory's role is purely as the data source for the "what currently exists" side of the comparison. The intelligence stays in DCM. + +#### 4.2.4 Integration Path via data store Abstraction + +DCM's data store abstraction (doc 11) is the natural integration point. The Discovered Store would be implemented as a `storage_sub_type: snapshot_store` data store backed by Kessel Inventory: + +```yaml +kessel_inventory_(prescribed infrastructure): + provider_type: (prescribed infrastructure) + storage_sub_type: snapshot_store + backend: kessel_inventory + endpoint: https://kessel-inventory.internal:9001 + + # DCM uses this provider for: + used_for: discovered_state + + # Write contract: provider calls POST /api/v1/instances/{id}/status + # which DCM translates to Kessel Inventory upsert + write_model: upsert_current_state + + # Read contract: DRC queries Kessel for discovered state + read_model: streaming_query_by_type_and_tenant +``` + +This means the Kessel Inventory integration requires **no changes to DCM's data model** β€” only a new data store implementation. The Drift Reconciliation Component calls the same Discovered State Store interface; the underlying implementation happens to be Kessel Inventory. + +--- + +## 5. Deployment and Sovereignty Considerations + +### 5.1 Air-Gapped and Sovereign Deployments + +DCM's `sovereign` profile requires air-gapped operation with no external dependencies. Any Kessel integration must support: + +- Local/on-premises Kessel deployment (not cloud-hosted) +- Offline operation when Kessel is temporarily unavailable (cached authorization decisions for read-only operations) +- mTLS between DCM and Kessel instances + +**Open question for Kessel team:** What is Kessel's deployment model for sovereign/air-gapped environments? Is there a supported on-premises deployment path? What is the operational footprint? + +### 5.2 Multi-Instance Federation + +DCM supports federation between multiple DCM instances (doc 22). A federated deployment may have multiple Kessel Relations instances (one per region or sovereignty zone) or a single shared instance. + +**Open question for Kessel team:** How does Kessel Relations handle multi-region replication? Can SpiceDB schema and relationship data be replicated across sovereignty boundaries? What are the consistency guarantees in a federated topology? + +### 5.3 Failure Mode Analysis + +If Kessel Relations is unavailable, DCM cannot evaluate authorization checks 1–2 of the five-check model, which means DCM cannot process any requests. This is a critical dependency. + +**Required mitigation strategies:** +- Read-through cache for CheckPermission results (short TTL, profile-governed) +- Circuit breaker: if Kessel is unavailable for >N consecutive checks, DCM enters a safe-deny mode (no new requests accepted) rather than a fail-open mode +- Kessel Relations HA deployment is a prerequisite, not optional + +**Open question for Kessel team:** What HA and disaster recovery patterns are recommended for production Kessel Relations deployments? + +--- + +## 6. Questions for the Kessel Team + +The following questions should be addressed before any integration work begins: + +### 6.1 Kessel Relations + +| # | Question | Why It Matters | +|---|----------|----------------| +| 1 | What is the current API stability level of the Kessel Relations gRPC API? Are breaking changes expected? | DCM needs a stable contract to build against | +| 2 | Does Kessel Relations support on-premises / air-gapped deployment? What is the operational footprint? | Required for DCM's `sovereign` profile | +| 3 | How does the SpiceDB schema evolve? Is there a migration path when the DCM permission model changes? | Schema evolution is a production concern | +| 4 | Can Kessel Relations store relationships at the scale DCM requires? How many relationship tuples per tenant at what query latency? | DCM may have thousands of cross-tenant authorization records per deployment | +| 5 | How does Kessel handle the zookie (consistency token) lifecycle? Are zookies scoped to a namespace/tenant, or global? | Relevant to DCM's multi-tenant isolation model | +| 6 | Is Kessel Relations multi-tenant natively, or does DCM need to namespace its SpiceDB schema? | Critical for DCM's tenant isolation requirements | +| 7 | What is the intended integration pattern for other Red Hat products (ACM, Insights)? How would DCM's usage interoperate? | Kessel's value to DCM is partly the shared source of truth across RH products | +| 8 | Does Kessel Relations have a concept equivalent to DCM's "cross-tenant authorization"? How are trust grants between tenants modeled? | Core to DCM's resource sharing model | + +### 6.2 Kessel Inventory + +| # | Question | Why It Matters | +|---|----------|----------------| +| 9 | How extensible is Kessel Inventory's resource type schema? Can DCM register custom resource types? | DCM has domain-specific resource types not in Kessel's default schema | +| 10 | What is the write model? Last-write-wins upsert, or versioned? Does Kessel Inventory support the discovered state pattern (full overwrite on each discovery cycle)? | DCM's Discovered Store is a full-replacement snapshot per discovery cycle | +| 11 | How does Kessel Inventory integrate with Kessel Relations for auth-filtered queries? Is the integration already built, or planned? | Core to the value of using Kessel Inventory | +| 12 | What is the data retention model? Does Kessel Inventory keep history or only current state? | DCM needs "current state" only for Discovered State; history is in DCM's Audit Store | +| 13 | What is the API stability level for Kessel Inventory? | Same concern as #1 for Relations | +| 14 | Is there a reference implementation of a Kessel Inventory provider for a Kubernetes/OpenShift resource type? | DCM would follow this pattern for its Service Providers | + +### 6.3 Joint Architecture Questions + +| # | Question | Why It Matters | +|---|----------|----------------| +| 15 | Is the Kessel project open to DCM contributing Resource Type definitions and SpiceDB schema extensions to the upstream? | Reduces divergence risk; benefits broader community | +| 16 | How does Kessel handle sovereign data β€” data that must not cross jurisdictional boundaries? | Critical for DCM's sovereignty model | +| 17 | What is the recommended pattern for bootstrapping the Kessel-DCM trust relationship? (mTLS? OIDC? Service account?) | Required for DCM's zero-trust model | +| 18 | Does Kessel have a compatibility matrix for Red Hat platform versions (OpenShift, RHEL)? | DCM targets the same platforms | + +--- + +## 7. Proposed Integration Architecture (Pending Kessel Alignment) + +This section describes the target architecture **conditional on positive answers to the questions in Section 6**. It should not be implemented until validated with the Kessel team. + +### 7.1 Kessel Relations as DCM Auth Provider + +``` +DCM Request Pipeline: + β”‚ + β–Ό Auth Provider (Kessel Relations): + β”‚ Check 1: identity verification via mTLS certificate + β”‚ Check 2: CheckPermission(actor, operation, tenant/resource) via Kessel Relations gRPC + β”‚ ← returns allow/deny + zookie token + β”‚ + β–Ό DCM Policy Engine (internal): + β”‚ Check 3: Accreditation check (DCM Accreditation Registry) + β”‚ Check 4: Data/Capability Matrix (DCM Governance Matrix) + β”‚ Check 5: Sovereignty check (DCM Sovereignty Zone registry) + β”‚ + β–Ό All five checks pass β†’ request proceeds to layer assembly +``` + +**Impact on DCM architecture:** +- Auth Provider registration: new `auth_mode: kessel_rebac` in doc 19 +- Cross-tenant authorization DCMGroup: writes to both DCM Group Registry AND Kessel Relations tuple store +- RBAC evaluation: replaced by Kessel Relations CheckPermission call for checks 1–2 +- Group membership sync: DCM Auth Providers (LDAP, OIDC) continue to manage authentication; group memberships are mirrored to Kessel Relations for use in permission evaluation + +### 7.2 Kessel Inventory as DCM Discovered State Store + +``` +Discovery Cycle: + β”‚ + β–Ό Discovery Scheduler triggers provider discovery + β”‚ + β–Ό Service Provider returns RealizedStatePayload stream + β”‚ (current state in DCM Unified Data Model format) + β”‚ + β–Ό Kessel Inventory data store: + β”‚ Translates DCM format β†’ Kessel Inventory Protobuf schema + β”‚ Upserts to Kessel Inventory (replaces prior discovered state) + β”‚ + β–Ό Drift Reconciliation Component (unchanged): + β”‚ Queries Kessel Inventory for discovered state + β”‚ Compares against DCM Realized State + β”‚ Produces Drift Records (classification, severity, field detail) + β”‚ Writes Drift Records to DCM Drift Record Store +``` + +**Impact on DCM architecture:** +- Discovered State Store: implement as data store backed by Kessel Inventory +- No changes to data model, drift detection logic, or Drift Reconciliation Component +- Resource type mapping: DCM Resource Type Specs β†’ Kessel Inventory resource types (new tooling required) + +--- + +## 8. What Does Not Change Regardless of Integration + +The following DCM capabilities remain entirely in DCM regardless of how the Kessel integration develops: + +| Capability | Why it stays in DCM | +|-----------|---------------------| +| Intent State Store (GitOps) | GitOps semantics, PR workflow, immutability β€” not in scope for Kessel | +| Requested State Store (write-once) | Assembled payload with full provenance β€” DCM-specific | +| Realized State Store (append-only event stream) | Hash-chained, tamper-evident, field-level provenance β€” DCM-specific | +| Approval gate quorum tracking | State-tracking across time β€” Kessel Relations answers membership, not quorum | +| Five-check boundary model (checks 3–5) | Accreditation, data classification, sovereignty β€” DCM-specific | +| Entity Relationship Graph | Operational relationships between resources β€” not access control | +| Field-level provenance | Source tracking per field β€” not in scope for Kessel | +| Drift detection logic and classification | DRC component β€” Kessel Inventory is a data source, not a drift engine | +| Audit trail (hash chain) | Tamper-evident audit β€” DCM-specific requirement | +| Resource lifecycle state machine | REQUESTED β†’ OPERATIONAL β†’ DECOMMISSIONED β€” DCM-specific | +| Policy Engine | GateKeeper, Transformation, Recovery, Orchestration Flow policies β€” DCM-specific | +| Authority Tier model | Approval routing β€” DCM-specific governance model | + +--- + +## 9. System Policies (Proposed β€” Pending Validation) + +These policies should be reviewed and confirmed after Kessel team alignment: + +| Policy | Rule | +|--------|------| +| `KESSEL-001` | Kessel Relations, if registered as a DCM Auth Provider, handles authorization checks 1 and 2 of the five-check boundary model only. Checks 3–5 remain in DCM's Policy Engine and cannot be delegated. | +| `KESSEL-002` | DCM's entity relationship graph (operational relationships between infrastructure resources) must never be stored in Kessel Relations. Only access-control relationships (actorβ†’groupβ†’tenantβ†’resource permissions) are stored in Kessel Relations. | +| `KESSEL-003` | Kessel Inventory, if registered as a DCM data store for Discovered State, holds only ephemeral current-state snapshots. Intent, Requested, and Realized State stores remain in DCM-managed data stores. | +| `KESSEL-004` | If Kessel Relations is unavailable, DCM enters safe-deny mode: no new requests are accepted. Fail-open behavior is not permitted under any profile. | +| `KESSEL-005` | Zookie tokens from Kessel Relations CheckPermission responses must be threaded through the DCM request context to guarantee consistency across authorization checks within the same request. | +| `KESSEL-006` | Cross-tenant authorization DCMGroups that are backed by Kessel Relations must be written atomically: the DCM Group Registry record and the Kessel Relations tuple must both succeed or both fail. Partial writes are treated as failures. | +| `KESSEL-007` | DCM sovereign profile deployments require a locally-deployed Kessel instance. Cloud-hosted Kessel is not permitted for sovereign deployments. This requirement must be confirmed as feasible with the Kessel team. | + +--- + +## 10. Open Items Before Integration Can Begin + +| # | Item | Owner | Blocking? | +|---|------|-------|-----------| +| 1 | Kessel team review of Section 6 questions | Kessel team | Yes | +| 2 | Kessel Relations API stability confirmation | Kessel team | Yes | +| 3 | Sovereign/air-gapped deployment validation | Kessel team | Yes (for sovereign profile) | +| 4 | SpiceDB schema design review for DCM permission model | DCM + Kessel | Yes | +| 5 | Kessel Inventory resource type extensibility confirmation | Kessel team | Yes (for inventory integration) | +| 6 | HA/DR pattern review for production Kessel deployment | Kessel team | Yes | +| 7 | DCM Auth Provider interface extension for `kessel_rebac` mode | DCM team | No (can design in parallel) | +| 8 | DCM data store implementation for Kessel Inventory | DCM team | No (can design in parallel) | +| 9 | Zookie lifecycle management design in DCM request pipeline | DCM team | No (can design in parallel) | +| 10 | Resource type mapping: DCM Resource Type Specs β†’ Kessel Inventory schema | DCM + Kessel | No (can design in parallel) | + +--- + +*Document maintained by the DCM Project. For questions, contributions, or to schedule the Kessel alignment session see [GitHub](https://github.com/dcm-project).*