Skip to content

RBAC Spec

Gareth Price edited this page Mar 10, 2026 · 1 revision

Spec 0017: RBAC Core Services Enforcement And Policy Sync

Status

Proposed

Summary

Spec 0011 defines authorization_policy_backend as Phlo's runtime policy decision point (PDP). That only solves half of authorization.

Phlo also runs core services with native enforcement models:

  • Trino grants
  • PostgreSQL roles and policies
  • Hasura metadata permissions
  • MinIO IAM policies
  • Nessie authorization rules

This spec defines how one canonical RBAC model drives both:

  • runtime decisions through authorization_policy_backend
  • backend-native enforcement through governance_backend

This is platform-wide RBAC, not API-only RBAC. A user should get the same authorization outcome whether they use:

  • Observatory
  • Phlo API
  • a directly exposed core service such as Trino, Hasura, MinIO, or Nessie

Relationship To Existing Specs

  • Builds on Spec 0010: authentication provider capability.
  • Builds on Spec 0011: authorization policy backend capability.
  • Aligns with Spec 0012: capability-native API surface.

This spec does not replace Spec 0011. It defines how runtime authorization and service enforcement stay aligned.

Problem

Phlo currently has fragmented authorization behavior:

  • some API and service-layer paths ask authorization_policy_backend
  • some services enforce access through package-local configuration
  • some backend roles and grants drift outside Phlo control
  • operators do not have one reliable way to explain a deny

This creates predictable failures:

  • API allows an action the backend later denies
  • API denies an action backend state would allow
  • package-local role aliases drift from the product vocabulary
  • incident response is slow because logs and policies do not line up

Goals

  • Keep one canonical RBAC model for Phlo product actions.
  • Keep runtime decisions centralized in authorization_policy_backend.
  • Push backend policy through governance_backend adapters.
  • Support direct user access to core services with the same RBAC intent as API and Observatory.
  • Make compilation, apply, verify, and revert deterministic.
  • Detect and report drift between desired and actual backend state.
  • Make ownership, lifecycle, and failure rules explicit enough for junior implementation.

Non-Goals

  • Do not redesign authentication or identity-provider login flows.
  • Do not replace backend-native security models with one fake universal layer.
  • Do not promise row-level or column-level parity across every backend in phase one.
  • Do not expose provider-specific policy internals directly to frontend code.
  • Do not auto-install security-sensitive default providers as unrelated startup side effects.

Design Principles

  • Separate identity, decision, and enforcement concerns.
  • Fail closed for security-sensitive runtime paths.
  • Keep canonical action names stable and derive backend verbs from them.
  • Compile policy from canonical source into backend artifacts.
  • Prefer diff-first sync and explicit verify over implicit mutation.
  • Every deny and every sync failure leaves an audit breadcrumb.

Capability Boundary

This section is the main guardrail against implementation drift.

authentication_provider

Responsible for:

  • resolving caller identity
  • normalizing claims and subject information
  • producing the Principal consumed by authz checks

Not responsible for:

  • authorization decisions
  • backend grants or service-native policies

authorization_policy_backend

Responsible for:

  • answering allow or deny for canonical action plus resource
  • using canonical roles and Spec 0011 policy semantics
  • producing normalized reason codes and policy identifiers

Not responsible for:

  • mutating Trino, Postgres, Hasura, MinIO, or Nessie state
  • inventing package-local role models
  • bypassing backend-native enforcement

governance_backend

Responsible for:

  • compiling canonical RBAC intent into backend-native artifacts
  • applying those artifacts to concrete backends
  • verifying actual backend state against desired state
  • reverting known applied changes when possible

Not responsible for:

  • answering request-time allow or deny decisions for product code
  • replacing the PDP contract

Control Plane vs Runtime Plane

Keep the planes separate:

Runtime plane:
  caller via Observatory/API
    -> authentication_provider
        -> authorization_policy_backend
            -> allow/deny
                -> protected operation
                    -> backend-native enforcement still applies

  caller direct to Trino/Hasura/MinIO/Nessie
    -> backend-native identity
        -> backend-native enforcement derived from canonical RBAC

Control plane:
  canonical RBAC files
    -> authz sync controller
        -> governance backends
            -> backend-native grants / policies / roles

Important rules:

  • a successful PDP check is necessary but not sufficient; backend enforcement must also agree
  • direct access to a core service is in scope for this spec
  • shared superuser credentials are not sufficient for user-level RBAC

Core Model

Canonical RBAC lives in project-scoped configuration.

Suggested location:

  • .phlo/authorization/roles.yaml
  • .phlo/authorization/policies.yaml

These files are the source of truth for:

  • canonical roles
  • subject-to-role mapping
  • canonical allow and deny rules
  • policy version hash used by sync and audit logs

roles.yaml

Defines role hierarchy and optional Phlo-managed subject assignment.

Example:

version: 1
roles:
  viewer: {inherits: []}
  analyst: {inherits: [viewer]}
  operator: {inherits: [analyst]}
  admin: {inherits: [operator]}
subjects:
  services: {phlo-api: [viewer], dagster-webserver: [operator]}

Rules:

  • role names are lowercase snake_case
  • inheritance is additive
  • inheritance cycles are invalid
  • unknown role references are invalid
  • package-local role aliases are not allowed without central registration

Authority rules:

  • role hierarchy is always defined centrally in roles.yaml
  • service principal assignment may be managed directly in roles.yaml
  • human user assignment in production should normally come from IdP claim or group mapping into canonical roles, not static per-user entries
  • static subjects.users entries are allowed only for development, bootstrap, or explicit break-glass profiles documented by the operator
  • if both IdP mapping and static user assignment are enabled for the same user, the environment profile must define precedence explicitly; silent merge rules are not allowed

policies.yaml

Defines canonical product-level permissions.

Example:

version: 1
policies:
  - {policy_id: allow_analyst_dataset_read, effect: allow, principal: {roles: [analyst]}, action: dataset.read, resource: {type: dataset, id_pattern: analytics.*}}
  - {policy_id: deny_non_admin_service_manage, effect: deny, principal: {roles: [viewer, analyst, operator]}, action: service.manage, resource: {type: service, id_pattern: "*"}}

Rules:

  • policy_id is unique and stable
  • explicit deny overrides allow
  • no match means deny
  • wildcards are allowed only for stable resource IDs or explicit low-cardinality attributes
  • policy files must be deterministic after parse and validation

Canonical Actions And Resources

Start with the Spec 0011 vocabulary and extend only when required.

Baseline actions:

  • dataset.read
  • dataset.query
  • asset.read
  • asset.execute
  • service.read
  • service.manage
  • admin.read
  • admin.manage

Candidate future actions:

  • object.read
  • object.write
  • catalog.read
  • catalog.manage

New actions require:

  • docs update
  • PDP tests
  • compiler mapping tests
  • explicit backend ownership

Action semantics:

  • dataset.read: permission to discover and read the contents of a specific dataset resource through product surfaces or backend-native SELECT-equivalent access scoped to that dataset
  • dataset.query: permission to use an interactive or programmable query surface that can execute arbitrary queries over one or more datasets; this is broader than opening a known dataset for read
  • asset.read: permission to view asset metadata, health, and lineage
  • asset.execute: permission to materialize, run, or trigger asset work
  • service.read: permission to inspect service status or configuration
  • service.manage: permission to mutate service configuration or lifecycle
  • admin.read: permission to inspect administrative state
  • admin.manage: permission to mutate administrative state

Compiler rules for action semantics:

  • compilers must map every supported canonical action explicitly
  • if a backend exposes only one primitive for both dataset.read and dataset.query, the compiler must document that collapse and tests must prove the behavior
  • if the environment requires the distinction and the backend cannot express it, plan or validation must fail; the compiler must not silently broaden access
  • product surfaces that expose a free-form query editor should check dataset.query, not only dataset.read

Stable resource identifiers:

  • dataset: catalog.schema.table
  • asset: Phlo or Dagster asset key
  • service: service name such as trino
  • admin: admin domain such as service_settings
  • object: bucket and object prefix when object storage policies are in scope

Use tenant or attributes only for stable, low-cardinality facts relevant to authorization. Do not encode request payloads, secrets, or transient blobs.

Request Lifecycle

Required runtime sequence for any protected surface.

API Endpoints

  1. Resolve Principal through authentication_provider.
  2. Build canonical action, ResourceRef, and DecisionContext.
  3. Call authorization_policy_backend.
  4. On deny, return 403 and log request_id, reason_code, and policy_id if present.
  5. On allow, execute the operation.
  6. Let backend-native enforcement apply as the final guardrail.

CLI Commands

  1. Resolve principal from local auth or service session context.
  2. Build canonical action and resource.
  3. Evaluate through the PDP before side effects.
  4. On deny, fail with a normalized error code.

Scheduled And Automation Workloads

  1. Use explicit service principals.
  2. Attach runtime context such as run_id and environment.
  3. Evaluate through the PDP before triggering side effects.
  4. Preserve decision metadata in structured logs.

Direct Core-Service Access

  1. The user authenticates to the core service through that service's supported identity path.
  2. The service resolves the user's identity or mapped groups or claims.
  3. Backend-native grants or policies, previously synced from canonical RBAC, enforce the request.
  4. The resulting allow or deny should match the same RBAC intent Phlo would apply through API or Observatory.

This means Phlo is the policy source of truth, while the core service remains the final enforcement point for direct access.

Enforcement Strategy

Phlo must not rely on PDP checks alone. Core services need matching native enforcement.

Each supported backend gets a compiler owned by a governance_backend implementation. The compiler turns canonical actions and resources into concrete backend artifacts.

Examples:

  • canonical dataset.read -> Trino GRANT SELECT
  • canonical dataset.query -> Trino schema or table privileges
  • canonical API-facing dataset access -> PostgreSQL role grants and optional RLS
  • canonical CRUD-style table permissions -> Hasura metadata permissions
  • canonical object access -> MinIO IAM policy documents
  • canonical catalog access -> Nessie authz rules

Compiler design rules:

  • compile from canonical roles and actions, not package-local shortcuts
  • validate generated identifiers before apply
  • emit stable object names where the backend allows it
  • support dry-run planning before mutation
  • support read-back verification after mutation

Backend Ownership

  • Trino: canonical dataset actions and role grants -> GRANT / REVOKE statements. Existing base: phlo-trino governance backend.
  • PostgreSQL / PostgREST: canonical dataset and API-surface access -> PostgreSQL roles, grants, view permissions, optional RLS.
  • Hasura: canonical role-to-table permissions -> Hasura metadata permission documents.
  • MinIO: canonical object and bucket actions -> IAM policy documents and claim mapping.
  • Nessie: canonical catalog read and manage actions -> Nessie authorization rules.

Per-Backend Identity And Enforcement Model

This section defines what "platform-wide RBAC" means per service.

Trino

  • User identity path: direct user auth through configured Trino auth, typically OAuth2/OIDC or LDAP.
  • Native enforcement: Trino access-control rules or grants, potentially via the existing governance backend.
  • Phlo ownership: map canonical roles to Trino roles or grants and keep the grant set in sync.
  • Minimum requirement: Trino must be able to distinguish Bob from Alice, or at least distinguish stable groups that map to canonical roles.
  • Non-goal for phase 1: perfect query-shape control beyond what Trino can represent natively.

PostgreSQL / PostgREST

  • User identity path: JWT or database role mapping for PostgREST; direct DB auth or mapped roles for PostgreSQL where exposed.
  • Native enforcement: PostgreSQL roles, grants, views, and optional row-level security.
  • Phlo ownership: compile canonical roles into database roles and API-facing permissions, then verify resulting grants or RLS state.
  • Minimum requirement: the JWT role or DB role seen by PostgreSQL must map back to canonical roles.

Hasura

  • User identity path: JWT claims forwarded to Hasura.
  • Native enforcement: Hasura roles and permission metadata, including row or column filters when configured.
  • Phlo ownership: map canonical roles into x-hasura-* role expectations and synced metadata permissions.
  • Minimum requirement: user tokens used with Hasura must carry stable role or claim information compatible with canonical roles.

MinIO

  • User identity path: OIDC or LDAP, with claim or group mapping into MinIO policies.
  • Native enforcement: MinIO IAM policies.
  • Phlo ownership: compile canonical object actions into IAM policy documents and keep claim-to-policy mapping aligned with canonical roles.
  • Minimum requirement: MinIO must receive user or group identity, not one shared human credential.

Nessie

  • User identity path: OIDC where enabled.
  • Native enforcement: Nessie authorization rules.
  • Phlo ownership: compile canonical catalog actions into Nessie-compatible authz rules and verify them after apply.
  • Minimum requirement: Nessie authn and authz must be enabled in environments where direct user access is expected.

Policy Compilation Limits

Canonical RBAC is the source of truth, but not every backend can represent every rule with the same fidelity.

Required compiler behavior:

  • compile exact equivalents where possible
  • fail validation for unsupported security-critical rules in protected environments
  • emit explicit warnings for lossy mappings only where that environment profile allows them
  • never silently broaden access because a backend cannot express a rule

Examples of potential gaps:

  • row-level conditions available in PostgreSQL or Hasura but not in Trino grant models
  • object-prefix policies in MinIO that do not align cleanly with dataset-style resources
  • backend-specific admin operations with no direct canonical action yet

Rule: when canonical policy is richer than a backend can express, the safer path wins. That means deny, validation failure, or an environment-blocking plan error, not a permissive approximation.

Managed Object Ownership

Drift, sync, and revert are safe only if Phlo is explicit about which backend objects it owns.

Required ownership rules:

  • every compiler must define a managed-object boundary for its backend
  • managed objects must be discoverable by stable names, tags, comments, or other backend-native markers, not by heuristics alone
  • verify and drift detection must compare desired vs actual state only within that managed boundary
  • "extra object" drift means extra relative to Phlo-managed objects, not every object present in a shared backend
  • revert may mutate only objects recorded in the applied change set and still inside the managed boundary
  • compilers must not delete or rewrite unmanaged roles, grants, policies, or metadata owned by operators or third-party tools

Examples:

  • Trino roles or grants created under a Phlo-owned naming convention
  • PostgreSQL roles, grants, or policies tagged as Phlo-managed
  • Hasura metadata blocks generated under a Phlo-owned permission section
  • MinIO policies with a Phlo-managed policy name prefix
  • Nessie rules emitted into a Phlo-managed ruleset or marked scope

If a backend cannot express a reliable managed-object boundary, that compiler is not production-ready.

Policy Sync Controller

Phlo needs one control-plane component to manage sync.

Suggested CLI surface:

  • phlo authz validate
  • phlo authz plan
  • phlo authz sync
  • phlo authz verify
  • phlo authz revert

Controller responsibilities:

  • load canonical RBAC files
  • validate schemas and cross-file references
  • resolve installed governance backends
  • compile desired backend artifacts
  • compute desired vs actual diffs
  • apply changes in deterministic order
  • verify actual state after apply
  • emit a structured sync report with a policy version hash

Sync Phases

Required execution order:

  1. validate canonical source
  2. resolve target backends
  3. read current backend state
  4. compile desired backend state
  5. compute diffs
  6. apply prerequisite roles or containers with no effective privileges
  7. apply restrictive changes, membership removals, revokes, and deny policies
  8. apply additive grants and allow policies required by the final state
  9. remove obsolete managed objects only after replacement state is in place
  10. verify read-back state
  11. emit final report

Sync Safety Rule

The controller and every compiler must avoid temporary privilege broadening during sync.

Required behavior:

  • if the backend supports transactional apply, use it for privilege-changing operations where practical
  • if the backend does not support transactional apply, order operations so the intermediate state is equal to or more restrictive than the final desired state
  • if safe ordering is impossible for a change set, plan must surface that fact and sync must fail unless the operator uses an explicit environment mechanism for coordinated downtime or a maintenance window
  • a successful sync means both final-state correctness and acceptable intermediate-state safety

Sync Contract

Every compiler must expose the same lifecycle:

  • plan: return desired changes without mutating state
  • apply: mutate backend state for the planned change set
  • verify: compare desired and actual backend state
  • revert: reverse a known applied change set when supported

revert is best-effort and must be explicit about limits. It is not permission to perform uncontrolled destructive cleanup.

Sync Report

Minimum report fields:

  • policy version hash
  • backend name
  • environment
  • planned object count
  • applied object count
  • verification result
  • drift summary
  • request or operation identifier
  • error details on failure

Failure Semantics

Authorization work is security-sensitive. Failure behavior must be explicit.

Runtime Failures

  • PDP backend unavailable: deny
  • malformed policy state: deny
  • invalid request model: deny
  • missing authz provider in a protected surface: deny in non-dev profiles

Sync Failures

  • validation failure: no backend mutation
  • backend unavailable before apply: fail command, leave existing state in place
  • apply failure mid-run: mark sync failed, emit partial-apply report, do not report success
  • verify failure after apply: mark sync failed even if writes succeeded

Partial Apply Rules

The controller must not claim convergence after a partial apply.

Required behavior:

  • identify which backend and objects succeeded
  • identify which backend and objects failed
  • emit a non-zero exit code
  • preserve enough metadata for explicit revert or follow-up repair

Drift Detection

Drift means actual backend state no longer matches canonical desired state.

Minimum requirements:

  • phlo authz verify compares desired and actual state
  • drift returns a non-zero exit code
  • structured logs include backend and object identifiers
  • drift output distinguishes missing, extra, and mismatched objects

Future work:

  • scheduled verify jobs
  • CI gates for protected environments
  • signed policy bundles for production sync

Identity And Role Mapping

Runtime PDP decisions and backend sync both depend on the same role vocabulary.

Requirements:

  • one canonical role vocabulary across API, CLI, and services
  • explicit claim-to-role mapping from authentication_provider
  • explicit service principal mapping for automation workloads
  • explicit user or group mapping into directly accessed core services
  • no hidden package-local role translation

Recommended path:

  • identity claims map to canonical roles through explicit configuration that references the role vocabulary defined in roles.yaml
  • resolved Principal.roles uses only canonical role names
  • backend compilers derive native roles from canonical roles, not the other way around
  • direct-service access must preserve user identity or group membership closely enough for user-level backend enforcement

Source-of-truth split:

  • roles.yaml defines the canonical role vocabulary and any Phlo-managed service-principal assignments
  • human user to role mapping in production should come from the authentication provider's configured claim or group mapping
  • subjects.users is for non-production bootstrap or narrowly controlled break-glass cases unless an environment profile explicitly blesses it
  • the same human user must not receive different canonical roles depending on whether they enter through API, UI, or direct-service access

Non-example:

  • a single shared Trino, Hasura, or MinIO credential used for all human users is not sufficient for platform-wide RBAC

User Experience Model

From a user perspective, the same role should produce the same outcome across surfaces.

Example:

  • bob@company.com has role analyst
  • policy allows dataset.read and dataset.query on analytics.*
  • Bob can open analytics.orders in Observatory
  • Bob can call the matching API successfully
  • Bob can query the same dataset directly in Trino
  • Bob cannot read finance.payroll anywhere

The intended property is consistency, not just API protection.

Example Mapping Appendix

This appendix shows one concrete end-to-end example from canonical policy to surface behavior and backend-native enforcement.

Example Canonical Config

roles.yaml

version: 1
roles:
  viewer: {inherits: []}
  analyst: {inherits: [viewer]}
  admin: {inherits: [analyst]}

policies.yaml

version: 1
policies:
  - policy_id: analyst_read_analytics
    effect: allow
    principal: {roles: [analyst]}
    action: dataset.read
    resource: {type: dataset, id_pattern: analytics.*}

  - policy_id: analyst_query_analytics
    effect: allow
    principal: {roles: [analyst]}
    action: dataset.query
    resource: {type: dataset, id_pattern: analytics.*}

  - policy_id: admin_manage_services
    effect: allow
    principal: {roles: [admin]}
    action: service.manage
    resource: {type: service, id_pattern: "*"}

Desired outcomes:

  • Bob can read analytics.orders
  • Bob cannot read finance.payroll
  • Alice can manage services

Example Surface Behavior

  • Observatory: Bob can browse analytics.*, not finance.*
  • API: dataset.read on analytics.orders is allowed for Bob
  • API: service.manage is denied for Bob and allowed for Alice
  • Direct Trino: Bob can query analytics.orders, not finance.payroll

Example IdP Claim Mapping

One practical pattern is to let the IdP emit canonical roles or groups that map directly to canonical roles.

Example conceptual JWT claims:

{
  "sub": "bob@company.com",
  "email": "bob@company.com",
  "groups": ["analyst"]
}

Phlo-side mapping:

  • groups=analyst -> canonical role analyst
  • groups=admin -> canonical role admin

Backend-side mapping:

  • Trino receives the same user identity and maps analyst to Trino grants
  • Hasura receives JWT role claims derived from the same canonical role
  • MinIO maps group or claim membership to IAM policies
  • Nessie consumes OIDC identity and matching authz rules

Production note:

  • this claim or group mapping is the normal source of truth for human users
  • static per-user entries in roles.yaml are for development or explicit break-glass operation, not the default production model

Example Trino Compilation

Canonical intent:

  • role analyst may read analytics.*
  • role analyst may use dataset query surfaces for analytics.*
  • role admin may manage broadly

Conceptual compiled state:

CREATE ROLE analyst;
CREATE ROLE admin;

GRANT SELECT ON TABLE analytics.orders TO ROLE analyst;
GRANT SELECT ON TABLE analytics.customers TO ROLE analyst;

GRANT ALL PRIVILEGES ON SCHEMA analytics TO ROLE admin;
GRANT ALL PRIVILEGES ON SCHEMA finance TO ROLE admin;

Identity requirement:

  • Bob must connect to Trino as Bob, or through stable groups that Trino can map to analyst
  • one shared human-facing Trino user is not sufficient

Operational note:

  • the existing phlo-trino governance backend already works in SQL-grant terms, so the reference compiler should normalize canonical actions into SQL privileges and validate identifiers before apply
  • in phase one, Trino may collapse dataset.read and dataset.query onto the same SELECT-style privilege model; if so, that collapse must be documented and covered by compiler tests

Example Hasura Compilation

Canonical intent:

  • analyst may read selected datasets exposed through GraphQL
  • admin may manage all GraphQL data paths

Conceptual JWT claims:

{
  "sub": "bob@company.com",
  "https://hasura.io/jwt/claims": {
    "x-hasura-default-role": "analyst",
    "x-hasura-allowed-roles": ["analyst"]
  }
}

Conceptual permission metadata:

tables:
  api.orders:
    select:
      analyst:
        filter: {}
        columns: ["id", "customer_id", "amount"]
      admin:
        filter: {}
        columns: ["*"]

Operational note:

  • the existing Hasura permission manager already syncs table permissions from config, so the compiler should emit that shape from canonical RBAC rather than inventing another package-local role format

Example PostgreSQL / PostgREST Compilation

Canonical intent:

  • analyst may read API-facing analytics data
  • direct SQL access or PostgREST access should resolve to a PostgreSQL role that carries the right grants

Conceptual compiled state:

CREATE ROLE analyst NOINHERIT;
GRANT USAGE ON SCHEMA analytics TO analyst;
GRANT SELECT ON ALL TABLES IN SCHEMA analytics TO analyst;

If row-level constraints are in scope, the compiler may also emit CREATE POLICY statements. If the environment requires row-level protection and the target backend path cannot express it, plan or validation must fail.

Example MinIO Compilation

Canonical intent:

  • analyst may read objects under analytics/
  • admin may manage buckets and objects

Conceptual IAM policy document for analyst:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": ["s3:GetObject"],
      "Resource": ["arn:aws:s3:::lakehouse/analytics/*"]
    }
  ]
}

Identity requirement:

  • OIDC or LDAP must map Bob to the policy set associated with canonical role analyst

Example Nessie Compilation

Canonical intent:

  • analyst may read catalog objects
  • admin may manage branches or catalog state

Conceptual compiled state:

  • OIDC enabled for direct user auth
  • Nessie authorization enabled
  • rules granting read operations to the role set derived from analyst
  • broader management rules for admin

Exact rule syntax depends on the deployed Nessie authz mode. The compiler must target the active mode and fail validation if the environment expects direct user access but Nessie authn or authz is disabled.

Audit And Observability

Minimum structured events:

  • deny decisions
  • policy validation failures
  • sync start
  • sync success
  • sync failure
  • verify success
  • drift detected

Required fields where available:

  • request_id
  • run_id
  • policy_id
  • reason_code
  • policy_version_hash
  • backend
  • resource_type
  • resource_id

Do not log:

  • secrets
  • raw tokens
  • full policy payloads when they contain sensitive values

Operational Constraints

  • policy sync should be the normal mutation path for managed authz state
  • manual backend edits are drift and must be visible
  • default authorization providers must be registered explicitly, not as side effects of unrelated capability defaults
  • no package should bypass the PDP by checking principal.roles directly for a protected runtime decision

Junior Developer Implementation Guide

Add A Protected Endpoint

  1. Choose an existing canonical action.
  2. Build a stable ResourceRef.
  3. Build DecisionContext with request correlation.
  4. Call authorization_policy_backend before side effects.
  5. Return 403 on deny.
  6. Add allow and deny tests.
  7. Update docs if a new action was required.

Add A New Backend Compiler

  1. Define the canonical action-to-backend mapping table.
  2. Implement plan, apply, verify, and revert.
  3. Validate generated identifiers and policy object names.
  4. Add contract tests shared with other compilers.
  5. Add integration tests against the real backend service.
  6. Document configuration, limits, and rollout caveats.

Change The Role Vocabulary

  1. Update roles.yaml schema and examples.
  2. Update claim-to-role mapping.
  3. Update compiler mappings.
  4. Run phlo authz validate.
  5. Run phlo authz plan.
  6. Review diffs before phlo authz sync.
  7. Document rollout notes.

Testing Strategy

Unit Tests

  • role hierarchy expansion
  • policy parsing and validation
  • explicit deny precedence
  • no-match deny behavior
  • policy version hashing
  • compiler output generation

Contract Tests

  • shared fixtures executed against every compiler
  • mapping completeness for supported actions
  • verify parity between plan output and read-back normalization
  • reject invalid identifiers and dangerous input

Integration Tests

  • API route enforcement through the PDP
  • CLI enforcement through the PDP
  • sync and verify against Trino
  • sync and verify against PostgreSQL or PostgREST
  • sync and verify against Hasura
  • sync and verify against MinIO
  • sync and verify against Nessie
  • partial-apply and drift paths return non-zero exit status
  • each migrated package adds a regression proving ad hoc role checks were removed

Rollout Plan

Phase 1: Canonical Policy Source

  • define schemas for roles.yaml and policies.yaml
  • add canonical fixtures for tests
  • add phlo authz validate

Phase 2: Runtime Hardening

  • complete fail-closed PDP behavior for protected surfaces
  • standardize deny logging fields
  • remove route-local and package-local role checks

Phase 3: Sync Foundation

  • add the sync controller
  • add diff and sync report models
  • add plan, sync, verify, and revert CLI commands

Phase 4: Backend Compilers

  • implement Trino compiler as the reference path
  • implement PostgreSQL or PostgREST compiler
  • implement Hasura compiler
  • implement MinIO compiler
  • implement Nessie compiler

Phase 5: Direct User Access

  • enable per-backend identity paths for direct user access
  • map IdP claims or groups into backend-native roles or policies
  • verify same-user same-resource parity across API and direct backend access

Phase 6: Operations

  • add drift-focused operator runbook
  • wire sync and verify into CI or deployment flow
  • define production policy bundle and promotion process

Environment Profiles

Different environments can enforce different strictness, but the profile must be explicit.

Development

  • direct backend access may be incomplete
  • lossy mappings may be warned instead of blocked
  • fail-open behavior is still not allowed for protected runtime decisions

Staging

  • all protected surfaces should use the PDP
  • backend sync should run for target services
  • lossy compiler mappings should fail unless explicitly waived

Production

  • protected surfaces fail closed
  • direct-access services expected by users must have real identity propagation
  • unsupported or lossy policy mappings are blocking errors
  • verify and drift detection are mandatory gates

Acceptance Criteria

  • one canonical RBAC source drives both PDP decisions and backend sync
  • protected product surfaces ask only authorization_policy_backend for runtime decisions
  • direct user access to supported core services enforces the same canonical RBAC intent through backend-native permissions
  • core services receive backend-native policy artifacts through governance_backend compilers
  • policy diffs are reviewable before apply
  • sync failures and drift produce clear non-zero outcomes
  • a junior engineer can add a protected endpoint or backend compiler from this spec without inventing new authz behavior

Operational Runbook

  • Day 0: configure claim-to-role mapping, create policy files, run validate, plan, sync, then verify.
  • Policy change: update policy files in version control, run validate and plan in CI, review diffs, sync, then verify.
  • Incident triage: collect request_id, policy_id, and reason_code; check PDP logs; run phlo authz verify; compare desired vs actual state; revert or repair explicitly after partial apply.

Documentation Follow-Up

When implementation starts, update configuration docs, CLI docs, security setup docs, backend package docs under docs/packages/, and common error docs if authz codes are added.

Open Questions

  • should production sync be CI or deploy driven only, or may operators run it directly?
  • which environments allow partial backend coverage during migration?
  • what is the minimum row-level security scope for PostgreSQL phase one?
  • do production environments require signed policy bundles?

Clone this wiki locally