Skip to content

feat: add credential drivers for provider secret storage #1931

Description

@johntmyers

Problem Statement

OpenShell Providers v2 submit provider credentials, such as API keys, through the
normal gateway API. Historically, provider records could contain inline
plaintext credential values in gateway persistence. That is acceptable only as an
upgrade-compatibility read path, not as the long-term storage model for new
provider writes.

Credential storage should be gateway-owned. Users submit provider credentials
through the normal Providers v2 create/update flow. The gateway owner decides
whether those submitted secrets are stored in OpenShell's default encrypted
database credential store or in an external backend such as Kubernetes Secrets or
OpenBao. Provider configuration must not point at arbitrary pre-existing backend
secrets.

The first concrete use cases are:

  • Default gateway deployments storing provider API keys in encrypted database
    credential objects.
  • Kubernetes deployments storing provider API keys in OpenShell-managed
    Kubernetes Secrets.
  • OpenBao deployments storing provider API keys in OpenShell-managed KV paths.
  • Single-player/local environments later using OS-native stores such as macOS
    Keychain.

This issue only targets Providers v2. Providers v1 should not be extended.

Proposed Design

Add a Credential Driver interface, modeled after the compute driver boundary,
that lets the gateway store, resolve, and delete provider credentials through a
gateway-owned storage backend.

Credential storage should support:

  • Default encrypted database credential storage when no external credential
    driver is configured.
  • In-tree gateway drivers for common external backends.
  • Out-of-process drivers connected over a Unix domain socket.
  • At most one enabled external credential driver at a time, matching the current
    compute-driver model.
  • Legacy inline database credentials as a read-compatible upgrade path for
    provider records created before credential storage handles.
  • Opaque gateway-managed credential handles persisted in provider records.

The public provider API remains provider.credentials / openshell provider create --credential. provider.credential_handles is internal gateway state and
must be rejected when supplied by clients.

Example provider create flow:

openshell provider create \
  --name nvidia-prod \
  --type nvidia \
  --credential NVIDIA_API_KEY

When no external credential driver is enabled, the gateway stores the submitted
credential in the default encrypted database credential store, clears inline
credential values before persisting the provider record, and stores only opaque
handles in the provider record. Existing provider records that already contain
inline plaintext credentials remain readable for upgrade compatibility.

Example gateway configuration for the default encrypted database credential
store:

[openshell.gateway.credential_storage]
key_encryption_key_path = "/var/lib/openshell/credentials/key-encryption-key.bin"

Or, for multi-replica deployments where all gateway replicas must share the same
key-encryption key:

[openshell.gateway.credential_storage]
key_encryption_key_env = "OPENSHELL_GATEWAY_CREDENTIAL_KEY_ENCRYPTION_KEY"

Example gateway configuration for Kubernetes Secrets:

[openshell.gateway]
credential_drivers = ["kubernetes-secrets"]

[openshell.credential_drivers.kubernetes-secrets]
namespace = "openshell"

Example gateway configuration for OpenBao:

[openshell.gateway]
credential_drivers = ["openbao"]

[openshell.credential_drivers.openbao]
address = "http://openbao.openbao.svc.cluster.local:8200"
mount = "secret"
kv_version = "2"
auth_method = "kubernetes"
role = "openshell-gateway"

External drivers may use the advanced UDS transport:

[openshell.gateway]
credential_drivers = ["enterprise-secrets"]

[openshell.credential_drivers.enterprise-secrets]
transport = "uds"
socket_path = "/run/openshell/credential-driver.sock"

Built-in external drivers default to in-tree transport and do not require a
transport field.

Credential driver contract:

  • GetCapabilities: report driver identity and feature support.
  • StoreCredential: store or overwrite one provider credential and return an
    opaque handle.
  • ResolveCredentials: resolve opaque handles into string secret values when
    the gateway materializes a sandbox provider environment.
  • DeleteCredential: delete one stored provider credential handle.
  • ListCredentials: optional driver-owned inspection surface.

Backend layout is OpenShell-managed:

  • The default encrypted database store stores one encrypted credential object per
    provider credential outside the provider record. Each object contains an
    encrypted value envelope, and the provider record stores only the handle.
  • Kubernetes Secrets stores one OpenShell-managed Secret per provider credential
    using deterministic names derived from provider name and credential key.
  • OpenBao stores one OpenShell-managed KV entry per provider credential under a
    managed prefix such as openshell/provider-credentials/.
  • Users do not provide backend Secret names, paths, keys, versions, or
    namespaces in provider configuration.

Provider delete and credential removal should delete backend credential objects
before completing the provider database write/delete. Cleanup failures fail
closed.

Acceptance Criteria

  • Providers v2 credentials can be stored through the active gateway credential
    storage path.
  • Providers v1 remain unchanged.
  • When no external credential driver is configured, new provider credentials are
    stored in the default encrypted database credential store.
  • Existing provider records with inline plaintext database credentials remain
    readable for upgrade compatibility.
  • Gateway configuration rejects more than one enabled external credential
    driver.
  • Gateway configuration rejects explicit credential_drivers = []; omitting the
    field selects default encrypted database credential storage.
  • User-supplied credential_handles are rejected by provider create/update.
  • Stored provider records contain opaque handles, not secret values, for new
    provider credential writes.
  • Provider API responses redact credential values and do not expose internal
    handles.
  • Secret values are resolved only at runtime and are never logged.
  • Provider update overwrites the existing backend handle when possible.
  • Provider update of a legacy inline provider preserves untouched inline
    credentials while moving newly submitted credential values into credential
    storage.
  • Provider delete and credential removal delete backend objects and fail closed
    on backend cleanup errors.
  • The default encrypted database credential store encrypts provider credential
    values with gateway-owned key material and can reuse that key material across
    gateway restarts.
  • The Helm chart creates and retains a shared key-encryption-key Secret for the
    default encrypted database store when no external credential driver is enabled.
  • The Kubernetes Secrets driver stores, resolves, and deletes provider API keys
    using gateway Kubernetes RBAC.
  • The OpenBao driver stores, resolves, and deletes provider API keys from a
    deployed OpenBao instance.
  • Targeted Kubernetes e2e coverage validates Kubernetes Secrets and OpenBao
    storage backends one at a time using the local Skaffold/Helm development
    environment.

Alternatives Considered

Keeping only inline plaintext credentials was rejected because it forces provider
records to own secret material directly and does not provide defense in depth for
default gateway deployments. Inline credentials remain supported only as a
legacy read path for provider records created before credential handles.

Using provider-authored references to existing backend secrets was rejected after
design review. It expands provider configuration into backend addressing and
policy, while the intended ownership model is that OpenShell internal driver
logic manages backend destinations.

Using environment variables as the primary credential driver was rejected because
environment variables are an input/discovery mechanism, not a durable backend
storage contract. They may remain useful for local provider discovery.

Resolving credentials inside sandbox runtimes was rejected because backend
access should remain controlled by the gateway owner and should not expand
sandbox authority.

Agent Investigation

  • Existing compute drivers provide a suitable model for in-tree and
    UDS-connected driver implementations.
  • Provider credential storage and resolution belong in the gateway because the
    gateway owner controls backend access and policy.
  • Providers v2 is the right extension point; Providers v1 should remain
    unchanged.
  • Upgrade compatibility is preserved by reading existing inline database
    credentials while routing new writes through credential storage handles.
  • Supporting only one enabled external credential driver simplifies routing and
    matches the current compute-driver model.
  • Default encrypted database credential storage improves the baseline deployment
    posture without requiring Kubernetes Secrets or OpenBao.
  • Kubernetes Secrets and OpenBao should be implemented first because they cover
    the primary production deployment paths.
  • OpenBao validation should use the existing Skaffold and Helm development
    environment so behavior is tested against a real Kubernetes deployment.

Metadata

Metadata

Assignees

Labels

No fields configured for Enhancement.

Projects

Status
Todo

Relationships

None yet

Development

No branches or pull requests

Issue actions