-
Notifications
You must be signed in to change notification settings - Fork 56
docs: ADR 0046 — host-side API server design for sandboxed agents #1814
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,175 @@ | ||
| --- | ||
| title: "46. Host-side API server design for sandboxed agents" | ||
| status: Accepted | ||
| relates_to: | ||
| - agent-infrastructure | ||
| - security-threat-model | ||
| topics: | ||
| - sandbox | ||
| - api-server | ||
| - credential-isolation | ||
| --- | ||
|
|
||
| # 46. Host-side API server design for sandboxed agents | ||
|
|
||
| Date: 2026-06-02 | ||
|
|
||
| ## Status | ||
|
|
||
| Accepted | ||
|
|
||
| <!-- Once this ADR is Accepted, its content is frozen. Do not edit the Context, | ||
| Decision, or Consequences sections. If circumstances change, write a new | ||
| ADR that supersedes this one. Only status changes and links to superseding | ||
| ADRs should be added after acceptance. --> | ||
|
|
||
| ## Context | ||
|
|
||
| [ADR 0024](0024-harness-definitions.md) introduced the `api_servers` harness field as planned but not | ||
| implemented. [ADR 0017](0017-credential-isolation-for-sandboxed-agents.md)/[ADR 0025](0025-provider-credential-delivery-for-sandboxed-agents.md) established the host-side REST server as Tier 3 | ||
| of the credential delivery model — for cases where providers (Tier 2) cannot | ||
| handle, originally scoped to credentials in request bodies and response | ||
| transformation. Practice revealed additional cases beyond provider reach: | ||
| long-running operations exceeding MCP timeouts, operations the sandbox | ||
| deliberately blocks (container builds, see | ||
| [NVIDIA/OpenShell#113](https://github.com/NVIDIA/OpenShell/issues/113)), | ||
| and multi-step atomic operations. | ||
|
|
||
| The `host-side-api-server` experiment | ||
| ([fullsend-ai/experiments#28](https://github.com/fullsend-ai/experiments/pull/28)) | ||
| validated the end-to-end flow with two servers (Go container builder, Python | ||
| repo provisioner), testing lifecycle management, API discoverability, L7 policy | ||
| tuning, per-run auth, and file transfer. This ADR records design decisions | ||
| informed by that experiment and by subsequent review discussion on the | ||
| implementation model. | ||
|
|
||
| ## Options | ||
|
|
||
| ### API discoverability | ||
|
|
||
| Three approaches were tested. `/tools.json` (structured tool-use schema) was | ||
| the most token-efficient under full access (92k tokens vs 107k for OpenAPI, | ||
| 100k for baked instructions). Both discovery-based methods fail under | ||
| restricted policies where the endpoint is blocked; baked instructions succeed | ||
| (84k) but can drift from the actual API. OpenAPI's verbose structure adds | ||
| context tokens without proportional benefit for LLM agents. | ||
|
|
||
| ### Per-run authentication | ||
|
|
||
| **UUID bearer token via provider placeholders:** simple, proven, no key | ||
| management. The proxy resolves the placeholder — the real token never enters | ||
| the sandbox. No claims or expiry. | ||
|
|
||
| **Short-lived JWTs with claims:** per-operation authorization and audit trail, | ||
| but adds signing key management and JWT library dependencies in every server. | ||
| The L7 policy already restricts reachable endpoints, making JWT claims a | ||
| redundant second layer. | ||
|
|
||
| ### File transfer between server and sandbox | ||
|
|
||
| **`openshell sandbox upload/download` from the server:** works today, validated | ||
| by experiment, handles real-time exchange during request handling. Couples | ||
| server to OpenShell CLI. | ||
|
|
||
| **Shared host mount:** transparent POSIX access, no CLI coupling. Depends on | ||
| OpenShell mount support that is not yet universally available | ||
| ([NVIDIA/OpenShell#1509](https://github.com/NVIDIA/OpenShell/issues/1509)). | ||
|
|
||
| **HTTP multipart via the API:** fully portable, but large files through the L7 | ||
| proxy add overhead. | ||
|
|
||
| ### Implementation model | ||
|
|
||
| **Language-agnostic process contract:** each server is an independent binary | ||
| in any language. The runner manages it via CLI flags (`--port`, `--token`, | ||
| `--bind-address`) and expects `/healthz`, `/tools.json`, bearer auth, and | ||
| clean `SIGTERM` shutdown. The experiment validated this with Go and Python | ||
| servers. Maximally flexible, but every server re-implements boilerplate (CLI | ||
| parsing, health endpoints, auth middleware, security hardening). | ||
|
|
||
| **Go interface:** fullsend-maintained servers implement an internal Go | ||
| interface, compiled to a single binary. Simplifies deployment (one binary, | ||
| no runtime dependencies) and enables stub-based testing. The process | ||
| contract remains the runner's enforcement boundary — the interface is an | ||
| internal pattern, not a runner requirement. Provides a seam for adapting | ||
| deployment topology across workflow platforms (sidecars, pods). | ||
|
|
||
| ## Decision | ||
|
|
||
| Adopt the host-side API server design with the following implementation model, | ||
| policy model, and security requirements. The | ||
| [`host-side-api-server` experiment](https://github.com/fullsend-ai/experiments/pull/28) | ||
| validated the end-to-end flow; the implementation model was refined during | ||
| review. | ||
|
|
||
|
ifireball marked this conversation as resolved.
|
||
| **Process contract.** Every compiled API server binary accepts `--port`, | ||
| `--token`, and `--bind-address` flags; serves `GET /healthz` | ||
| (unauthenticated) and `GET /tools.json` (structured tool-use schema) for agent | ||
| discovery; validates bearer tokens on all other endpoints; and shuts down | ||
| cleanly on `SIGTERM`. Servers write logs to stderr; the runner collects and | ||
| bundles logs from all API servers so they are available for inspection after | ||
|
maruiz93 marked this conversation as resolved.
|
||
| the run completes. The runner starts servers after pre-script, health-checks | ||
| before sandbox creation, and tears down after sandbox destruction. If a server | ||
| crashes mid-run, the run fails. | ||
|
|
||
| **Implementation model.** The runner enforces the process contract above — any | ||
| binary that satisfies it is a valid API server. Internally, fullsend-maintained | ||
| servers use a Go interface as the recommended pattern: each server provides | ||
| HTTP handlers and a `/tools.json` schema, and the `main()` wires the | ||
| implementation to the process contract (CLI flags, health endpoints, auth | ||
| middleware, signal handling). Harnesses reference a compiled Go binary, the | ||
| workflow runner downloads it to the host, and `fullsend run` manages its | ||
| lifecycle. The interface makes servers testable via stub implementations and | ||
| provides the seam for adapting deployment topology across workflow platforms | ||
| (e.g., sidecars or separate pods on Kubernetes/OpenShift). Implementors who | ||
| need more control can write their own `main()` against the process contract | ||
| directly. | ||
|
|
||
| **Network policy via composable provider profiles.** Each API server ships | ||
| atomic capability profiles — one per logical group of endpoints (e.g., | ||
| `builder-build`, `builder-push`, `builder-read`). Harnesses list which profiles | ||
| to attach. Composition is additive per OpenShell's provider-backed policy | ||
| composition | ||
| ([NVIDIA/OpenShell#1037](https://github.com/NVIDIA/OpenShell/pull/1037)). | ||
| Different agent roles compose different capability sets for the same server. | ||
| Requires OpenShell >= v0.0.37 and | ||
| [#776](https://github.com/fullsend-ai/fullsend/issues/776). | ||
|
|
||
| **Per-run auth:** UUID bearer token via OpenShell provider placeholders. JWTs | ||
| are a future enhancement when per-operation claims become necessary. | ||
|
|
||
| **File transfer:** `openshell sandbox upload/download` from the server during | ||
|
maruiz93 marked this conversation as resolved.
maruiz93 marked this conversation as resolved.
|
||
| request handling. Shared host mount | ||
| ([NVIDIA/OpenShell#1509](https://github.com/NVIDIA/OpenShell/issues/1509)) | ||
| will be evaluated as an alternative when available. | ||
|
|
||
| **Bind address:** servers default to `127.0.0.1`, runner overrides to | ||
| `0.0.0.0` for sandboxed agents. | ||
| [NVIDIA/OpenShell#1633](https://github.com/NVIDIA/OpenShell/issues/1633) | ||
| (supervisor-proxied host-local endpoints) would eliminate this requirement. | ||
|
|
||
| **Security hardening:** timing-safe token comparison, 1 MB request body limits, | ||
| rate limiting on unauthenticated endpoints, credential scrubbing in error | ||
| messages, bounded in-memory state. | ||
|
|
||
| ## Consequences | ||
|
|
||
| - The `api_servers` harness field ([ADR 0024](0024-harness-definitions.md)) will gain a `providers` sub-field and | ||
| defined runtime behavior. The initial implementation targets Go servers | ||
| behind an internal interface; the process contract keeps the door open for | ||
| other languages. | ||
| - Implementing this design requires | ||
| [#776](https://github.com/fullsend-ai/fullsend/issues/776) (provider-backed | ||
| policy composition) as a prerequisite. | ||
| - Servers are coupled to the OpenShell CLI for file transfer until shared host | ||
| mounts are universally available. | ||
| - Servers must bind to `0.0.0.0` on shared hosts, widening the attack surface | ||
| until [NVIDIA/OpenShell#1633](https://github.com/NVIDIA/OpenShell/issues/1633) | ||
| ships. | ||
| - API servers (Tier 3) are now clearly scoped to cases providers cannot | ||
| handle: long-running operations, sandbox capability gaps, credentials in | ||
| request bodies, response transformation, and multi-step atomic operations. | ||
| - Fullsend-maintained servers follow a Go interface pattern (testable, | ||
| platform-portable). The process contract remains the enforcement boundary | ||
| — servers in other languages or with custom `main()` implementations are | ||
| valid as long as they satisfy it. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.