Skip to content

feat(experimental): agent-sandbox Kubernetes backend#3

Open
christopherwxyz wants to merge 1 commit into
upstream-pr/structured-subagent-argsfrom
upstream-pr/agent-sandbox-backend
Open

feat(experimental): agent-sandbox Kubernetes backend#3
christopherwxyz wants to merge 1 commit into
upstream-pr/structured-subagent-argsfrom
upstream-pr/agent-sandbox-backend

Conversation

@christopherwxyz
Copy link
Copy Markdown

@christopherwxyz christopherwxyz commented May 24, 2026

Summary

New experimental backend alongside the existing ATE/Substrate implementation in internal/experimental/k8s/. Targets kubernetes-sigs/agent-sandbox — the GA-stable SandboxClaim/SandboxTemplate/SandboxWarmPool API for gVisor-isolated agent pods on Kubernetes.

agent-sandbox is the only viable path on GKE Autopilot, where Substrate's atelet DaemonSet (privileged + hostPort + hostPath) is blocked by Warden.

Depends on PR #1

This PR is rebased onto upstream-pr/structured-subagent-args (PR #1). The bundled examples/python_sandbox_agent reference impl reads start.GetSubagentPrompt() — that field is added by PR #1. Land PR #1 first.

ax.yaml

registry:
  agent_sandbox_agents:
    - id: py
      name: Python Sandbox Agent
      description: Executes Python in a gVisor sandbox
      namespace: agent-platform
      template: python-sandbox-template
      port: 8494
      protocol: axp

Design notes

  • SandboxClaim, not raw Sandbox CR. Raw Sandbox creation with an empty PodSpec fails admission (spec.podTemplate.spec.containers: Required value) because the agent-sandbox controller expects the SandboxTemplate to drive PodSpec defaulting. SandboxClaim (extensions.agents.x-k8s.io/v1alpha1) references the template by name and is the documented path.

  • Conversation-scoped lifecycle. One SandboxClaim per AX conversation. Adopted on conflict — useful when an ax-server restart reconnects to a still-running pod.

  • Readiness signal. Poll status.sandbox.podIPs[0]. Default 2-minute timeout covers cold image pulls; tunable via WithReadyTimeout.

  • Reference impl. examples/python_sandbox_agent — a minimal Python-exec agent that demonstrates the AgentService contract. Reads the planner's typed prompt via start.GetSubagentPrompt() (PR proto: subagent prompt/history as structured AgentStart fields #1's field).

Test coverage

  • Controller-runtime fake client with WithStatusSubresource so Status().Update() actually round-trips.
  • Harness integration test that exercises create → adopt → delete.
  • Reference-impl tests for python_sandbox_agent including both the structured-args path and the legacy direct-caller path.

go test -race ./... clean.

Production context

Two months of production runtime in our deployment.

Scrubs from internal staging

  • Test-fixture team: unoteam: example
  • examples/python_sandbox_agent/main.go + README image paths → <registry>/<project>/... substitution markers
  • cloudbuild.yaml omitted (project-specific build script)
  • Design doc "For uno scale" → "For modest scale"

Adds a new experimental backend alongside the existing
internal/experimental/k8s/ate (Substrate) implementation. The new
backend targets [kubernetes-sigs/agent-sandbox] — the GA-stable
SandboxClaim/SandboxTemplate/SandboxWarmPool API for gVisor-isolated
agent pods on Kubernetes. agent-sandbox is the only viable path on
GKE Autopilot, where Substrate's atelet DaemonSet (privileged +
hostPort + hostPath) is blocked by Warden.

[kubernetes-sigs/agent-sandbox]: https://github.com/kubernetes-sigs/agent-sandbox

Configuration in ax.yaml:

    registry:
      agent_sandbox_agents:
        - id: py
          name: Python Sandbox Agent
          description: Executes Python in a gVisor sandbox
          namespace: agent-platform
          template: python-sandbox-template
          port: 8494
          protocol: axp

Design notes:

  - Creates SandboxClaim (extensions.agents.x-k8s.io/v1alpha1) rather
    than raw Sandbox CR. Raw Sandbox creation with an empty PodSpec
    fails admission (spec.podTemplate.spec.containers: Required
    value) because the agent-sandbox controller expects the
    SandboxTemplate to drive PodSpec defaulting.

  - Adopts existing SandboxClaims on conflict — useful when an
    ax-server restart reconnects to a still-running conversation pod.

  - status.sandbox.podIPs is the readiness signal we poll on. A 2-minute
    default timeout covers cold image pulls without a warm pool;
    callers can tune via WithReadyTimeout.

  - The new AgentSandboxAgent type implements agent.Agent and mirrors
    the shape of SubstrateAgent in internal/experimental/agent. On
    Connect it creates a per-conversation SandboxClaim, builds a
    RemoteAgent at podIP:port (defaults to 8494, configurable), and
    defers DeleteSandbox.

  - RegisterAgentSandbox lands in both internal/controller and
    internal/controller2 (which exist in parallel during the
    controller2 refactor).

Tests use controller-runtime's fake client with
WithStatusSubresource so Status().Update() actually round-trips —
the default fake silently drops status mutations otherwise.

go test -race ./... clean against origin/main with this change.
@christopherwxyz christopherwxyz force-pushed the upstream-pr/agent-sandbox-backend branch from 9d0e192 to 88c443a Compare May 24, 2026 15:15
@christopherwxyz christopherwxyz changed the base branch from main to upstream-pr/structured-subagent-args May 24, 2026 15:15
christopherwxyz added a commit that referenced this pull request May 25, 2026
…earch_agent)

These two examples were Slack-specific / Vertex-specific application
code that has been extracted into the uno-infra monorepo where they
belong (deployed alongside their K8s manifests).

Keeping them in the AX fork's examples/ created merge conflicts on
every upstream pull from google/ax. With them gone, the fork's
divergence from upstream is bounded to:
  - the three PR branches (#1 #2 #3) of generic AX improvements
  - feat/agent-sandbox-backend's tracking branch

examples/python_sandbox_agent/ stays — it's a generic reference
implementation of the agent-sandbox backend pattern, useful upstream.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant