Skip to content

mitm: support plain-HTTP forward-proxy requests on the MITM ingress#150

Merged
dangtony98 merged 2 commits intomainfrom
mitm/plain-http-forward-proxy
May 6, 2026
Merged

mitm: support plain-HTTP forward-proxy requests on the MITM ingress#150
dangtony98 merged 2 commits intomainfrom
mitm/plain-http-forward-proxy

Conversation

@dangtony98
Copy link
Copy Markdown
Contributor

@dangtony98 dangtony98 commented May 6, 2026

Closes #132.

Summary

The MITM listener previously accepted only CONNECT, returning 405 for any other method. Agents pointed at plain-HTTP upstreams (e.g. Tailscale Aperture's http://ai/v1/chat/completions) lost broker coverage — request logging, host policy, substitutions, credential injection — unless operators bypassed the proxy entirely via NO_PROXY.

This PR adds a second dispatch branch for absolute-form forward-proxy requests (RFC 7230 §5.3.2: POST http://upstream/path HTTP/1.1) on the same TLS-wrapped listener. Proxy-Authorization stays confidential because it travels inside the existing TLS-to-proxy tunnel — no new port, no new listener, no token-confidentiality regression.

vault run now sets HTTP_PROXY alongside HTTPS_PROXY, both pointing at the same TLS-wrapped proxy URL. Plain-HTTP traffic that previously bypassed the broker entirely is now intercepted, audited, and (for matched hosts) credential-injected.

Implementation

  • Refactor: forwardHandler's closure body extracted into forwardRequest(useTLSUpstream bool), shared by the CONNECT-tunnelled HTTPS path and the new plain-HTTP path.
  • Dispatch: new isAbsoluteForwardProxyRequest validator rejects origin-form, https://, ws://, and other non-http schemes; case-insensitive on scheme; rejects empty host and fragments.
  • handleForward: per-IP flood gate (loopback exempt, shared with CONNECT under mitmIPKey + TierAuth) → host:port canonicalisation (default :80) → IsValidHostParseProxyAuthResolveForProxyforwardRequest. Reuses writeProxyAuthChallenge / writeAuthError from connect.go.
  • WebSocket: dialWebSocketUpstream gains a outReq.URL.Scheme == "http" branch that skips TLS for ws:// upstreams.
  • Env: BuildProxyEnv emits HTTP_PROXY=<same URL as HTTPS_PROXY>; ProxyEnvKeys adds HTTP_PROXY so stale parent values are stripped. The TypeScript SDK's ContainerConfig.env, session creation, and buildProxyEnv() all mirror the change.

SSRF protection (netguard.SafeDialContext on p.upstream), credential injection (brokercore.ApplyInjection — strips Proxy-Authorization, X-Vault, hop-by-hop, and Connection-listed headers per RFC 7230 §6.1), per-vault deny passthrough (commit 7a107de), and request logging (IngressMITM) all apply identically to the new path.

Breaking-change note for operators

After upgrading, agents' plain-http:// requests are now intercepted by Agent Vault. Under the default vault policy (passthrough), traffic still works as before. Vaults with unmatched_host_policy=deny will start rejecting plain-HTTP traffic to unconfigured hosts (same as they already do for HTTPS). Operators who relied on HTTP_PROXY being unset to pin specific clients elsewhere can opt out via NO_PROXY.

Three previously-asserted "HTTP_PROXY must NOT be set" tests have been flipped — the historical rationale was functional (the listener would 405), not a security boundary.

Test plan

  • make test (full Go suite, race detector clean) — ok across all packages
  • make build (web + binary)
  • vitest run for the TypeScript SDK — 91/91 tests pass
  • 9 new tests in internal/mitm/forward_test.go covering:
    • flagship plain-HTTP forward + injection + Proxy-Authorization strip
    • https:// rejection (with CONNECT hint)
    • other-scheme allowlist (file, ftp, gopher, ws → 400)
    • missing / invalid Proxy-Authorization → 407 with Proxy-Authenticate
    • vault-hint mismatch → 403
    • hop-by-hop + X-Vault stripping (RFC 7230 §6.1)
    • ws:// WebSocket upgrade + frame piping
    • netguard SSRF blocking loopback at dial time → 502
    • request log row shape (Ingress, Method, Host, Path, VaultID, ActorType/ID, Status)
    • HTTP/1.1 keepalive across two back-to-back forward requests
  • Three HTTP_PROXY-unset assertions flipped to HTTP_PROXY-equals-HTTPS_PROXY
  • Self-review pass via /simplify (deleted dead if p.logSink != nil, inlined upstreamScheme, plumbed LogSink through setupProxy to avoid post-construction race, tightened Host-header assertion)
  • Self-review pass via /security-review (no findings)

Files

  • internal/mitm/{proxy,forward,connect,websocket}.go — dispatch + forward + ws scheme branch
  • internal/mitm/{proxy_test,forward_test}.go — 9 new tests + setupProxy option modifier
  • internal/isolation/env.go, cmd/run.go, sdks/sdk-typescript/src/resources/sessions.tsHTTP_PROXY env injection
  • Three flipped tests: cmd/run_test.go, internal/isolation/{env,integration}_test.go
  • Docs: cmd/skill_{cli,http}.md, README.md, CLAUDE.md, docs/{agents/protocol,self-hosting/environment-variables,reference/cli,installation,guides/connect-coding-agent,guides/container-isolation}.md{x,}

The MITM listener previously accepted only CONNECT, returning 405 for
any other method. Agents pointed at plain-HTTP upstreams (e.g.
Tailscale Aperture's `http://ai/v1/chat/completions`) lost broker
coverage — request logging, host policy, substitutions, and
credential injection — unless operators bypassed the proxy entirely
via NO_PROXY.

This change adds a second dispatch branch for absolute-form
forward-proxy requests (RFC 7230 §5.3.2) on the same TLS-wrapped
listener. Proxy-Authorization stays confidential because it travels
inside the existing TLS-to-proxy tunnel — no new port, no new
listener, no token-confidentiality regression.

Refactor:
- forwardHandler's closure body extracted into forwardRequest, shared
  by the CONNECT-tunnelled HTTPS path and the new plain-HTTP path. A
  useTLSUpstream bool selects http vs. https for the outbound URL.
- mitmConnectIPKey renamed to mitmIPKey; CONNECT and forward share
  the per-IP TierAuth flood-gate budget.

New code:
- isAbsoluteForwardProxyRequest validator rejects origin-form,
  https://, ws://, and other non-http schemes; case-insensitive on
  scheme; rejects empty host and fragments.
- handleForward: per-IP flood gate (loopback exempt, same as CONNECT)
  → host:port canonicalisation (default :80) → IsValidHost →
  ParseProxyAuth → ResolveForProxy → forwardRequest. Reuses
  writeProxyAuthChallenge / writeAuthError verbatim from connect.go.
- dialWebSocketUpstream gains an outReq.URL.Scheme == "http" branch
  that skips TLS for ws:// upstreams.

Env injection:
- BuildProxyEnv now emits HTTP_PROXY=<same TLS-wrapped URL as
  HTTPS_PROXY>; ProxyEnvKeys includes HTTP_PROXY so stale parent
  values are stripped. The TS SDK's ContainerConfig.env, session
  creation, and buildProxyEnv mirror the change.

SSRF protection (netguard.SafeDialContext on p.upstream), credential
injection (brokercore.ApplyInjection — strips Proxy-Authorization,
X-Vault, hop-by-hop, and Connection-listed headers), per-vault deny
passthrough, and request logging (IngressMITM) all apply identically
to the new path. Verified by 9 new tests in internal/mitm/forward_test.go
covering injection, scheme allowlist, auth challenges, hop-by-hop
stripping, ws:// upgrade, SSRF blocking, log row shape, and HTTP/1.1
keepalive.

Three pre-existing assertions that the omission of HTTP_PROXY was
load-bearing have been flipped — the historical rationale was
functional ("would 405 any plain http:// request"), not a security
boundary.

Docs: skill_cli.md and skill_http.md updated together per CLAUDE.md
conventions; README, package doc, env-var reference, CLI reference,
installation guide, and the agents/protocol page all reflect the new
contract.
@infisical-review-police
Copy link
Copy Markdown

💬 Discussion in Slack: #pr-review-agent-vault-150-mitm-support-plain-http-forward-proxy-requests-on-the

Posted by Review Police — reviews, comments, new commits, and CI failures will stream into this channel.

@mintlify
Copy link
Copy Markdown

mintlify Bot commented May 6, 2026

Preview deployment for your docs. Learn more about Mintlify Previews.

Project Status Preview Updated (UTC)
agent-vault 🟢 Ready View Preview May 6, 2026, 1:49 AM

💡 Tip: Enable Workflows to automatically generate PRs for you.

bodyclose linter caught the missing close on the 101 Switching
Protocols response read in TestMITMForwardWebSocketPlainHTTP.
Mirrors the existing pattern in proxy_test.go's WebSocket tests.
@dangtony98 dangtony98 merged commit f25703e into main May 6, 2026
9 checks passed
@dangtony98 dangtony98 deleted the mitm/plain-http-forward-proxy branch May 6, 2026 02:03
Comment thread internal/mitm/forward.go
Comment on lines +84 to +90
urlHost := r.URL.Host
host, port, err := net.SplitHostPort(urlHost)
if err != nil {
host = urlHost
port = "80"
}
target := net.JoinHostPort(host, port)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 When handleForward parses an absolute-form forward-proxy request whose URL contains an IPv6 literal without an explicit port (e.g. POST http://[::1]/path HTTP/1.1), it produces a malformed [[::1]]:80 target. The fix is to use r.URL.Hostname() (which strips brackets) and r.URL.Port() defaulting to "80" when empty — Go's idiomatic pattern that handles IPv4, named hosts, and bracketed IPv6 uniformly.

Extended reasoning...

What goes wrong

For an absolute-form forward request POST http://[::1]/path HTTP/1.1, Go's net/url preserves the brackets in r.URL.Host (it stores the literal-with-brackets form so that round-tripping the URL yields the original). net/url exposes Hostname() to get the unbracketed form when callers want it.

The current code at internal/mitm/forward.go:84-90 does not use that accessor:

urlHost := r.URL.Host
host, port, err := net.SplitHostPort(urlHost)
if err != nil {
    host = urlHost
    port = "80"
}
target := net.JoinHostPort(host, port)

Walking through the IPv6-no-port case step by step:

  1. r.URL.Host == "[::1]" (brackets preserved by url.Parse).
  2. net.SplitHostPort("[::1]") fails with address [::1]: missing port in address — the bracketed form requires an explicit port to be parseable here.
  3. The fallback runs: host = "[::1]" (still bracketed!), port = "80".
  4. net.JoinHostPort("[::1]", "80") sees a : inside the host string, applies its "wrap in brackets" rule, and returns "[[::1]]:80" — double-bracketed and malformed.

I confirmed each of those steps empirically against net.SplitHostPort / net.JoinHostPort, matching the verifier reproductions.

Why validation does not catch it

isValidHost is just brokercore.IsValidHost (internal/brokercore/brokercore.go:53), which rejects only @ ? # / \\ space, %, control chars, and leading/trailing dots. [, ], and : all pass through, so isValidHost("[::1]") returns true and the malformed target reaches forwardRequest.

What the user sees

Inside forwardRequest, the malformed target becomes outURL.Host, then the request URL string http://[[::1]]:80/path. Go's http.NewRequestWithContext is permissive and accepts that, but the upstream dial fails (net.Dial cannot resolve [[::1]]:80), surfacing as 502 Bad Gateway with upstream_error. Concretely an agent calling fetch("http://[::1]/health") or fetch("http://[fd00::1]/api") through HTTP_PROXY against an IPv6 internal service is broken on the path this PR explicitly introduces.

The CONNECT path is not affected: per RFC 7230 §4.3.6 a CONNECT request line is required to be host:port, so net.SplitHostPort always succeeds there and the bracket-stripping happens automatically. This is unique to the new handleForward path.

Step-by-step proof for the fix

Replacing the host/port logic with the standard url.URL accessors handles every case:

host := r.URL.Hostname()
port := r.URL.Port()
if port == "" {
    port = "80"
}
target := net.JoinHostPort(host, port)

Trace for http://[::1]/path: Hostname() == "::1" (brackets stripped), Port() == "" so port = "80", JoinHostPort("::1", "80") == "[::1]:80" (correctly bracketed once). isValidHost("::1") is true. outURL.Host == "[::1]:80", the dial succeeds, the upstream sees a normal request — same shape http://[::1]:8080/path already produces today.

Same trace for IPv4 http://10.0.0.1/path: Hostname() == "10.0.0.1", Port() == "", port = "80", target "10.0.0.1:80" — identical to today's behaviour.

Same trace for named host http://api.example.com/path: Hostname() == "api.example.com", target "api.example.com:80" — identical to today.

Severity

Normal, not pre-existing: the entire handleForward codepath is new in this PR, so any malformed-target failure on it is a regression introduced here. The trigger is narrow (IPv6-literal HTTP upstream without an explicit port) but lives squarely inside the use case this PR markets — "plain-HTTP forward-proxy support for internal services." The fix is a one-liner with no test changes required (existing tests don't cover IPv6 literals; adding one would be a nice follow-up but not strictly necessary).

Comment thread internal/mitm/forward.go
emit(http.StatusBadGateway, "internal")
return
}
outReq.Host = host
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 forwardRequest sets outReq.Host = host (port-stripped) on line 194, so the wire Host header omits the port for non-default-port upstreams (e.g. http://internal.example:8080/api becomes Host: internal.example). This violates RFC 7230 §5.4, which requires the Host field-value to match the URI authority including non-default ports — strict upstreams reject it, vhost routing keyed on host:port breaks, and self-referential URLs (redirects, link generation) come back wrong. The pattern existed in the CONNECT path but rarely bit there because HTTPS:443 is the canonical port; the new plain-HTTP forward-proxy path makes non-default ports the common case (k8s ClusterIP, dev servers, microservices). Fix: outReq.Host = target (or drop the line — Go will use outURL.Host which retains the port). Worth noting, the new TestMITMForwardPlainHTTPInjectsCredentials asserts sawHost == upstreamHost (port-stripped), which bakes the buggy behavior in rather than catching it.

Extended reasoning...

What goes wrong

internal/mitm/forward.go:194 sets outReq.Host = host where host is the port-stripped form of the upstream authority. Go's net/http uses Request.Host verbatim as the wire Host header when it is non-empty (it does not add the port back from URL.Host). For non-default-port upstreams this produces a Host header that does not match the URI authority, in violation of RFC 7230 §5.4:

If the target URI includes an authority component, then a client MUST send a field-value for Host that is identical to that authority component, excluding any userinfo subcomponent and its "@" delimiter.

Default ports (80 for HTTP, 443 for HTTPS) are tolerated everywhere because the spec allows omitting them. Non-default ports — :8080, :3000, :5000, :8443 etc. — are the dominant case for internal HTTP services, and stripping the port there breaks:

  • Virtual-host routing on upstreams that key on host:port (nginx, Envoy, k8s ingress with port-aware vhosting)
  • Self-referential URL generation (redirects, link generation, OAuth callback validation that includes Host in signed state)
  • Strict HTTP/1.1 servers that validate Host per the RFC

Step-by-step proof

For an agent issuing POST http://internal.example:8080/api/v1/items HTTP/1.1 through the new forward-proxy:

  1. handleForward parses r.URL.Host = "internal.example:8080" via net.SplitHostPorthost="internal.example", port="8080", target="internal.example:8080"
  2. Calls p.forwardRequest(w, r, target, host, false, scope)
  3. forwardRequest builds outURL.Host = target ("internal.example:8080") — used for TCP routing
  4. http.NewRequestWithContext(...) initially sets outReq.Host to "internal.example:8080" (from outURL.Host)
  5. Line 194 overrides it: outReq.Host = host → "internal.example" (no port)
  6. http.Transport.RoundTrip serialises outReq.Host literally, so the wire frame becomes:
    POST /api/v1/items HTTP/1.1
    Host: internal.example
    ...
    

The upstream receives a Host header missing the port that was in the URI authority — non-canonical per RFC 7230 §5.4.

Why the PR amplifies a previously latent bug

The same line existed in the old forwardHandler closure, but the only caller was the CONNECT path with useTLSUpstream=true — i.e. HTTPS:443, the canonical port. Stripping :443 from Host: api.example.com:443 to Host: api.example.com is RFC-permitted, so the bug was effectively dead code. The PR refactors that line into the shared forwardRequest and adds handleForward as a new caller specifically targeted at plain-HTTP internal services (the PR description's example: http://ai/v1/chat/completions). Internal HTTP traffic is overwhelmingly on non-default ports, so the latent line now bites in production.

What the new test asserts

TestMITMForwardPlainHTTPInjectsCredentials computes upstreamHost via net.SplitHostPort(strings.TrimPrefix(upstream.URL, "http://")) — the port-stripped form — then asserts sawHost != upstreamHost is an error. Since httptest.NewServer binds a random non-default port, this enforces the buggy port-stripping behavior rather than catching it. The test comment even acknowledges "handleForward sets outReq.Host to the port-stripped form … upstream sees that exact value", which documents the wrong behavior.

Fix

One-line change at internal/mitm/forward.go:194:

outReq.Host = target  // was: outReq.Host = host

Or simply drop the line — http.NewRequestWithContext already populates outReq.Host from outURL.Host, which has the port. The host argument is still needed for credential lookup (p.creds.Inject(..., host)), so its parameter remains; only the assignment to outReq.Host should be retargeted.

The test assertion should change from sawHost != upstreamHost to sawHost != upstreamHost+":"+upstreamPort (or sawHost != strings.TrimPrefix(upstream.URL, "http://")), so it actively verifies the canonical Host header.

Comment thread cmd/run.go
Comment on lines 42 to 57
AGENT_VAULT_ADDR — base URL of the Agent Vault HTTP control server
AGENT_VAULT_VAULT — vault the session is scoped to

The child also inherits HTTPS_PROXY / NO_PROXY / NODE_USE_ENV_PROXY plus
the root CA trust variables (SSL_CERT_FILE, NODE_EXTRA_CA_CERTS,
REQUESTS_CA_BUNDLE, CURL_CA_BUNDLE, GIT_SSL_CAINFO, DENO_CERT) so standard
HTTPS clients transparently route through the broker. NODE_USE_ENV_PROXY=1
enables Node.js built-in proxy support (v22.21.0+) so fetch() and
https.get() honor HTTPS_PROXY natively. HTTP_PROXY is intentionally not
set — the MITM proxy only handles HTTPS (CONNECT) and would 405 any plain
http:// request. The root CA PEM is written to ~/.agent-vault/mitm-ca.pem.
The child also inherits HTTPS_PROXY / HTTP_PROXY / NO_PROXY /
NODE_USE_ENV_PROXY plus the root CA trust variables (SSL_CERT_FILE,
NODE_EXTRA_CA_CERTS, REQUESTS_CA_BUNDLE, CURL_CA_BUNDLE, GIT_SSL_CAINFO,
DENO_CERT) so both HTTPS and plain-HTTP clients transparently route
through the broker. NODE_USE_ENV_PROXY=1 enables Node.js built-in proxy
support (v22.21.0+) so fetch() and http.get()/https.get() honor the
proxy env natively. HTTPS_PROXY and HTTP_PROXY both point at the same
TLS-wrapped proxy URL — the listener accepts CONNECT for https://
upstreams and absolute-form forward-proxy requests for http:// on the
same port. The root CA PEM is written to ~/.agent-vault/mitm-ca.pem.

Example:
` + examplePrefix + ` -- claude
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The runtime startup banner at cmd/run.go:149 still prints routing HTTPS through MITM proxy and the surrounding step comment at line 141 still says "Route the child's HTTPS traffic", even though this PR's whole point is that plain-HTTP traffic now also routes through the MITM ingress. Operators see this banner first thing on every successful vault run launch, immediately after reading the new help text that promises HTTP/HTTPS coverage. Trivial fix: s/routing HTTPS/routing HTTP\/HTTPS/ and update the step comment to match.

Extended reasoning...

What the bug is. The PR rewrote every text surface that mentioned HTTPS-only routing — the cobra Long help (lines 42–57 of the diff), the docstring on augmentEnvWithMITM (lines 386–395), the assertions in run_test.go, README.md, CLAUDE.md, skill_cli.md, skill_http.md, docs/agents/protocol.mdx, docs/installation.mdx, docs/reference/cli.mdx, docs/self-hosting/environment-variables.mdx, docs/guides/connect-coding-agent.mdx, docs/guides/container-isolation.mdx, even the TypeScript SDK's ContainerConfig JSDoc — but the one user-visible runtime confirmation message was missed.\n\nWhere it manifests. cmd/run.go:149:\n\ngo\nfmt.Fprintf(os.Stderr, "%s routing HTTPS through MITM proxy (127.0.0.1:%d)\n", successText("agent-vault:"), mitmPort)\n\n\nand the surrounding step comment on line 141:\n\ngo\n// 6. Route the child's HTTPS traffic through the transparent MITM\n// proxy. The MITM ingress is the only credential-injection path,\n// so a failure here is fatal.\n\n\nBoth predate this PR (the line itself isn't in the diff) but are now stale because the headline behavioral change of this PR is that HTTP_PROXY is also injected and plain-HTTP traffic is also intercepted.\n\nWhy this slipped past existing checks. The line is a literal fmt.Fprintf to stderr — no tests assert on the banner string, and grep HTTPS on the diff would surface only deleted/changed strings, not unchanged ones. The text-sweep otherwise was thorough; this is the single drift.\n\nStep-by-step proof.\n1. Build the new binary: make build.\n2. Start a server: ./agent-vault server -d.\n3. Run ./agent-vault run -- printenv | grep -E '^(HTTP|HTTPS)_PROXY'.\n4. Stderr emits: agent-vault: routing HTTPS through MITM proxy (127.0.0.1:14322).\n5. Stdout shows both HTTPS_PROXY AND HTTP_PROXY set to the same TLS-wrapped MITM URL — exactly what augmentEnvWithMITM now does per the updated docstring at run.go:386–395 and per isolation.BuildProxyEnv at internal/isolation/env.go:52–53.\n6. Result: the operator-facing message contradicts what was actually injected, and contradicts every doc the operator might cross-reference (Long help directly above the call site, README, docs/reference/cli.mdx, etc.).\n\nImpact. Cosmetic — no functional defect. But the PR description explicitly highlights plain-HTTP coverage as the breaking change ("After upgrading, agents' plain-http:// requests are now intercepted by Agent Vault"). Leaving the only observable runtime confirmation as "HTTPS"-only undersells exactly the breaking change operators most need to notice.\n\nHow to fix. One-line edit at cmd/run.go:149 — change the format string to "%s routing HTTP/HTTPS through MITM proxy (127.0.0.1:%d)\n". Optionally also update the step comment at line 141 from "Route the child's HTTPS traffic" to "Route the child's HTTP and HTTPS traffic" to mirror the augmentEnvWithMITM docstring's wording ("transparently routes HTTP and HTTPS").

Comment thread CLAUDE.md
Comment on lines +25 to +32
- **Single ingress into the broker — transparent MITM** (on by default, port 14322, disable with `--mitm-port 0`): TLS-encrypted HTTPS_PROXY/HTTP_PROXY-compatible ingress backed by [internal/mitm](internal/mitm/) + [internal/ca](internal/ca/) (software CA, root key encrypted with the master key). The listener is TLS-wrapped (cert signed by the MITM CA) and accepts both `CONNECT host:port` (HTTPS upstreams) and absolute-form forward-proxy requests (`POST http://host/path HTTP/1.1`, RFC 7230 §5.3.2) for plain-HTTP upstreams on the same port. Clients use `HTTPS_PROXY=https://...` and `HTTP_PROXY=https://...` — both point at the same TLS-wrapped proxy URL. Credential injection lives in `brokercore`. HTTP/1.1 at the ingress, with transparent WebSocket upgrade support (HTTP/2 not yet). Bind failures are non-fatal — the core HTTP server keeps running.
- **Proposals = GitHub-PR-style change requests.** Agents cannot edit services or credentials directly; they create proposals, a human approves in CLI or browser, and apply merges atomically. Per-vault sequential IDs. 7-day TTL.
- **Two independent permission axes**:
- Instance role: `owner` vs `member` (applies to both users and agents).
- Vault role: `proxy` < `member` < `admin`. Proxy can use the proxy and raise proposals; member can manage credentials/services; admin can invite humans.
- **KEK/DEK key wrapping**: A random DEK (Data Encryption Key) encrypts credentials and the CA key at rest (AES-256-GCM). If a master password is set, Argon2id derives a KEK (Key Encryption Key) that wraps the DEK; changing the password re-wraps the DEK without re-encrypting credentials. If no password is set (passwordless mode), the DEK is stored in plaintext — suitable for PaaS deploys where volume security is the trust boundary. Login uses email+password. The first user to register becomes the instance owner and is auto-granted vault admin on `default`.
- **Agent skills are the agent-facing contract.** [cmd/skill_cli.md](cmd/skill_cli.md) and [cmd/skill_http.md](cmd/skill_http.md) are embedded into the binary, installed by `vault run`, and served publicly at `/v1/skills/{cli,http}`. They are the authoritative reference for what agents can do.
- **Two isolation modes for `vault run`** (selected via `--isolation` or `AGENT_VAULT_ISOLATION`): `host` (default, cooperative — fork+exec on the host with `HTTPS_PROXY` envvars) and `container` (non-cooperative — Docker container with iptables egress locked to the Agent Vault proxy). Container mode lives in [internal/isolation/](internal/isolation/) with an embedded Dockerfile + init-firewall.sh + entrypoint.sh, built on first use and cached by content hash.
- **Two isolation modes for `vault run`** (selected via `--isolation` or `AGENT_VAULT_ISOLATION`): `host` (default, cooperative — fork+exec on the host with `HTTPS_PROXY`/`HTTP_PROXY` envvars) and `container` (non-cooperative — Docker container with iptables egress locked to the Agent Vault proxy). Container mode lives in [internal/isolation/](internal/isolation/) with an embedded Dockerfile + init-firewall.sh + entrypoint.sh, built on first use and cached by content hash.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Doc cross-sync miss: this PR adds HTTP_PROXY alongside HTTPS_PROXY in vault run and updates ~8 pages to match, but ~14 other docs still describe HTTPS_PROXY-only behavior — quickstarts (claude-code, codex, cursor, hermes-agent, opencode, nanoclaw, openclaw, custom-agent), docs/index.mdx:24, docs/agents/overview.mdx, docs/learn/security.mdx:118 (heading), docs/learn/services.mdx:187, docs/guides/connect-custom-agent.mdx, and docs/self-hosting/{local,docker,fly-io}.mdx. The agents/protocol.mdx body table on line 90 was correctly updated, but the section heading "Route requests through HTTPS_PROXY" (and the frontmatter description) wasn't. Nothing functionally breaks — this is purely the cross-doc-sync rule the PR's own CLAUDE.md line ~32/50 codifies. Mechanical fix: the same wording flip the PR already applied in the eight updated pages, repeated across the listed paths.

Extended reasoning...

What is wrong. This PR's own CLAUDE.md change (the file's own conventions section, line ~32/50) explicitly says: "Whenever a new feature lands, scan docs/ for pages that need a matching update — quickstart, guides, self-hosting, learn, and reference all mirror runtime behavior and drift fast." The PR does this for eight pages (CLAUDE.md, README.md, docs/agents/protocol.mdx body table, docs/guides/{connect-coding-agent,container-isolation}.mdx, docs/installation.mdx, docs/reference/cli.mdx, docs/self-hosting/environment-variables.mdx) and the two embedded skill files. It misses ~14 others that still describe HTTPS_PROXY-only behavior, contradicting the new runtime behavior the PR ships.

Verified by grep against the working tree. Each of the following lines was confirmed:

  • docs/quickstart/claude-code.mdx:13, cursor.mdx:13, codex.mdx:13, hermes-agent.mdx:13, opencode.mdx:13: "pre-configures HTTPS_PROXY and CA trust so outbound HTTPS routes through Agent Vault transparently". Same five files at line ~39: "HTTPS_PROXY routes the call through Agent Vault".
  • docs/quickstart/custom-agent.mdx:3,13,27,62,89 — frontmatter description and body all HTTPS_PROXY-only.
  • docs/quickstart/nanoclaw.mdx:43, openclaw.mdx:43: "Invited agents like NanoClaw build their own HTTPS_PROXY…".
  • docs/index.mdx:24: "You point the agent at Agent Vault by setting HTTPS_PROXY and trusting the Agent Vault CA certificate.".
  • docs/agents/overview.mdx:12,22,162 — three HTTPS_PROXY-only mentions on the canonical agents overview page.
  • docs/learn/security.mdx:118 — section heading "Transparent proxy (HTTPS_PROXY) transport" plus body.
  • docs/learn/services.mdx:187 — Twilio walkthrough still says "via HTTPS_PROXY".
  • docs/guides/connect-custom-agent.mdx:3,27,41,62,69,91,118,127 — frontmatter description, body, env-var table row, and the heading "Set HTTPS_PROXY manually" all HTTPS_PROXY-only.
  • docs/self-hosting/local.mdx:81: "transparent HTTPS_PROXY listener on port 14322".
  • docs/self-hosting/docker.mdx:24: "transparent HTTPS proxy (14322) so agents' HTTPS_PROXY can reach the broker".
  • docs/self-hosting/fly-io.mdx:76: "Agent HTTPS_PROXY".
  • docs/agents/protocol.mdx: the body env-var table on line ~88-90 correctly added an HTTP_PROXY row, but the section heading "Route requests through HTTPS_PROXY" and the frontmatter description still describe HTTPS_PROXY-only behavior.

Step-by-step proof of impact.

  1. Operator follows docs/installation.mdx (updated in this PR) and stands up a server. They read line 121 — "the transparent HTTP/HTTPS proxy that agents' HTTPS_PROXY and HTTP_PROXY point at". Good.
  2. They click into the Claude Code quickstart at docs/quickstart/claude-code.mdx. Line 13 says: "pre-configures HTTPS_PROXY and CA trust so outbound HTTPS routes through Agent Vault transparently". They reasonably conclude that plain-http:// upstreams are NOT brokered.
  3. The agent makes a plain-http:// call to an internal service. Under the new code path it IS brokered (audited, credential-injected if matched, denied under unmatched_host_policy=deny). The operator is surprised — the docs told them otherwise.
  4. Now flip the scenario: a vault has unmatched_host_policy=deny and the operator hadn't added the internal service. Under the old behavior, a plain-http:// call would never have hit the broker. Under the new behavior it returns 403 from Agent Vault. The operator goes to docs/learn/security.mdx to debug, sees the heading "Transparent proxy (HTTPS_PROXY) transport" and the HTTPS-only body below it, and now genuinely cannot tell whether plain-HTTP traffic is supposed to go through the broker or not.

That's the operator-facing harm: not a runtime bug, but documentation that actively misleads someone trying to reason about traffic flow with the new code in place.

Why existing code does not prevent this. This is documentation, not code — there's no test that fails on doc drift, and the PR's own self-review checklist (run /simplify, run /security-review, make test) does not cover scanning unchanged docs for stale references. The only safeguard is the CLAUDE.md convention that this PR violates.

Impact. Severity nit. No functional regression — every line cited is factually correct for HTTPS, just incomplete with respect to HTTP in the new world. Operators reading these pages will conclude HTTP_PROXY is best-effort or only-some-pages, which is the opposite of true. Cosmetic, but pervasive (~14 pages on a docs surface this PR explicitly touched).

How to fix. Mechanical sed pass over the listed paths replacing the same wording the PR already established in the updated pages: HTTPS_PROXYHTTPS_PROXY/HTTP_PROXY, HTTPS trafficHTTP and HTTPS traffic, the section headings updated to drop the HTTPS_PROXY-specific framing, and the agents/protocol.mdx frontmatter description + section heading updated to match the body table that was already corrected.

dangtony98 added a commit that referenced this pull request May 6, 2026
Wider sweep after the post-merge review surfaced more stale references
the four-comment review didn't enumerate. Highest-impact missing
surface was the five embedded server instruction files served to
invited agents:

- internal/server/instructions.txt
- internal/server/persistent_agent_instructions.txt
- internal/server/persistent_instructions_{admin,member,proxy}.txt

These are normative protocol guidance baked into the binary; without
this update, invited agents would set HTTPS_PROXY only and continue
bypassing the broker for plain-http:// upstreams even after #150
shipped.

Also updated:

- examples/daytona-openai-realtime/run.mjs — `delete env.HTTP_PROXY` /
  `http_proxy` in the egress-block test (parity with HTTPS_PROXY) and
  added `-e HTTP_PROXY=...` to the example docker run.
- web/src/pages/home/AllAgentsTab.tsx — manual proxy-setup snippet now
  exports both HTTPS_PROXY and HTTP_PROXY.
- cmd/skill_{cli,http}.md — "transparent HTTPS proxy" → "transparent
  HTTP/HTTPS proxy" in the canonical agent-facing skill blurb;
  skill_http frontmatter description matched.
- README.md — server-startup line now describes the same.
- docs/self-hosting/environment-variables.mdx — AGENT_VAULT_ISOLATION
  row and trailing CA-PEM paragraph mention HTTP_PROXY too.
- docs/quickstart/{nanoclaw,openclaw}.mdx — example bullets reference
  both env vars.
- internal/brokercore/proxyauth.go — `Basic` form doc explains both
  env vars take the same proxy URL.
- internal/brokercore/brokercore.go — broker-scoped header doc says
  Proxy-Authorization arrives via either ingress shape.
- internal/server/handle_mitm.go — CA-PEM-not-listening guard mentions
  both env vars.

No code-path or behavioural change beyond the original #151 fixes;
this is doc/instruction parity.
dangtony98 added a commit that referenced this pull request May 6, 2026
…c) (#151)

Follow-up to #150 — addresses the four findings from the post-merge
automated review.

## Two correctness bugs

### 1. IPv6 literal target double-bracketed (`internal/mitm/forward.go`)

For `POST http://[::1]/path HTTP/1.1`:
- `r.URL.Host == "[::1]"` (Go preserves brackets so the URL
round-trips).
- `net.SplitHostPort("[::1]")` fails with `missing port in address`.
- Old fallback set `host = "[::1]"`, `port = "80"`.
- `net.JoinHostPort("[::1]", "80")` saw `:` inside the host string,
applied its bracket rule, and produced `[[::1]]:80` — double-bracketed
and unreachable.

CONNECT path was unaffected because RFC 7230 §4.3.6 requires CONNECT
request lines to be `host:port`, so `SplitHostPort` always succeeded
there.

**Fix**: switch to `r.URL.Hostname()` (strips brackets) and
`r.URL.Port()` (returns `""` when omitted). Idiomatic Go pattern,
handles IPv4, named hosts, and bracketed IPv6 uniformly.

**Test**: new `TestMITMForwardIPv6LiteralCanonicalises` binds a real
listener on `[::1]:0`, sends an absolute-form forward request, and
asserts the upstream sees `Host: [::1]:<port>` (correctly bracketed
exactly once). Skips on hosts without IPv6 loopback.

### 2. Wire Host header strips port for non-default-port upstreams

Old code set `outReq.Host = host` (port-stripped), so
`http://internal.example:8080/api` forwarded as `Host: internal.example`
— RFC 7230 §5.4 violation. The legacy CONNECT path tolerated this
because HTTPS:443 is canonical and the port can be omitted, but the new
plain-HTTP forward path makes non-default ports the dominant case (k8s
ClusterIP, dev servers, microservices, the Tailscale Aperture / Hermes
scenario in #132).

**Fix**: `outReq.Host = target` (port-included), with a comment
explaining the RFC and why the legacy behaviour was effectively dead
code.

**Test**: `TestMITMForwardPlainHTTPInjectsCredentials` updated. The old
assertion `sawHost == upstreamHost` (port-stripped) actively baked in
the buggy behaviour; new assertion checks against `upstreamAuthority`
(host:port).

## Two cosmetic misses

### 3. Startup banner still said "routing HTTPS"

[cmd/run.go:149](cmd/run.go#L149) emits the user's first signal after
`vault run` succeeds. Updated to `routing HTTP/HTTPS through MITM proxy
(...)` and the surrounding step comment to match.

### 4. Doc cross-sync — 14 pages still described HTTPS_PROXY-only

The original PR updated 8 pages but missed 14 others. Updated:

- Quickstarts: `claude-code`, `codex`, `cursor`, `hermes-agent`,
`opencode`, `nanoclaw`, `openclaw`, `custom-agent`
- `docs/index.mdx`
- `docs/agents/overview.mdx`
- `docs/agents/protocol.mdx` — frontmatter description and section
heading
- `docs/learn/security.mdx` — heading + body
- `docs/learn/services.mdx` — Twilio walkthrough
- `docs/guides/connect-custom-agent.mdx` — frontmatter, body, env-var
table (split into HTTPS_PROXY + HTTP_PROXY rows), heading, manual-setup
snippet (`export HTTP_PROXY="$HTTPS_PROXY"`)
- `docs/self-hosting/{local,docker,fly-io}.mdx`

## Verification

- [x] `make test` (full Go suite, race detector clean) — `ok` across all
packages
- [x] `make build` — succeeds
- [x] `golangci-lint run ./...` — `0 issues`
- [x] New IPv6 test passes locally
- [x] Existing CONNECT-path tests still pass under the `outReq.Host =
target` change

## Files

- `internal/mitm/forward.go` — IPv6 fix + Host header fix + clarifying
comments
- `internal/mitm/forward_test.go` — IPv6 regression test + tightened
Host assertion
- `cmd/run.go` — banner + step comment
- 14 doc pages — `HTTP_PROXY`/`HTTPS_PROXY` parity
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Agent Vault breaks Tailscale Aperture / Hermes

1 participant