`git clone` over HTTPS fails inside a sandbox when an SNI-matching secret is configured

## Summary

When a sandbox is created with `Secret.env(..., allow_hosts=["github.com"])`,
HTTPS traffic to `github.com` is intercepted by microsandbox. On a clean
public image (e.g. `debian`, `python:3.12-slim`) this manifests in two layers:

1. **TLS cert verification fails by default.** `git clone` (or `curl`) errors
   with `server verification failed: certificate signer not trusted` —
   microsandbox appears to be MITM-ing the connection and presenting a cert
   the system CA store doesn't trust.
2. **If verification is bypassed** (`GIT_SSL_NO_VERIFY=1`, or in an image
   that has microsandbox's CA in its trust store), the connection proceeds
   but the gzipped `git-upload-pack` POST body is corrupted, and the server
   responds:
   ```
   error: RPC failed; HTTP 400 curl 22 The requested URL returned error: 400
   fatal: expected 'packfile'
   ```

Both behaviors are triggered by the **mere presence** of an
`allow_hosts=["github.com"]` secret. The secret value isn't required to be
real — a dummy string reproduces the issue. An unauthenticated public clone
to a different repo on the same host fails identically.

## Environment

- `msb 0.4.6`
- macOS host (darwin, aarch64)
- Linux aarch64 guest, `debian` image, git 2.43.0
- Python SDK: `microsandbox==0.4.6`

## Reproduction

```python
# repro_microsandbox_issue.py
import asyncio
from microsandbox import Action, Network, NetworkPolicy, Sandbox, Secret

IMAGE = "debian"  # also reproduces on python:3.12-slim, ubuntu:noble, etc.
CLONE_TARGET = "https://github.com/zsh-users/zsh-completions.git"
SETUP = "apt-get update -qq && apt-get install -y -qq git ca-certificates >/dev/null"
NETWORK = Network(policy=NetworkPolicy(
    default_egress=Action.ALLOW, default_ingress=Action.ALLOW,
))

async def attempt_clone(name, with_secret):
    secrets = [Secret.env(
        "GITHUB_TOKEN", value="ghp_dummy", allow_hosts=["github.com"],
    )] if with_secret else []
    sb = await Sandbox.create(
        name, image=IMAGE, user="root", memory=2048, detached=False,
        secrets=secrets, network=NETWORK, replace=True,
    )
    try:
        await sb.shell(SETUP)
        # Use GIT_SSL_NO_VERIFY=1 to see the deeper "expected 'packfile'" failure;
        # without it the failure is "certificate signer not trusted" on default images.
        out = await sb.shell(
            f"rm -rf /tmp/x && GIT_SSL_NO_VERIFY=1 git clone {CLONE_TARGET} /tmp/x"
        )
        print(f"[{name}] with_secret={with_secret} exit={out.exit_code}")
        for line in (out.stdout_text + out.stderr_text).strip().splitlines():
            print(f"  {line}")
    finally:
        await sb.stop_and_wait()

async def main():
    await attempt_clone("repro-with-secret", with_secret=True)
    print("---")
    await attempt_clone("repro-no-secret", with_secret=False)

asyncio.run(main())
```

Observed output:

```
[repro-with-secret] with_secret=True exit=128
  Cloning into '/tmp/x'...
  error: RPC failed; HTTP 400 curl 22 The requested URL returned error: 400
  fatal: expected 'packfile'
---
[repro-no-secret] with_secret=False exit=0
  Cloning into '/tmp/x'...
```

With `GIT_SSL_NO_VERIFY=1` **removed**, the `with_secret=True` run instead
reports:

```
fatal: unable to access 'https://github.com/zsh-users/zsh-completions.git/':
server verification failed: certificate signer not trusted.
(CAfile: /etc/ssl/certs/ca-certificates.crt CRLfile: none)
```

Equivalent CLI repro:

```
msb run debian \
  --name repro \
  --detach \
  --secret "GITHUB_TOKEN=ghp_dummy@github.com" \
  --net-default-egress allow

msb exec repro -- bash -c '
  apt-get update -qq && apt-get install -y -qq git ca-certificates >/dev/null &&
  GIT_SSL_NO_VERIFY=1 git clone https://github.com/zsh-users/zsh-completions.git /tmp/x
'
```

## Expected

Clone succeeds, identical to running the same command outside a sandbox or in
a sandbox with no host-scoped secret.

## Actual

Two-layer failure depending on whether system CAs trust microsandbox's
interceptor:

- TLS verification failure (default), or
- TLS verification passes → HTTP 400 / `fatal: expected 'packfile'` on the
  `git-upload-pack` POST.

With `GIT_TRACE=1 GIT_CURL_VERBOSE=1` the trace shows:

1. `GET /<repo>/info/refs?service=git-upload-pack` → `200 OK`
2. `POST /<repo>/git-upload-pack` with
   - `Content-Type: application/x-git-upload-pack-request`
   - `Content-Encoding: gzip`
   - `Content-Length: ~72 KiB`
3. `curl` reports "We are completely uploaded and fine"
4. Server responds `HTTP/1.1 400 Bad Request`, `Content-Length: 0`, headers
   look like a normal GitHub response (`Server: GitHub-Babel/3.0`,
   `X-GitHub-Request-Id: …`).

## What works (rules things out)

- Same image, no `github.com`-scoped secret — clone succeeds.
- `git ls-remote https://github.com/<repo>` — succeeds (GET only, no POST
  body).
- `git clone --depth=1 https://github.com/<repo>` — **succeeds** with the
  secret enabled (POST body small enough to stay uncorrupted).
- Secrets scoped to *other* hosts (`gitlab.example.com`,
  `api.anthropic.com`) — do not affect github.com clones.
- `--net-default-egress allow` and explicit `allow` rules to github.com —
  no effect; the issue is independent of the network policy.

## What doesn't help

- URL auth form: `https://${TOKEN}@github.com/...`,
  `https://x-access-token:${TOKEN}@github.com/...`,
  `https://user:${TOKEN}@github.com/...` — all fail identically.
- HTTP/1.1 instead of HTTP/2 (`-c http.version=HTTP/1.1`).
- Adding a second `github.com`-scoped secret or removing it — one is enough
  to trigger the failure.

## Hypothesis

It looks like microsandbox is doing **active TLS interception** (i.e.
terminating TLS and re-establishing it, MITM-style) for any outbound
connection whose SNI matches an `allow_hosts` entry on any configured
secret. Two observable consequences:

1. The presented cert is signed by an internal CA that isn't in the standard
   `ca-certificates` bundle, so verification fails on stock images.
2. Once decrypted, the request body is byte-rewritten (presumably to scan or
   substitute placeholder strings). On the `git-upload-pack` POST the body
   is **gzipped** — any byte mutation breaks the gzip frame, and the server
   correctly responds with 400.

Two signals point at this rather than at a passive TLS sniffer:

- The cert-trust failure on stock images would not occur with passive
  observation.
- The failure happens with a **dummy** secret value: the substitution
  process touches bytes even when there is no real placeholder to match.

The fact that **the mere presence** of a host-scoped secret toggles the
behavior — not the *use* of the secret in the request — suggests the
interception is always-on for any TLS stream whose SNI matches an
`allow_hosts` entry.

## Impact

This breaks a substantial class of real workflows in sandboxes that have any
github.com-scoped secret:

- `git clone` of moderately-sized public or private repos
- Zsh framework bootstraps that `git clone` plugins on first interactive
  shell (zim, oh-my-zsh, antigen, zinit, …) — entire framework fails to
  install
- Any package manager or build system that fetches sources via HTTPS git
- Anything that fails closed on TLS verification (most production tooling)
  even hits the issue *before* the byte-corruption layer

## Suggested fix directions

- Ship microsandbox's interceptor CA in the standard sandbox runtime in a
  way the guest's system trust store picks up, or document a clear path to
  install it.
- Skip body inspection / mutation when the outgoing request is
  `Content-Encoding: gzip` (or any encoded transfer-encoding).
- Only run substitution when the placeholder is actually present in the
  outbound bytes (scan instead of always rewrite).
- For gzipped bodies, decode → substitute → re-encode rather than mutate
  raw bytes.
- At minimum, document this limitation and emit a clear warning at sandbox
  creation when a secret is scoped to a host known to be used with git
  smart-HTTP.

## Workarounds (current)

- Use `git clone --depth=1` (with `git fetch --unshallow` if full history is
  needed; the unshallow may still hit the same issue on large fetches —
  unconfirmed).
- Pre-bake any needed git clones into the image so they don't run after
  secrets are attached.
- Don't scope secrets to hosts you also need to do full `git clone` against
  — inject those tokens by other means inside the sandbox.
- For framework bootstraps (zim, etc.) that target github.com, run the
  bootstrap before attaching the github.com-scoped secret, or persist the
  bootstrap output in a volume so it doesn't need to re-clone on each boot.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`git clone` over HTTPS fails inside a sandbox when an SNI-matching secret is configured #756

Summary

Environment

Reproduction

Expected

Actual

What works (rules things out)

What doesn't help

Hypothesis

Impact

Suggested fix directions

Workarounds (current)

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

git clone over HTTPS fails inside a sandbox when an SNI-matching secret is configured #756

Description

Summary

Environment

Reproduction

Expected

Actual

What works (rules things out)

What doesn't help

Hypothesis

Impact

Suggested fix directions

Workarounds (current)

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

`git clone` over HTTPS fails inside a sandbox when an SNI-matching secret is configured #756