Skip to content

fix(docker): wrap docker CLI invocations in user shell to inherit PATH#378

Merged
Dumbris merged 5 commits intomainfrom
fix/docker-path-execution
Apr 11, 2026
Merged

fix(docker): wrap docker CLI invocations in user shell to inherit PATH#378
Dumbris merged 5 commits intomainfrom
fix/docker-path-execution

Conversation

@Dumbris
Copy link
Copy Markdown
Member

@Dumbris Dumbris commented Apr 11, 2026

Summary

  • Wraps all docker CLI invocations in the user's login shell (sh -l -c on Unix, cmd /c on Windows) so Docker-related subprocesses inherit the same PATH as the user's interactive shell.
  • Fixes the case where mcpproxy (launched from the macOS tray / LaunchAgent) cannot find docker because Docker Desktop adds itself to PATH only via the user's shell rc files, not the GUI environment.
  • Covers both the upstream Docker runtime (internal/upstream/core/docker.go) and the security scanner Docker runner (internal/security/scanner/docker.go). Upstream Client changes reuse the existing wrapWithUserShell helper and apply envManager.BuildSecureEnvironment(); scanner introduces a local getDockerCmd helper with inline POSIX/Windows quoting.

Test plan

  • go build ./... passes (currently FAILS — see review comment about unused os/exec import in internal/upstream/core/monitoring.go)
  • go test ./internal/upstream/core/... -race
  • go test ./internal/security/scanner/... -race
  • Manual: launch mcpproxy from the macOS tray (no shell PATH) and confirm Docker-isolated upstream servers start and the supply-chain scanner runs
  • Manual: run on Linux / Windows to confirm shell wrapping does not regress startup
  • ./scripts/test-api-e2e.sh

Copy link
Copy Markdown
Member Author

@Dumbris Dumbris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review: fix/docker-path-execution

Thanks for tackling this — the macOS-tray PATH problem has been a real pain. Overall the approach (shell-wrapping all docker invocations so the login shell populates PATH) is sound and consistent with how connection_stdio.go already wraps stdio upstream commands. A few issues though, one of which is a hard blocker.

Blocker: build is broken

internal/upstream/core/monitoring.go still imports os/exec but no longer references it after the two call sites were migrated to c.newDockerCmd. go build ./... fails with:

internal/upstream/core/monitoring.go:8:2: \"os/exec\" imported and not used

Fix: drop the os/exec import. This alone will block CI and merge.

Correctness / design issues

  1. Two parallel shell-wrap implementations. internal/upstream/core/docker.go correctly reuses the existing c.wrapWithUserShell(...) + shellescape(...) helpers from connection_stdio.go. But internal/security/scanner/docker.go reinvents the wheel with an inline getDockerCmd that:

    • Hardcodes sh -l -c (ignores $SHELL, unlike wrapWithUserShell which respects it and falls back sensibly).
    • Hardcodes cmd /c on Windows even when the user runs Git Bash (wrapWithUserShell has explicit logic for this via isBash).
    • Uses a hand-rolled POSIX single-quote escape ('\"'\"') and a naive Windows strings.Trim(arg, '\"') that silently destroys any legitimately-quoted argument.

    This is a maintainability and correctness liability. Either extract wrapWithUserShell / shellescape to a small shared package (e.g. internal/shellwrap) and use it from both call sites, or at minimum document why the scanner needs its own variant. Right now the two implementations WILL drift.

  2. Security: shell metacharacters in scanner args. The inline escape in the scanner is correct for fully-controlled arguments (container names, image IDs), but RunScanner passes cfg-derived args that can include user-provided image names, volume mounts, and env vars. If any of those ever flow from remote config / registry data, the hand-rolled quoting is the only thing standing between an attacker and arbitrary shell execution on the host. The existing shellescape in connection_stdio.go has been battle-tested; please reuse it.

  3. Windows branch in scanner is broken for args containing spaces or quotes. strings.Trim(arg, '\"') only strips leading/trailing quotes — an arg like C:\\Program Files\\Docker\\docker.exe will be wrapped as \"C:\\Program Files\\Docker\\docker.exe\" and passed through cmd.exe unchanged, but an arg containing an embedded \" will yield a malformed command line. cmd.exe quoting rules are notoriously bad; again, reuse the existing Windows path from shellescape.

  4. Performance / latency. Spawning sh -l -c for every health check (checkDockerContainerHealth, GetConnectionDiagnostics — both run on a 2s interval) reads the user's full login rc files every time. On a machine with a heavy .zshrc / .bashrc, this can add 100–500ms per call and run CPU hot in the background. Consider caching the resolved Docker binary path once at startup (via one login-shell call) and then calling exec.CommandContext(ctx, cachedDockerPath, args...) directly. This is how most GUI apps on macOS solve this problem (it's essentially what the Docker Desktop helper does).

  5. No tests. Given this changes how every Docker command is launched, there should be at least:

    • A unit test for newDockerCmd / getDockerCmd asserting the resulting cmd.Path and cmd.Args for both OSes (use runtime.GOOS mocking or build tags).
    • A regression test that arguments containing spaces, single quotes, and double quotes survive round-tripping through the wrapper.

    The existing internal/upstream/core/shell_test.go already tests wrapWithUserShell; if the scanner reuses that helper, you get the tests for free.

Minor / style

  • getDockerCmd uses } else { after a returngofmt doesn't flag it but golangci-lint's golint/revive will. Drop the else.
  • The new helper in upstream/core/docker.go silently assumes c.envManager may be nil (nil-check is present — good), but wrapWithUserShell itself dereferences c.envManager.GetSystemEnvVar unconditionally. If there's any code path where envManager is nil, this will panic. Worth an assertion or nil-guard at construction.
  • newDockerCmd does not log the wrapped command — wrapWithUserShell logs at debug level, so the upstream path gets logging for free, but the scanner's getDockerCmd is silent. Add a d.logger.Debug(...) for parity; it's invaluable when this breaks on a user's machine.

Regression risk

  • The scanner currently inherits the parent process environment via exec.CommandContext. Switching to sh -l -c means scanner subprocesses now get a full login environment — this could expose env vars (AWS creds, GitHub tokens from ~/.zshrc) to Docker images being scanned. This probably isn't what you want for a security scanner. Consider whether DockerRunner should run docker directly with a resolved absolute path (option 4 above) rather than with a full login shell.

Verdict

Do not merge as-is — the build is broken. After fixing the import, the rest is reviewable-but-fixable: I'd strongly prefer reusing wrapWithUserShell / shellescape in the scanner rather than maintaining two escape implementations, and I think the "resolve docker path once, then exec directly" approach would be a meaningfully better fix long-term, especially for the health-check hot path.

Happy to pair on any of this.

claude added 4 commits April 11, 2026 17:16
After migrating monitoring.go to use c.newDockerCmd, the os/exec import
was left dangling and broke the build. Remove it.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Introduces internal/shellwrap to centralize the login-shell wrapping
logic and docker binary resolution that were previously duplicated
(or reinvented) across internal/upstream/core and
internal/security/scanner.

Key helpers:
  * Shellescape      — POSIX single-quote / cmd.exe quoting.
  * WrapWithUserShell — wraps a command in $SHELL -l -c for Unix
    (and /c for Windows cmd), honoring $SHELL instead of hardcoding
    sh so zsh/fish users get their real interactive PATH.
  * ResolveDockerPath — sync.Once-cached absolute path for the docker
    binary. Tries exec.LookPath first and only falls back to a
    one-shot login-shell probe when the fast path fails. Callers can
    then exec docker directly on every subsequent call, avoiding the
    cost of respawning a login shell on hot paths (health check,
    diagnostics, ~every 2-5s).
  * MinimalEnv       — returns an allow-listed PATH+HOME environment
    for subprocesses that must not inherit ambient secrets.

Includes unit tests covering:
  * Shellescape round-trips for spaces, single quotes, $, backticks,
    and glob stars.
  * WrapWithUserShell honors $SHELL overrides and falls back to
    /bin/bash when unset.
  * ResolveDockerPath is cached across calls (second call returns
    the same value even after PATH is sabotaged).
  * MinimalEnv strips AWS/GitHub/Anthropic/OpenAI credentials while
    retaining PATH.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The previous implementation of the scanner's getDockerCmd had three
problems flagged in PR review:

1. It hardcoded \`sh -l -c\` on Unix, ignoring \$SHELL. Users whose login
   shell is zsh/fish would get a different PATH than the one they see
   in Terminal.
2. It hardcoded \`cmd /c\` on Windows, breaking Git Bash / WSL users,
   and used strings.Trim(arg, "\"") which silently mangles legitimately
   quoted arguments.
3. It inherited the full user environment via sh -l -c. Because the
   scanner runs untrusted container images (vulnerability scanners,
   SBOM generators, …), ambient secrets such as AWS_ACCESS_KEY_ID,
   GITHUB_TOKEN, or ANTHROPIC_API_KEY could leak into scan sandboxes.

Fix:
  * Use shellwrap.ResolveDockerPath to locate the docker binary once
    (cached process-wide) and exec it directly — no shell wrapping.
  * Set cmd.Env to shellwrap.MinimalEnv(), which retains only the
    PATH + HOME (+ Windows equivalents) needed for docker to run.
  * Drop the custom quoting logic entirely; exec.Command already
    passes arguments verbatim when we skip the shell.

Added TestDockerRunnerDoesNotLeakAmbientSecrets which pollutes the
process env with fake AWS/GitHub/Anthropic credentials and then
asserts cmd.Env is non-nil and does NOT contain any of those keys
while still carrying PATH.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…lwrap

The Docker PATH fix in this PR made every newDockerCmd call re-spawn
sh -l -c so that docker would be discoverable when mcpproxy is
launched from Launchpad / a LaunchAgent. That works, but
checkDockerContainerHealth and GetConnectionDiagnostics fire every
few seconds each, which means we were re-reading .zshrc / .bashrc
dozens of times a minute per Docker upstream.

Fix:
  * Resolve the docker binary once via shellwrap.ResolveDockerPath
    (sync.Once) and exec it directly on subsequent calls. Retain a
    fallback to the existing c.wrapWithUserShell path in case the
    cache resolution fails (e.g. uncommon install layouts), so the
    original "works when launched from Launchpad" guarantee stands.

Also deduplicate the shell wrapping implementation:
  * c.wrapWithUserShell is now a thin wrapper around
    shellwrap.WrapWithUserShell. It still emits the server-scoped
    debug log line that existed before for log continuity.
  * The package-local shellescape helper is kept as an alias of
    shellwrap.Shellescape for backward compatibility with the
    existing TestShellescape suite in shell_test.go.

No behavior change for the personal edition binary beyond the perf
improvement — docker commands are still launched with the correct
PATH and the secure environment built by c.envManager.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages bot commented Apr 11, 2026

Deploying mcpproxy-docs with  Cloudflare Pages  Cloudflare Pages

Latest commit: 1c7dea3
Status: ✅  Deploy successful!
Preview URL: https://ccf94409.mcpproxy-docs.pages.dev
Branch Preview URL: https://fix-docker-path-execution.mcpproxy-docs.pages.dev

View logs

@Dumbris
Copy link
Copy Markdown
Member Author

Dumbris commented Apr 11, 2026

Pushed review fixes (4 new commits on top of d0e391d):

Blocker

  • e5bd658 — removed the stale os/exec import from internal/upstream/core/monitoring.go; go build ./... is green again.

Unified shell wrapping (new shared package)

  • 62a4622 — added internal/shellwrap with Shellescape, WrapWithUserShell, ResolveDockerPath, and MinimalEnv. Covered by internal/shellwrap/shellwrap_test.go (shellescape tricky-arg round-trips, $SHELL override honored, $SHELL empty fallback, docker-path cache survives PATH sabotage, minimal env drops AWS/GitHub/Anthropic/OpenAI tokens while retaining PATH).

Scanner fixes (unify + security)

  • c74af5binternal/security/scanner/docker.go now resolves docker once via shellwrap.ResolveDockerPath and execs it directly with cmd.Env = shellwrap.MinimalEnv(). No more hardcoded sh -l -c or cmd /c, no more strings.Trim(arg, "\"") mangling, no more leaking ambient AWS/GitHub/Anthropic tokens into scanner containers. Added TestDockerRunnerDoesNotLeakAmbientSecrets which pollutes the env with fake credentials and asserts cmd.Env is non-nil and contains none of them while still carrying PATH. Unreachable else after return is gone.

Upstream perf fix

  • 1c7dea3internal/upstream/core/docker.go's newDockerCmd now uses the cached docker path on the fast path, so checkDockerContainerHealth / GetConnectionDiagnostics (firing every ~2-5s) no longer respawn sh -l -c to re-read .zshrc / .bashrc on every invocation. Login-shell wrap is retained as a fallback when the cache cannot resolve docker, preserving the original Launchpad fix. Also deduplicated: c.wrapWithUserShell and shellescape are now thin delegations to the shellwrap package, so existing shell_test.go keeps passing unchanged.

Verification on macOS arm64

  • go build ./... — OK
  • go build -tags server ./... — OK
  • ./mcpproxy --versionMCPProxy v0.1.0 (personal) darwin/arm64
  • go test ./internal/upstream/core/... ./internal/security/scanner/... ./internal/shellwrap/... -race -count=1 — all green
  • go test ./internal/... -count=1 — all green
  • ./scripts/run-linter.sh0 issues

Personal edition binary behavior is unchanged beyond the perf improvement. The scanner now runs with a minimal, allow-listed env that does not inherit ambient credentials.

@github-actions
Copy link
Copy Markdown

📦 Build Artifacts

Workflow Run: View Run
Branch: fix/docker-path-execution

Available Artifacts

  • archive-darwin-amd64 (26 MB)
  • archive-darwin-arm64 (23 MB)
  • archive-linux-amd64 (15 MB)
  • archive-linux-arm64 (13 MB)
  • archive-windows-amd64 (25 MB)
  • archive-windows-arm64 (23 MB)
  • frontend-dist-pr (0 MB)
  • installer-dmg-darwin-amd64 (19 MB)
  • installer-dmg-darwin-arm64 (17 MB)

How to Download

Option 1: GitHub Web UI (easiest)

  1. Go to the workflow run page linked above
  2. Scroll to the bottom "Artifacts" section
  3. Click on the artifact you want to download

Option 2: GitHub CLI

gh run download 24284567542 --repo smart-mcp-proxy/mcpproxy-go

Note: Artifacts expire in 14 days.

@Dumbris Dumbris merged commit b6dc5ad into main Apr 11, 2026
23 checks passed
@Dumbris Dumbris deleted the fix/docker-path-execution branch April 11, 2026 14:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants