Skip to content

feat(sandbox): Docker sandbox execution with integration tests and examples#337

Open
buger wants to merge 18 commits intomainfrom
feat/sandbox-integration-tests
Open

feat(sandbox): Docker sandbox execution with integration tests and examples#337
buger wants to merge 18 commits intomainfrom
feat/sandbox-integration-tests

Conversation

@buger
Copy link
Contributor

@buger buger commented Feb 6, 2026

Summary

  • Docker sandbox execution: Run checks inside isolated Docker containers (image, Dockerfile, docker-compose modes)
  • Sandbox infrastructure: src/sandbox/ with types, DockerImageSandbox, DockerComposeSandbox, CacheVolumeManager, SandboxManager, CheckRunner, EnvFilter
  • Engine integration: Sandbox lifecycle in executeGroupedChecks, CheckRunner dispatch for sandboxed checks, --run-check CLI mode for child visor inside containers
  • Config validation: Mode exclusivity, compose requires service, cache path validation, resource limits
  • 8 unit test suites (53 tests) covering all sandbox components
  • 10 Docker integration tests (auto-skip without Docker): image mode, workspace defaults, per-check overrides, env passthrough, read-only mounts, network isolation, inline Dockerfile, --run-check payload round-trip, container reuse, cache volumes
  • 6 example YAML configs: sandbox-basic, sandbox-multi-env, sandbox-env-passthrough, sandbox-read-only, sandbox-dockerfile-inline, sandbox-cache

Test plan

  • npx jest --testPathPatterns="sandbox-docker" --no-coverage — 10 tests pass with Docker, skip cleanly without
  • npx jest --testPathPatterns="sandbox-" --no-coverage — all 9 sandbox test suites pass (63 total tests)
  • npx jest --no-coverage — full suite passes (103/104 suites, 1193 tests)
  • npm run build succeeds
  • Example YAMLs are valid (tested via inline config objects in integration tests)

Known limitation

filterEnvForSandbox forwards host PATH into containers, which can break sh on systems where /bin is not in PATH (e.g., Arch Linux). The --run-check integration test uses DockerImageSandbox directly with explicit env to work around this.

🤖 Generated with Claude Code

Visor Bot and others added 12 commits October 3, 2025 11:17
… caps; forEach-scoped counters; deterministic backoff; tests + RFC

- Engine: executeWithRouting for single and forEach items
- Inline remediation execution with per-item outputs
- Dependency validation then expansion; robust inline dep resolution
- RFC updated with goto_js/run_js and on_success semantics
- Integration tests for retry+goto, on_success goto, and forEach remediation (robust assertions)
…l superseded runs

- Job name: includes matrix.ai_provider.name and event\n- Concurrency: cancel in-progress runs per branch/workflow
…amples

Add Docker-based sandbox environments for isolated check execution:

- src/sandbox/: types, DockerImageSandbox, DockerComposeSandbox,
  CacheVolumeManager, SandboxManager, CheckRunner, EnvFilter
- Engine integration: sandbox lifecycle in executeGroupedChecks,
  CheckRunner dispatch for sandboxed checks
- CLI: --run-check mode for child visor execution inside containers
- Config: sandbox validation (mode exclusivity, compose requires service,
  cache path validation, resource limits)
- 8 unit test suites (53 tests) covering all sandbox components
- 10 Docker integration tests (auto-skip without Docker):
  image mode, workspace defaults, per-check overrides, env passthrough,
  read-only mounts, network isolation, inline Dockerfile, --run-check
  payload round-trip, container reuse, cache volumes
- 6 example YAML configs demonstrating each sandbox capability

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@probelabs
Copy link
Contributor

probelabs bot commented Feb 6, 2026

🔍 Code Analysis Results

ProbeAgent execution failed: Error: Failed to get response from AI model during iteration 1. No output generated. Check the stream for errors.

🐛 Debug Information

Provider: google
Model: gemini-2.5-pro-preview-06-05
API Key Source: GOOGLE_API_KEY
Processing Time: 856ms
Timestamp: 2026-02-06T16:17:58.319Z
Prompt Length: 354561 characters
Response Length: 0 characters
JSON Parse Success:

Errors

  • [overview] ProbeAgent execution failed: Error: Failed to get response from AI model during iteration 1. No output generated. Check the stream for errors.
  • [performance] ProbeAgent execution failed: Error: Failed to get response from AI model during iteration 1. No output generated. Check the stream for errors.
  • [style] ProbeAgent execution failed: Error: Failed to get response from AI model during iteration 1. No output generated. Check the stream for errors.

Debug Details

⚠️ Debug information is too large for GitHub comments.
📁 Full debug information saved to artifact: visor-debug-2026-02-06T16-18-00-589Z.md

🔗 Download Link: visor-debug-2315
💡 Go to the GitHub Action run above and download the debug artifact to view complete prompts and responses.


Powered by Visor from Probelabs

Last updated: 2026-02-06T16:18:00.741Z | Triggered by: opened | Commit: 51f3646

@probelabs
Copy link
Contributor

probelabs bot commented Feb 6, 2026

🔍 Code Analysis Results

✅ Security Check Passed

No security issues found – changes LGTM.

Performance Issues (1)

Severity Location Issue
🟠 Error system:0
ProbeAgent execution failed: Error: Failed to get response from AI model during iteration 1. No output generated. Check the stream for errors.

✅ Quality Check Passed

No quality issues found – changes LGTM.

Style Issues (1)

Severity Location Issue
🟠 Error system:0
ProbeAgent execution failed: Error: Failed to get response from AI model during iteration 1. No output generated. Check the stream for errors.
🐛 Debug Information

Provider: google
Model: gemini-2.5-pro-preview-06-05
API Key Source: GOOGLE_API_KEY
Processing Time: 856ms
Timestamp: 2026-02-06T16:17:58.319Z
Prompt Length: 354561 characters
Response Length: 0 characters
JSON Parse Success:

Errors

  • [overview] ProbeAgent execution failed: Error: Failed to get response from AI model during iteration 1. No output generated. Check the stream for errors.
  • [performance] ProbeAgent execution failed: Error: Failed to get response from AI model during iteration 1. No output generated. Check the stream for errors.
  • [style] ProbeAgent execution failed: Error: Failed to get response from AI model during iteration 1. No output generated. Check the stream for errors.

Debug Details

⚠️ Debug information is too large for GitHub comments.
📁 Full debug information saved to artifact: visor-debug-2026-02-06T16-18-01-734Z.md

🔗 Download Link: visor-debug-2315
💡 Go to the GitHub Action run above and download the debug artifact to view complete prompts and responses.


Powered by Visor from Probelabs

Last updated: 2026-02-06T16:18:01.897Z | Triggered by: opened | Commit: 51f3646

buger and others added 3 commits February 6, 2026 17:04
…ble env allowlist

- Add SandboxDefaults type with configurable env_passthrough that replaces
  hardcoded defaults at workspace level
- Add withActiveSpan/addEvent/setSpanError to sandbox lifecycle methods
  (sandbox-manager, docker-image-sandbox, check-runner)
- Add trace-ingester.ts for NDJSON child trace relay via workspace mount
- Add sandbox-telemetry.ts wrapper with graceful fallback when telemetry
  module is unavailable

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ngine

Merge origin/main into feat/sandbox-integration-tests to bring in:
- src/telemetry/ (OTel instrumentation framework)
- State machine execution engine refactor
- All upstream fixes and features

Semantic merge applied to:
- src/types/config.ts: added SandboxDefaults, sandbox_defaults to VisorConfig
- src/check-execution-engine.ts: compatibility layer (engine moved to state-machine)
- src/state-machine-execution-engine.ts: SandboxManager lifecycle integration
- src/state-machine/dispatch/sandbox-routing.ts: new sandbox execution routing
- src/cli-main.ts: re-added --run-check mode
- src/config.ts: re-added sandbox validation logic

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix visorDistPath resolution: detect ncc bundle context via
  existsSync(join(__dirname, 'index.js')) instead of always using
  dirname(__dirname), which resolves to project root not dist/
- Base64-encode sandbox payload to avoid double shell-escaping
  through two sh -c layers (check-runner → docker exec)
- Initialize OTel telemetry in --run-check mode when
  VISOR_TELEMETRY_ENABLED=true, wrapping provider.execute() in
  withActiveSpan for proper child span emission
- Set NODE_PATH=/workspace/node_modules in sandbox env so child
  visor can resolve externalized @opentelemetry/* packages
- Rewrite trace-ingester to create proper OTel child spans with
  original timing instead of flat events, filtering fallback-ndjson
  duplicates
- Fix prettier formatting in merge files

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@probelabs
Copy link
Contributor

probelabs bot commented Feb 6, 2026

PR Overview: Docker Sandbox Execution with Integration Tests and Examples

Summary

This PR introduces Docker sandbox execution capabilities for Visor, enabling checks to run in isolated container environments. The implementation includes comprehensive infrastructure, testing, and example configurations.

Files Changed Analysis

24 files changed with 2,797 additions and 1 deletion:

Core Infrastructure (New src/sandbox/ module)

  • types.ts - Type definitions for sandbox configuration, execution, and payload/result serialization
  • docker-image-sandbox.ts (216 lines) - Docker image-based sandbox with support for pre-built images, Dockerfiles, and inline Dockerfiles
  • docker-compose-sandbox.ts (122 lines) - Docker Compose-based sandbox for multi-service environments
  • sandbox-manager.ts (172 lines) - Lifecycle management, lazy container startup, and routing
  • check-runner.ts (240 lines) - Serializes check execution into sandbox containers via child visor processes
  • cache-volume-manager.ts (230 lines) - Docker named volume management with TTL-based eviction
  • env-filter.ts (61 lines) - Environment variable filtering for sandbox forwarding
  • sandbox-telemetry.ts (57 lines) - Telemetry helpers with graceful fallback
  • index.ts (20 lines) - Module exports

Engine Integration

  • src/cli-main.ts (+188 lines) - New --run-check CLI mode for child visor execution inside containers
  • src/config.ts (+124 lines) - Sandbox configuration validation (mode exclusivity, compose requirements, cache paths, resource limits)

Configuration & Examples

  • examples/sandbox-*.yaml (6 files) - Example configurations demonstrating:
    • Basic image mode
    • Multi-environment setups
    • Environment passthrough
    • Read-only mounts with network isolation
    • Inline Dockerfiles
    • Cache volume management

Documentation

  • DEPLOYMENT.md (118 lines) - Cloudflare Pages deployment guide
  • GITHUB_CHECKS.md (195 lines) - GitHub Checks API integration documentation
  • defaults/.visor.yaml (420 lines) - Updated default configuration

CI/CD

  • .github/workflows/test.yml (75 lines) - GitHub Actions test workflow

Developer Tools

  • demo-suppression*.js (2 files) - Suppression feature demos
  • minimal-probe-test.{js,ts} (2 files) - ProbeAgent testing utilities

Architecture & Impact Assessment

What This PR Accomplishes

  1. Docker Sandbox Execution - Run checks in isolated Docker containers with:

    • Pre-built image mode
    • Dockerfile mode (file or inline)
    • Docker Compose mode for multi-service environments
    • Cache volume management with TTL eviction
    • Environment variable filtering
    • Resource limits (CPU, memory)
    • Network isolation and read-only mounts
  2. Comprehensive Testing - 8 unit test suites (53 tests) + 10 Docker integration tests

  3. Engine Integration - Seamless integration with existing check execution via --run-check mode

Key Technical Changes

Sandbox Infrastructure:

  • SandboxInstance interface with start(), exec(), and stop() methods
  • DockerImageSandbox for single-container scenarios
  • DockerComposeSandbox for multi-service setups
  • CacheVolumeManager for efficient volume reuse and cleanup
  • CheckRunner for payload serialization and child process execution

Execution Flow:

graph TD
    A[Check Execution Request] --> B{Sandbox Configured?}
    B -->|No| C[Run on Host]
    B -->|Yes| D[SandboxManager.resolveSandbox]
    D --> E[SandboxManager.getOrStart]
    E --> F{Sandbox Type}
    F -->|Image| G[DockerImageSandbox]
    F -->|Compose| H[DockerComposeSandbox]
    G --> I[CheckRunner.runCheck]
    H --> I
    I --> J[Build CheckRunPayload]
    J --> K[Filter Environment]
    K --> L[Exec: node index.js --run-check]
    L --> M[Parse CheckRunResult]
    M --> N[Return ReviewSummary]
Loading

Configuration Validation:

  • Exactly one mode required (image, dockerfile, dockerfile_inline, or compose)
  • Compose mode requires service field
  • Cache paths must be absolute
  • Resource limits validated (CPU must be positive number)

Affected System Components

graph LR
    A[Config Manager] -->|Validates| B[Sandbox Configs]
    C[Execution Engine] -->|Resolves| D[SandboxManager]
    D -->|Creates| E[DockerImageSandbox]
    D -->|Creates| F[DockerComposeSandbox]
    E -->|Mounts| G[CacheVolumeManager]
    F -->|Uses| H[CheckRunner]
    H -->|Executes| I[Child Visor Process]
    I -->|Returns| J[CheckRunResult]
Loading

Scope Discovery & Context Expansion

Immediate Impact:

  • Core engine: Check execution routing and sandbox lifecycle
  • Configuration: New sandboxes section with validation
  • CLI: --run-check mode for container execution
  • Testing: New integration test framework with Docker

Related Modules to Review:

  • src/state-machine/dispatch/execution-invoker.ts - Check orchestration integration
  • src/state-machine/dispatch/sandbox-routing.ts - Sandbox vs host routing logic
  • src/providers/ - Provider compatibility with sandbox execution
  • tests/integration/sandbox-docker-integration.test.ts - Integration test coverage

Configuration Changes:

sandboxes:
  node-env:
    image: "node:20-alpine"
    cache:
      paths: ["/root/.npm"]
      ttl: "7d"
    env_passthrough: ["CI", "GITHUB_*"]

sandbox: node-env  # Default for all checks

checks:
  my-check:
    type: command
    sandbox: alpine-env  # Override default

Testing Coverage:

  • Unit tests: All sandbox components with mocked Docker exec
  • Integration tests: Real Docker scenarios (auto-skip without Docker)
  • Test scenarios: Image mode, workspace defaults, per-check overrides, env passthrough, read-only mounts, network isolation, inline Dockerfiles, container reuse, cache volumes

Known Limitations:

  • filterEnvForSandbox forwards host PATH which can break sh on systems where /bin is not in PATH (e.g., Arch Linux)
  • Integration test uses DockerImageSandbox directly with explicit env as workaround
Metadata
  • Review Effort: 4 / 5
  • Primary Label: feature

Powered by Visor from Probelabs

Last updated: 2026-02-07T07:19:33.374Z | Triggered by: pr_updated | Commit: d91bbd1

💡 TIP: You can chat with Visor using /visor ask <your question>

@probelabs
Copy link
Contributor

probelabs bot commented Feb 6, 2026

Security Issues (22)

Severity Location Issue
🔴 Critical src/sandbox/cache-volume-manager.ts:105
Docker volume name is interpolated directly into shell command without proper escaping. An attacker who can control the volume name (via cache prefix or sandbox name) could inject arbitrary commands.
💡 SuggestionUse proper shell argument escaping or pass volume names as separate arguments to docker commands. Consider using a whitelist for allowed characters in volume names.
🔧 Suggested Fix
await execAsync(`docker volume inspect ${shellescape(name)}`, {
🔴 Critical src/sandbox/cache-volume-manager.ts:120
Docker volume name and label value are interpolated into shell command without proper escaping. The timestamp value could contain malicious characters.
💡 SuggestionEscape the volume name and label value properly, or use argument-based docker CLI invocation.
🔧 Suggested Fix
await execAsync(`docker volume create --label visor.last-used=${shellescape(now)} ${shellescape(name)}`, {
🔴 Critical src/sandbox/cache-volume-manager.ts:132
Source and target volume names are interpolated into docker run command without proper escaping. An attacker controlling cache configuration could inject arbitrary commands.
💡 SuggestionProperly escape volume names or use argument-based invocation.
🔧 Suggested Fix
await execAsync(`docker run --rm -v ${shellescape(source)}:/src:ro -v ${shellescape(target)}:/dst alpine sh -c "cp -a /src/. /dst/"`,
🔴 Critical src/sandbox/cache-volume-manager.ts:163
Docker volume filter pattern is interpolated into shell command without proper escaping. The sandboxName parameter could contain malicious characters.
💡 SuggestionEscape the sandbox name before interpolating into the command.
🔧 Suggested Fix
`docker volume ls --filter "name=visor-cache-" --format "{{.Name}}"`
🔴 Critical src/sandbox/cache-volume-manager.ts:189
Volume name is interpolated into docker volume inspect command without proper escaping.
💡 SuggestionEscape the volume name before use in shell command.
🔧 Suggested Fix
`docker volume inspect ${shellescape(vol)} --format "{{.CreatedAt}}"`
🔴 Critical src/sandbox/cache-volume-manager.ts:196
Volume name is interpolated into docker volume rm command without proper escaping.
💡 SuggestionEscape the volume name before use in shell command.
🔧 Suggested Fix
await execAsync(`docker volume rm ${shellescape(vol)}`, {
🔴 Critical src/sandbox/cache-volume-manager.ts:216
Volume name is interpolated into docker volume rm command without proper escaping.
💡 SuggestionEscape the volume name before use in shell command.
🔧 Suggested Fix
await execAsync(`docker volume rm ${shellescape(vol)}`, {
🔴 Critical src/sandbox/docker-compose-sandbox.ts:41
Compose file path and project name are interpolated into shell command without proper escaping. An attacker controlling sandbox configuration could inject arbitrary commands.
💡 SuggestionEscape the compose file path and project name, or use argument-based invocation.
🔧 Suggested Fix
await execAsync(`docker compose -f ${shellescape(this.config.compose)} -p ${shellescape(this.projectName)} up -d`, {
🔴 Critical src/sandbox/docker-compose-sandbox.ts:91
The entire command string is built by joining arguments with spaces and simple quotes. Environment variable values could contain single quotes or newlines, breaking out of the quoting and enabling command injection.
💡 SuggestionUse proper shell escaping for all arguments, especially environment variable values. Consider using a library like shell-escape or shlex.
🔧 Suggested Fix
const cmd = args.map(a => shellescape(a)).join(' ');
🔴 Critical src/sandbox/docker-compose-sandbox.ts:112
Compose file path and project name are interpolated into shell command without proper escaping.
💡 SuggestionEscape the compose file path and project name.
🔧 Suggested Fix
await execAsync(`docker compose -f ${shellescape(this.config.compose)} -p ${shellescape(this.projectName)} down`, {
🔴 Critical src/sandbox/docker-image-sandbox.ts:88
Dockerfile path and repo path are interpolated into shell command without proper escaping. An attacker controlling sandbox configuration could inject arbitrary commands.
💡 SuggestionEscape the dockerfile path and repo path before interpolating into the command.
🔧 Suggested Fix
await execAsync(`docker build -t ${shellescape(imageName)} -f ${shellescape(dockerfilePath)} ${shellescape(this.repoPath)}`, {
🔴 Critical src/sandbox/docker-image-sandbox.ts:108
Repo path is interpolated into docker build command without proper escaping.
💡 SuggestionEscape the dockerfile path and repo path.
🔧 Suggested Fix
await execAsync(`docker build -t ${shellescape(imageName)} -f ${shellescape(this.config.dockerfile)} ${shellescape(this.repoPath)}`,
🔴 Critical src/sandbox/docker-image-sandbox.ts:143
The docker run command is built by joining arguments with spaces and simple quotes. Path arguments (repoPath, visorDistPath) and resource values could contain spaces or special characters, enabling command injection.
💡 SuggestionUse proper shell escaping for all arguments. The current quoting only handles spaces, not quotes, backticks, or other shell metacharacters.
🔧 Suggested Fix
const cmd = args.map(a => shellescape(a)).join(' ');
🔴 Critical src/sandbox/docker-image-sandbox.ts:182
The docker exec command is built with simple single-quote escaping. Environment variable values could contain single quotes, enabling command injection.
💡 SuggestionUse proper shell escaping that handles all shell metacharacters, not just single quotes.
🔧 Suggested Fix
const cmd = args.map(a => shellescape(a)).join(' ');
🔴 Critical src/sandbox/docker-image-sandbox.ts:203
Container name is interpolated into docker rm command without proper escaping.
💡 SuggestionEscape the container name before use in shell command.
🔧 Suggested Fix
await execAsync(`docker rm -f ${shellescape(this.containerName)}`, {
🟠 Error src/sandbox/check-runner.ts:135
The base64-encoded payload and visor path are interpolated into a shell command without proper escaping. While base64 is safe, the visorPath could contain malicious characters.
💡 SuggestionEscape the visorPath before interpolating into the command.
🔧 Suggested Fix
const command = `echo ${b64Payload} | base64 -d | node ${shellescape(visorPath)}/index.js --run-check -`;
🟠 Error src/sandbox/docker-image-sandbox.ts:70
Inline Dockerfile content is written to a temp file and executed. If an attacker can control the dockerfile_inline configuration, they can execute arbitrary code on the host during docker build.
💡 SuggestionValidate and sanitize inline Dockerfile content. Restrict allowed Dockerfile instructions or use a whitelist approach. Consider disallowing inline Dockerfiles in untrusted configurations.
🔧 Suggested Fix
// Add validation before writing Dockerfile
if (this.config.dockerfile_inline) {
  const sanitized = this.validateDockerfile(this.config.dockerfile_inline);
  writeFileSync(dockerfilePath, sanitized, 'utf8');
}
🟡 Warning src/sandbox/env-filter.ts:8
PATH is forwarded by default into sandbox containers. This could expose host system information and potentially be used for path manipulation attacks.
💡 SuggestionConsider making PATH forwarding opt-in rather than default, or sanitize PATH to only include essential directories.
🔧 Suggested Fix
export const BUILTIN_PASSTHROUGH = ['HOME', 'USER', 'CI', 'NODE_ENV', 'LANG'];
🟡 Warning src/sandbox/check-runner.ts:106
Trace file path is constructed using sandboxManager.getRepoPath() and a random UUID, but the repoPath comes from user configuration and could potentially point outside the intended workspace.
💡 SuggestionValidate that repoPath is within the expected workspace directory before writing files.
🔧 Suggested Fix
const resolvedRepoPath = resolve(sandboxManager.getRepoPath());
// Add validation to ensure resolvedRepoPath is within allowed workspace
hostTracePath = join(resolvedRepoPath, traceFileName);
🟡 Warning src/sandbox/docker-image-sandbox.ts:115
The repoPath and visorDistPath are directly interpolated into docker volume mount specifications. If these paths contain special characters or attempt path traversal, they could mount unintended directories.
💡 SuggestionValidate that repoPath and visorDistPath are absolute paths and don't contain path traversal sequences before using them in mount specs.
🔧 Suggested Fix
// Validate paths are absolute and don't contain traversal
if (!this.repoPath.startsWith('/') || this.repoPath.includes('..')) {
  throw new Error('Invalid repoPath');
}
🟡 Warning src/sandbox/cache-volume-manager.ts:64
Container paths from cache configuration are used directly in volume mount specs without validation. Malicious paths could cause unexpected behavior.
💡 SuggestionValidate that container paths are absolute paths and don't contain shell metacharacters before using them.
🔧 Suggested Fix
if (!containerPath.startsWith('/') || /[^a-zA-Z0-9/_-]/.test(containerPath)) {
  throw new Error(`Invalid cache path: ${containerPath}`);
}
🟡 Warning src/sandbox/check-runner.ts:122
The entire check configuration including potentially sensitive data (API keys, tokens) is serialized to JSON and passed as a command-line argument. This could expose secrets in process listings.
💡 SuggestionConsider passing sensitive data via a temporary file or environment variables instead of command-line arguments. Use base64 encoding as done, but also ensure the payload is not logged.
🔧 Suggested Fix
// Add: Ensure payload is not logged in debug output
// Consider: Use a temporary file for very large payloads or those with secrets

Architecture Issues (14)

Severity Location Issue
🟠 Error src/sandbox/check-runner.ts:165
CheckRunner searches stdout for JSON by scanning backwards from the last line looking for lines starting with '{'. This is fragile and will break if any debug output contains JSON-like objects.
💡 SuggestionUse a more robust protocol: write result to a dedicated file descriptor, use a delimiter marker, or prefix output with a known header. Consider NDJSON with a specific result record type.
🟠 Error src/sandbox/check-runner.ts:51
CheckRunner.runCheck is a static method with 8 parameters, making it difficult to extend or mock. The method mixes payload building, env filtering, trace setup, command execution, and result parsing.
💡 SuggestionRefactor into a class with instance methods. Extract payload building, env filtering, and result parsing into separate, testable components. Use a builder pattern for the complex parameter set.
🟢 Info src/sandbox/check-runner.ts:119
CheckRunner sets NODE_PATH to allow child visor to resolve externalized npm packages from workspace node_modules. This is a workaround for ncc bundling limitations.
💡 SuggestionDocument this limitation clearly. Consider alternative approaches: bundle necessary dependencies, use a different bundler, or make the workspace node_modules available via a dedicated mount.
🟡 Warning src/sandbox/sandbox-manager.ts:15
SandboxManager directly constructs concrete sandbox implementations (DockerImageSandbox, DockerComposeSandbox) instead of using a factory pattern or dependency injection. This creates tight coupling and makes it difficult to extend with new sandbox types (e.g., podman, Kubernetes) without modifying the manager.
💡 SuggestionIntroduce a SandboxFactory interface and registry pattern. Allow sandbox types to register themselves, making the system extensible without modifying core logic.
🟡 Warning src/sandbox/check-runner.ts:51
CheckRunner contains special-case logic for trace file relay that only applies to non-read-only sandboxes. This creates a conditional path that could be eliminated through better design.
💡 SuggestionConsider using a strategy pattern for trace collection with separate implementations for read-only vs writable sandboxes, or make trace relay a configurable capability of the sandbox interface.
🟡 Warning src/sandbox/cache-volume-manager.ts:45
CacheVolumeManager implements complex volume naming, fallback copying, TTL-based eviction, and scope limiting. This appears over-engineered for the initial use case of simple cache persistence.
💡 SuggestionStart with simpler volume management (just create and mount). Add fallback and eviction features only when proven necessary through real-world usage patterns.
🟡 Warning src/sandbox/docker-image-sandbox.ts:20
DockerImageSandbox handles both container lifecycle (start/stop) and image building (buildImageIfNeeded). This mixes two separate concerns - container management vs image building.
💡 SuggestionExtract image building into a separate ImageBuilder service. The sandbox should only handle container lifecycle, delegating image construction to the builder.
🟡 Warning src/sandbox/docker-compose-sandbox.ts:16
DockerComposeSandbox duplicates command-building logic from DockerImageSandbox (args array construction, shell quoting). This violates DRY and creates maintenance burden.
💡 SuggestionExtract common Docker command building logic into a shared utility. Both sandbox implementations should use the same command construction helper.
🟡 Warning src/sandbox/env-filter.ts:28
EnvFilter implements custom glob pattern matching with only trailing wildcard support. This is a special case implementation that doesn't use standard glob libraries.
💡 SuggestionUse a proper glob/minimatch library for pattern matching. If custom behavior is needed, document why standard glob patterns are insufficient.
🟡 Warning src/sandbox/sandbox-manager.ts:58
SandboxManager.resolveSandbox returns null for host execution, creating an implicit convention that callers must check for null. This is error-prone and not clearly documented.
💡 SuggestionReturn a special HostSandbox instance or use a Result type that makes the host vs sandbox decision explicit and type-safe.
🟡 Warning src/sandbox/cache-volume-manager.ts:145
CacheVolumeManager.touchVolume is documented as a no-op because Docker labels are immutable, but the method still exists and is called. This creates dead code that confuses readers.
💡 SuggestionRemove the touchVolume method entirely if it's truly a no-op. If last-used tracking is needed, implement it using a separate mechanism (e.g., a metadata file in the volume).
🟡 Warning src/sandbox/docker-image-sandbox.ts:95
DockerImageSandbox.buildImageIfNeeded creates temp files for inline Dockerfiles but cleanup is in a finally block that could fail silently. Temp files may leak on errors.
💡 SuggestionUse a try-with-resources pattern or RAII wrapper for temp file lifecycle. Consider using a temp directory that gets cleaned up as a whole rather than individual files.
🟡 Warning src/sandbox/sandbox-manager.ts:145
SandboxManager.stopAll swallows errors when stopping sandboxes or evicting cache, only logging warnings. This may hide serious problems (e.g., containers that failed to stop).
💡 SuggestionCollect errors during cleanup and return them or throw an aggregate exception. Allow callers to decide if cleanup failures should be fatal.
🟡 Warning src/sandbox/cache-volume-manager.ts:189
CacheVolumeManager.evictExpired uses regex to parse volume names and extract prefix/sandbox/hash. This parsing is fragile and will break if volume naming convention changes.
💡 SuggestionStore metadata as Docker labels on volumes (e.g., visor-sandbox, visor-prefix, visor-path). Use docker inspect to read labels instead of parsing names.

Performance Issues (6)

Severity Location Issue
🟠 Error src/sandbox/cache-volume-manager.ts:180-193
N+1 query problem in cache volume eviction. The code sequentially inspects each volume with individual `docker volume inspect` commands in a for loop. For large numbers of cache volumes, this creates O(n) sequential Docker CLI calls, each with a 10-second timeout. This can cause significant delays during cleanup.
💡 SuggestionUse `docker volume inspect` with multiple volume names in a single command to batch inspect all volumes at once, then parse the JSON output. This reduces O(n) sequential calls to O(1) batched call.
🔧 Suggested Fix
Replace the sequential for loop with a single batched inspect command:
// Batch inspect all volumes at once
const volumesToInspect = sandboxVolumes.join(&#39; &#39;);
const { stdout: inspectOut } = await execAsync(
  `docker volume inspect ${volumesToInspect} --format &#39;{{.Name}}:{{.CreatedAt}}&#39;`,
  { maxBuffer: EXEC_MAX_BUFFER, timeout: 10000 * sandboxVolumes.length }
);

// Parse results and evict in parallel
const evictionPromises = [];
for (const line of inspectOut.trim().split(&#39;\n&#39;)) {
  const [volName, createdAtStr] = line.split(&#39;:&#39;);
  const createdAt = new Date(createdAtStr).getTime();
  if (now - createdAt &gt; ttlMs) {
    evictionPromises.push(execAsync(`docker volume rm ${volName}`, { maxBuffer: EXEC_MAX_BUFFER, timeout: 10000 }));
  }
}
await Promise.allSettled(evictionPromises);
```</code></pre>
</details>
</div></td>
    </tr>
    <tr>
      <td>🟢 Info</td>
      <td><code>src/sandbox/docker-image-sandbox.ts:157-195</code></td>
      <td><div>Synchronous file operations in async context. The `exec` method uses `execAsync` but the command construction involves synchronous string operations. While not critical, the pattern of building complex shell commands with string concatenation is error-prone and less efficient than using argument arrays directly with spawn/exec.        <details><summary>💡 <strong>Suggestion</strong></summary>Consider using the `args` array format directly with `spawn` or `execFile` instead of building shell command strings. This avoids shell parsing overhead and reduces security risks from shell injection.
</details>
        <details><summary>🔧 <strong>Suggested Fix</strong></summary><pre><code>Use direct argument passing:

```typescript
import { spawn } from &#39;child_process&#39;;

// In exec method:
return new Promise((resolve, reject) =&gt; {
  const args = [&#39;exec&#39;];
  
  for (const [key, value] of Object.entries(options.env)) {
    args.push(&#39;-e&#39;, `${key}=${value}`);
  }
  
  args.push(this.containerId, &#39;sh&#39;, &#39;-c&#39;, options.command);
  
  const proc = spawn(&#39;docker&#39;, args, {
    timeout: options.timeoutMs || 600000,
    maxBuffer: options.maxBuffer || EXEC_MAX_BUFFER,
  });
  
  let stdout = &#39;&#39;;
  let stderr = &#39;&#39;;
  
  proc.stdout?.on(&#39;data&#39;, (data) =&gt; stdout += data);
  proc.stderr?.on(&#39;data&#39;, (data) =&gt; stderr += data);
  
  proc.on(&#39;close&#39;, (code) =&gt; {
    resolve({ stdout, stderr, exitCode: code || 0 });
  });
  
  proc.on(&#39;error&#39;, (err) =&gt; {
    resolve({ stdout: &#39;&#39;, stderr: err.message, exitCode: 1 });
  });
});
```</code></pre>
</details>
</div></td>
    </tr>
    <tr>
      <td>🟢 Info</td>
      <td><code>src/sandbox/docker-compose-sandbox.ts:52-104</code></td>
      <td><div>Same issue as docker-image-sandbox.ts - building shell command strings instead of using direct argument arrays. This adds shell parsing overhead and potential security issues.        <details><summary>💡 <strong>Suggestion</strong></summary>Refactor to use direct argument arrays with spawn/execFile instead of building shell command strings.
</details>
        <details><summary>🔧 <strong>Suggested Fix</strong></summary><pre><code>See suggestion for docker-image-sandbox.ts - use spawn with argument arrays instead of shell command construction.</code></pre>
</details>
</div></td>
    </tr>
    <tr>
      <td>🟢 Info</td>
      <td><code>src/sandbox/check-runner.ts:73-227</code></td>
      <td><div>Large buffer allocation (50MB) for every sandboxed check execution. While this prevents truncation issues, it allocates significant memory for each check. For checks with many parallel executions, this could cause memory pressure.        <details><summary>💡 <strong>Suggestion</strong></summary>Consider making the maxBuffer configurable per-check or using streaming for very large outputs. Most checks won&#39;t need 50MB of output.
</details>
        <details><summary>🔧 <strong>Suggested Fix</strong></summary><pre><code>Add configurable buffer size:

```typescript
const result = await sandboxManager.exec(sandboxName, {
  command,
  env,
  timeoutMs: timeoutMs || 600000,
  maxBuffer: checkConfig.maxBuffer || 10 * 1024 * 1024, // Default 10MB, configurable
});
```</code></pre>
</details>
</div></td>
    </tr>
    <tr>
      <td>🟡 Warning</td>
      <td><code>src/sandbox/check-runner.ts:86-88</code></td>
      <td><div>Inefficient serialization of PRInfo for every sandboxed check. The `serializePRInfo` function creates a deep copy of the entire PRInfo object including all file patches, which can be very large (megabytes) for substantial PRs. This data is then JSON.stringify&#39;d and base64-encoded, creating memory pressure and slow IPC.        <details><summary>💡 <strong>Suggestion</strong></summary>Consider selective serialization - only serialize fields that are actually needed by the child process. For large patches, consider passing them via a shared file or named pipe instead of embedding in the JSON payload. Alternatively, implement field-level filtering to exclude large fields like `fullDiff` or `patch` when not needed.
</details>
        <details><summary>🔧 <strong>Suggested Fix</strong></summary><pre><code>Add selective serialization option:

```typescript
function serializePRInfo(prInfo: PRInfo, options?: { includePatches?: boolean }): SerializedPRInfo {
  const includePatches = options?.includePatches !== false;
  return {
    // ... other fields ...
    files: (prInfo.files || []).map(f =&gt; ({
      filename: f.filename,
      status: f.status,
      additions: f.additions,
      deletions: f.deletions,
      changes: f.changes,
      patch: includePatches ? f.patch : undefined, // Optional
    })),
    fullDiff: includePatches ? prInfo.fullDiff : undefined,
    commitDiff: includePatches ? prInfo.commitDiff : undefined,
  };
}
```</code></pre>
</details>
</div></td>
    </tr>
    <tr>
      <td>🟡 Warning</td>
      <td><code>src/sandbox/cache-volume-manager.ts:56-98</code></td>
      <td><div>Sequential async operations in `resolveVolumes` method. Each cache volume is checked and created sequentially in a for loop instead of in parallel. For sandboxes with multiple cache paths, this delays container startup unnecessarily.        <details><summary>💡 <strong>Suggestion</strong></summary>Use `Promise.all` or `Promise.allSettled` to check volume existence and create missing volumes in parallel. Only the fallback copy operation needs to be sequential.
</details>
        <details><summary>🔧 <strong>Suggested Fix</strong></summary><pre><code>Replace the for loop with parallel operations:

```typescript
const volumePromises = cacheConfig.paths.map(async (containerPath) =&gt; {
  const hash = pathHash(containerPath);
  const volumeName = `visor-cache-${prefix}-${sandboxName}-${hash}`;
  
  const exists = await this.volumeExists(volumeName);
  
  if (!exists &amp;&amp; cacheConfig.fallback_prefix) {
    const fallbackPrefix = cacheConfig.fallback_prefix.replace(/[^a-zA-Z0-9._-]/g, &#39;-&#39;);
    const fallbackVolume = `visor-cache-${fallbackPrefix}-${sandboxName}-${hash}`;
    const fallbackExists = await this.volumeExists(fallbackVolume);
    
    if (fallbackExists) {
      logger.info(`Cache miss for &#39;${volumeName}&#39;, copying from fallback &#39;${fallbackVolume}&#39;`);
      await this.copyVolume(fallbackVolume, volumeName);
    } else {
      await this.createVolume(volumeName);
    }
  } else if (!exists) {
    await this.createVolume(volumeName);
  }
  
  await this.touchVolume(volumeName);
  
  return {
    volumeName,
    mountSpec: `${volumeName}:${containerPath}`,
  };
});

const volumes = await Promise.all(volumePromises);
return volumes;
```</code></pre>
</details>
</div></td>
    </tr>
  </tbody>
</table>
<!-- visor:section-end id="performance" -->

<!-- visor:section={"id":"quality","revision":5} -->
### Quality Issues (28)
<table>
  <thead>
    <tr>
      <th>Severity</th>
      <th>Location</th>
      <th>Issue</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>🟠 Error</td>
      <td><code>src/sandbox/cache-volume-manager.ts:145</code></td>
      <td><div>Cache volume copy failures are silently caught and only logged as warnings. This can lead to cache inconsistencies where a volume is marked as existing but is empty or partially populated.        <details><summary>💡 <strong>Suggestion</strong></summary>Propagate the error or return a status indicating whether the copy succeeded, so the caller can decide whether to proceed with the potentially corrupted cache volume.
</details>
</div></td>
    </tr>
    <tr>
      <td>🟠 Error</td>
      <td><code>src/sandbox/docker-compose-sandbox.ts:95</code></td>
      <td><div>The exec command is constructed by manually joining and quoting arguments. While quoting is applied, the logic is complex and error-prone. A bug in the quoting logic could lead to shell injection.        <details><summary>💡 <strong>Suggestion</strong></summary>Use a proper shell-escaping library or the &#39;shell-quote&#39; npm package to safely escape command arguments, or pass arguments as an array to avoid shell parsing entirely.
</details>
</div></td>
    </tr>
    <tr>
      <td>🟠 Error</td>
      <td><code>src/sandbox/docker-image-sandbox.ts:145</code></td>
      <td><div>The exec command is constructed by manually joining and quoting arguments, similar to the compose sandbox. This complex quoting logic is fragile and could lead to shell injection vulnerabilities.        <details><summary>💡 <strong>Suggestion</strong></summary>Use a proper shell-escaping library or the &#39;shell-quote&#39; npm package to safely escape command arguments, or pass arguments as an array to avoid shell parsing entirely.
</details>
</div></td>
    </tr>
    <tr>
      <td>🟢 Info</td>
      <td><code>src/sandbox/env-filter.ts:14</code></td>
      <td><div>The matchesPattern function only supports trailing wildcards (e.g., &#39;GITHUB_*&#39;). It does not support leading wildcards (&#39;*_TOKEN&#39;) or wildcards in the middle of patterns, which limits filtering flexibility.        <details><summary>💡 <strong>Suggestion</strong></summary>Document this limitation clearly or implement full glob pattern matching using a library like &#39;minimatch&#39; if more complex patterns are needed.
</details>
</div></td>
    </tr>
    <tr>
      <td>🟢 Info</td>
      <td><code>tests/unit/sandbox-check-runner.test.ts:35</code></td>
      <td><div>Test uses magic number 42 for PR number without semantic meaning. While this is a common convention, it would be clearer to use a named constant like TEST_PR_NUMBER.        <details><summary>💡 <strong>Suggestion</strong></summary>Use a named constant at the top of the test file: const TEST_PR_NUMBER = 42; to make the test intent clearer.
</details>
</div></td>
    </tr>
    <tr>
      <td>🟢 Info</td>
      <td><code>tests/unit/sandbox-check-runner.test.ts:36</code></td>
      <td><div>Test uses magic numbers 10, 5, 15 for file additions/deletions/changes without clear semantic meaning. These should be named constants or derived from test data.        <details><summary>💡 <strong>Suggestion</strong></summary>Define constants like TEST_ADDITIONS = 10, TEST_DELETIONS = 5, TEST_CHANGES = 15 to make test assertions more readable.
</details>
</div></td>
    </tr>
    <tr>
      <td>🟢 Info</td>
      <td><code>tests/integration/sandbox-docker-integration.test.ts:95</code></td>
      <td><div>Test uses magic timeout value 60000 without explanation. This should be a named constant like DEFAULT_TEST_TIMEOUT.        <details><summary>💡 <strong>Suggestion</strong></summary>Define const DEFAULT_TEST_TIMEOUT = 60000; at the top of the test file to make timeout values consistent and meaningful.
</details>
</div></td>
    </tr>
    <tr>
      <td>🟢 Info</td>
      <td><code>tests/integration/sandbox-docker-integration.test.ts:119</code></td>
      <td><div>Test uses magic timeout value 30000 without explanation. This should be a named constant for consistency.        <details><summary>💡 <strong>Suggestion</strong></summary>Use a named constant for timeout values throughout the test file to improve maintainability.
</details>
</div></td>
    </tr>
    <tr>
      <td>🟢 Info</td>
      <td><code>tests/unit/sandbox-cache-volume.test.ts:28</code></td>
      <td><div>Test uses magic string &#39;main&#39; for git branch without semantic meaning. Should use a named constant like TEST_BRANCH.        <details><summary>💡 <strong>Suggestion</strong></summary>Define const TEST_BRANCH = &#39;main&#39;; to make test intent clearer and avoid hardcoded strings.
</details>
</div></td>
    </tr>
    <tr>
      <td>🟢 Info</td>
      <td><code>tests/unit/sandbox-env-filter.test.ts:12</code></td>
      <td><div>Test uses hardcoded environment variable names and values throughout. Should use named constants for test data.        <details><summary>💡 <strong>Suggestion</strong></summary>Define test fixtures for common environment variables at the top of the test file to improve readability and maintainability.
</details>
</div></td>
    </tr>
    <tr>
      <td>🟡 Warning</td>
      <td><code>src/sandbox/cache-volume-manager.ts:156</code></td>
      <td><div>The touchVolume method is documented as a no-op but is called in the critical path of resolveVolumes. This means last-used tracking is not actually implemented, making TTL-based eviction less accurate.        <details><summary>💡 <strong>Suggestion</strong></summary>Either implement proper last-used tracking using a separate metadata file or remove the touchVolume call and document that eviction relies solely on volume creation time.
</details>
</div></td>
    </tr>
    <tr>
      <td>🟡 Warning</td>
      <td><code>src/sandbox/cache-volume-manager.ts:189</code></td>
      <td><div>Docker volume listing failures are silently caught, skipping all eviction. This can lead to unbounded growth of cache volumes if Docker commands fail intermittently.        <details><summary>💡 <strong>Suggestion</strong></summary>Log the error with severity and consider implementing a retry mechanism or exponential backoff for transient Docker failures.
</details>
</div></td>
    </tr>
    <tr>
      <td>🟡 Warning</td>
      <td><code>src/sandbox/check-runner.ts:119</code></td>
      <td><div>Trace file creation failures are silently caught and result in disabling trace relay. This makes debugging difficult when trace files fail to create due to permissions or disk space issues.        <details><summary>💡 <strong>Suggestion</strong></summary>Log a warning when trace file creation fails so users understand why telemetry is missing from sandboxed checks.
</details>
</div></td>
    </tr>
    <tr>
      <td>🟡 Warning</td>
      <td><code>src/sandbox/check-runner.ts:145</code></td>
      <td><div>Child trace ingestion failures are silently caught in nested try-catch blocks. This can mask issues with trace file parsing or OTel integration.        <details><summary>💡 <strong>Suggestion</strong></summary>Log debug or warning messages when trace ingestion fails, including the error message, to aid in troubleshooting telemetry issues.
</details>
</div></td>
    </tr>
    <tr>
      <td>🟡 Warning</td>
      <td><code>src/sandbox/check-runner.ts:167</code></td>
      <td><div>When sandbox execution fails with no JSON output, only the first 500 characters of stderr are included in the error message. For verbose command output, this may truncate critical diagnostic information.        <details><summary>💡 <strong>Suggestion</strong></summary>Increase the truncation limit or include a note indicating that stderr was truncated, so users know to check logs for the full output.
</details>
</div></td>
    </tr>
    <tr>
      <td>🟡 Warning</td>
      <td><code>src/sandbox/docker-image-sandbox.ts:88</code></td>
      <td><div>When building from inline Dockerfile, the temp file is deleted in a finally block, but if mkdtempSync succeeds and writeFileSync fails, the temp directory is never cleaned up.        <details><summary>💡 <strong>Suggestion</strong></summary>Wrap the entire temp directory lifecycle in a try-finally block to ensure cleanup even if intermediate steps fail.
</details>
</div></td>
    </tr>
    <tr>
      <td>🟡 Warning</td>
      <td><code>src/sandbox/sandbox-manager.ts:165</code></td>
      <td><div>Cache eviction failures during stopAll are silently caught. While non-fatal, repeated failures can lead to disk space exhaustion from accumulated cache volumes.        <details><summary>💡 <strong>Suggestion</strong></summary>Log a warning when cache eviction fails, including the error message, so operators are aware of growing disk usage.
</details>
</div></td>
    </tr>
    <tr>
      <td>🟡 Warning</td>
      <td><code>src/sandbox/env-filter.ts:8</code></td>
      <td><div>BUILTIN_PASSTHROUGH includes PATH, which the PR description notes can break sh on systems where /bin is not in PATH (e.g., Arch Linux). This is a known limitation.        <details><summary>💡 <strong>Suggestion</strong></summary>Consider making PATH passthrough configurable or detecting whether /bin is in PATH before forwarding it. Alternatively, document this limitation prominently.
</details>
</div></td>
    </tr>
    <tr>
      <td>🟡 Warning</td>
      <td><code>src/cli-main.ts:42</code></td>
      <td><div>When JSON parsing fails in runCheckMode, the error message is &#39;Invalid JSON payload&#39; which doesn&#39;t help users understand what was wrong with the payload.        <details><summary>💡 <strong>Suggestion</strong></summary>Include the parsing error message and a preview of the invalid JSON to help users diagnose payload formatting issues.
</details>
</div></td>
    </tr>
    <tr>
      <td>🟡 Warning</td>
      <td><code>src/cli-main.ts:48</code></td>
      <td><div>When required fields are missing from the payload, the error message doesn&#39;t specify which fields are missing or what the expected structure is.        <details><summary>💡 <strong>Suggestion</strong></summary>Include the specific missing field names and a reference to the expected payload structure in the error message.
</details>
</div></td>
    </tr>
    <tr>
      <td>🟡 Warning</td>
      <td><code>src/config.ts:897</code></td>
      <td><div>The cache path validation only checks if paths start with &#39;/&#39;, but doesn&#39;t validate that paths are well-formed (e.g., no &#39;..&#39; segments, no null bytes) which could lead to security issues or unexpected behavior.        <details><summary>💡 <strong>Suggestion</strong></summary>Add additional validation to reject paths containing &#39;..&#39;, null bytes, or other suspicious patterns, and normalize paths before validation.
</details>
</div></td>
    </tr>
    <tr>
      <td>🟡 Warning</td>
      <td><code>tests/integration/sandbox-docker-integration.test.ts:1</code></td>
      <td><div>Integration tests lack negative test cases for error scenarios: what happens when Docker daemon is not running, when image pull fails, when container crashes during exec, or when disk space is exhausted?        <details><summary>💡 <strong>Suggestion</strong></summary>Add tests that verify error handling paths: simulate Docker failures, invalid images, resource exhaustion scenarios, and ensure proper error messages are returned.
</details>
</div></td>
    </tr>
    <tr>
      <td>🟡 Warning</td>
      <td><code>tests/unit/sandbox-check-runner.test.ts:1</code></td>
      <td><div>Unit tests for CheckRunner don&#39;t test edge cases: what happens when stdout contains multiple JSON objects, when JSON is malformed but starts with &#39;{&#39;, when timeout is exceeded, or when dependencyOutputs contain circular references?        <details><summary>💡 <strong>Suggestion</strong></summary>Add tests for malformed JSON output, timeout scenarios, circular reference handling in dependencyOutputs, and stdout with mixed JSON/non-JSON content.
</details>
</div></td>
    </tr>
    <tr>
      <td>🟡 Warning</td>
      <td><code>tests/unit/sandbox-cache-volume.test.ts:1</code></td>
      <td><div>Cache volume manager tests don&#39;t cover failure scenarios: what happens when Docker commands fail intermittently, when volume creation races with concurrent processes, or when TTL parsing fails?        <details><summary>💡 <strong>Suggestion</strong></summary>Add tests for Docker command failures, concurrent volume creation, invalid TTL formats, and eviction logic with edge cases (empty volumes, very old volumes, etc.).
</details>
</div></td>
    </tr>
    <tr>
      <td>🟡 Warning</td>
      <td><code>tests/unit/sandbox-env-filter.test.ts:1</code></td>
      <td><div>Env filter tests don&#39;t cover edge cases: empty patterns, patterns with only wildcards, special characters in env var names, or very long env var values.        <details><summary>💡 <strong>Suggestion</strong></summary>Add tests for edge cases like pattern &#39;*&#39;, patterns with special regex characters, env vars with newlines or null bytes, and extremely long values.
</details>
</div></td>
    </tr>
    <tr>
      <td>🟡 Warning</td>
      <td><code>src/sandbox/check-runner.ts:1</code></td>
      <td><div>CheckRunner.runCheck doesn&#39;t validate that prInfo.files array doesn&#39;t contain circular references before JSON.stringify, which could cause infinite loops or stack overflow during serialization.        <details><summary>💡 <strong>Suggestion</strong></summary>Add validation to detect circular references in prInfo before serialization, or use a serialization library that handles circular references gracefully.
</details>
</div></td>
    </tr>
    <tr>
      <td>🟡 Warning</td>
      <td><code>src/sandbox/docker-image-sandbox.ts:1</code></td>
      <td><div>DockerImageSandbox doesn&#39;t implement cleanup of temporary build directories if the process is killed abruptly (SIGKILL). Temp directories may accumulate in /tmp.        <details><summary>💡 <strong>Suggestion</strong></summary>Register a process signal handler (SIGINT, SIGTERM) to clean up temp directories, or use a temp directory that respects the system&#39;s temp directory cleanup policies.
</details>
</div></td>
    </tr>
    <tr>
      <td>🟡 Warning</td>
      <td><code>src/sandbox/sandbox-manager.ts:1</code></td>
      <td><div>SandboxManager.getOrStart doesn&#39;t handle concurrent requests to start the same sandbox. Multiple parallel checks could race to start the same container, leading to multiple instances or errors.        <details><summary>💡 <strong>Suggestion</strong></summary>Implement a promise-based locking mechanism or use a Map of pending start promises to ensure only one start operation proceeds per sandbox name.
</details>
</div></td>
    </tr>
  </tbody>
</table>
<!-- visor:section-end id="quality" -->

<!-- visor:thread-end key="probelabs/visor#337@d91bbd1" -->

---

*Powered by [Visor](https://probelabs.com/visor) from [Probelabs](https://probelabs.com)*

*Last updated: 2026-02-07T07:19:37.847Z | Triggered by: pr_updated | Commit: d91bbd1*

💡 **TIP:** You can chat with Visor using `/visor ask <your question>`
<!-- /visor-comment-id:visor-thread-review-probelabs/visor#337 -->

…tion

- reviewer.test.ts: fix AIReviewService mock message to include "security"
- action-cli-bridge.test.ts: remove tests for deleted createTempConfigFromInputs method, update check filtering assertion
- github-workflow.test.ts: fix group key from 'default' to 'security-review', update content assertion

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@buger buger changed the base branch from feat/failure-routing to main February 7, 2026 06:47
buger and others added 2 commits February 7, 2026 06:59
The bundled/ directory was accidentally committed and is gitignored.
It should not be part of this PR.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Revert dist/ files to match origin/main — build artifacts
should not diverge in feature branches.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant