PentAGI Security Research — Case Study #1
++ This interactive report consolidates the results of a three-phase security research engagement against + PentAGI, an open-source autonomous AI pentesting agent (Go, 1001 files, + 9 Docker configs). The research spans static source analysis, sandbox behavioral testing, and a fully instrumented + end-to-end execution with a custom Mock LLM and AI Security Gateway — all conducted in an isolated, air-gapped environment. + Dates: April 19–21, 2026. +
+1. Executive Summary & Metrics
+Quantitative overview of all three research phases. Overall security posture: DANGEROUS BY DESIGN — PentAGI requires host-level Docker socket access to operate, making isolation a prerequisite, not an option.
+Risk Score
+Weighted security posture score for PentAGI based on static analysis, behavioral observation, and architectural risk.
+Static Code Patterns
+Frequency of security-sensitive code patterns identified across 1001 source files (518 Go).
+Phase 2.2 Gateway Detections
+Threat categories detected by the AI Security Gateway across 274 intercepted agent-to-LLM requests.
+2. Research Phase Timeline
+Three-phase progression from static analysis through iterative sandbox testing to successful end-to-end instrumented execution.
+Phase 1 — Static Analysis
+ April 19, 2026 +Source code audit of the full repository. Tools: custom auditor script, trufflehog, grep pattern analysis.
+Phase 2.0 — Basic Sandbox
+ April 20, 2026 +Isolated Docker sandbox: internal: true, no docker.sock mount, read-only filesystem, fake API keys.
Phase 2.1 — Docker-in-Docker + Ollama
+ April 20, 2026 +Added DinD container and Ollama local LLM backend. Tested with gemma3:4b and qwen3:1.7b models on CPU.
+Hardcoded 10-minute LLM timeout is non-configurable. CPU inference incompatible. Requires cloud API or GPU.
+Phase 2.2 — Mock LLM + AI Gateway
+ April 21, 2026 +Custom Mock LLM server + AI Security Gateway proxy. Full end-to-end execution achieved. 274 requests intercepted, 202 containing threat patterns.
+3. Static Analysis — Critical Findings
+Four critical/high-severity findings identified via source code inspection. Zero false positives confirmed by dynamic phases.
+4. Phase 2.2 Sandbox Architecture
+Fully instrumented, air-gapped sandbox. All agent-to-LLM traffic routed through the AI Security Gateway for real-time threat detection.
++ (Generator / Refiner / Primary) +
+ :11435 — detect mode +
+ :11436 — 3 modes +
+ pentagi-terminal-10 (debian:latest)
+ nmap · /etc/passwd · curl · env +
+ (inside DinD sandbox)
+ HTTP target for agent recon +
+ pgvector (sandbox) +
+ Static: read-only FS +
+ OpenAI / Anthropic +
+ 274 reqs · 202 threats +
5. Phase 2.2 — Gateway Security Findings
+Threat intelligence captured by the AI Security Gateway across all 274 intercepted agent-to-LLM requests during the successful end-to-end execution run.
+PentAGI's agent system prompts contain extensive instructions for data collection and exfiltration as part of its legitimate pentest workflow. The Gateway correctly flags all of these.
+PentAGI injects bypass security/filter/restriction patterns in its own system prompts — a legitimate design choice to enable an LLM to perform offensive actions, but flagged as prompt injection by the Gateway's 12-pattern ruleset.
This demonstrates that without a Gateway, a compromised or malicious LLM could receive these bypass instructions and act on them — executing arbitrary commands in the DinD container and exfiltrating data.
+Successful Execution Trace (Flow #10)
+| Step | +Agent | +Tool Called | +Result | +
|---|---|---|---|
| 1 | +tool_call_id_detector | +get_number ×5 |
+ Template call_{r:24:h} detected |
+
| 2 | +docker_image_selector | +(text response) |
+ debian:latest selected | +
| 3 | +generator | +subtask_list (barrier) |
+ 4 subtasks created | +
| 4 | +primary_agent (S1) | +terminal ×4 + done |
+ nmap, /etc/passwd, web enum, env | +
| 5 | +refiner (S1) | +subtask_patch (barrier) |
+ No changes | +
| 6–8 | +primary_agent (S2–S4) | +terminal + done |
+ Remaining subtask cycles (16+ commands total) | +
| 9 | +refiner (final) | +subtask_patch |
+ planned_count=0, task_complete=true | +
6. Tools Developed During Research
+Two purpose-built tools created to enable Phase 2.2. Both are reusable for future AI agent security research.
+Lightweight mock LLM server that responds instantly (0 latency), eliminating CPU inference timeouts. Fully implements Ollama API including function/tool calling with unique IDs.
+call_{24hex}.
+ subtask_list terminates generator, done terminates executor, refiner detection via agent context.
+ Transparent HTTP proxy placed between AI agents and their LLM backend. Intercepts all requests/responses for real-time threat detection and logging — no agent modification required.
+7. Consolidated Key Findings
+Cross-phase findings with MITRE ATT&CK mapping and deployment guidance.
+Docker Socket Mount is Non-Optional
+PentAGI requires /var/run/docker.sock to operate. Any installation grants the agent (and any LLM it contacts) full host-level Docker control: container creation/destruction, volume access, and privilege escalation to root. Confirmed dynamically in Phase 2.0.
Container Runs as root:root
+The main PentAGI service container is explicitly configured with user: root:root. Combined with docker.sock access, this provides maximal host privilege from the moment the container starts.
NET_ADMIN Capability Available
+The DOCKER_NET_ADMIN flag (default: false) can enable full host network manipulation — ARP spoofing, packet sniffing, routing changes. If activated by a compromised LLM or misconfiguration, provides network-wide attack surface.
1144 Docker API Calls + 46 exec.Command
+The codebase makes 1144 Docker API calls and 46 direct exec.Command invocations. Extensive programmatic control over the Docker daemon. Any prompt injection reaching the LLM could weaponize these code paths.
Hardcoded 10-Minute LLM Timeout
+Non-configurable 10-minute timeout makes PentAGI incompatible with local CPU-based LLMs. Effectively forces cloud API dependency (OpenAI/Anthropic), breaking air-gap security for any isolated deployment.
+Hardening Controls Effective
+Read-only filesystem blocked SSH key generation at /root/.ollama/id_ed25519. DinD evasion detected (docker_inside=false). AI Gateway intercepted 100% of agent traffic with zero false negatives. 0 hardcoded secrets in source.
8. Deployment Guidance & Remediation Roadmap
+Structured guidance for organizations evaluating PentAGI. Ordered by criticality and implementation timeline.
+Mandatory Prerequisites
+ Before ANY deployment +-
+
- [M1] Deploy ONLY in a dedicated, sacrificial VM with no production data or services. +
- [M2] NEVER use real customer API keys or credentials inside the PentAGI environment. +
- [M3] Deploy an AI Security Gateway (or equivalent proxy) on all agent-to-LLM traffic. +
- [M4] Enable network egress logging and alerting for unexpected external connections. +
Architecture Hardening
+ Short-term (30 days) +-
+
- [A1] Consider Docker socket proxy (e.g., Tecnativa/docker-socket-proxy) to restrict API surface to required operations only. +
- [A2] Run the container as a non-root user where feasible — submit upstream patch to the project. +
- [A3] Keep
DOCKER_NET_ADMIN=false(default). Document this explicitly in ops runbooks.
+ - [A4] Implement time-boxed sessions with automatic container teardown after each engagement. +
LLM Backend Hardening
+ Medium-term (60 days) +-
+
- [L1] Extend the AI Gateway ruleset with target-specific sensitive data patterns (hostnames, internal CIDRs, project names). +
- [L2] Switch Gateway to enforce mode once baseline false-positive rate is acceptable. +
- [L3] Integrate Gateway logs with SIEM for cross-session behavioral analysis. +
- [L4] Upstream: request configurable LLM timeout to enable local/GPU model support. +
Ongoing Research & Monitoring
+ Continuous +-
+
- [R1] Repeat dynamic analysis with a GPU-accelerated local LLM to observe full agent behavior without cloud dependency. +
- [R2] Test Mock LLM in malicious mode to measure Gateway enforcement efficacy against adversarial inputs. +
- [R3] Monitor upstream PentAGI for new versions addressing docker.sock dependency. +
- [R4] Publish Mock LLM + AI Gateway as standalone open-source tools for the AI agent security community. +
9. Research Verdict
+BY DESIGN
PentAGI is not malware, but its architecture mandates host-level Docker control. Any installation in a shared environment creates a container escape path available to the LLM backend.
+RECOMMENDED
Without a dedicated, isolated VM with no adjacent sensitive workloads, real credentials, or production infrastructure, the risk is unacceptable.
+CONTROLS
With DinD isolation, read-only filesystem, fake API keys, network air-gap, and AI Security Gateway, PentAGI is safe for controlled security research and case studies.
+Three Critical Design Limitations
+Grants root access to the host Docker daemon. Non-negotiable for PentAGI to function.
Hardcoded, non-configurable. Incompatible with local CPU inference. Forces cloud API dependency.
If Docker or LLM is unavailable, PentAGI halts completely. No graceful fallback.