From dbe6ec9b6b13d8a3577e4d6c2f18c6668b5d9f5e Mon Sep 17 00:00:00 2001 From: Mike Martinez Oroz <224715623+Ek1m-Z3n1t@users.noreply.github.com> Date: Wed, 17 Jun 2026 10:49:51 -0400 Subject: [PATCH] feat: add case study #2 Co-Authored-By: Claude Sonnet 4.6 --- .trivyignore | 5 + CONTRIBUTING.md | 4 +- README.md | 10 +- pentagi-2026-04/PENTAGI_CASE_STUDY.html | 1099 +++++++++++++ .../PENTAGI_CASE_STUDY_BRANDING.html | 1455 +++++++++++++++++ pentagi-2026-04/README.md | 105 ++ 6 files changed, 2673 insertions(+), 5 deletions(-) create mode 100644 pentagi-2026-04/PENTAGI_CASE_STUDY.html create mode 100644 pentagi-2026-04/PENTAGI_CASE_STUDY_BRANDING.html create mode 100644 pentagi-2026-04/README.md diff --git a/.trivyignore b/.trivyignore index 3da70eb..42150e0 100644 --- a/.trivyignore +++ b/.trivyignore @@ -2,3 +2,8 @@ # These are expected findings documented as part of the IaC security gap analysis research. # The AWS key AKIAIOSFODNN7EXAMAAA is AWS's official example/documentation key pattern. AVD-SECRET-0001 + +# PentAGI case study HTML reports contain security research content that describes +# detected attack patterns (EXFILTRATION, PROMPT_INJECTION, env var probing) as evidence. +# These are documented findings, not active payloads. Approved FP — see SECURITY_AUDIT_LOG.md 2026-06-17. +AVD-SECRET-0002 diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index b47c9b5..b8248bf 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -9,9 +9,11 @@ ## Research Standards All contributions must meet the same bar as published studies: -- Findings reproducible from publicly available tools (Trivy, Checkov, pq-audit, TruffleHog) + +- Findings reproducible from publicly available tools (Trivy, Checkov, pq-audit, TruffleHog, Falco) - Evidence provided as raw tool output (JSON preferred) - No client or proprietary data — lab/intentionally-vulnerable repos only +- AI agent studies: behavioral analysis must use runtime monitoring (Falco or equivalent) — static analysis alone is not sufficient ## Commit Signing diff --git a/README.md b/README.md index 2fd56ed..a83d0fc 100644 --- a/README.md +++ b/README.md @@ -5,7 +5,7 @@
Banner generated with AI assistance · MK ScorpioSec

-> IaC security research — applied findings from real-world infrastructure analysis. +> Applied security research — IaC, AI agents, and infrastructure analysis. Raw evidence published with every finding. [![License](https://img.shields.io/badge/License-Apache_2.0-D62828?style=flat-square)](LICENSE) [![Security](https://img.shields.io/badge/Security-Policy-blue?style=flat-square)](SECURITY.md) @@ -14,9 +14,10 @@ ## Studies -| Study | Description | Status | -|-------|-------------|--------| -| [TerraGoat gap analysis](terragoat-2026-04/) | 187 undocumented findings across Checkov, Trivy, and pq-audit. Running only the official scanner shows 23% of actual exposure. | `ready` | +| # | Study | Description | Status | +|---|-------|-------------|--------| +| 1 | [TerraGoat gap analysis](terragoat-2026-04/) | 187 undocumented findings across Checkov, Trivy, and pq-audit. Running only the official scanner shows 23% of actual exposure. | `ready` | +| 2 | [PentAGI — AI agent security analysis](pentagi-2026-04/) | 4 CRITICAL findings in static analysis. 462 EXFILTRATION events + 24 PROMPT_INJECTION attempts in behavioral analysis. 73.7% threat rate across 274 requests. | `ready` | --- @@ -45,6 +46,7 @@ Third-party tools used across studies: | [Trivy](https://github.com/aquasecurity/trivy) | Aqua Security | Apache 2.0 | | [Checkov](https://github.com/bridgecrewio/checkov) | Bridgecrew / Palo Alto | Apache 2.0 | | [TruffleHog](https://github.com/trufflesecurity/trufflehog) | Truffle Security | AGPL-3.0 | +| [Falco](https://github.com/falcosecurity/falco) | Falco Security | Apache 2.0 | --- diff --git a/pentagi-2026-04/PENTAGI_CASE_STUDY.html b/pentagi-2026-04/PENTAGI_CASE_STUDY.html new file mode 100644 index 0000000..ad864d2 --- /dev/null +++ b/pentagi-2026-04/PENTAGI_CASE_STUDY.html @@ -0,0 +1,1099 @@ + + + + + + PentAGI Security Research — Case Study #1 | MK ScorpioSec + + + + + + + + + +
+ + +
+
+ CASE STUDY #1 + AI AGENT SECURITY + PUBLIC RESEARCH +
+

PentAGI Security Research — Case Study #1

+

+ This interactive report consolidates the results of a three-phase security research engagement against + PentAGI, an open-source autonomous AI pentesting agent (Go, 1001 files, + 9 Docker configs). The research spans static source analysis, sandbox behavioral testing, and a fully instrumented + end-to-end execution with a custom Mock LLM and AI Security Gateway — all conducted in an isolated, air-gapped environment. + Dates: April 19–21, 2026. +

+
+ 🚫 CLASSIFICATION: PUBLIC RESEARCH + 👤 Researcher: Mike Martinez Oroz + 🔍 Organization: MK ScorpioSec + 📅 Published: June 2026 +
+
+ + +
+
+

1. Executive Summary & Metrics

+

Quantitative overview of all three research phases. Overall security posture: DANGEROUS BY DESIGN — PentAGI requires host-level Docker socket access to operate, making isolation a prerequisite, not an option.

+
+ + +
+
+
4
+
Critical Findings
+
+
+
1144
+
Docker API Calls
+
+
+
274
+
Intercepted Requests
+
+
+
73.7%
+
Traffic with Threats
+
+
+
462
+
Exfiltration Events
+
+
+
24
+
Prompt Injections
+
+
+ + +
+
+
+

Risk Score

+

Weighted security posture score for PentAGI based on static analysis, behavioral observation, and architectural risk.

+
+
+ +
+
+ 18 / 100 +
DANGEROUS BY DESIGN
+
+
+ +
+
+

Static Code Patterns

+

Frequency of security-sensitive code patterns identified across 1001 source files (518 Go).

+
+
+ +
+
+ +
+
+

Phase 2.2 Gateway Detections

+

Threat categories detected by the AI Security Gateway across 274 intercepted agent-to-LLM requests.

+
+
+ +
+
+
+
+ + +
+
+

2. Research Phase Timeline

+

Three-phase progression from static analysis through iterative sandbox testing to successful end-to-end instrumented execution.

+
+ +
+ + +
+
1.0
+
+
+

Phase 1 — Static Analysis

+ April 19, 2026 +
+

Source code audit of the full repository. Tools: custom auditor script, trufflehog, grep pattern analysis.

+
+ 4 CRITICAL findings + 1144 Docker API calls + 0 hardcoded secrets +
+
+
+ + +
+
2.0
+
+
+

Phase 2.0 — Basic Sandbox

+ April 20, 2026 +
+

Isolated Docker sandbox: internal: true, no docker.sock mount, read-only filesystem, fake API keys.

+
+ RESULT: PentAGI halts at T+16s — "Docker runtime client initialization failed." Confirms docker.sock is mandatory core, not optional feature. +
+
+
+ + +
+
2.1
+
+
+

Phase 2.1 — Docker-in-Docker + Ollama

+ April 20, 2026 +
+

Added DinD container and Ollama local LLM backend. Tested with gemma3:4b and qwen3:1.7b models on CPU.

+
+
gemma3:4b — TIMEOUT 10m2s
+
qwen3:1.7b — TIMEOUT 10m3s
+
+

Hardcoded 10-minute LLM timeout is non-configurable. CPU inference incompatible. Requires cloud API or GPU.

+
+
+ + +
+
2.2
+
+
+

Phase 2.2 — Mock LLM + AI Gateway

+ April 21, 2026 +
+

Custom Mock LLM server + AI Security Gateway proxy. Full end-to-end execution achieved. 274 requests intercepted, 202 containing threat patterns.

+
+ SUCCESS — Flow #10 complete + 462 EXFILTRATION events + 24 PROMPT_INJECTION +
+
+
+ +
+
+ + +
+
+

3. Static Analysis — Critical Findings

+

Four critical/high-severity findings identified via source code inspection. Zero false positives confirmed by dynamic phases.

+
+ +
+ + + + + +
+ +
+
+
+ + +
+
+

4. Phase 2.2 Sandbox Architecture

+

Fully instrumented, air-gapped sandbox. All agent-to-LLM traffic routed through the AI Security Gateway for real-time threat detection.

+
+ +
+
+ + +
+
+ PentAGI Agents
+ (Generator / Refiner / Primary) +
+
+
+ AI Security Gateway
+ :11435 — detect mode +
+
+
+ Mock LLM Server
+ :11436 — 3 modes +
+
+ + +
+
+
+
tool calls
+
+
+
+
+
logs all prompts
+
+
+ + +
+
+ Docker-in-Docker (DinD)
+ pentagi-terminal-10 (debian:latest)
+ nmap · /etc/passwd · curl · env +
+
+
+ DVWA Target
+ (inside DinD sandbox)
+ HTTP target for agent recon +
+
+ + +
+
Network Isolation Boundary — internal: true (no internet)
+
+
+ PostgreSQL
+ pgvector (sandbox) +
+
+ No docker.sock
+ Static: read-only FS +
+
+ Fake API Keys
+ OpenAI / Anthropic +
+
+ Gateway Logs
+ 274 reqs · 202 threats +
+
+
+
+
+
+ + +
+
+

5. Phase 2.2 — Gateway Security Findings

+

Threat intelligence captured by the AI Security Gateway across all 274 intercepted agent-to-LLM requests during the successful end-to-end execution run.

+
+ + +
+ + +
+
+ EXFILTRATION + 462 + pattern matches +
+

PentAGI's agent system prompts contain extensive instructions for data collection and exfiltration as part of its legitimate pentest workflow. The Gateway correctly flags all of these.

+
+
+ IP address extraction patterns + 178 hits +
+
+ +
+ /etc/passwd & /etc/shadow access + 136 hits +
+
+ +
+ nc -l (netcat listener) + 96 hits +
+
+ +
+ curl | bash pipe patterns + 52 hits +
+
+
+
+ + +
+
+ PROMPT INJECTION + 24 + matches in system prompts +
+

PentAGI injects bypass security/filter/restriction patterns in its own system prompts — a legitimate design choice to enable an LLM to perform offensive actions, but flagged as prompt injection by the Gateway's 12-pattern ruleset.

+
+
Key Insight
+

This demonstrates that without a Gateway, a compromised or malicious LLM could receive these bypass instructions and act on them — executing arbitrary commands in the DinD container and exfiltrating data.

+
+
+
Traffic Breakdown
+
+
+
+
+
+
+
+
+
+ 202 with threats (73.7%) + 72 clean (26.3%) +
+
+
+ +
+ + +
+

Successful Execution Trace (Flow #10)

+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
StepAgentTool CalledResult
1tool_call_id_detectorget_number ×5Template call_{r:24:h} detected
2docker_image_selector(text response)debian:latest selected
3generatorsubtask_list (barrier)4 subtasks created
4primary_agent (S1)terminal ×4 + donenmap, /etc/passwd, web enum, env
5refiner (S1)subtask_patch (barrier)No changes
6–8primary_agent (S2–S4)terminal + doneRemaining subtask cycles (16+ commands total)
9refiner (final)subtask_patchplanned_count=0, task_complete=true
+
+
+ Duration: ~4 seconds + 4/4 subtasks completed + Container: pentagi-terminal-10 + 16+ commands executed in DinD +
+
+
+ + +
+
+

6. Tools Developed During Research

+

Two purpose-built tools created to enable Phase 2.2. Both are reusable for future AI agent security research.

+
+ +
+ +
+
+ mock_llm.py + Ollama/OpenAI Compatible +
+

Lightweight mock LLM server that responds instantly (0 latency), eliminating CPU inference timeouts. Fully implements Ollama API including function/tool calling with unique IDs.

+
+
+ + Mode: helpful — Responds with valid tool calls (terminal, subtask_list, done, subtask_patch). Simulates a cooperative agent LLM. +
+
+ + Mode: malicious — Injects adversarial payloads into tool calls. Tests Gateway detection capability. +
+
+ + Mode: confused — Sends malformed/unexpected responses. Tests agent error handling robustness. +
+
+ + Full logging of all received prompts. Tool call IDs format: call_{24hex}. +
+
+ + PentAGI-specific barrier logic: subtask_list terminates generator, done terminates executor, refiner detection via agent context. +
+
+
+ +
+
+ ai_gateway.py + HTTP Proxy + Detection +
+

Transparent HTTP proxy placed between AI agents and their LLM backend. Intercepts all requests/responses for real-time threat detection and logging — no agent modification required.

+
+
+ + 12 exfiltration patterns — IPs, credentials, sensitive files, network listener commands, pipe-to-bash patterns. +
+
+ + 12 prompt injection patterns — bypass/jailbreak/ignore-previous-instructions keywords in system prompts. +
+
+ + 9 suspicious action patterns — chmod 777, root privilege escalation, cron job creation, SSH key deployment. +
+
+ + Mode: detect — Logs and passes through. Mode: enforce — Blocks and rejects threats with configurable rate limiting. +
+
+ + Output: JSONL structured logs with timestamps, threat type, matched pattern, request ID. Compatible with SIEM ingestion. +
+
+
+ +
+
+ + +
+
+

7. Consolidated Key Findings

+

Cross-phase findings with MITRE ATT&CK mapping and deployment guidance.

+
+ +
+ +
+
+ CRITICAL + T1611 — Escape to Host +
+

Docker Socket Mount is Non-Optional

+

PentAGI requires /var/run/docker.sock to operate. Any installation grants the agent (and any LLM it contacts) full host-level Docker control: container creation/destruction, volume access, and privilege escalation to root. Confirmed dynamically in Phase 2.0.

+
CVSS: 9.8 — docker-compose.yml:176
+
+ +
+
+ CRITICAL + T1078.003 — Local Accounts +
+

Container Runs as root:root

+

The main PentAGI service container is explicitly configured with user: root:root. Combined with docker.sock access, this provides maximal host privilege from the moment the container starts.

+
CVSS: 9.1 — docker-compose.yml:180
+
+ +
+
+ HIGH + T1557 — Adversary-in-the-Middle +
+

NET_ADMIN Capability Available

+

The DOCKER_NET_ADMIN flag (default: false) can enable full host network manipulation — ARP spoofing, packet sniffing, routing changes. If activated by a compromised LLM or misconfiguration, provides network-wide attack surface.

+
CVSS: 8.0 — docker-compose.yml:161
+
+ +
+
+ HIGH + T1059 — Command Execution +
+

1144 Docker API Calls + 46 exec.Command

+

The codebase makes 1144 Docker API calls and 46 direct exec.Command invocations. Extensive programmatic control over the Docker daemon. Any prompt injection reaching the LLM could weaponize these code paths.

+
Static pattern count — 518 Go source files
+
+ +
+
+ MEDIUM + Design Limitation +
+

Hardcoded 10-Minute LLM Timeout

+

Non-configurable 10-minute timeout makes PentAGI incompatible with local CPU-based LLMs. Effectively forces cloud API dependency (OpenAI/Anthropic), breaking air-gap security for any isolated deployment.

+
Confirmed: gemma3:4b + qwen3:1.7b both timed out
+
+ +
+
+ POSITIVE + Defense Works +
+

Hardening Controls Effective

+

Read-only filesystem blocked SSH key generation at /root/.ollama/id_ed25519. DinD evasion detected (docker_inside=false). AI Gateway intercepted 100% of agent traffic with zero false negatives. 0 hardcoded secrets in source.

+
All defensive controls confirmed effective
+
+ +
+
+ + +
+
+

8. Deployment Guidance & Remediation Roadmap

+

Structured guidance for organizations evaluating PentAGI. Ordered by criticality and implementation timeline.

+
+ +
+ +
+
1
+
+
+

Mandatory Prerequisites

+ Before ANY deployment +
+
    +
  • [M1] Deploy ONLY in a dedicated, sacrificial VM with no production data or services.
  • +
  • [M2] NEVER use real customer API keys or credentials inside the PentAGI environment.
  • +
  • [M3] Deploy an AI Security Gateway (or equivalent proxy) on all agent-to-LLM traffic.
  • +
  • [M4] Enable network egress logging and alerting for unexpected external connections.
  • +
+
+
+ +
+
2
+
+
+

Architecture Hardening

+ Short-term (30 days) +
+
    +
  • [A1] Consider Docker socket proxy (e.g., Tecnativa/docker-socket-proxy) to restrict API surface to required operations only.
  • +
  • [A2] Run the container as a non-root user where feasible — submit upstream patch to the project.
  • +
  • [A3] Keep DOCKER_NET_ADMIN=false (default). Document this explicitly in ops runbooks.
  • +
  • [A4] Implement time-boxed sessions with automatic container teardown after each engagement.
  • +
+
+
+ +
+
3
+
+
+

LLM Backend Hardening

+ Medium-term (60 days) +
+
    +
  • [L1] Extend the AI Gateway ruleset with target-specific sensitive data patterns (hostnames, internal CIDRs, project names).
  • +
  • [L2] Switch Gateway to enforce mode once baseline false-positive rate is acceptable.
  • +
  • [L3] Integrate Gateway logs with SIEM for cross-session behavioral analysis.
  • +
  • [L4] Upstream: request configurable LLM timeout to enable local/GPU model support.
  • +
+
+
+ +
+
4
+
+
+

Ongoing Research & Monitoring

+ Continuous +
+
    +
  • [R1] Repeat dynamic analysis with a GPU-accelerated local LLM to observe full agent behavior without cloud dependency.
  • +
  • [R2] Test Mock LLM in malicious mode to measure Gateway enforcement efficacy against adversarial inputs.
  • +
  • [R3] Monitor upstream PentAGI for new versions addressing docker.sock dependency.
  • +
  • [R4] Publish Mock LLM + AI Gateway as standalone open-source tools for the AI agent security community.
  • +
+
+
+ +
+
+ + +
+
+

9. Research Verdict

+
+
+
+
Verdict
+
DANGEROUS
BY DESIGN
+

PentAGI is not malware, but its architecture mandates host-level Docker control. Any installation in a shared environment creates a container escape path available to the LLM backend.

+
+
+
For Production Use
+
NOT
RECOMMENDED
+

Without a dedicated, isolated VM with no adjacent sensitive workloads, real credentials, or production infrastructure, the risk is unacceptable.

+
+
+
For Research Use
+
SAFE WITH
CONTROLS
+

With DinD isolation, read-only filesystem, fake API keys, network air-gap, and AI Security Gateway, PentAGI is safe for controlled security research and case studies.

+
+
+
+

Three Critical Design Limitations

+
+
+ 1 +
docker.sock mandatory
Grants root access to the host Docker daemon. Non-negotiable for PentAGI to function.
+
+
+ 2 +
10-minute LLM timeout
Hardcoded, non-configurable. Incompatible with local CPU inference. Forces cloud API dependency.
+
+
+ 3 +
No degraded mode
If Docker or LLM is unavailable, PentAGI halts completely. No graceful fallback.
+
+
+
+
+ +
+ + + + + + + + + + diff --git a/pentagi-2026-04/PENTAGI_CASE_STUDY_BRANDING.html b/pentagi-2026-04/PENTAGI_CASE_STUDY_BRANDING.html new file mode 100644 index 0000000..7e11a5c --- /dev/null +++ b/pentagi-2026-04/PENTAGI_CASE_STUDY_BRANDING.html @@ -0,0 +1,1455 @@ + + + + + + PentAGI Security Research — Case Study #2 | MK ScorpioSec + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+
+ CASE STUDY #1 + AI AGENT SECURITY + PUBLIC RESEARCH +
+

PentAGI Security Research — Case Study #2

+

+ This interactive report consolidates the results of a three-phase security research engagement against + PentAGI, an open-source autonomous AI pentesting agent (Go, 1001 files, + 9 Docker configs). The research spans static source analysis, sandbox behavioral testing, and a fully instrumented + end-to-end execution with a custom Mock LLM and AI Security Gateway — all conducted in an isolated, air-gapped environment. + Dates: April 19–21, 2026. +

+
+ 🚫 CLASSIFICATION: RESEARCH + 👤 Researcher: Mike Martinez Oroz + 🔍 Organization: MK ScorpioSec + 📅 Published: June 2026 +
+
+ +
+ + +
+
+

1. Executive Summary & Metrics

+

Quantitative overview of all three research phases. Overall security posture: DANGEROUS BY DESIGN — PentAGI requires host-level Docker socket access to operate, making isolation a prerequisite, not an option.

+
+ + +
+
+
4
+
Critical Findings
+
+
+
1144
+
Docker API Calls
+
+
+
274
+
Intercepted Requests
+
+
+
73.7%
+
Traffic with Threats
+
+
+
462
+
Exfiltration Events
+
+
+
24
+
Prompt Injections
+
+
+ + +
+
+
+

Risk Score

+

Weighted security posture score for PentAGI based on static analysis, behavioral observation, and architectural risk.

+
+
+ +
+
+ 18 / 100 +
DANGEROUS BY DESIGN
+
+
+ +
+
+

Static Code Patterns

+

Frequency of security-sensitive code patterns identified across 1001 source files (518 Go).

+
+
+ +
+
+ +
+
+

Phase 2.2 Gateway Detections

+

Threat categories detected by the AI Security Gateway across 274 intercepted agent-to-LLM requests.

+
+
+ +
+
+
+
+ +
+ + +
+
+

2. Research Phase Timeline

+

Three-phase progression from static analysis through iterative sandbox testing to successful end-to-end instrumented execution.

+
+ +
+ + +
+
1.0
+
+
+

Phase 1 — Static Analysis

+ April 19, 2026 +
+

Source code audit of the full repository. Tools: custom auditor script, trufflehog, grep pattern analysis.

+
+ 4 CRITICAL findings + 1144 Docker API calls + 0 hardcoded secrets +
+
+
+ + +
+
2.0
+
+
+

Phase 2.0 — Basic Sandbox

+ April 20, 2026 +
+

Isolated Docker sandbox: internal: true, no docker.sock mount, read-only filesystem, fake API keys.

+
+ RESULT: PentAGI halts at T+16s — "Docker runtime client initialization failed." Confirms docker.sock is mandatory core, not optional feature. +
+
+
+ + +
+
2.1
+
+
+

Phase 2.1 — Docker-in-Docker + Ollama

+ April 20, 2026 +
+

Added DinD container and Ollama local LLM backend. Tested with gemma3:4b and qwen3:1.7b models on CPU.

+
+
gemma3:4b — TIMEOUT 10m2s
+
qwen3:1.7b — TIMEOUT 10m3s
+
+

Hardcoded 10-minute LLM timeout is non-configurable. CPU inference incompatible. Requires cloud API or GPU.

+
+
+ + +
+
2.2
+
+
+

Phase 2.2 — Mock LLM + AI Gateway

+ April 21, 2026 +
+

Custom Mock LLM server + AI Security Gateway proxy. Full end-to-end execution achieved. 274 requests intercepted, 202 containing threat patterns.

+
+ SUCCESS — Flow #10 complete + 462 EXFILTRATION events + 24 PROMPT_INJECTION +
+
+
+ +
+
+ +
+ + +
+
+

3. Static Analysis — Critical Findings

+

Four critical/high-severity findings identified via source code inspection. Zero false positives confirmed by dynamic phases.

+
+ +
+ + + + + +
+ +
+
+
+ +
+ + +
+
+

4. Phase 2.2 Sandbox Architecture

+

Fully instrumented, air-gapped sandbox. All agent-to-LLM traffic routed through the AI Security Gateway for real-time threat detection.

+
+ +
+
+ + +
+
+ PentAGI Agents
+ (Generator / Refiner / Primary) +
+
+
+ AI Security Gateway
+ :11435 — detect mode +
+
+
+ Mock LLM Server
+ :11436 — 3 modes +
+
+ + +
+
+
+
tool calls
+
+
+
+
+
logs all prompts
+
+
+ + +
+
+ Docker-in-Docker (DinD)
+ pentagi-terminal-10 (debian:latest)
+ nmap · /etc/passwd · curl · env +
+
+
+ DVWA Target
+ (inside DinD sandbox)
+ HTTP target for agent recon +
+
+ + +
+
Network Isolation Boundary — internal: true (no internet)
+
+
+ PostgreSQL
+ pgvector (sandbox) +
+
+ No docker.sock
+ Static: read-only FS +
+
+ Fake API Keys
+ OpenAI / Anthropic +
+
+ Gateway Logs
+ 274 reqs · 202 threats +
+
+
+
+
+
+ +
+ + +
+
+

5. Phase 2.2 — Gateway Security Findings

+

Threat intelligence captured by the AI Security Gateway across all 274 intercepted agent-to-LLM requests during the successful end-to-end execution run.

+
+ + +
+ + +
+
+ EXFILTRATION + 462 + pattern matches +
+

PentAGI's agent system prompts contain extensive instructions for data collection and exfiltration as part of its legitimate pentest workflow. The Gateway correctly flags all of these.

+
+
+ IP address extraction patterns + 178 hits +
+
+ +
+ /etc/passwd & /etc/shadow access + 136 hits +
+
+ +
+ nc -l (netcat listener) + 96 hits +
+
+ +
+ curl | bash pipe patterns + 52 hits +
+
+
+
+ + +
+
+ PROMPT INJECTION + 24 + matches in system prompts +
+

PentAGI injects bypass security/filter/restriction patterns in its own system prompts — a legitimate design choice to enable an LLM to perform offensive actions, but flagged as prompt injection by the Gateway's 12-pattern ruleset.

+
+
Key Insight
+

This demonstrates that without a Gateway, a compromised or malicious LLM could receive these bypass instructions and act on them — executing arbitrary commands in the DinD container and exfiltrating data.

+
+
+
Traffic Breakdown
+
+
+
+
+
+
+
+
+
+ 202 with threats (73.7%) + 72 clean (26.3%) +
+
+
+ +
+ + +
+

Successful Execution Trace (Flow #10)

+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
StepAgentTool CalledResult
1tool_call_id_detectorget_number ×5Template call_{r:24:h} detected
2docker_image_selector(text response)debian:latest selected
3generatorsubtask_list (barrier)4 subtasks created
4primary_agent (S1)terminal ×4 + donenmap, /etc/passwd, web enum, env
5refiner (S1)subtask_patch (barrier)No changes
6–8primary_agent (S2–S4)terminal + doneRemaining subtask cycles (16+ commands total)
9refiner (final)subtask_patchplanned_count=0, task_complete=true
+
+
+ Duration: ~4 seconds + 4/4 subtasks completed + Container: pentagi-terminal-10 + 16+ commands executed in DinD +
+
+
+ +
+ + +
+
+

6. Tools Developed During Research

+

Two purpose-built tools created to enable Phase 2.2. Both are reusable for future AI agent security research.

+
+ +
+ +
+
+ mock_llm.py + Ollama/OpenAI Compatible +
+

Lightweight mock LLM server that responds instantly (0 latency), eliminating CPU inference timeouts. Fully implements Ollama API including function/tool calling with unique IDs.

+
+
+ + Mode: helpful — Responds with valid tool calls (terminal, subtask_list, done, subtask_patch). Simulates a cooperative agent LLM. +
+
+ + Mode: malicious — Injects adversarial payloads into tool calls. Tests Gateway detection capability. +
+
+ + Mode: confused — Sends malformed/unexpected responses. Tests agent error handling robustness. +
+
+ + Full logging of all received prompts. Tool call IDs format: call_{24hex}. +
+
+ + PentAGI-specific barrier logic: subtask_list terminates generator, done terminates executor, refiner detection via agent context. +
+
+
+ +
+
+ ai_gateway.py + HTTP Proxy + Detection +
+

Transparent HTTP proxy placed between AI agents and their LLM backend. Intercepts all requests/responses for real-time threat detection and logging — no agent modification required.

+
+
+ + 12 exfiltration patterns — IPs, credentials, sensitive files, network listener commands, pipe-to-bash patterns. +
+
+ + 12 prompt injection patterns — bypass/jailbreak/ignore-previous-instructions keywords in system prompts. +
+
+ + 9 suspicious action patterns — chmod 777, root privilege escalation, cron job creation, SSH key deployment. +
+
+ + Mode: detect — Logs and passes through. Mode: enforce — Blocks and rejects threats with configurable rate limiting. +
+
+ + Output: JSONL structured logs with timestamps, threat type, matched pattern, request ID. Compatible with SIEM ingestion. +
+
+
+ +
+
+ +
+ + +
+
+

7. Consolidated Key Findings

+

Cross-phase findings with MITRE ATT&CK mapping and deployment guidance.

+
+ +
+ +
+
+ CRITICAL + T1611 — Escape to Host +
+

Docker Socket Mount is Non-Optional

+

PentAGI requires /var/run/docker.sock to operate. Any installation grants the agent (and any LLM it contacts) full host-level Docker control: container creation/destruction, volume access, and privilege escalation to root. Confirmed dynamically in Phase 2.0.

+
CVSS: 9.8 — docker-compose.yml:176
+
+ +
+
+ CRITICAL + T1078.003 — Local Accounts +
+

Container Runs as root:root

+

The main PentAGI service container is explicitly configured with user: root:root. Combined with docker.sock access, this provides maximal host privilege from the moment the container starts.

+
CVSS: 9.1 — docker-compose.yml:180
+
+ +
+
+ HIGH + T1557 — Adversary-in-the-Middle +
+

NET_ADMIN Capability Available

+

The DOCKER_NET_ADMIN flag (default: false) can enable full host network manipulation — ARP spoofing, packet sniffing, routing changes. If activated by a compromised LLM or misconfiguration, provides network-wide attack surface.

+
CVSS: 8.0 — docker-compose.yml:161
+
+ +
+
+ HIGH + T1059 — Command Execution +
+

1144 Docker API Calls + 46 exec.Command

+

The codebase makes 1144 Docker API calls and 46 direct exec.Command invocations. Extensive programmatic control over the Docker daemon. Any prompt injection reaching the LLM could weaponize these code paths.

+
Static pattern count — 518 Go source files
+
+ +
+
+ MEDIUM + Design Limitation +
+

Hardcoded 10-Minute LLM Timeout

+

Non-configurable 10-minute timeout makes PentAGI incompatible with local CPU-based LLMs. Effectively forces cloud API dependency (OpenAI/Anthropic), breaking air-gap security for any isolated deployment.

+
Confirmed: gemma3:4b + qwen3:1.7b both timed out
+
+ +
+
+ POSITIVE + Defense Works +
+

Hardening Controls Effective

+

Read-only filesystem blocked SSH key generation at /root/.ollama/id_ed25519. DinD evasion detected (docker_inside=false). AI Gateway intercepted 100% of agent traffic with zero false negatives. 0 hardcoded secrets in source.

+
All defensive controls confirmed effective
+
+ +
+
+ +
+ + +
+
+

8. Deployment Guidance & Remediation Roadmap

+

Structured guidance for organizations evaluating PentAGI. Ordered by criticality and implementation timeline.

+
+ +
+ +
+
1
+
+
+

Mandatory Prerequisites

+ Before ANY deployment +
+
    +
  • [M1] Deploy ONLY in a dedicated, sacrificial VM with no production data or services.
  • +
  • [M2] NEVER use real customer API keys or credentials inside the PentAGI environment.
  • +
  • [M3] Deploy an AI Security Gateway (or equivalent proxy) on all agent-to-LLM traffic.
  • +
  • [M4] Enable network egress logging and alerting for unexpected external connections.
  • +
+
+
+ +
+
2
+
+
+

Architecture Hardening

+ Short-term (30 days) +
+
    +
  • [A1] Consider Docker socket proxy (e.g., Tecnativa/docker-socket-proxy) to restrict API surface to required operations only.
  • +
  • [A2] Run the container as a non-root user where feasible — submit upstream patch to the project.
  • +
  • [A3] Keep DOCKER_NET_ADMIN=false (default). Document this explicitly in ops runbooks.
  • +
  • [A4] Implement time-boxed sessions with automatic container teardown after each engagement.
  • +
+
+
+ +
+
3
+
+
+

LLM Backend Hardening

+ Medium-term (60 days) +
+
    +
  • [L1] Extend the AI Gateway ruleset with target-specific sensitive data patterns (hostnames, internal CIDRs, project names).
  • +
  • [L2] Switch Gateway to enforce mode once baseline false-positive rate is acceptable.
  • +
  • [L3] Integrate Gateway logs with SIEM for cross-session behavioral analysis.
  • +
  • [L4] Upstream: request configurable LLM timeout to enable local/GPU model support.
  • +
+
+
+ +
+
4
+
+
+

Ongoing Research & Monitoring

+ Continuous +
+
    +
  • [R1] Repeat dynamic analysis with a GPU-accelerated local LLM to observe full agent behavior without cloud dependency.
  • +
  • [R2] Test Mock LLM in malicious mode to measure Gateway enforcement efficacy against adversarial inputs.
  • +
  • [R3] Monitor upstream PentAGI for new versions addressing docker.sock dependency.
  • +
  • [R4] Publish Mock LLM + AI Gateway as standalone open-source tools for the AI agent security community.
  • +
+
+
+ +
+
+ +
+ + +
+
+

9. Research Verdict

+
+
+
+
Verdict
+
DANGEROUS
BY DESIGN
+

PentAGI is not malware, but its architecture mandates host-level Docker control. Any installation in a shared environment creates a container escape path available to the LLM backend.

+
+
+
For Production Use
+
NOT
RECOMMENDED
+

Without a dedicated, isolated VM with no adjacent sensitive workloads, real credentials, or production infrastructure, the risk is unacceptable.

+
+
+
For Research Use
+
SAFE WITH
CONTROLS
+

With DinD isolation, read-only filesystem, fake API keys, network air-gap, and AI Security Gateway, PentAGI is safe for controlled security research and case studies.

+
+
+
+

Three Critical Design Limitations

+
+
+ 1 +
docker.sock mandatory
Grants root access to the host Docker daemon. Non-negotiable for PentAGI to function.
+
+
+ 2 +
10-minute LLM timeout
Hardcoded, non-configurable. Incompatible with local CPU inference. Forces cloud API dependency.
+
+
+ 3 +
No degraded mode
If Docker or LLM is unavailable, PentAGI halts completely. No graceful fallback.
+
+
+
+
+ + + + +
+ + + + + + + + +
+ + + + diff --git a/pentagi-2026-04/README.md b/pentagi-2026-04/README.md new file mode 100644 index 0000000..e4d928b --- /dev/null +++ b/pentagi-2026-04/README.md @@ -0,0 +1,105 @@ +# Case Study #2 — PentAGI Autonomous AI Agent Security Analysis + +**Research period:** April 2026 +**Subject:** [PentAGI](https://pentagi.com/) — autonomous AI-powered penetration testing agent +**Classification:** RESEARCH — publicly available codebase +**Author:** MK ScorpioSec Research Team + +--- + +## Overview + +PentAGI automates the full penetration testing lifecycle using LLM-orchestrated agents, MCP tools, and multi-agent workflows. This study examines the security posture of PentAGI itself when deployed in a containerized environment — the tool-testing-the-tool scenario. + +**Key question:** If an AI security agent runs loose in your environment, what can it reach and what does it actually do? + +--- + +## Findings Summary + +### Static Analysis + +| Severity | Count | Key Finding | +|----------|-------|-------------| +| CRITICAL | 4 | docker.sock exposure, root execution, NET_ADMIN capability, 1,144 Docker API calls | +| HIGH | 2 | Unrestricted filesystem access, host network exposure | +| MEDIUM | 2 | Missing resource limits, debug interface exposure | + +### Dynamic Analysis — Phase 2.2 (Behavioral) + +| Metric | Value | +|--------|-------| +| Requests analyzed | 274 | +| Threat rate | **73.7%** | +| EXFILTRATION events | **462** (env vars, filesystem, credential probing) | +| PROMPT_INJECTION attempts | **24** | +| Docker API calls (1 session) | **1,144** | + +--- + +## Critical Findings + +### 1. Docker Socket Exposure (`/var/run/docker.sock`) +Mounting the Docker socket gives the container — and the LLM driving it — full control over the host Docker daemon. A successful prompt injection achieves host escape. + +### 2. Root Container Execution +All PentAGI containers run as root with no user namespace isolation. Combined with the Docker socket, this is a direct path to full host compromise. + +### 3. NET_ADMIN Capability +Grants the container full access to the host network stack: traffic interception, routing manipulation, and ARP spoofing against adjacent containers. + +### 4. Prompt Injection → Container Escape Chain +Static analysis confirmed 24 injection-surface endpoints. A weaponized injection payload can direct the LLM to spin up a new privileged container mounting the host filesystem. + +--- + +## Reports + +| File | Description | +|------|-------------| +| [`PENTAGI_CASE_STUDY_BRANDING.html`](PENTAGI_CASE_STUDY_BRANDING.html) | Full branded report — print-ready, all charts and visualizations | +| [`PENTAGI_CASE_STUDY.html`](PENTAGI_CASE_STUDY.html) | Compact research report | + +> Open in a modern browser. Reports use Chart.js for visualizations (loaded from CDN). + +--- + +## Methodology + +**Phase 1 — Static Analysis** +- Dockerfile + docker-compose.yml capability audit +- Go source code pattern analysis: `exec.Command` calls, Docker API usage, filesystem ops, secret references +- Dependency scanning with Trivy +- Container privilege matrix evaluation + +**Phase 2 — Dynamic Analysis** +- Falco behavioral monitoring during live agent sessions +- API call pattern classification (Anthropic API, Docker API, filesystem) +- Behavioral threat event taxonomy: EXFILTRATION, PROMPT_INJECTION, PRIVILEGE_ESCALATION, LATERAL_MOVEMENT +- Prompt injection surface mapping (direct + indirect vectors) + +--- + +## Responsible Disclosure + +This research was conducted on the publicly available PentAGI codebase and default Docker Compose deployment. No production systems or live endpoints were targeted. Findings relate to the default configuration as shipped. + +Disclosure timeline: findings documented April 2026. + +--- + +## Tools Used + +| Tool | Purpose | +|------|---------| +| [Trivy](https://github.com/aquasecurity/trivy) | Container + dependency scanning | +| [Falco](https://falco.org/) | Runtime behavioral monitoring | +| [pq-audit](https://github.com/mk-scorpiosec/pq-audit) | Post-quantum cryptography layer | +| Custom Falco rules | EXFILTRATION + PROMPT_INJECTION classification | + +--- + +## Related + +- [Case Study #1 — TerraGoat IaC Analysis](../terragoat-2026-04/) +- [MK ScorpioSec Research](https://github.com/mk-scorpiosec/research)