From 16b1dbf7fc0fd61cd5dd687b74fb0848ea28ddd7 Mon Sep 17 00:00:00 2001 From: Ben Alkov Date: Wed, 17 Jun 2026 13:47:40 -0400 Subject: [PATCH 1/2] fix(review): the agent tool invocation is fundamentally wrong In working this PR, I had to *carefully* review the Claude Code docs. *This* commit is an outcome of that review. - There is no such thing as a "subagent_type" - it's an hallucination. The only existing fields are listed in https://code.claude.com/docs/en/sub-agents#supported-frontmatter-fields - Agent definitions *are not composable* - The builtin agents (e.g. Explore) are defined in *exactly* the same way as custom agents - There is no mechanism for "subclassing" agents - The recommended way to restrict subagents is using one or both of `tools`/`disallowedTools` - `permissionMode`: `dontAsk` (unintuitively) forbids Claude Code from executing any action *which would normally require user approval* - There is no such thing as "run_in_background" - another hallucination - `prompt` is under-specified: it *does not* exist in the list of frontmatter fields mentioned above, but it *can* be used when invoking the `--agents` command line flag. `initialPrompt` exists, but its description is a little fuzzy. Going conservative here. Signed-off-by: Ben Alkov --- .../scaffold/fullsend-repo/agents/review.md | 3 +- .../fullsend-repo/skills/pr-review/SKILL.md | 41 +++++++++++-------- 2 files changed, 26 insertions(+), 18 deletions(-) diff --git a/internal/scaffold/fullsend-repo/agents/review.md b/internal/scaffold/fullsend-repo/agents/review.md index 393df4ccb..ae7ba165b 100644 --- a/internal/scaffold/fullsend-repo/agents/review.md +++ b/internal/scaffold/fullsend-repo/agents/review.md @@ -94,7 +94,8 @@ This agent has three skills. Select based on invocation context: paths, scope authorization, PR body injection defense), and produces a structured review result. Sub-agent definitions live in `skills/pr-review/sub-agents/`. Each sub-agent is dispatched with - `model` from its frontmatter and `subagent_type: Explore`. + arguments appropriate to the "Agent" tool, as specified in the + pr-review skill. - **`code-review`** — the prompt is about a local branch diff with no PR, or another skill is delegating code evaluation. This skill evaluates the diff and source files directly across the original diff --git a/internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md b/internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md index 7d3226834..e44ad1c00 100644 --- a/internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md +++ b/internal/scaffold/fullsend-repo/skills/pr-review/SKILL.md @@ -312,8 +312,8 @@ For each selected sub-agent, assemble a context package containing: For each selected sub-agent: 1. Read the sub-agent definition from `sub-agents/{name}.md` -2. Extract the `model` from frontmatter -3. Compose the spawn prompt from three parts: +2. Extract `model` and `name` from frontmatter +3. Compose the spawn prompt from: **Part 1 — Sub-agent definition:** the full markdown body of the sub-agent file (everything after the frontmatter) @@ -360,11 +360,13 @@ For each selected sub-agent: ``` 4. Spawn via Agent tool with: + - `name`: from the sub-agent frontmatter + - `tools`: `Read, Grep, Glob` - `model`: from the sub-agent frontmatter (any value accepted by - the Agent tool's `model` parameter) - - `subagent_type`: `Explore` (read-only — sub-agents do not write) - - `run_in_background`: `true` - - `prompt`: composed from parts 1–5 + the Agent tool's `model` parameter) + - `permissionMode`: `dontAsk` + - `background`: `true` + - `initialPrompt`: composed from parts 1–5 **All sub-agents MUST be dispatched simultaneously** — include all Agent calls in a single message so they run concurrently. This is the @@ -459,8 +461,9 @@ fresh context. The challenger has not seen the orchestrator's synthesis — it receives only the raw findings and the diff, preserving context isolation. -1. Read `sub-agents/challenger.md` for the sub-agent definition -2. Compose the spawn prompt from: +1. Read the sub-agent definition from `sub-agents/challenger.md` +2. Extract `model` and `name` from frontmatter +3. Compose the spawn prompt from: **Part 1 — Sub-agent definition:** the full markdown body of the challenger sub-agent file (everything after the frontmatter) @@ -494,10 +497,14 @@ isolation. REVIEW_SUB_AGENT_TRUE ``` -3. Spawn via Agent tool with: - - `model`: from the challenger sub-agent frontmatter (`opus`) - - `subagent_type`: `Explore` (read-only) - - `prompt`: composed from parts 1–4 +4. Spawn via Agent tool with: + - `name`: from the challenger sub-agent frontmatter + - `tools`: `Read, Grep, Glob` + - `model`: from the sub-agent frontmatter (any value accepted by + the Agent tool's `model` parameter) + - `permissionMode`: `dontAsk` + - `background`: `true` + - `initialPrompt`: composed from parts 1–4 **Prompt size guard:** If the combined context package (findings JSON + diff + file list + PR metadata) exceeds 80 000 tokens, @@ -510,7 +517,7 @@ isolation. needs their findings as input), so it is dispatched sequentially, not in the parallel batch from step 4. -4. Consume the challenger's output. The challenger returns a **different +5. Consume the challenger's output. The challenger returns a **different format** from dimension sub-agents: an object with `adjudicated_findings` and `removed_findings` arrays (not a flat finding array). Parse accordingly: @@ -522,15 +529,15 @@ isolation. part of the standard finding schema. - If `adjudicated_findings` is empty but the pre-challenger finding set was non-empty, treat this as a challenger failure (fall back - per step 5 below). A legitimate challenger pass that removes all - findings is unlikely — an empty result more likely indicates a - parsing error or context truncation. + per the immediate next step below). A legitimate challenger pass + that removes all findings is unlikely — an empty result more likely + indicates a parsing error or context truncation. - Otherwise, replace the merged finding set with the challenger's `adjudicated_findings`. - Log any `removed_findings` for transparency but do not include them in the final review. -5. If the challenger sub-agent fails (timeout, error, empty +6. If the challenger sub-agent fails (timeout, error, empty response), fall back to using the pre-challenger merged finding set from steps 6a–6c. Record an **info**-level finding: From c1d2b494eb6a1cfd8993d6768e781bed14344cf4 Mon Sep 17 00:00:00 2001 From: Ben Alkov Date: Wed, 17 Jun 2026 09:03:58 -0400 Subject: [PATCH 2/2] feat(review): harden review agent with read-only repo access Review agents should not modify the target repository - add harness field `readonly_repo` (boolean) which enforces `chmod -R a-w` on the cloned repo, after copy - tighten the review agent's sandbox policy to exclude workdir writes Assisted-by: Claude Code (Opus 4.6) Signed-off-by: Ben Alkov --- internal/cli/run.go | 14 +++++++++- internal/harness/harness.go | 1 + internal/harness/harness_test.go | 27 +++++++++++++++++++ .../fullsend-repo/harness/review.yaml | 1 + .../fullsend-repo/policies/review.yaml | 2 +- 5 files changed, 43 insertions(+), 2 deletions(-) diff --git a/internal/cli/run.go b/internal/cli/run.go index e705afc63..fe4d8f259 100644 --- a/internal/cli/run.go +++ b/internal/cli/run.go @@ -684,7 +684,19 @@ func runAgent(ctx context.Context, agentName, fullsendDir, outputBase, targetRep printer.StepDone(fmt.Sprintf("Agent-input files copied (%.1fs)", time.Since(inputStart).Seconds())) } - // 8c. Host-side scan (Path A): scan the target repo's context files + // 8c. Make the target repo read-only if the harness opts in. + // Runs after all repo-directory writes (8a, 8a-2) are complete. + // Excludes .git/ so git operations (index.lock, etc.) still work. + if h.ReadonlyRepo { + chmodCmd := fmt.Sprintf("chmod -R a-w %s && chmod -R u+w %s/.git", remoteRepositoryDir, remoteRepositoryDir) + if _, _, _, err := sandbox.Exec(sandboxName, chmodCmd, 30*time.Second); err != nil { + printer.StepWarn("Could not make repo read-only: " + err.Error()) + } else { + printer.StepDone("Target repo set to read-only") + } + } + + // 8d. Host-side scan (Path A): scan the target repo's context files // (CLAUDE.md, AGENTS.md, SKILL.md, etc.) before the agent processes them. // The target branch may contain attacker-controlled files from a PR. if h.SecurityEnabled() { diff --git a/internal/harness/harness.go b/internal/harness/harness.go index 9c7630bdd..1e45e462a 100644 --- a/internal/harness/harness.go +++ b/internal/harness/harness.go @@ -215,6 +215,7 @@ type Harness struct { ValidationLoop *ValidationLoop `yaml:"validation_loop,omitempty"` RunnerEnv map[string]string `yaml:"runner_env,omitempty"` TimeoutMinutes int `yaml:"timeout_minutes,omitempty"` + ReadonlyRepo bool `yaml:"readonly_repo,omitempty"` SandboxTimeoutSeconds int `yaml:"sandbox_timeout_seconds,omitempty"` Security *SecurityConfig `yaml:"security,omitempty"` AllowedRemoteResources []string `yaml:"allowed_remote_resources,omitempty"` diff --git a/internal/harness/harness_test.go b/internal/harness/harness_test.go index 76b862dfb..718e1df86 100644 --- a/internal/harness/harness_test.go +++ b/internal/harness/harness_test.go @@ -1418,6 +1418,33 @@ agent: agents/code.md assert.Nil(t, h.MaxRuntimeFetches) } +func TestLoad_ReadonlyRepoField(t *testing.T) { + content := ` +agent: agents/review.md +readonly_repo: true +` + dir := t.TempDir() + path := filepath.Join(dir, "test.yaml") + require.NoError(t, os.WriteFile(path, []byte(content), 0o644)) + + h, err := Load(path) + require.NoError(t, err) + assert.True(t, h.ReadonlyRepo) +} + +func TestLoad_ReadonlyRepoFieldOmitted(t *testing.T) { + content := ` +agent: agents/code.md +` + dir := t.TempDir() + path := filepath.Join(dir, "test.yaml") + require.NoError(t, os.WriteFile(path, []byte(content), 0o644)) + + h, err := Load(path) + require.NoError(t, err) + assert.False(t, h.ReadonlyRepo) +} + // --- ValidForgePlatform tests --- func TestValidForgePlatform(t *testing.T) { diff --git a/internal/scaffold/fullsend-repo/harness/review.yaml b/internal/scaffold/fullsend-repo/harness/review.yaml index ebfce5a73..f87f31f81 100644 --- a/internal/scaffold/fullsend-repo/harness/review.yaml +++ b/internal/scaffold/fullsend-repo/harness/review.yaml @@ -4,6 +4,7 @@ doc: docs/agents/review.md model: opus image: ghcr.io/fullsend-ai/fullsend-code:latest policy: policies/review.yaml +readonly_repo: true role: review slug: fullsend-ai-review diff --git a/internal/scaffold/fullsend-repo/policies/review.yaml b/internal/scaffold/fullsend-repo/policies/review.yaml index 00f10b630..3871b6d8b 100644 --- a/internal/scaffold/fullsend-repo/policies/review.yaml +++ b/internal/scaffold/fullsend-repo/policies/review.yaml @@ -8,7 +8,7 @@ version: 1 # handles mutations on the runner with a separate write-scoped token. filesystem_policy: - include_workdir: true + include_workdir: false read_only: [/usr, /lib, /proc, /dev/urandom, /app, /etc, /var/log] read_write: [/sandbox, /tmp, /dev/null] landlock: