From 006fe59977868f541058a7944a92b69177830f18 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Jose=20Villase=C3=B1or=20Montfort?=
 <195970+montfort@users.noreply.github.com>
Date: Tue, 16 Jun 2026 12:04:40 -0600
Subject: [PATCH] =?UTF-8?q?fix(framework):=20audit-prompt=20hardening=20?=
 =?UTF-8?q?=E2=80=94=20enforce=20independence=20+=20surface=20output=20con?=
 =?UTF-8?q?tract=20(#261)?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

A real 5-auditor cross-family cycle exposed two prompt-design gaps that let
auditors drift from the intended discipline (neither is a CLI mechanics bug).

Problem A — auditor independence was not enforced. When audits run
sequentially, each report lands in the shared .straymark/audits/<charter>/
dir, and a later auditor can read the earlier ones. One real auditor produced
a meta-consolidation of the four prior reports instead of an independent
audit — its "convergence" was copied, not independent, which silently inflates
both its rating and the consolidated review's confidence.

- audit-prompt.md (EN + ES): ABSOLUTE RULE now forbids reading/grepping/
  referencing any other report-*.md; "Your role" + "What you must NOT do"
  reworded (a sibling report may already be on disk — do not open it).
- straymark-audit-review (4 runtime copies): contamination guard that detects
  reports referencing siblings and excludes them from convergence/dedup +
  rating. A prompt rule is weak; verify at review time too.

Problem B — the output contract was buried at the end of a ~3,900-line
resolved prompt, and the auditor patterned its frontmatter after the embedded
AILOGs (different schema), producing off-contract reports.

- New "Output contract (read this first)" block right after the ABSOLUTE RULE:
  required frontmatter + the four finding categories + an explicit
  "DELIBERATELY DIFFERENT from the AILOG/AIDEC frontmatter" disambiguation,
  plus a Frontmatter note beside the embedded AILOGs. (Categories were already
  defined before the output format, so no reorder was needed.)

All audit tests green (skill 12, template 9, charter_audit 21, audit 9).

Closes #261.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 CHANGELOG.md                                  | 23 +++++++++++++++++++
 .../workflows/straymark-audit-review.md       |  2 ++
 .../skills/straymark-audit-review/SKILL.md    |  2 ++
 .../skills/straymark-audit-review/SKILL.md    |  2 ++
 .../skills/straymark-audit-review/SKILL.md    |  2 ++
 dist/.straymark/audit-prompts/audit-prompt.md | 20 ++++++++++++++--
 .../audit-prompts/i18n/es/audit-prompt.md     | 20 ++++++++++++++--
 7 files changed, 67 insertions(+), 4 deletions(-)
diff --git a/CHANGELOG.md b/CHANGELOG.md
index 53cf9164..c2b47795 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -7,6 +7,29 @@ and this project uses [independent versioning](README.md#versioning) for Framewo
 
 ---
 
+## [Unreleased]
+
+### Changed (Framework — audit prompt hardening, #261)
+
+- **Auditor independence is now enforced in the audit prompt** (`audit-prompts/audit-prompt.md`,
+  EN + ES). The ABSOLUTE RULE forbids reading, grepping, or referencing any other auditor's
+  `report-*.md` under `.straymark/audits/` — for this Charter or any other — because cross-model
+  convergence is signal only when each auditor reached it independently. The "Your role" and
+  "What you must NOT do" sections were reworded to match (a sibling report may already be on
+  disk; do not open it).
+- **The output contract is surfaced near the top** of the prompt (new "Output contract (read
+  this first)" section, right after the ABSOLUTE RULE) instead of only at the end of a long
+  prompt: required report frontmatter, the four finding categories, and an explicit warning that
+  the report frontmatter is **deliberately different** from the embedded AILOG/AIDEC frontmatter
+  the auditor reads (the mimicry that drifted real reports off-schema). A "Frontmatter note" was
+  also added beside the embedded AILOGs.
+- **`straymark-audit-review`** (skill, all runtimes) gained a **contamination guard**: it now
+  scans each report for signs it read its siblings (references to another `report-*.md`, a
+  cross-auditor comparison table, "I verified all N findings from the prior <model> audit") and
+  excludes contaminated reports from the convergence/dedup math and the auditor rating.
+
+---
+
 ## CLI 3.25.0 / Core 0.5.0 — Architecture Plan track A1 (EXPERIMENTAL)
 
 The textual + authoring half of Loom's Architecture Plan view (Spec 002) — the operator's "where are we?" answered from the terminal, ahead of the visual overlay (A2). All surfaces are **EXPERIMENTAL (Loom v0)** and may change without a deprecation cycle.
diff --git a/dist/.agent/workflows/straymark-audit-review.md b/dist/.agent/workflows/straymark-audit-review.md
index 7f52eb03..b415e3a8 100644
--- a/dist/.agent/workflows/straymark-audit-review.md
+++ b/dist/.agent/workflows/straymark-audit-review.md
@@ -46,6 +46,8 @@ For each `.straymark/audits/<CHARTER-ID>/report-*.md`:
 
 Build a **master finding list** — every unique claim across all auditors, deduplicated when two auditors clearly describe the same thing.
 
+**Independence check (contamination guard).** Before you trust any convergence between auditors, scan each report for signs it read the *other* auditors' reports instead of auditing independently: explicit references to another `report-*.md`, another auditor named by model, a "comparison table of auditors", or language like "I independently verified all N findings from the prior <model> audit". A report that consolidates or cross-checks against its siblings is **contaminated** — its agreement is copied, not independent signal. Flag it, exclude it from the convergence/dedup math and from the auditor rating (step 5), and note the contamination in the review. The audit prompt forbids reading sibling reports, but a prompt rule is weak — verify it here.
+
 ### 3. Verify every finding against actual code
 
 This is the substantive step. For EACH finding in the master list:
diff --git a/dist/.claude/skills/straymark-audit-review/SKILL.md b/dist/.claude/skills/straymark-audit-review/SKILL.md
index 813a4f12..276d3d90 100644
--- a/dist/.claude/skills/straymark-audit-review/SKILL.md
+++ b/dist/.claude/skills/straymark-audit-review/SKILL.md
@@ -48,6 +48,8 @@ For each `.straymark/audits/<CHARTER-ID>/report-*.md`:
 
 Build a **master finding list** — every unique claim across all auditors, deduplicated when two auditors clearly describe the same thing.
 
+**Independence check (contamination guard).** Before you trust any convergence between auditors, scan each report for signs it read the *other* auditors' reports instead of auditing independently: explicit references to another `report-*.md`, another auditor named by model, a "comparison table of auditors", or language like "I independently verified all N findings from the prior <model> audit". A report that consolidates or cross-checks against its siblings is **contaminated** — its agreement is copied, not independent signal. Flag it, exclude it from the convergence/dedup math and from the auditor rating (step 5), and note the contamination in the review. The audit prompt forbids reading sibling reports, but a prompt rule is weak — verify it here.
+
 ### 3. Verify every finding against actual code
 
 This is the substantive step. For EACH finding in the master list:
diff --git a/dist/.codex/skills/straymark-audit-review/SKILL.md b/dist/.codex/skills/straymark-audit-review/SKILL.md
index 681f2919..1410e5bb 100644
--- a/dist/.codex/skills/straymark-audit-review/SKILL.md
+++ b/dist/.codex/skills/straymark-audit-review/SKILL.md
@@ -47,6 +47,8 @@ For each `.straymark/audits/<CHARTER-ID>/report-*.md`:
 
 Build a **master finding list** — every unique claim across all auditors, deduplicated when two auditors clearly describe the same thing.
 
+**Independence check (contamination guard).** Before you trust any convergence between auditors, scan each report for signs it read the *other* auditors' reports instead of auditing independently: explicit references to another `report-*.md`, another auditor named by model, a "comparison table of auditors", or language like "I independently verified all N findings from the prior <model> audit". A report that consolidates or cross-checks against its siblings is **contaminated** — its agreement is copied, not independent signal. Flag it, exclude it from the convergence/dedup math and from the auditor rating (step 5), and note the contamination in the review. The audit prompt forbids reading sibling reports, but a prompt rule is weak — verify it here.
+
 ### 3. Verify every finding against actual code
 
 This is the substantive step. For EACH finding in the master list:
diff --git a/dist/.gemini/skills/straymark-audit-review/SKILL.md b/dist/.gemini/skills/straymark-audit-review/SKILL.md
index 681f2919..1410e5bb 100644
--- a/dist/.gemini/skills/straymark-audit-review/SKILL.md
+++ b/dist/.gemini/skills/straymark-audit-review/SKILL.md
@@ -47,6 +47,8 @@ For each `.straymark/audits/<CHARTER-ID>/report-*.md`:
 
 Build a **master finding list** — every unique claim across all auditors, deduplicated when two auditors clearly describe the same thing.
 
+**Independence check (contamination guard).** Before you trust any convergence between auditors, scan each report for signs it read the *other* auditors' reports instead of auditing independently: explicit references to another `report-*.md`, another auditor named by model, a "comparison table of auditors", or language like "I independently verified all N findings from the prior <model> audit". A report that consolidates or cross-checks against its siblings is **contaminated** — its agreement is copied, not independent signal. Flag it, exclude it from the convergence/dedup math and from the auditor rating (step 5), and note the contamination in the review. The audit prompt forbids reading sibling reports, but a prompt rule is weak — verify it here.
+
 ### 3. Verify every finding against actual code
 
 This is the substantive step. For EACH finding in the master list:
diff --git a/dist/.straymark/audit-prompts/audit-prompt.md b/dist/.straymark/audit-prompts/audit-prompt.md
index e20934b3..3d5633aa 100644
--- a/dist/.straymark/audit-prompts/audit-prompt.md
+++ b/dist/.straymark/audit-prompts/audit-prompt.md
@@ -56,6 +56,7 @@ Specifically, you are FORBIDDEN from:
 - Running code generators (`go generate`, `sqlc generate`, `wire`, `cargo build` with filesystem effects, `npm install`, etc.).
 - Applying "fixes" or "improvements" to the code, even if you believe they are correct.
 - Reformatting, renaming, or reorganizing existing files.
+- Reading, opening, grepping, or referencing **another auditor's report** (`report-*.md`, `auditor-*.md`, or any scratch file) under `.straymark/audits/` — for this Charter or any other. Your audit must be **independent**: an audit that reads, cites, summarizes, or "cross-verifies against" another auditor's report is contaminated and will be discarded. Cross-auditor convergence is signal ONLY when each auditor reached it *without* seeing the others — a copied agreement is worthless.
 
 The ONLY thing you may write is your audit report file at the canonical path shown in **Output format** below. That is the ONLY file you have permission to create.
 
@@ -67,11 +68,24 @@ If a test fails, **REPORT IT**. Do NOT repair it.
 
 ---
 
+## Output contract (read this first)
+
+You are about to read a lot — the Charter, the originating AILOGs, the diff — before you reach the full **Output format** at the very end of this prompt. Lock these invariants in now, so the long read does not pull your report toward the wrong shape:
+
+1. **You write exactly one file**: your audit report, at the canonical path in **Output format**. Nothing else (see the ABSOLUTE RULE).
+2. **Required report frontmatter** (validated against `{{schema_path}}`): `audit_role`, `auditor`, `charter_id`, `git_range`, `prompt_used`, `audited_at`, `findings_total`, `findings_by_category` — where `findings_by_category` has exactly the four keys `hallucination`, `implementation_gap`, `real_debt`, `false_positive`. `evidence_citations` and `audit_quality` are optional but recommended.
+3. **The four finding categories** (`hallucination`, `implementation_gap`, `real_debt`, `false_positive`) are defined under **Finding categorization** below — *before* the point where you must assign them.
+4. **⚠️ Your report frontmatter is DELIBERATELY DIFFERENT from the AILOG/AIDEC frontmatter you are about to read.** The AILOGs embedded below use keys like `id` / `status` / `confidence` / `risk_level` / `agent`. Your report does **not** — it uses the audit keys in (2). Do not mimic the surrounding documents; follow the schema.
+
+This is a summary. The authoritative, complete format (frontmatter + every body section) is in **Output format** at the end of this prompt — write your report against that, not against this digest.
+
+---
+
 ## Your role
 
 You are an independent code auditor. Your job is to verify that the implementation of a specific Charter fulfills the declared tasks and files, find real bugs in the code, and identify security risks. **You are NOT a cheerleader** — reporting "no issues" when bugs exist is worse than reporting a false positive.
 
-StrayMark orchestrates cross-model audits: typically another auditor from a **different model family** is reviewing the same Charter in parallel. Your value lies in applying evidence discipline (citing `file:line` of files you actually opened) and severity calibration against the real config, not in cosmetically converging with the other auditor.
+StrayMark orchestrates cross-model audits: another auditor from a **different model family** reviews the same Charter — sometimes alongside you, sometimes before you, so their `report-*.md` may already sit in `.straymark/audits/{{charter_id}}/`. **You must not read it** (see the ABSOLUTE RULE). Your value lies in *independent* evidence discipline (citing `file:line` of files you actually opened) and severity calibration against the real config — not in converging with, or even glancing at, another auditor's report. An agreement you reached by reading theirs is not convergence; it is contamination.
 
 ---
 
@@ -105,6 +119,8 @@ The authoritative source of scope is the Charter file at `{{charter_path}}`. Rea
 
 These AILOGs document the rationale and the emergent risks during execution. **Read them before auditing** — the `R<N>` risks already documented there are NOT new findings, they are consciously accepted trade-offs.
 
+> **Frontmatter note.** These AILOGs carry their own frontmatter (`id`, `status`, `confidence`, `risk_level`, `agent`). That is **not** the shape of your audit report — your report uses the audit schema in **Output format**. Read the AILOGs for their content; do not let their frontmatter become a template for yours.
+
 ```
 {{ailog_paths}}
 ```
@@ -312,7 +328,7 @@ Does the implementation meet the closure criterion declared by `{{charter_id}}`?
 - **DO NOT declare Critical or High severity** without having verified that the real driver, flag, role, or deployment of the project triggers the bug. See Step 5. Declaring "critical regression" based on a stubbed component or a disabled flag invalidates the audit through false inflation.
 - **DO NOT report** that a file "does not exist" without having searched with the correct path (including naming-convention variants used by the project).
 - **DO NOT copy the file structure** without verifying content.
-- **DO NOT ignore** the prior-audits folders (typically `audit/` or `.straymark/audits/`) — they contain prior analyses you are NOT meant to audit (they were audited already, or they are meta-evidence of the process, not project code).
+- **DO NOT audit, and DO NOT read for cross-reference, the audit folders** (`audit/` or `.straymark/audits/`). They hold other auditors' reports and prior analyses — neither project code for you to audit, nor input to your findings. In particular, do not open this cycle's sibling `report-*.md` files (see the ABSOLUTE RULE on independence): your audit must stand on the code you read yourself.
 - **DO NOT run** destructive or generative commands. Only read/verify commands (`go vet`, `go build`, `go test`; `cargo check`, `cargo test --no-run`; `npm run lint`, `npm test`; or their equivalents).
 - **DO NOT consult external sources** beyond what is provided in this prompt and the repository files you open via tool call. The audit must be reproducible from the prompt + the repo + the available read tools.
 
diff --git a/dist/.straymark/audit-prompts/i18n/es/audit-prompt.md b/dist/.straymark/audit-prompts/i18n/es/audit-prompt.md
index 78ffb5e5..22c251ea 100644
--- a/dist/.straymark/audit-prompts/i18n/es/audit-prompt.md
+++ b/dist/.straymark/audit-prompts/i18n/es/audit-prompt.md
@@ -53,6 +53,7 @@ Concretamente, tienes PROHIBIDO:
 - Ejecutar generadores de código (`go generate`, `sqlc generate`, `wire`, `cargo build` con efectos en el filesystem, npm install, etc.).
 - Aplicar "fixes" o "mejoras" al código, aunque creas que son correctas.
 - Reformatear, renombrar o reorganizar archivos existentes.
+- Leer, abrir, grepear o referenciar **el reporte de otro auditor** (`report-*.md`, `auditor-*.md`, o cualquier archivo de borrador) bajo `.straymark/audits/` — de este Charter o de cualquier otro. Tu auditoría debe ser **independiente**: una auditoría que lee, cita, resume o "contrasta contra" el reporte de otro auditor está contaminada y será descartada. La convergencia entre auditores es señal SOLO cuando cada uno llegó a ella *sin* ver a los demás — un acuerdo copiado no vale nada.
 
 Lo ÚNICO que puedes escribir es tu archivo de reporte de auditoría en la ruta canónica que aparece en la sección **Formato de salida** más abajo. Ese es el ÚNICO archivo que tienes permiso de crear.
 
@@ -64,11 +65,24 @@ Si un test falla, **REPÓRTALO**. NO lo arregles.
 
 ---
 
+## Contrato de salida (lee esto primero)
+
+Vas a leer mucho —el Charter, los AILOGs originantes, el diff— antes de llegar al **Formato de salida** completo al final de este prompt. Fija estos invariantes ahora, para que la lectura larga no arrastre tu reporte hacia la forma equivocada:
+
+1. **Escribes exactamente un archivo**: tu reporte de auditoría, en la ruta canónica del **Formato de salida**. Nada más (ver la REGLA ABSOLUTA).
+2. **Frontmatter requerido del reporte** (validado contra `{{schema_path}}`): `audit_role`, `auditor`, `charter_id`, `git_range`, `prompt_used`, `audited_at`, `findings_total`, `findings_by_category` — donde `findings_by_category` lleva exactamente las cuatro claves `hallucination`, `implementation_gap`, `real_debt`, `false_positive`. `evidence_citations` y `audit_quality` son opcionales pero recomendados.
+3. **Las cuatro categorías de hallazgo** (`hallucination`, `implementation_gap`, `real_debt`, `false_positive`) están definidas en **Categorización de hallazgos** más abajo — *antes* del punto donde debes asignarlas.
+4. **⚠️ El frontmatter de tu reporte es DELIBERADAMENTE DISTINTO del frontmatter de los AILOG/AIDEC que vas a leer.** Los AILOGs embebidos abajo usan claves como `id` / `status` / `confidence` / `risk_level` / `agent`. Tu reporte **no** — usa las claves de auditoría de (2). No imites los documentos circundantes; sigue el esquema.
+
+Esto es un resumen. El formato autoritativo y completo (frontmatter + cada sección del cuerpo) está en **Formato de salida** al final de este prompt — escribe tu reporte contra ese, no contra este digesto.
+
+---
+
 ## Tu rol
 
 Eres un auditor de código independiente. Tu trabajo es verificar que la implementación de un Charter específico cumple con las tareas y archivos declarados, encontrar bugs reales en el código, e identificar riesgos de seguridad. **NO eres un cheerleader** — reportar "sin problemas" cuando existen bugs es peor que reportar un falso positivo.
 
-StrayMark orquesta auditorías cross-modelo: típicamente otro auditor de una **familia de modelo distinta** está revisando el mismo Charter en paralelo. Tu valor está en aplicar la disciplina de evidencia (citar `archivo:línea` de archivos que abriste) y la calibración de severidad contra la config real, no en convergir cosméticamente con el otro auditor.
+StrayMark orquesta auditorías cross-modelo: otro auditor de una **familia de modelo distinta** revisa el mismo Charter — a veces a la par tuya, a veces antes que tú, así que su `report-*.md` puede ya estar en `.straymark/audits/{{charter_id}}/`. **No debes leerlo** (ver la REGLA ABSOLUTA). Tu valor está en la disciplina de evidencia *independiente* (citar `archivo:línea` de archivos que abriste) y la calibración de severidad contra la config real — no en convergir con, ni siquiera echar un vistazo a, el reporte de otro auditor. Un acuerdo al que llegaste leyendo el suyo no es convergencia; es contaminación.
 
 ---
 
@@ -102,6 +116,8 @@ La fuente autoritativa de alcance es el archivo del Charter en `{{charter_path}}
 
 Estos AILOGs documentan la racional y los riesgos emergentes durante la ejecución. **Léelos antes de auditar** — los R<N> que ya están documentados ahí NO son hallazgos nuevos, son trade-offs aceptados conscientemente.
 
+> **Nota sobre el frontmatter.** Estos AILOGs llevan su propio frontmatter (`id`, `status`, `confidence`, `risk_level`, `agent`). Esa **no** es la forma de tu reporte de auditoría — tu reporte usa el esquema de auditoría del **Formato de salida**. Lee los AILOGs por su contenido; no dejes que su frontmatter se vuelva la plantilla del tuyo.
+
 ```
 {{ailog_paths}}
 ```
@@ -309,7 +325,7 @@ Observaciones sobre código que NO es parte del alcance de este Charter pero que
 - **NO declares severidad Crítica o Alta** sin haber verificado que el driver, flag, rol o deployment real del proyecto dispara el bug. Ver Paso 5. Declarar "regresión crítica" basándote en un componente stub o un flag desactivado invalida la auditoría por falsa inflación.
 - **NO reportes** que un archivo "no existe" sin haber buscado con la ruta correcta (incluyendo variantes de naming convention del proyecto).
 - **NO copies la estructura de archivos** sin verificar contenido.
-- **NO ignores** las carpetas de auditorías previas (típicamente `audit/` o `.straymark/audits/`) — contienen análisis previos que NO debes auditar (ya fueron auditadas o son meta-evidencia del proceso, no código del proyecto).
+- **NO audites, y NO leas para contrastar, las carpetas de auditorías** (`audit/` o `.straymark/audits/`). Contienen reportes de otros auditores y análisis previos — ni código del proyecto que debas auditar, ni insumo para tus hallazgos. En particular, no abras los `report-*.md` hermanos de este ciclo (ver la REGLA ABSOLUTA sobre independencia): tu auditoría debe sostenerse sobre el código que leíste tú mismo.
 - **NO ejecutes** comandos destructivos o generativos. Solo comandos de lectura/verificación (`go vet`, `go build`, `go test`; `cargo check`, `cargo test --no-run`; `npm run lint`, `npm test`; o sus equivalentes).
 - **NO consultes fuentes externas** más allá de lo provisto en este prompt y de los archivos del repositorio que abras vía tool call. La auditoría debe ser reproducible desde el prompt + el repo + las herramientas de lectura disponibles.