Skip to content

docs: link experiments archive + add analysis charts#14

Merged
wei9072 merged 1 commit into
mainfrom
docs/experiments-readme-charts
May 7, 2026
Merged

docs: link experiments archive + add analysis charts#14
wei9072 merged 1 commit into
mainfrom
docs/experiments-readme-charts

Conversation

@wei9072

@wei9072 wei9072 commented May 7, 2026

Copy link
Copy Markdown
Owner

Summary

Adds a hyperlink to `experiments/` from the main READMEs (EN +
ZH) and turns the experiments archive's own README into an
analysis dashboard with four charts.

Charts added to `experiments/README.md`

  1. Overview — 52 dirs / 26 paired / 11 models, broken down by
    task shape (Plan A / B / C / Round 9 / initial)
  2. Plan B brownfield fix-rate matrix — for every (model,
    variant) cell, how many of the 3 planted SEC bugs remain.
    Computed from the actual archived files against the current
    rule library, so the numbers are reproducible. The headline:
    GPT-5.2 / GPT-5.3-codex / GPT-5.4-mini all show 3 → 0 (perfect
    fix) when paired with Aegis.
  3. Plan C multi-module completion table — surfaces the
    anti-paralysis ROI: GPT-5.4-mini A variant abandoned the task
    (no `notifications.py`, no `tests.py`, 24k tokens spent on
    design proposals); B variant of the same model completed.
    Cycle introductions: 0/14.
  4. Three Aegis ROI mechanisms as a Mermaid flowchart, plus a
    table marking which mechanisms were designed-in vs emergent.

Plus a direct lineage ASCII chart connecting each experiment
finding to the PR that fixed it (Round 8 codex → PR #9; Round 9
Go/Java → PR #11 / #12).

Repo housekeeping

The previous `experiments` PR accidentally committed
`31flashlite-amb-b` as a gitlink (submodule pointer) because
codex had created a nested `.git/` inside that round directory
during the agent run. This PR removes the nested `.git/` and
re-imports the 6 actual deliverable files as plain content.

Test plan

  • Main README links resolve to `experiments/README.md`
  • `git ls-files experiments/31flashlite-amb-b` returns 6 files
    (was 1 stale gitlink entry before this PR)
  • Plan B chart numbers reproduced via:
    `for d in *-bf-?; do python3 aegis_validate.py "$d/auth.py" | grep -c security; done`
  • Mermaid renders on GitHub
  • CI green on push

🤖 Generated with Claude Code

- Main README.md and README.zh-TW.md: new "Experiments archive" /
  「實驗資料」 section linking to experiments/README.md
- experiments/README.md: replaces the prose-only intro with four
  analysis charts:
    1. Round structure overview (52 dirs, 26 paired, 11 models)
    2. Plan B brownfield 0/3 → 3/3 fix-rate matrix with per-model
       remaining-bug counts (real numbers from re-validating the
       archive against current rule library)
    3. Plan C multi-module task-completion matrix surfacing the
       anti-paralysis ROI mechanism (g54mini-mc-a abandoned vs
       g54mini-mc-b completed same task)
    4. Mermaid flowchart of the three Aegis ROI mechanisms
       (rule-hit → fix; structural guardrail; anti-paralysis ritual)
- Plus a "Direct lineage" chart connecting each experiment finding
  to the PR that fixed it (Round 8 → PR #9; Round 9 → PR #11/#12).

Also re-imports `experiments/31flashlite-amb-b/` as plain files —
the previous commit accidentally captured it as a gitlink because
codex had created a nested `.git/` inside the round dir during the
agent run. Removed the nested `.git/` and re-added the 6 actual
deliverable files.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@wei9072 wei9072 merged commit 9ce0863 into main May 7, 2026
1 check passed
@wei9072 wei9072 deleted the docs/experiments-readme-charts branch May 7, 2026 08:16
wei9072 added a commit that referenced this pull request May 7, 2026
Three changes following user feedback:

1. **Move the experiments link to the top** of both main READMEs.
   Previously it sat between Status and License at the bottom — most
   readers never reached it. Now it appears immediately after the
   "if you are an AI agent" pointer block, with a 📊 emoji marker.
2. **Add `experiments/README.zh-TW.md`** — full Chinese translation
   of the experiments analysis dashboard (overview chart, Plan B
   fix-rate matrix, Plan C task-completion matrix, three ROI
   mechanisms Mermaid, direct lineage). All four charts and all
   tables translated.
3. **Add language switcher** (`English · 繁體中文`) to both
   experiments README files so they resolve to each other.

The Chinese main README (README.zh-TW.md) was already updated in
PR #14 with an experiments section; this commit just relocates
that link to the top and points it at the new zh-TW counterpart.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant