Skip to content

Add the agentic benchmark chart (metrics + safety)#160

Merged
DietrichGebert merged 1 commit into
mainfrom
bench/agentic-chart
Jun 18, 2026
Merged

Add the agentic benchmark chart (metrics + safety)#160
DietrichGebert merged 1 commit into
mainfrom
bench/agentic-chart

Conversation

@DietrichGebert

Copy link
Copy Markdown
Owner

The chart for the #158 benchmark. #158 was squash-merged before the chart commits landed, so this brings the chart onto main.

A grouped bar chart in the README's Numbers section:

  • LOC, tokens, cost and time, each as a % of the no-skill baseline (baseline = 100%, lower is leaner / cheaper / faster). ponytail is lowest on all four (LOC 46%, tokens 78%, cost 80%, time 73%); caveman rises above 100% on tokens, cost and time.
  • A separate safety strip (the 6-task adversarial tier, higher is safer): baseline, caveman and ponytail 100%, yagni-oneliner 95%.

Same #8b949e system-gray palette as the existing chart so it reads on both GitHub light and dark themes.

🤖 Generated with Claude Code

Grouped bars of LOC, tokens, cost and time as a % of the no-skill baseline
(lower is leaner/cheaper/faster), plus a separate safety strip (baseline,
caveman and ponytail 100%; yagni-oneliner 95%). System-gray palette so it reads
on both GitHub themes. The chart commits landed after #158 had already
squash-merged, so this brings the chart onto main.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@DietrichGebert DietrichGebert merged commit 8d5037d into main Jun 18, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant