research: improving code quality — abstraction-first agents over vibe-coded slop

## Context

Steve Krouse's essay ["Reports of code's death are greatly exaggerated"](https://blog.val.town/blog/reports-of-codes-death/) (March 2026) nails a problem that directly applies to devswarm's agent output:

> "Vibe coding gives the illusion that your vibes are precise abstractions. They will feel this way right up until they leak."

Our agents generate code. That code needs to not be slop. The essay argues AI should help produce **better** code through better abstractions, not just more code faster.

## The problem with current agent output

Right now our agents (fixer, zig_specialist, test_writer) have a "simple-first" rule but no **abstraction quality** feedback loop. They fix bugs and write code, but nobody checks:

1. **Is the code well-abstracted?** Or is it a pile of inline logic that works today but collapses under the next feature?
2. **Does the fix reduce complexity?** Or does it add another special case to an already-tangled function?
3. **Are the right things compressed into the right abstractions?** (Dijkstra: "The purpose of abstraction is not to be vague, but to create a new semantic level in which one can be absolutely precise.")

The Slack notification flowchart example is perfect — a complex diagram that looks irreducible until someone finds the right abstraction to compress it.

## What "better code quality" means for agents

### 1. Abstraction-awareness in the reviewer role

The reviewer currently checks: correctness, memory safety, patterns, silent failures. Missing:

- **Abstraction quality**: is this function doing one thing at one level of abstraction? Or is it mixing high-level orchestration with low-level byte manipulation?
- **Compression ratio**: could this 50-line function be expressed as 10 lines with the right abstraction? (Not golf — genuine semantic compression)
- **Leaky abstractions**: does this API expose implementation details that will break when internals change?

### 2. Refactor-awareness in the fixer role

Current rule: "simple-first, one concern per edit." But sometimes the right fix is a refactor that introduces an abstraction. The fixer needs to know when:

- A fix that adds another `if` branch is worse than a fix that introduces a proper enum
- A fix that copies 3 lines is worse than extracting a helper
- But also: a premature abstraction over 2 use-sites is worse than duplication

The "rule of three" heuristic: duplicate once, okay. Duplicate twice, extract.

### 3. A new `abstraction_reviewer` role

| Field | Value |
|---|---|
| Name | `abstraction_reviewer` |
| Tier | opus |
| Writable | false |
| Focus | Find opportunities to compress complexity through better abstractions |

Prompt sketch:
- "You are an abstraction reviewer. Your job is to find code that is precise but not yet well-compressed."
- "Look for: repeated patterns that could be a single function, data that could be a type, control flow that could be a state machine, configuration that could be a table."
- "Do NOT suggest premature abstractions. Only flag when ≥3 instances of the same pattern exist."
- "For each finding, show the current code and the abstracted version side by side."

### 4. Post-generation quality gate

After any writable agent (fixer, zig_specialist, test_writer) produces code, run a lightweight quality check:

```
Agent produces code
  → Compile check (already exists)
  → Test check (already exists)  
  → Abstraction check (NEW): does this diff increase or decrease cyclomatic complexity?
     If complexity increased by >20%, flag for review.
```

### 5. Precision over vibes in the orchestrator

The essay's core insight: "English specifications intuitively feel precise until you learn better from bitter experience."

This applies directly to how the orchestrator decomposes tasks. When a user says "add live collaboration," the orchestrator needs to:
- Recognize this is deceptively complex
- Break it into precise sub-problems (conflict resolution, operational transform, state sync)
- Not treat "live collaboration" as a single task for one agent

This is the `/office-hours` pattern from gstack — challenge the premise before executing.

## Connects to

- #352 (role prompts — add abstraction review to reviewer checklist)
- #356 (skill ingestion — `abstraction_reviewer` could be a `.md` skill file)
- #354 (eval framework — code quality metrics: cyclomatic complexity delta, abstraction density)

## Key quotes from the essay

> "The purpose of abstraction is not to be vague, but to create a new semantic level in which one can be absolutely precise." — Dijkstra

> "AI should help us produce better code. And when we have AGI this will be easy." — Simon Willison (via Krouse)

> "When we get AGI, the very first things we will use it on will be our hardest abstraction problems."

The direction is clear: agents that produce better abstractions, not just more code.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

research: improving code quality — abstraction-first agents over vibe-coded slop #357

Context

The problem with current agent output

What "better code quality" means for agents

1. Abstraction-awareness in the reviewer role

2. Refactor-awareness in the fixer role

3. A new `abstraction_reviewer` role

4. Post-generation quality gate

5. Precision over vibes in the orchestrator

Connects to

Key quotes from the essay

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Field	Value
Name	`abstraction_reviewer`
Tier	opus
Writable	false
Focus	Find opportunities to compress complexity through better abstractions

research: improving code quality — abstraction-first agents over vibe-coded slop #357

Description

Context

The problem with current agent output

What "better code quality" means for agents

1. Abstraction-awareness in the reviewer role

2. Refactor-awareness in the fixer role

3. A new abstraction_reviewer role

4. Post-generation quality gate

5. Precision over vibes in the orchestrator

Connects to

Key quotes from the essay

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

3. A new `abstraction_reviewer` role