Skip to content

research: improving code quality — abstraction-first agents over vibe-coded slop #357

@justrach

Description

@justrach

Context

Steve Krouse's essay "Reports of code's death are greatly exaggerated" (March 2026) nails a problem that directly applies to devswarm's agent output:

"Vibe coding gives the illusion that your vibes are precise abstractions. They will feel this way right up until they leak."

Our agents generate code. That code needs to not be slop. The essay argues AI should help produce better code through better abstractions, not just more code faster.

The problem with current agent output

Right now our agents (fixer, zig_specialist, test_writer) have a "simple-first" rule but no abstraction quality feedback loop. They fix bugs and write code, but nobody checks:

  1. Is the code well-abstracted? Or is it a pile of inline logic that works today but collapses under the next feature?
  2. Does the fix reduce complexity? Or does it add another special case to an already-tangled function?
  3. Are the right things compressed into the right abstractions? (Dijkstra: "The purpose of abstraction is not to be vague, but to create a new semantic level in which one can be absolutely precise.")

The Slack notification flowchart example is perfect — a complex diagram that looks irreducible until someone finds the right abstraction to compress it.

What "better code quality" means for agents

1. Abstraction-awareness in the reviewer role

The reviewer currently checks: correctness, memory safety, patterns, silent failures. Missing:

  • Abstraction quality: is this function doing one thing at one level of abstraction? Or is it mixing high-level orchestration with low-level byte manipulation?
  • Compression ratio: could this 50-line function be expressed as 10 lines with the right abstraction? (Not golf — genuine semantic compression)
  • Leaky abstractions: does this API expose implementation details that will break when internals change?

2. Refactor-awareness in the fixer role

Current rule: "simple-first, one concern per edit." But sometimes the right fix is a refactor that introduces an abstraction. The fixer needs to know when:

  • A fix that adds another if branch is worse than a fix that introduces a proper enum
  • A fix that copies 3 lines is worse than extracting a helper
  • But also: a premature abstraction over 2 use-sites is worse than duplication

The "rule of three" heuristic: duplicate once, okay. Duplicate twice, extract.

3. A new abstraction_reviewer role

Field Value
Name abstraction_reviewer
Tier opus
Writable false
Focus Find opportunities to compress complexity through better abstractions

Prompt sketch:

  • "You are an abstraction reviewer. Your job is to find code that is precise but not yet well-compressed."
  • "Look for: repeated patterns that could be a single function, data that could be a type, control flow that could be a state machine, configuration that could be a table."
  • "Do NOT suggest premature abstractions. Only flag when ≥3 instances of the same pattern exist."
  • "For each finding, show the current code and the abstracted version side by side."

4. Post-generation quality gate

After any writable agent (fixer, zig_specialist, test_writer) produces code, run a lightweight quality check:

Agent produces code
  → Compile check (already exists)
  → Test check (already exists)  
  → Abstraction check (NEW): does this diff increase or decrease cyclomatic complexity?
     If complexity increased by >20%, flag for review.

5. Precision over vibes in the orchestrator

The essay's core insight: "English specifications intuitively feel precise until you learn better from bitter experience."

This applies directly to how the orchestrator decomposes tasks. When a user says "add live collaboration," the orchestrator needs to:

  • Recognize this is deceptively complex
  • Break it into precise sub-problems (conflict resolution, operational transform, state sync)
  • Not treat "live collaboration" as a single task for one agent

This is the /office-hours pattern from gstack — challenge the premise before executing.

Connects to

Key quotes from the essay

"The purpose of abstraction is not to be vague, but to create a new semantic level in which one can be absolutely precise." — Dijkstra

"AI should help us produce better code. And when we have AGI this will be easy." — Simon Willison (via Krouse)

"When we get AGI, the very first things we will use it on will be our hardest abstraction problems."

The direction is clear: agents that produce better abstractions, not just more code.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions