Why
A theme that's blowing up in May 2026 writeups: "your AI agent is one prompt update away from breaking three unrelated flows." Concrete examples: a one-line tone tweak that jumped refusal rate 14 points, a currency-format fix that broke vendor extraction days later.
The fix is conceptually simple — treat prompt changes like database migrations, gated by a paired before/after eval on the same examples. EvalView already does this. We just don't tell that story anywhere.
What
Add a cookbook entry under docs/ (or wherever cookbook docs live) walking a reader through:
- The pain (with one of the real examples cited above as a hook)
- The mental model: prompts are schema; changes need migrations
- The recipe with EvalView:
evalview snapshot before the change
- make the prompt edit
evalview check after
- what the diff tells you about which flows regressed
- Wiring it into CI so future prompt PRs auto-gate
Acceptance criteria
Hints
- Reference posts for tone/context (don't lift content):
- AscentCore: "Why Your AI Agents Are One Update Away from Breaking" (May 4, 2026)
- The GitHub Blog post on validating non-deterministic agent behavior
- Length target: 800-1200 words. Long enough to be useful, short enough to read in one sitting.
Size
~2-3 hours including the worked example.
Why
A theme that's blowing up in May 2026 writeups: "your AI agent is one prompt update away from breaking three unrelated flows." Concrete examples: a one-line tone tweak that jumped refusal rate 14 points, a currency-format fix that broke vendor extraction days later.
The fix is conceptually simple — treat prompt changes like database migrations, gated by a paired before/after eval on the same examples. EvalView already does this. We just don't tell that story anywhere.
What
Add a cookbook entry under
docs/(or wherever cookbook docs live) walking a reader through:evalview snapshotbefore the changeevalview checkafterAcceptance criteria
docs/cookbook/prompt-as-migration.md)Hints
Size
~2-3 hours including the worked example.