State Triage

Deterministic reasoning around one capable agent.

A component of QA Veritas — an exploration of how AI agents reason about, verify, and operate complex systems.

Problem

The common pattern for "AI triage" is to wire a model directly to a pile of logs and hope. It fails in two predictable ways. The model miscounts — it reports "3 failures" when there were 5, because it eyeballed a wall of text instead of parsing it. And it chases the loudest error instead of the first one, so the conclusion is a downstream symptom, not the cause. A root-cause analysis you can't reproduce, built on numbers a model guessed, is not analysis. It's a confident narrative.

Core Idea

Split the investigation at the line between facts and judgment. Cheap, deterministic code establishes what happened — exact counts, statuses, the first error signature, the phase. Only then does a single capable agent reason about why. The model never re-derives what a script can compute reliably, and every finding is written into a typed, file-locked state document. The report is generated from that state, so it's reproducible: same state, same brief, every time.

The agent never counts. The counts are already in state.json.

Architecture Diagram

flowchart LR
    R[result + logs] --> P[Parse facts<br/><i>deterministic</i>]
    P --> C[Classify<br/>PASS/HANG/TEST/INFRA/<br/>FRAMEWORK/PRODUCT]
    C --> PL[Plan<br/>cheapest-first +<br/>stop conditions]
    PL --> I[Agent investigates<br/><i>judgment</i>]
    I --> S[Synthesize<br/>root-cause brief]
    P -.-> ST[(state.json<br/>typed, file-locked)]
    C -.-> ST
    PL -.-> ST
    I -.-> ST
    ST --> S

Concepts

Determinism around nondeterminism — a regex where a regex belongs; the model saved for judgment, never arithmetic.
First error, not loudest — causes precede cascades; the pipeline anchors on the first signal.
Failure taxonomy — PASS / HANG / TEST-BUG / INFRA / FRAMEWORK / PRODUCT, because a label is only useful if it changes what you do next.
State as the source of truth — typed sections (deterministic_findings, evidence_sources, triage, log_investigation, root_cause) under a file lock; the brief is a render of state, not a fresh model call.
Graceful degradation — no live access? The plan scopes itself to logs-only instead of failing.

Examples

Classification grounded in a parsed signature, not a vibe:

$ python -m statetriage classify --result examples/result.json
class:      INFRA
confidence: high
first_error: write rejected (FORBIDDEN/8/index read-only)
rationale:  Signature matches a storage flood-stage guard; the test
            asserted nothing — the environment rejected the write.

Quick Start

pip install -e .          # or: python -m statetriage --help

# Full pipeline: parse → classify → plan → write state → brief
python -m statetriage run --result examples/result.json --log examples/run.log --state out/state.json

python -m statetriage classify --result examples/result.json   # just the label
python -m statetriage brief --state out/state.json             # render brief from state

Python 3.10+, zero third-party runtime dependencies.

Why It Matters

For engineers: triage stops being a talent that lives in two senior people and becomes a repeatable system. The brief is defensible in a review because every number traces to a parse, not a guess.

For AI agents: this is the pattern for letting a model do the part it's good at (reasoning over evidence) while fencing off the part it's bad at (counting, exact recall). The typed state file also makes the investigation composable — another tool can read root_cause and act on it.

Future Vision

A reference agent runner that consumes the plan and fills the log_investigation / root_cause sections.
A prior-incident matcher keyed by error signature — search memory before investigating fresh.
Signature packs loaded from YAML, so the classifier is data, not code.
A replay mode that re-runs synthesis over historical state to measure classifier drift.

Part of QA Veritas

QA Veritas explores AI-Native Verification Engineering — practical patterns for a future where humans and AI agents operate complex systems together. Every component serves one loop:

Memory → Reasoning → Verification → Action

QA Veritas
├── Resource Ledger                    Memory       operational truth as a git tree
├── State Triage      ◀ you are here   Reasoning    deterministic triage around an agent
├── LogLens                            Reasoning    code-aware evidence from logs
├── Intent Verify                      Verification declarative intent → observable proof
├── Runbook Forge                      Runbooks     procedures derived from verified history
├── SkillPack                          Skills       progressive-disclosure agent capability
└── Future Agents                      Agents       narrow operators that compose the above

Layer	Component
Memory	Resource Ledger
Reasoning	State Triage (this repo) · LogLens
Verification	Intent Verify
Runbooks	Runbook Forge
Skills	SkillPack
Writing	Field notes & essays

Start at the platform overview. MIT licensed.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.github/workflows		.github/workflows
examples		examples
statetriage		statetriage
tests		tests
ANNOUNCEMENT.md		ANNOUNCEMENT.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

State Triage

Problem

Core Idea

Architecture Diagram

Concepts

Examples

Quick Start

Why It Matters

Future Vision

Part of QA Veritas

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

State Triage

Problem

Core Idea

Architecture Diagram

Concepts

Examples

Quick Start

Why It Matters

Future Vision

Part of QA Veritas

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages