docs(adr): add ADR 0049 semantic observability and improvement loop#2423
Conversation
Defines derived signals, read-first observer, shadow interventions, and lesson extraction on top of JSONL traces, with hybrid OTel-backend export. Signed-off-by: Hofni Gartner <hgartner@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>
E2E tests did not runE2E tests run automatically for org/repo members and collaborators on pull requests. For other contributors, a maintainer must add the See E2E testing guide for details. |
|
🤖 Finished Review · ✅ Success · Started 2:15 PM UTC · Completed 2:27 PM UTC |
| @@ -0,0 +1,178 @@ | |||
| --- | |||
There was a problem hiding this comment.
[critical] This collides with #1489, which also claims 0049. That PR is closer to merging (has reviews, ready-for-merge label), so I think this one should take the next available number.
There's a renumber-adr skill in the repo that can help — it checks for collisions against the target branch and renumbers automatically.
|
|
||
| ### Option C: Hybrid (recommended) | ||
|
|
||
| Export traces to an OTel-compatible backend as the primary query surface. |
There was a problem hiding this comment.
[important] This ADR assumes OTel-compatible traces exist but doesn't reference the companion ADR that decides how they're produced (#1489 — distributed tracing instrumentation). I think we should be explicit about that dependency here, since derived signals and the observer both consume what the tracing layer produces.
Might make sense to wait for #1489 to land and then reference it by number.
| - Harness layout for signal artifacts, shadow logs, and lesson files | ||
| ([ADR 0024](0024-harness-definitions.md)) | ||
| - Per-SIG dashboards and aggregation on trace scores and tags | ||
| - Observer write tools and action allowlist policy |
There was a problem hiding this comment.
[non-blocking] Per AGENTS.md, accepted ADRs should update docs/architecture.md and related problem docs in the same PR. Not blocking on it, but worth adding before merge.
ReviewFindingsHigh
Medium
Low
Info
|
| the same schema for offline retro when no backend is configured. | ||
|
|
||
| JSONL remains the forensic source ([ADR 0021](0021-jsonl-reasoning-trace-exposure.md)). | ||
| Derived signals are enrichments, not a replacement. |
There was a problem hiding this comment.
[low] design-document-alignment
Section 2 says implementation details are 'deferred to ADR 0024 or a follow-on ADR.' ADR 0024 is already accepted and does not mention observer stages, signal artifacts, shadow logs, or lesson files. The phrasing implies ADR 0024 might already cover these, when in reality only a follow-on ADR would.
| deferred to a follow-on decision. See | ||
| [operational-observability.md](../problems/operational-observability.md). | ||
|
|
||
| ## Decision |
There was a problem hiding this comment.
[low] capitalization-consistency
'Rollout order' (sentence case) vs 'Non-goals' (title case) shows minor inconsistency in subsection header capitalization within the Decision section.
Summary
Proposes ADR 0049: Semantic observability and improvement loop — a layered model on top of ADR 0021 JSONL traces:
delivered: falseby default)Builds on operational-observability.md and testing-agents.md. Takes inspiration from The Darwin Project for separating raw traces from semantic signals, without adopting that runtime.
Relationship to retro (#131): Retro remains the workflow-level improvement agent (issues + PR comments). This ADR adds the plumbing — cheaper inputs for retro (signals vs full JSONL) and a path from findings → regression tests. Implementation is deferred.
Why now
JSONL gives per-run forensics but is expensive to scan at factory scale. Generic trace backends capture tool spans but not fullsend-specific patterns. This ADR records the what/why before implementation ADRs (trace export, harness layout, observer wiring).
What this PR does not include
Options considered
Documented in the ADR: artifact-only (A), backend-only (B), hybrid (C, preferred).