feat(roadmap-planner): DORA Lead Time 3-stage attribution (Phase 2 backend)#197
Open
yhuan123 wants to merge 12 commits into
Open
feat(roadmap-planner): DORA Lead Time 3-stage attribution (Phase 2 backend)#197yhuan123 wants to merge 12 commits into
yhuan123 wants to merge 12 commits into
Conversation
Plan: docs/plans/2026-05-21-dora-optimization-plan.md. Upgrades Lead Time for Changes from a single number to Dev → Review → Release 3-stage attribution, with bottleneck highlighting and a separate 9-month trend panel (sparkline + MoM/QoQ chips). DORA-strict: first commit → release, no backlog. Phase 2 backend half (schema + sync + calculator + API + tests). Frontend follows in a separate PR. The other 4 DORA metrics — Deployment Frequency, Change Failure Rate, Mean Time to Recovery, Cycle Time — calculator + UI are untouched.
56ccb4e to
4d92204
Compare
added 11 commits
May 21, 2026 23:23
…P2-2]
Review found that the new componentVersionRE only accepted
`{component}-v{semver}`, so versions named `argo-cd-2.9.0` (no v
prefix) were labelled with the full version string and a dashboard
filter for `argo-cd` returned no Lead Time data while other metrics
still grouped by EnrichedRelease.Component.
matchReleasedVersion now returns the matched EnrichedRelease so the
caller reuses the collector-parsed Component field — same source as
release_frequency and friends. The componentVersionRE +
componentFromVersionName helper are removed.
…alize [P1-2] Review found that Jira version releaseDate is only a calendar date, parsed at midnight, so any PR merged later on the same release day was dropped by filterPreReleasePRs as a hotfix — losing the last PR on release day for many shipped issues and skewing Lead Time low. matchReleasedVersion now snaps the returned T3 to end-of-day. A PR merged at 14:00 on release day stays inside the window; the Release stage end-time is still that calendar day.
…t [P1-1] Review found that switching MetricResult.Value/Unit to hours broke existing consumers: MetricCard / MetricBreakdown hard-code days and DORA day thresholds, and PrometheusExporter publishes lead_time_days without conversion. A 30-day Lead Time (720h) would render as "720 days" and trip Low-band classification at 60+ days. Value is now p50_hours / 24 (days), Unit is "days". The hour-precision values stay in Metadata (total.p50_hours, stages[].p50_hours, trend.points[].p50_hours) for the Phase 1 frontend that needs them. Plan §8 documents the dual encoding.
…-window PRs [P2-1] Review found that the PR fetch window matched HistoricalDays exactly, but the Lead Time calculator filters issues by release date — so a change released early in the window whose PR merged just before window start (and is not old enough to hit the 180d long-dev exclusion) gets classified as C4/no PR and undercounted. Collector now uses since = now - (HistoricalDays + 180d). The 180d buffer aligns with excludedLongDevThreshold, so PRs older than that would never be referenced by a qualified issue anyway.
…jsx [P2] Review found that MetricBreakdown.jsx still reads metadata.min/max/count on the lead_time_to_release branch, so the expanded breakdown rendered Min=0 / Max=0 / Epics=0 after Phase 2 even though samples existed. Lead Time calculator now re-emits the legacy days fields (min/max/average/count/sample_size/percentile) alongside the new hour-precision payload (total / stages / worst_issues / trend etc), mirroring the same backward-compat strategy used for Value/Unit in P1-1. The frontend stays unchanged on the backend PR.
…le [P2] Review found that storage.enabled defaults to false in the sample config, but lead_time_to_release stays enabled. After Phase 2 the calculator unconditionally required PR data, so every issue was classified as C4/no PR and the whole metric disappeared from the dashboard and Prometheus — strictly worse than the pre-Phase-2 behaviour. CalculationContext now carries PRStoreAvailable, set by the collector based on whether the storage handle is non-nil. When false, Calculate falls back to the pre-Phase-2 calendar path (issue.created → release, days) and emits metadata.degraded = "no_pr_store" + a reason string so the UI can show a configuration warning. The fallback does not emit stages / worst_issues / trend / coverage — those require PR data — but Value, min, max, average, and count are populated for MetricBreakdown.jsx. Trade-off documented in plan §10.
- parseVersionName no longer falls back to the whole version name —
legacy names ("0.3", "v2.1") stop polluting component buckets
- new metrics.exclude_plugins config (D6): v3-era plugins (katanomi,
knative, jenkins, tekton-operator) dropped from all component
dimensions at collection time
- calculators skip component=="" instead of bucketing to "unknown"
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Plan: docs/plans/2026-05-21-dora-optimization-plan.md — 2026-05-21. Backend half of DORA Phase 2; the lean Lead Time card + Trend Panel frontend lands in a follow-up PR.
Summary
Upgrades Lead Time for Changes from a single number to Dev → Review → Release 3-stage attribution. The card now points at which stage is the bottleneck and a separate panel shows the 9-month trend with MoM/QoQ deltas, so the team can locate improvement targets instead of just reading a headline number. DORA-strict: starts at first commit, ends at release.
Diff
0009_pr_first_commit_at.sqladdspull_requests.first_commit_at TIMESTAMP NULL.ListPRCommits/ListMRCommits. Sync populatesfirst_commit_atfrom the platform commits API; failures degrade silently (COALESCE-protected UPSERT).UpsertPullRequestsSQL extended; newListPullRequestsSinceread path for the metrics collector.storage.Store, fetches PRs alongside Jira data, exposes them inCalculationContext.PullRequests.main.goopens the store once and shares it with team-analytics.lead_time.gorewritten: 3-stage durations (T0..T3), fallback matrix C1–C6, bottleneck dual-condition rule (D15), worst_issues one-per-stage (P2-4), 9-month trend with MoM/QoQ direction, transparency coverage stats, consistency_warning./api/metrics/lead_time_to_releaseaccepts?include_bots=and?with_trend=. Other calculators unchanged.Cycle Time / Patch Ratio / Time to Patch / Deploy Freq calculators are not touched (plan D2 / D4 / D5).
Test plan
Sequencing
Backend lands first; frontend (`MetricCard` lean redesign + new `LeadTimeTrendPanel.jsx`) follows in a separate PR per plan §Phase 1.
🤖 Generated with Claude Code