feat(roadmap-planner): DORA Lead Time 3-stage attribution (Phase 2 backend) by yhuan123 · Pull Request #197 · AlaudaDevops/toolbox

yhuan123 · 2026-05-21T14:53:40Z

Plan: docs/plans/2026-05-21-dora-optimization-plan.md — 2026-05-21. Backend half of DORA Phase 2; the lean Lead Time card + Trend Panel frontend lands in a follow-up PR.

Summary

Upgrades Lead Time for Changes from a single number to Dev → Review → Release 3-stage attribution. The card now points at which stage is the bottleneck and a separate panel shows the 9-month trend with MoM/QoQ deltas, so the team can locate improvement targets instead of just reading a headline number. DORA-strict: starts at first commit, ends at release.

Diff

Layer	Change
Schema	`0009_pr_first_commit_at.sql` adds `pull_requests.first_commit_at TIMESTAMP NULL`.
Sync	GitHub + GitLab clients gain `ListPRCommits` / `ListMRCommits`. Sync populates `first_commit_at` from the platform commits API; failures degrade silently (COALESCE-protected UPSERT).
Storage	`UpsertPullRequests` SQL extended; new `ListPullRequestsSince` read path for the metrics collector.
Collector	Holds a (nullable) `storage.Store`, fetches PRs alongside Jira data, exposes them in `CalculationContext.PullRequests`. `main.go` opens the store once and shares it with team-analytics.
Calculator	`lead_time.go` rewritten: 3-stage durations (T0..T3), fallback matrix C1–C6, bottleneck dual-condition rule (D15), worst_issues one-per-stage (P2-4), 9-month trend with MoM/QoQ direction, transparency coverage stats, consistency_warning.
API	`/api/metrics/lead_time_to_release` accepts `?include_bots=` and `?with_trend=`. Other calculators unchanged.

Cycle Time / Patch Ratio / Time to Patch / Deploy Freq calculators are not touched (plan D2 / D4 / D5).

Test plan

`go test ./internal/...` + `go vet ./...` — green
11 new unit tests in `lead_time_test.go` covering percentile math, fallback matrix C1/C3/C4/E9, component regex, bottleneck dual-condition, worst-per-stage selection, MoM/QoQ direction, end-to-end Calculate
Dev deploy: `curl -s '/api/metrics/lead_time_to_release?include_bots=false&component=tektoncd-operator&with_trend=true' | jq .` returns `value > 0`, `metadata.stages` length 3, `metadata.coverage.coverage_pct >= 70`, `metadata.trend.points` length 9
Dev deploy: `SELECT COUNT(*) FROM pull_requests WHERE merged_at > NOW() - INTERVAL '7 days' AND first_commit_at IS NULL` stays small after the first sync cycle

Sequencing

Backend lands first; frontend (`MetricCard` lean redesign + new `LeadTimeTrendPanel.jsx`) follows in a separate PR per plan §Phase 1.

🤖 Generated with Claude Code

Plan: docs/plans/2026-05-21-dora-optimization-plan.md. Upgrades Lead Time for Changes from a single number to Dev → Review → Release 3-stage attribution, with bottleneck highlighting and a separate 9-month trend panel (sparkline + MoM/QoQ chips). DORA-strict: first commit → release, no backlog. Phase 2 backend half (schema + sync + calculator + API + tests). Frontend follows in a separate PR. The other 4 DORA metrics — Deployment Frequency, Change Failure Rate, Mean Time to Recovery, Cycle Time — calculator + UI are untouched.

…P2-2] Review found that the new componentVersionRE only accepted `{component}-v{semver}`, so versions named `argo-cd-2.9.0` (no v prefix) were labelled with the full version string and a dashboard filter for `argo-cd` returned no Lead Time data while other metrics still grouped by EnrichedRelease.Component. matchReleasedVersion now returns the matched EnrichedRelease so the caller reuses the collector-parsed Component field — same source as release_frequency and friends. The componentVersionRE + componentFromVersionName helper are removed.

…alize [P1-2] Review found that Jira version releaseDate is only a calendar date, parsed at midnight, so any PR merged later on the same release day was dropped by filterPreReleasePRs as a hotfix — losing the last PR on release day for many shipped issues and skewing Lead Time low. matchReleasedVersion now snaps the returned T3 to end-of-day. A PR merged at 14:00 on release day stays inside the window; the Release stage end-time is still that calendar day.

…t [P1-1] Review found that switching MetricResult.Value/Unit to hours broke existing consumers: MetricCard / MetricBreakdown hard-code days and DORA day thresholds, and PrometheusExporter publishes lead_time_days without conversion. A 30-day Lead Time (720h) would render as "720 days" and trip Low-band classification at 60+ days. Value is now p50_hours / 24 (days), Unit is "days". The hour-precision values stay in Metadata (total.p50_hours, stages[].p50_hours, trend.points[].p50_hours) for the Phase 1 frontend that needs them. Plan §8 documents the dual encoding.

…-window PRs [P2-1] Review found that the PR fetch window matched HistoricalDays exactly, but the Lead Time calculator filters issues by release date — so a change released early in the window whose PR merged just before window start (and is not old enough to hit the 180d long-dev exclusion) gets classified as C4/no PR and undercounted. Collector now uses since = now - (HistoricalDays + 180d). The 180d buffer aligns with excludedLongDevThreshold, so PRs older than that would never be referenced by a qualified issue anyway.

…jsx [P2] Review found that MetricBreakdown.jsx still reads metadata.min/max/count on the lead_time_to_release branch, so the expanded breakdown rendered Min=0 / Max=0 / Epics=0 after Phase 2 even though samples existed. Lead Time calculator now re-emits the legacy days fields (min/max/average/count/sample_size/percentile) alongside the new hour-precision payload (total / stages / worst_issues / trend etc), mirroring the same backward-compat strategy used for Value/Unit in P1-1. The frontend stays unchanged on the backend PR.

…le [P2] Review found that storage.enabled defaults to false in the sample config, but lead_time_to_release stays enabled. After Phase 2 the calculator unconditionally required PR data, so every issue was classified as C4/no PR and the whole metric disappeared from the dashboard and Prometheus — strictly worse than the pre-Phase-2 behaviour. CalculationContext now carries PRStoreAvailable, set by the collector based on whether the storage handle is non-nil. When false, Calculate falls back to the pre-Phase-2 calendar path (issue.created → release, days) and emits metadata.degraded = "no_pr_store" + a reason string so the UI can show a configuration warning. The fallback does not emit stages / worst_issues / trend / coverage — those require PR data — but Value, min, max, average, and count are populated for MetricBreakdown.jsx. Trade-off documented in plan §10.

…ently [P2]

…l_days [P2]

- parseVersionName no longer falls back to the whole version name — legacy names ("0.3", "v2.1") stop polluting component buckets - new metrics.exclude_plugins config (D6): v3-era plugins (katanomi, knative, jenkins, tekton-operator) dropped from all component dimensions at collection time - calculators skip component=="" instead of bucketing to "unknown"

yhuan123 force-pushed the feat/dora-lead-time-phase2 branch from 56ccb4e to 4d92204 Compare May 21, 2026 15:12

huanyang@alauda.io added 11 commits May 21, 2026 23:23

fix(roadmap-planner): keep Lead Time alive when PR fetch fails transi…

7392cd2

…ently [P2]

fix(roadmap-planner): backfill first_commit_at for pre-0009 PR rows [P2]

dec0884

fix(roadmap-planner): align API default window with metrics.historica…

49a6b4f

…l_days [P2]

fix(roadmap-planner): fetch commits only for merged PRs/MRs [P2]

4497c54

yhuan123 mentioned this pull request Jun 4, 2026

fix(roadmap-planner): drop invalid components from DORA metrics #201

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(roadmap-planner): DORA Lead Time 3-stage attribution (Phase 2 backend)#197

feat(roadmap-planner): DORA Lead Time 3-stage attribution (Phase 2 backend)#197
yhuan123 wants to merge 12 commits into
AlaudaDevops:mainfrom
yhuan123:feat/dora-lead-time-phase2

yhuan123 commented May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

yhuan123 commented May 21, 2026

Summary

Diff

Test plan

Sequencing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant