Add DORA band color thresholds and rolling trend panels #396

Merged
ldraney merged 2 commits from 395-dora-dashboard-band-color-thresholds-rol into main 2026-06-13 20:38:37 +00:00
Owner

Summary

Updates the DORA Grafana dashboard with official performance band color thresholds on all 4 overview stat panels, and adds a new Trends row with 7d/30d rolling average timeseries panels for Deployment Frequency and Lead Time.

Changes

  • terraform/dashboards/dora-dashboard.json
    • Deploys/Day (id 2): thresholds changed to red <0.03, yellow 0.03-0.14, blue 0.14-1, green >1
    • Lead Time p50 (id 3): thresholds changed to green <24h, blue 24-168h, yellow 168-720h, red >720h
    • CFR (id 4): thresholds changed to green <5%, blue 5-10%, yellow 10-15%, red >15%
    • MTTR (id 5): thresholds changed to green <1h, blue 1-24h, yellow 24-168h, red >168h
    • Added "Trends" row (id 14) with 4 timeseries panels:
      • DF 7d Rolling Avg (id 15): avg_over_time(...[7d:1d])
      • DF 30d Rolling Avg (id 16): avg_over_time(...[30d:1d])
      • LT 7d Rolling Avg (id 17): avg_over_time(...[7d:1d])
      • LT 30d Rolling Avg (id 18): avg_over_time(...[30d:1d])

Test Plan

  • JSON validates without error (verified)
  • Import dashboard into Grafana and confirm stat panels show correct band colors
  • Verify trend panels render with available Prometheus data (30d panels will show partial data until retention is extended past 15d)

Review Checklist

  • JSON parses without errors
  • Threshold steps use correct DORA band colors (green/blue/yellow/red)
  • Threshold values match DORA specifications
  • Trend panel queries use correct avg_over_time subquery syntax
  • Panel IDs are unique and sequential
  • Grid positions do not overlap

None -- dashboard-only change, no docs impact.

Closes #395

## Summary Updates the DORA Grafana dashboard with official performance band color thresholds on all 4 overview stat panels, and adds a new Trends row with 7d/30d rolling average timeseries panels for Deployment Frequency and Lead Time. ## Changes - `terraform/dashboards/dora-dashboard.json` - **Deploys/Day** (id 2): thresholds changed to red <0.03, yellow 0.03-0.14, blue 0.14-1, green >1 - **Lead Time p50** (id 3): thresholds changed to green <24h, blue 24-168h, yellow 168-720h, red >720h - **CFR** (id 4): thresholds changed to green <5%, blue 5-10%, yellow 10-15%, red >15% - **MTTR** (id 5): thresholds changed to green <1h, blue 1-24h, yellow 24-168h, red >168h - Added "Trends" row (id 14) with 4 timeseries panels: - DF 7d Rolling Avg (id 15): `avg_over_time(...[7d:1d])` - DF 30d Rolling Avg (id 16): `avg_over_time(...[30d:1d])` - LT 7d Rolling Avg (id 17): `avg_over_time(...[7d:1d])` - LT 30d Rolling Avg (id 18): `avg_over_time(...[30d:1d])` ## Test Plan - JSON validates without error (verified) - Import dashboard into Grafana and confirm stat panels show correct band colors - Verify trend panels render with available Prometheus data (30d panels will show partial data until retention is extended past 15d) ## Review Checklist - [x] JSON parses without errors - [x] Threshold steps use correct DORA band colors (green/blue/yellow/red) - [x] Threshold values match DORA specifications - [x] Trend panel queries use correct avg_over_time subquery syntax - [x] Panel IDs are unique and sequential - [x] Grid positions do not overlap ## Related Notes None -- dashboard-only change, no docs impact. ## Related Closes #395 - DORA#4: https://forgejo.tail5b443a.ts.net/ldraney/DORA/issues/4 - DORA#7: https://forgejo.tail5b443a.ts.net/ldraney/DORA/issues/7
Add DORA band color thresholds and rolling trend panels
All checks were successful
ci/woodpecker/push/terraform Pipeline was successful
ci/woodpecker/pr/terraform Pipeline was successful
d111138116
Update the 4 overview stat panels with official DORA performance band
colors (green/blue/yellow/red for Elite/High/Medium/Low) and add a new
Trends row with 7d and 30d rolling average panels for Deployment
Frequency and Lead Time.

Refs: DORA#4, DORA#7

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Author
Owner

QA Review -- PR #396

Threshold Band Colors (DORA#4)

Panel Base Step 2 Step 3 Step 4 Verdict
Deploys/Day red (null) yellow (0.03) blue (0.14) green (1) PASS
Lead Time p50 green (null) blue (24) yellow (168) red (720) PASS
CFR (%) green (null) blue (5) yellow (10) red (15) PASS
MTTR (hours) green (null) blue (1) yellow (24) red (168) PASS

All four stat panels match the specified DORA performance band definitions.

Rolling Trend Panels (DORA#7)

Panel ID Query Window Grid Position Verdict
DF 7d Rolling Avg 15 [7d:1d] 12w x=0 y=42 PASS
DF 30d Rolling Avg 16 [30d:1d] 12w x=12 y=42 PASS
LT p50 7d Rolling Avg 17 [7d:1d] 12w x=0 y=50 PASS
LT p50 30d Rolling Avg 18 [30d:1d] 12w x=12 y=50 PASS
  • Trends row (id 14) positioned correctly at y=41 after MTTR section
  • DF and LT panels laid out side-by-side (7d left, 30d right) in two rows
  • Subquery syntax avg_over_time((<inner>)[Nd:1d]) is correct for Prometheus
  • Panel styling (line, smooth interpolation, fillOpacity 20, legend table) matches existing dashboard conventions

Structural Checks

  • JSON valid (verified by python3 json.load)
  • Panel IDs unique and sequential (14-18)
  • Grid positions non-overlapping
  • No existing panels modified beyond threshold color/value changes
  • Dashboard schema version, uid, and templating unchanged

Findings

No issues found.


VERDICT: APPROVE

## QA Review -- PR #396 ### Threshold Band Colors (DORA#4) | Panel | Base | Step 2 | Step 3 | Step 4 | Verdict | |-------|------|--------|--------|--------|---------| | Deploys/Day | red (null) | yellow (0.03) | blue (0.14) | green (1) | PASS | | Lead Time p50 | green (null) | blue (24) | yellow (168) | red (720) | PASS | | CFR (%) | green (null) | blue (5) | yellow (10) | red (15) | PASS | | MTTR (hours) | green (null) | blue (1) | yellow (24) | red (168) | PASS | All four stat panels match the specified DORA performance band definitions. ### Rolling Trend Panels (DORA#7) | Panel | ID | Query Window | Grid Position | Verdict | |-------|----|-------------|---------------|---------| | DF 7d Rolling Avg | 15 | `[7d:1d]` | 12w x=0 y=42 | PASS | | DF 30d Rolling Avg | 16 | `[30d:1d]` | 12w x=12 y=42 | PASS | | LT p50 7d Rolling Avg | 17 | `[7d:1d]` | 12w x=0 y=50 | PASS | | LT p50 30d Rolling Avg | 18 | `[30d:1d]` | 12w x=12 y=50 | PASS | - Trends row (id 14) positioned correctly at y=41 after MTTR section - DF and LT panels laid out side-by-side (7d left, 30d right) in two rows - Subquery syntax `avg_over_time((<inner>)[Nd:1d])` is correct for Prometheus - Panel styling (line, smooth interpolation, fillOpacity 20, legend table) matches existing dashboard conventions ### Structural Checks - [x] JSON valid (verified by python3 json.load) - [x] Panel IDs unique and sequential (14-18) - [x] Grid positions non-overlapping - [x] No existing panels modified beyond threshold color/value changes - [x] Dashboard schema version, uid, and templating unchanged ### Findings No issues found. --- **VERDICT: APPROVE**
Merge branch 'main' into 395-dora-dashboard-band-color-thresholds-rol
All checks were successful
ci/woodpecker/push/terraform Pipeline was successful
ci/woodpecker/pr/terraform Pipeline was successful
ci/woodpecker/pull_request_closed/terraform Pipeline was successful
9c1c726596
Author
Owner

PR #396 Review

DOMAIN REVIEW

Tech Stack: Grafana JSON dashboard, Terraform-managed ConfigMap, PromQL queries.

Threshold Verification (Accelerate DORA bands):

Metric Band PR Value DORA Spec Match?
DF (deploys/day) Low <0.03 (red) <monthly (~0.03) Yes
DF Medium 0.03-0.14 (yellow) monthly-weekly Yes
DF High 0.14-1 (blue) weekly-daily Yes
DF Elite >1 (green) daily/on-demand Yes
Lead Time (hours) Elite+High <24h (green) Elite <1h, High <24h Acceptable -- conflates Elite/High into one green band due to 4-slot limit
Lead Time Medium 24-168h (blue) 1day-1week Yes
Lead Time Low boundary 168-720h (yellow) 1week-1month Yes
Lead Time Low >720h (red) >1month Yes
CFR (%) Elite <5 (green) 0-5% Yes
CFR High 5-10 (blue) 5-10% Yes
CFR Medium 10-15 (yellow) 10-15% Yes
CFR Low >15 (red) >15% Yes
MTTR (hours) Elite <1h (green) <1hr Yes
MTTR High 1-24h (blue) 1hr-1day Yes
MTTR Medium 24-168h (yellow) 1day-1week Yes
MTTR Low >168h (red) >1week See nit below

PromQL Queries (Trend Panels):

  • DF subqueries use avg_over_time((sum(rate(...))[Nd:1d]) -- valid Prometheus subquery syntax.
  • LT subqueries use avg_over_time((histogram_quantile(0.5, ...))[Nd:1d]) -- valid. The by (le) aggregation inside histogram_quantile is correct.
  • All 4 trend queries use the same metric names as the overview stat panels (dora_deployments_total, dora_pr_lead_time_seconds_bucket). Consistent.
  • Trend panels omit $repo filter, matching overview stat panel behavior (aggregate across all repos). Consistent.

Grafana JSON Structure:

  • Panel IDs 14-18 are sequential and unique (existing range: 1-13). No collisions.
  • Grid positions: Row at y=41, trend panels at y=42 and y=50. Each pair uses w=12 with x=0 and x=12 (two-column layout). No overlaps with existing panels (prior content ends at y=41).
  • Datasource UIDs consistently use ${DS_PROMETHEUS} variable. No hardcoded datasource references.

BLOCKERS

None.

NITS

  1. MTTR red threshold is more aggressive than DORA spec. MTTR red fires at >168h (1 week) but DORA "Low" is >1 month (720h). The 168h-720h range (1week to 1month) maps to the Medium-Low boundary in Accelerate. For a small team this is arguably better -- you want to see red sooner. But it means the MTTR thresholds don't exactly match the standard while the other three metrics do. Consider documenting this intentional deviation in a comment within the JSON, or adjusting to 720h for consistency.

  2. Detail row thresholds not updated. The CFR gauge panel (id 11, line 338-344) and MTTR detail stat panel (id 13, line 392-398) still use the old color scheme (orange instead of blue, CFR red at 30 instead of 15). The overview and detail panels now show different color bands for the same metric. This is not in scope per the PR description ("all 4 overview stat panels"), but the inconsistency could confuse dashboard users. Consider a follow-up issue to align detail row thresholds.

  3. 30d trend panels vs 15d retention. The PR body honestly documents that 30d panels will show partial data. This is fine, but consider adding a panel description or annotation in the Grafana JSON itself (the description field on panels 16 and 18) so dashboard viewers see this caveat without reading the PR.

  4. No CFR/MTTR trend panels. Only DF and LT get trend panels. Issue #430 exists for "7d/30d rolling trend panels for DF and LT" specifically, so this is intentional. But consider noting in the PR body that CFR/MTTR trends are deferred so the scope boundary is explicit.

SOP COMPLIANCE

  • PR body has Summary, Changes, Test Plan, Related sections
  • Review Checklist included and self-assessed
  • No secrets, .env files, or credentials committed
  • No unnecessary file changes -- single file changed (dora-dashboard.json), tightly scoped
  • Commit scope matches issue description
  • Closes #395 referenced correctly

PROCESS OBSERVATIONS

  • This is a visibility improvement (DORA band colors + trend lines) that directly supports the DORA metrics initiative. No deployment risk -- it is a dashboard-only change with no infrastructure or application impact.
  • The 30d retention limitation is worth tracking. If Prometheus retention is extended later, these panels will automatically benefit with no further changes needed.
  • Issue #430 (DF+LT trend panels) appears to overlap with the trend panel portion of this PR. If this PR merges, #430 may be partially or fully resolved -- verify and close/update #430 accordingly.

VERDICT: APPROVED

## PR #396 Review ### DOMAIN REVIEW **Tech Stack:** Grafana JSON dashboard, Terraform-managed ConfigMap, PromQL queries. **Threshold Verification (Accelerate DORA bands):** | Metric | Band | PR Value | DORA Spec | Match? | |--------|------|----------|-----------|--------| | DF (deploys/day) | Low | <0.03 (red) | <monthly (~0.03) | Yes | | DF | Medium | 0.03-0.14 (yellow) | monthly-weekly | Yes | | DF | High | 0.14-1 (blue) | weekly-daily | Yes | | DF | Elite | >1 (green) | daily/on-demand | Yes | | Lead Time (hours) | Elite+High | <24h (green) | Elite <1h, High <24h | Acceptable -- conflates Elite/High into one green band due to 4-slot limit | | Lead Time | Medium | 24-168h (blue) | 1day-1week | Yes | | Lead Time | Low boundary | 168-720h (yellow) | 1week-1month | Yes | | Lead Time | Low | >720h (red) | >1month | Yes | | CFR (%) | Elite | <5 (green) | 0-5% | Yes | | CFR | High | 5-10 (blue) | 5-10% | Yes | | CFR | Medium | 10-15 (yellow) | 10-15% | Yes | | CFR | Low | >15 (red) | >15% | Yes | | MTTR (hours) | Elite | <1h (green) | <1hr | Yes | | MTTR | High | 1-24h (blue) | 1hr-1day | Yes | | MTTR | Medium | 24-168h (yellow) | 1day-1week | Yes | | MTTR | Low | >168h (red) | >1week | See nit below | **PromQL Queries (Trend Panels):** - DF subqueries use `avg_over_time((sum(rate(...))[Nd:1d])` -- valid Prometheus subquery syntax. - LT subqueries use `avg_over_time((histogram_quantile(0.5, ...))[Nd:1d])` -- valid. The `by (le)` aggregation inside histogram_quantile is correct. - All 4 trend queries use the same metric names as the overview stat panels (dora_deployments_total, dora_pr_lead_time_seconds_bucket). Consistent. - Trend panels omit `$repo` filter, matching overview stat panel behavior (aggregate across all repos). Consistent. **Grafana JSON Structure:** - Panel IDs 14-18 are sequential and unique (existing range: 1-13). No collisions. - Grid positions: Row at y=41, trend panels at y=42 and y=50. Each pair uses w=12 with x=0 and x=12 (two-column layout). No overlaps with existing panels (prior content ends at y=41). - Datasource UIDs consistently use `${DS_PROMETHEUS}` variable. No hardcoded datasource references. ### BLOCKERS None. ### NITS 1. **MTTR red threshold is more aggressive than DORA spec.** MTTR red fires at >168h (1 week) but DORA "Low" is >1 month (720h). The 168h-720h range (1week to 1month) maps to the Medium-Low boundary in Accelerate. For a small team this is arguably better -- you _want_ to see red sooner. But it means the MTTR thresholds don't exactly match the standard while the other three metrics do. Consider documenting this intentional deviation in a comment within the JSON, or adjusting to 720h for consistency. 2. **Detail row thresholds not updated.** The CFR gauge panel (id 11, line 338-344) and MTTR detail stat panel (id 13, line 392-398) still use the old color scheme (`orange` instead of `blue`, CFR red at 30 instead of 15). The overview and detail panels now show different color bands for the same metric. This is not in scope per the PR description ("all 4 overview stat panels"), but the inconsistency could confuse dashboard users. Consider a follow-up issue to align detail row thresholds. 3. **30d trend panels vs 15d retention.** The PR body honestly documents that 30d panels will show partial data. This is fine, but consider adding a panel description or annotation in the Grafana JSON itself (the `description` field on panels 16 and 18) so dashboard viewers see this caveat without reading the PR. 4. **No CFR/MTTR trend panels.** Only DF and LT get trend panels. Issue #430 exists for "7d/30d rolling trend panels for DF and LT" specifically, so this is intentional. But consider noting in the PR body that CFR/MTTR trends are deferred so the scope boundary is explicit. ### SOP COMPLIANCE - [x] PR body has Summary, Changes, Test Plan, Related sections - [x] Review Checklist included and self-assessed - [x] No secrets, .env files, or credentials committed - [x] No unnecessary file changes -- single file changed (dora-dashboard.json), tightly scoped - [x] Commit scope matches issue description - [x] Closes #395 referenced correctly ### PROCESS OBSERVATIONS - This is a visibility improvement (DORA band colors + trend lines) that directly supports the DORA metrics initiative. No deployment risk -- it is a dashboard-only change with no infrastructure or application impact. - The 30d retention limitation is worth tracking. If Prometheus retention is extended later, these panels will automatically benefit with no further changes needed. - Issue #430 (DF+LT trend panels) appears to overlap with the trend panel portion of this PR. If this PR merges, #430 may be partially or fully resolved -- verify and close/update #430 accordingly. ### VERDICT: APPROVED
ldraney deleted branch 395-dora-dashboard-band-color-thresholds-rol 2026-06-13 20:38:37 +00:00
Sign in to join this conversation.
No description provided.