Deploy DORA exporter + Grafana dashboard (Phase 2+3) #10
No reviewers
Labels
No labels
domain:backend
domain:devops
domain:frontend
status:approved
status:in-progress
status:needs-fix
status:qa
type:bug
type:devops
type:feature
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
forgejo_admin/pal-e-platform!10
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "feat/dora-deploy-dashboard"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
woodpecker_api_tokento Makefile secret rendering pipelineCloses #9
Changes
terraform/main.tfkubernetes_secret_v1.dora_exporter-- env vars for Woodpecker + Forgejo API access using internal cluster URLs (*.svc.cluster.local)kubernetes_deployment_v1.dora_exporter-- single replica, 50m CPU / 32Mi-128Mi memory, liveness+readiness probes on/health:8000,envFromsecret injectionkubernetes_service_v1.dora_exporter-- ClusterIP on port 8000kubernetes_manifest.dora_exporter_service_monitor-- ServiceMonitor CRD, 60s scrape interval on/metricskubernetes_config_map_v1.dora_dashboard-- Grafana dashboard ConfigMap withgrafana_dashboard = "1"label for sidecar auto-discoverytofu fmtin existing MinIO IAM policy blockterraform/dashboards/dora-dashboard.json(new)Grafana dashboard JSON with:
rate(dora_deployments_total{status="success"}[1d]) * 86400per repodora_pr_lead_time_secondsp50/p95 per repo (displayed in hours)dora_deployments_total{status="failure"} / totalper repo (percentage)dora_deployment_last_success_timestampDS_PROMETHEUS(datasource) andrepo(multi-select fromlabel_values)terraform/variables.tfwoodpecker_api_token(string, sensitive) -- Woodpecker CI personal API tokendora_exporter_image(string, defaultharbor.tail5b443a.ts.net/pal-e-dora-exporter/dora-exporter:latest)Makefilewoodpecker_api_tokentoTF_SECRET_VARSlistValidation
Test Plan
woodpecker_api_tokento Salt pillar (salt/pillar/secrets/platform.sls)make tofu-secrets-- verifywoodpecker_api_tokenappears insecrets.auto.tfvarsmake tofu-plan-- review plan output for 5 new resourcesmake tofu-apply-- deploykubectl get pods -n monitoring -l app=dora-exportershows Runningdora-exporteras UPpromql: dora_deployments_totalreturns dataReview Checklist
tofu fmtappliedtofu validatepassesforgejo_admin_username/forgejo_admin_passwordvarssensitive = truedepends_onchains follow existing patternsTF_SECRET_VARSupdatedterraform/andMakefilePre-Apply Operator Steps
salt/pillar/secrets/platform.sls(GPG-encrypted)make tofu-secretsto render updated tfvarsmake tofu-planto reviewmake tofu-applyRelated Notes
plan-2026-03-01-dora-metrics-dashboard-- parent plan (Phase 2 + Phase 3)issue-pal-e-platform-dora-deploy-dashboard-- pal-e-docs issuedora-framework-- the axiom this makes measurableissue-pal-e-dora-exporter-service-- Phase 1 (resolved, PR #1 on pal-e-dora-exporter)Review-fix round 1:
Fixed the MTTR overview panel PromQL query in
terraform/dashboards/dora-dashboard.json.Issue: The query used
dora_deployment_last_success_timestamp{status="failure"}but this metric does not carry astatuslabel -- it tracks the timestamp of the last successful deployment per repo. Thestatus="failure"selector would return no data.Fix: Changed to
max((time() - dora_deployment_last_success_timestamp) / 3600)which computes the worst-case hours since last success across all repos, giving a meaningful aggregate MTTR overview value.Commit:
38a693a