Add tier label to all Prometheus metrics #3

Merged
ldraney merged 1 commit from 2-add-tier-label-to-all-prometheus-metrics into main 2026-06-13 20:05:38 +00:00
Owner

Summary

Adds a tier label ("1", "2", or "3") to all six Prometheus metric families emitted by the DORA exporter. Tier assignment is driven by a YAML config file (tiers.yaml) loaded via the DORA_TIERS_FILE environment variable. Repos not listed default to tier 3. Matching strips the owner prefix so "forgejo_admin/pal-e-docs" resolves to the "pal-e-docs" entry.

Changes

  • src/config.py -- Added DORA_TIERS_FILE env var, load_tiers() YAML parser, and Config.tier_for_repo() lookup method
  • src/collectors/forgejo.py -- Added "tier" to label lists on pr_merges_total, pr_lead_time_seconds, pr_review_count; collectors store Config ref and pass tier to all .labels() calls
  • src/collectors/woodpecker.py -- Added "tier" to label lists on deployments_total, deployment_duration_seconds, deployment_last_success_timestamp; same config/tier pattern
  • tiers.yaml -- Initial tier mapping (tier 1: platform-critical repos, tier 2: tooling repos, tier 3: default)
  • requirements.txt -- Added pyyaml>=6.0
  • tests/test_tiers.py -- 11 unit tests covering YAML parsing, owner-prefix stripping, default tier fallback, and metric label verification

Test Plan

  • python -m pytest tests/test_tiers.py -v -- all 11 tests pass
  • ruff check src/ tests/ and ruff format --check src/ tests/ -- clean
  • Deploy with DORA_TIERS_FILE=/path/to/tiers.yaml and verify /metrics output includes tier="1", tier="2", tier="3" labels

Review Checklist

  • No breaking changes to existing metric names or label sets (tier is additive)
  • Dockerfile and .woodpecker.yaml not modified per issue constraints
  • ruff format and ruff check pass
  • All 11 unit tests pass
  • Default tier 3 for unlisted repos

None -- no pal-e-docs notes for this change.

Closes #2
Parent issue: ldraney/DORA#1

## Summary Adds a `tier` label ("1", "2", or "3") to all six Prometheus metric families emitted by the DORA exporter. Tier assignment is driven by a YAML config file (`tiers.yaml`) loaded via the `DORA_TIERS_FILE` environment variable. Repos not listed default to tier 3. Matching strips the owner prefix so "forgejo_admin/pal-e-docs" resolves to the "pal-e-docs" entry. ## Changes - `src/config.py` -- Added `DORA_TIERS_FILE` env var, `load_tiers()` YAML parser, and `Config.tier_for_repo()` lookup method - `src/collectors/forgejo.py` -- Added `"tier"` to label lists on `pr_merges_total`, `pr_lead_time_seconds`, `pr_review_count`; collectors store `Config` ref and pass tier to all `.labels()` calls - `src/collectors/woodpecker.py` -- Added `"tier"` to label lists on `deployments_total`, `deployment_duration_seconds`, `deployment_last_success_timestamp`; same config/tier pattern - `tiers.yaml` -- Initial tier mapping (tier 1: platform-critical repos, tier 2: tooling repos, tier 3: default) - `requirements.txt` -- Added `pyyaml>=6.0` - `tests/test_tiers.py` -- 11 unit tests covering YAML parsing, owner-prefix stripping, default tier fallback, and metric label verification ## Test Plan - `python -m pytest tests/test_tiers.py -v` -- all 11 tests pass - `ruff check src/ tests/` and `ruff format --check src/ tests/` -- clean - Deploy with `DORA_TIERS_FILE=/path/to/tiers.yaml` and verify `/metrics` output includes `tier="1"`, `tier="2"`, `tier="3"` labels ## Review Checklist - [x] No breaking changes to existing metric names or label sets (tier is additive) - [x] Dockerfile and .woodpecker.yaml not modified per issue constraints - [x] ruff format and ruff check pass - [x] All 11 unit tests pass - [x] Default tier 3 for unlisted repos ## Related Notes None -- no pal-e-docs notes for this change. ## Related Closes #2 Parent issue: ldraney/DORA#1
Load a YAML tier config via DORA_TIERS_FILE env var and attach a
tier="1"|"2"|"3" label to every metric emitted by both the Forgejo and
Woodpecker collectors. Repos not in the config default to tier 3.
Matching strips the owner prefix so "forgejo_admin/pal-e-docs" matches
the "pal-e-docs" entry in tiers.yaml.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Author
Owner

QA Review -- PR #3

Summary

Clean, focused PR. Adds tier label to all 6 Prometheus metric families via a YAML config loaded from DORA_TIERS_FILE env var. Owner prefix stripping handles the owner/repo vs repo mismatch correctly.

Findings

No blocking issues.

Observations (non-blocking):

  1. main.py changes are format-only -- ruff reformatted two asyncio.create_task() calls. No functional change. Acceptable since the repo should stay consistently formatted.

  2. load_tiers() uses open() without encoding -- with open(file_path) as f: relies on system default encoding. YAML files are UTF-8 by convention and yaml.safe_load handles bytes, so this is fine in practice. Could add encoding="utf-8" for explicitness but not required.

  3. Test coverage is thorough -- 11 tests covering: YAML parsing (all tiers, partial, empty, missing file), owner prefix stripping, default tier fallback, no-config-file fallback, and metric label presence on all 6 metrics. Good boundary coverage.

  4. tiers.yaml matches the spec -- Tier 1 and Tier 2 lists match what was specified in the issue. No tier_3 section needed since unlisted repos default to 3.

  5. No breaking changes -- Metric names unchanged, tier is additive to existing label sets. Existing Grafana queries using {repo="..."} will continue to work; they just gain a new label dimension.

Checklist

  • All acceptance criteria from issue #2 met
  • tier label present on all 6 metric families (3 Forgejo + 3 Woodpecker)
  • Owner prefix stripping works (tested with forgejo_admin/pal-e-platform)
  • Default tier 3 for unlisted repos
  • DORA_TIERS_FILE env var optional (graceful fallback when unset)
  • pyyaml added to requirements.txt
  • Dockerfile and .woodpecker.yaml untouched per constraints
  • ruff clean, 11/11 tests pass

VERDICT: APPROVE

## QA Review -- PR #3 ### Summary Clean, focused PR. Adds `tier` label to all 6 Prometheus metric families via a YAML config loaded from `DORA_TIERS_FILE` env var. Owner prefix stripping handles the `owner/repo` vs `repo` mismatch correctly. ### Findings **No blocking issues.** **Observations (non-blocking):** 1. **`main.py` changes are format-only** -- ruff reformatted two `asyncio.create_task()` calls. No functional change. Acceptable since the repo should stay consistently formatted. 2. **`load_tiers()` uses `open()` without encoding** -- `with open(file_path) as f:` relies on system default encoding. YAML files are UTF-8 by convention and `yaml.safe_load` handles bytes, so this is fine in practice. Could add `encoding="utf-8"` for explicitness but not required. 3. **Test coverage is thorough** -- 11 tests covering: YAML parsing (all tiers, partial, empty, missing file), owner prefix stripping, default tier fallback, no-config-file fallback, and metric label presence on all 6 metrics. Good boundary coverage. 4. **`tiers.yaml` matches the spec** -- Tier 1 and Tier 2 lists match what was specified in the issue. No tier_3 section needed since unlisted repos default to 3. 5. **No breaking changes** -- Metric names unchanged, `tier` is additive to existing label sets. Existing Grafana queries using `{repo="..."}` will continue to work; they just gain a new label dimension. ### Checklist - [x] All acceptance criteria from issue #2 met - [x] `tier` label present on all 6 metric families (3 Forgejo + 3 Woodpecker) - [x] Owner prefix stripping works (tested with `forgejo_admin/pal-e-platform`) - [x] Default tier 3 for unlisted repos - [x] `DORA_TIERS_FILE` env var optional (graceful fallback when unset) - [x] `pyyaml` added to requirements.txt - [x] Dockerfile and .woodpecker.yaml untouched per constraints - [x] ruff clean, 11/11 tests pass **VERDICT: APPROVE**
Author
Owner

PR #3 Review

DOMAIN REVIEW

Stack: Python / FastAPI / Prometheus client / PyYAML / pytest

Tier label coverage -- all 6 metric families updated:

Metric Type File Tier added
dora_pr_merges_total Counter src/collectors/forgejo.py Yes
dora_pr_lead_time_seconds Histogram src/collectors/forgejo.py Yes
dora_pr_review_count Histogram src/collectors/forgejo.py Yes
dora_deployments_total Counter src/collectors/woodpecker.py Yes
dora_deployment_duration_seconds Histogram src/collectors/woodpecker.py Yes
dora_deployment_last_success_timestamp Gauge src/collectors/woodpecker.py Yes

Config design is solid:

  • DORA_TIERS_FILE env var for external config -- not hardcoded
  • load_tiers() parses YAML with tier_1/tier_2/tier_3 keys into flat {repo: tier} mapping
  • tier_for_repo() strips owner prefix via rsplit("/", 1)[-1] -- handles both "owner/repo" and "repo" forms correctly
  • Unlisted repos default to tier "3" via DEFAULT_TIER constant -- safe fallback
  • Missing/empty tiers file logs warning and degrades gracefully (all repos get tier 3)

Backwards compatibility:
Adding a label is technically a breaking change for Prometheus, but the impact is minimal here. Each repo maps to exactly one tier, so there is no cardinality explosion. Existing PromQL aggregations using sum() or rate() without explicit label matchers will continue to work. Queries with exact label matchers like {repo="foo"} also work since tier is additive. Any Grafana dashboards filtering by repo will be unaffected. This is safe to ship.

Test coverage -- 11 tests across 3 test classes:

  • TestLoadTiers (4): YAML parsing, missing file, empty file, partial tiers
  • TestTierForRepo (5): owner-prefix stripping, tier 1/2 lookup, unknown repo defaults, no tiers file scenario
  • TestCollectorTierLabels (2): structural verification that all 6 metric definitions include "tier" in _labelnames

Tests cover the critical paths: config parsing, lookup logic, default behavior, and metric label verification.

Python/PEP compliance:

  • Type hints present (dict[str, str])
  • Docstrings on public functions (PEP 257 style)
  • DEFAULT_TIER as module-level constant rather than magic string
  • PR states ruff check and ruff format pass

BLOCKERS

None.

NITS

  1. src/main.py formatting changes -- Lines 60-67 are pure reformatting of asyncio.create_task() calls, unrelated to the tier feature. Harmless (likely ruff autoformat) but technically scope creep. Not blocking.

  2. from datetime import datetime cleanup in forgejo.py line 177 -- The timezone import was removed. This appears correct (method uses fromisoformat() which handles timezone suffixes natively in Python 3.11+), but confirm the target Python version is >= 3.11.

  3. tiers.yaml at repo root -- Consider adding a comment or README note clarifying this is a reference/default config, and production should set DORA_TIERS_FILE to point to a deployed config (e.g., via ConfigMap or volume mount). Currently the file is self-documenting with its header comment, which is adequate.

  4. Duplicate repo name collision -- If two repos in different orgs share the same name (e.g., ldraney/utils and forgejo_admin/utils), they would collide to the same tier. The current owner-stripping design makes this intentional, but worth documenting as a known limitation if multi-org support grows.

SOP COMPLIANCE

  • PR body has: Summary, Changes, Test Plan, Related
  • No secrets, .env files, or credentials committed
  • No unnecessary file changes (main.py formatting is minor)
  • Commit messages align with issue scope
  • Closes #2 referenced correctly
  • Parent issue cross-reference (ldraney/DORA#1) included
  • ruff check and ruff format reported clean

PROCESS OBSERVATIONS

  • Deployment frequency: This is a config-driven feature addition. The YAML-based tier mapping means future tier changes are config-only (no code changes), which is excellent for deployment frequency.
  • Change failure risk: Low. The tier label is purely additive to metrics. Default-to-tier-3 means the exporter never fails due to missing config. Graceful degradation on missing tiers file eliminates a class of startup failures.
  • Documentation gap: The tiers.yaml schema (what keys are valid, what values mean) is documented in code docstrings but not in a README. Consider adding a section to the project README for operator reference.

VERDICT: APPROVED

## PR #3 Review ### DOMAIN REVIEW **Stack:** Python / FastAPI / Prometheus client / PyYAML / pytest **Tier label coverage -- all 6 metric families updated:** | Metric | Type | File | Tier added | |--------|------|------|------------| | `dora_pr_merges_total` | Counter | `src/collectors/forgejo.py` | Yes | | `dora_pr_lead_time_seconds` | Histogram | `src/collectors/forgejo.py` | Yes | | `dora_pr_review_count` | Histogram | `src/collectors/forgejo.py` | Yes | | `dora_deployments_total` | Counter | `src/collectors/woodpecker.py` | Yes | | `dora_deployment_duration_seconds` | Histogram | `src/collectors/woodpecker.py` | Yes | | `dora_deployment_last_success_timestamp` | Gauge | `src/collectors/woodpecker.py` | Yes | **Config design is solid:** - `DORA_TIERS_FILE` env var for external config -- not hardcoded - `load_tiers()` parses YAML with `tier_1`/`tier_2`/`tier_3` keys into flat `{repo: tier}` mapping - `tier_for_repo()` strips owner prefix via `rsplit("/", 1)[-1]` -- handles both `"owner/repo"` and `"repo"` forms correctly - Unlisted repos default to tier `"3"` via `DEFAULT_TIER` constant -- safe fallback - Missing/empty tiers file logs warning and degrades gracefully (all repos get tier 3) **Backwards compatibility:** Adding a label is technically a breaking change for Prometheus, but the impact is minimal here. Each repo maps to exactly one tier, so there is no cardinality explosion. Existing PromQL aggregations using `sum()` or `rate()` without explicit label matchers will continue to work. Queries with exact label matchers like `{repo="foo"}` also work since `tier` is additive. Any Grafana dashboards filtering by `repo` will be unaffected. This is safe to ship. **Test coverage -- 11 tests across 3 test classes:** - `TestLoadTiers` (4): YAML parsing, missing file, empty file, partial tiers - `TestTierForRepo` (5): owner-prefix stripping, tier 1/2 lookup, unknown repo defaults, no tiers file scenario - `TestCollectorTierLabels` (2): structural verification that all 6 metric definitions include `"tier"` in `_labelnames` Tests cover the critical paths: config parsing, lookup logic, default behavior, and metric label verification. **Python/PEP compliance:** - Type hints present (`dict[str, str]`) - Docstrings on public functions (PEP 257 style) - `DEFAULT_TIER` as module-level constant rather than magic string - PR states ruff check and ruff format pass ### BLOCKERS None. ### NITS 1. **`src/main.py` formatting changes** -- Lines 60-67 are pure reformatting of `asyncio.create_task()` calls, unrelated to the tier feature. Harmless (likely ruff autoformat) but technically scope creep. Not blocking. 2. **`from datetime import datetime` cleanup in `forgejo.py` line 177** -- The `timezone` import was removed. This appears correct (method uses `fromisoformat()` which handles timezone suffixes natively in Python 3.11+), but confirm the target Python version is >= 3.11. 3. **`tiers.yaml` at repo root** -- Consider adding a comment or README note clarifying this is a reference/default config, and production should set `DORA_TIERS_FILE` to point to a deployed config (e.g., via ConfigMap or volume mount). Currently the file is self-documenting with its header comment, which is adequate. 4. **Duplicate repo name collision** -- If two repos in different orgs share the same name (e.g., `ldraney/utils` and `forgejo_admin/utils`), they would collide to the same tier. The current owner-stripping design makes this intentional, but worth documenting as a known limitation if multi-org support grows. ### SOP COMPLIANCE - [x] PR body has: Summary, Changes, Test Plan, Related - [x] No secrets, .env files, or credentials committed - [x] No unnecessary file changes (main.py formatting is minor) - [x] Commit messages align with issue scope - [x] Closes #2 referenced correctly - [x] Parent issue cross-reference (`ldraney/DORA#1`) included - [x] ruff check and ruff format reported clean ### PROCESS OBSERVATIONS - **Deployment frequency:** This is a config-driven feature addition. The YAML-based tier mapping means future tier changes are config-only (no code changes), which is excellent for deployment frequency. - **Change failure risk:** Low. The tier label is purely additive to metrics. Default-to-tier-3 means the exporter never fails due to missing config. Graceful degradation on missing tiers file eliminates a class of startup failures. - **Documentation gap:** The `tiers.yaml` schema (what keys are valid, what values mean) is documented in code docstrings but not in a README. Consider adding a section to the project README for operator reference. ### VERDICT: APPROVED
ldraney deleted branch 2-add-tier-label-to-all-prometheus-metrics 2026-06-13 20:05:38 +00:00
Sign in to join this conversation.
No description provided.