Fix landscaping-assistant dashboard metric names #403

Open
ldraney wants to merge 15 commits from fix-dashboard-metric-names into main
Owner

Closes #401

Summary

Dashboard queries used assumed metric names from docs. Validated against live /metrics endpoint and corrected all 8 mismatched names.

Changes

  • terraform/dashboards/landscaping-assistant-golden-signals.json — fix all PromQL expressions to match actual yabeda output

Test Plan

  • All metric names verified against curl https://landscaping-assistant.tail5b443a.ts.net/metrics
  • tofu apply to update ConfigMap
  • Panels show data in Grafana (at least health check traffic visible)

Review Checklist

  • Histogram names include _seconds_ suffix (yabeda convention)
  • Puma gauge names match puma-stats exporter (no threads_ prefix)
  • Panel titles and legends updated to match new metric semantics

Acceptance Criteria

  • All 12 panels resolve queries without "No data" (given active traffic)
  • Follow-up to PR #402
  • Validated by curling live /metrics endpoint
Closes #401 ## Summary Dashboard queries used assumed metric names from docs. Validated against live `/metrics` endpoint and corrected all 8 mismatched names. ## Changes - `terraform/dashboards/landscaping-assistant-golden-signals.json` — fix all PromQL expressions to match actual yabeda output ## Test Plan - [x] All metric names verified against `curl https://landscaping-assistant.tail5b443a.ts.net/metrics` - [ ] `tofu apply` to update ConfigMap - [ ] Panels show data in Grafana (at least health check traffic visible) ## Review Checklist - [x] Histogram names include `_seconds_` suffix (yabeda convention) - [x] Puma gauge names match puma-stats exporter (no `threads_` prefix) - [x] Panel titles and legends updated to match new metric semantics ## Acceptance Criteria - All 12 panels resolve queries without "No data" (given active traffic) ## Related Notes - Follow-up to PR #402 - Validated by curling live `/metrics` endpoint
Fix dashboard metric names to match actual yabeda output
All checks were successful
ci/woodpecker/push/terraform Pipeline was successful
ci/woodpecker/pr/terraform Pipeline was successful
3a596edee6
Validated against live /metrics endpoint. Fixes:
- rails_request_duration_bucket → rails_request_duration_seconds_bucket
- rails_db_runtime_bucket → rails_db_runtime_seconds_bucket
- rails_view_runtime_bucket → rails_view_runtime_seconds_bucket
- puma_threads_running → puma_running
- puma_threads_total → puma_max_threads
- puma_threads_backlog → puma_backlog
- puma_threads_pool_capacity → puma_pool_capacity
- puma_workers (nonexistent) → puma_requests_count

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Author
Owner

PR #403 Review

DOMAIN REVIEW

Tech stack: Terraform/Grafana dashboard JSON (infrastructure/observability domain).

File changed: terraform/dashboards/landscaping-assistant-golden-signals.json (1 file)

The diff is 659 additions / 193 deletions, but the vast majority is json.dumps reformatting (compact single-line objects expanded to multi-line). The actual semantic changes are 8 metric name corrections and 3 panel title/legend updates. I verified each substantive change:

Histogram names -- adding _seconds_ suffix (yabeda convention):

  • rails_request_duration_bucket -> rails_request_duration_seconds_bucket (p50, p95, p99 panels) -- CORRECT
  • rails_db_runtime_bucket -> rails_db_runtime_seconds_bucket -- CORRECT
  • rails_view_runtime_bucket -> rails_view_runtime_seconds_bucket -- CORRECT

Puma gauge names -- removing threads_ prefix:

  • puma_threads_running -> puma_running (2 occurrences incl. utilization calc) -- CORRECT
  • puma_threads_total -> puma_max_threads (2 occurrences incl. utilization calc) -- CORRECT
  • puma_threads_backlog -> puma_backlog -- CORRECT
  • puma_threads_pool_capacity -> puma_pool_capacity -- CORRECT

Non-existent metric replacement:

  • puma_workers -> puma_requests_count -- CORRECT (panel title and legend also updated)

Utilization formula consistency check: The utilization percentage formula puma_running / clamp_min(puma_max_threads, 1) * 100 correctly uses the updated metric names on both sides of the division. No stale references remain.

Panel metadata updates: Title "Puma Workers & Pool Capacity" -> "Puma Requests & Pool Capacity" and legend labels updated to match new metric semantics ("max threads", "requests served"). These are accurate.

No structural changes: Panel IDs, grid positions, datasource references, schema version, and template variables are unchanged. The reformatting is cosmetic only.

BLOCKERS

None.

  • No new functionality requiring tests (this is a data-only config fix)
  • No user input handling, no secrets, no auth paths
  • No code logic -- only PromQL string corrections in a declarative JSON file

NITS

  1. JSON formatting churn: The reformatting from compact to expanded JSON creates noise in the diff (659+/193- for ~8 actual changes). Future dashboard edits could benefit from a consistent formatter (e.g., python -m json.tool or a pre-commit hook) so formatting is always normalized and diffs only show real changes. Not blocking -- this is a one-time cleanup.

  2. Test plan has unchecked items: The PR body shows tofu apply and "Panels show data in Grafana" as unchecked. These are post-merge validation steps, which is fine, but worth confirming they will be done after merge.

SOP COMPLIANCE

  • Branch named after issue -- Branch is fix-dashboard-metric-names, not 401-fix-dashboard-metric-names. Missing the issue number prefix per {issue-number}-{kebab-case-purpose} convention. Non-blocking since the parent issue #401 is already closed and this is a hotfix follow-up to PR #402.
  • PR body follows template -- Has Summary, Changes, Test Plan, Review Checklist, Acceptance Criteria, Related sections
  • Related references plan slug -- Related section references PR #402 but no plan slug. Acceptable for a quick-fix PR.
  • No secrets committed -- No credentials, API keys, or sensitive data in the diff
  • No scope creep -- All changes are strictly metric name corrections in the single dashboard file
  • Commit messages -- PR title is descriptive and accurate

PROCESS OBSERVATIONS

  • Change failure risk: LOW. This is a config-only fix to a Grafana dashboard. Incorrect metric names mean "No data" panels, not outages. The fix was validated against the live /metrics endpoint, which is the right approach.
  • Deployment frequency: This is a fast follow-up to PR #402 (the original dashboard), showing good iteration speed.
  • Documentation gap: Consider adding a brief note in the repo (README or a conventions doc) about yabeda's _seconds_ histogram naming convention so future dashboard authors don't repeat the same mistake.

VERDICT: APPROVED

Clean, well-scoped fix. All 8 metric name corrections are consistent with yabeda/puma-stats-exporter conventions. No stale references, formula consistency verified, panel metadata updated to match. The JSON reformatting noise is cosmetic and harmless.

## PR #403 Review ### DOMAIN REVIEW **Tech stack**: Terraform/Grafana dashboard JSON (infrastructure/observability domain). **File changed**: `terraform/dashboards/landscaping-assistant-golden-signals.json` (1 file) The diff is 659 additions / 193 deletions, but the vast majority is `json.dumps` reformatting (compact single-line objects expanded to multi-line). The actual semantic changes are 8 metric name corrections and 3 panel title/legend updates. I verified each substantive change: **Histogram names -- adding `_seconds_` suffix (yabeda convention):** - `rails_request_duration_bucket` -> `rails_request_duration_seconds_bucket` (p50, p95, p99 panels) -- CORRECT - `rails_db_runtime_bucket` -> `rails_db_runtime_seconds_bucket` -- CORRECT - `rails_view_runtime_bucket` -> `rails_view_runtime_seconds_bucket` -- CORRECT **Puma gauge names -- removing `threads_` prefix:** - `puma_threads_running` -> `puma_running` (2 occurrences incl. utilization calc) -- CORRECT - `puma_threads_total` -> `puma_max_threads` (2 occurrences incl. utilization calc) -- CORRECT - `puma_threads_backlog` -> `puma_backlog` -- CORRECT - `puma_threads_pool_capacity` -> `puma_pool_capacity` -- CORRECT **Non-existent metric replacement:** - `puma_workers` -> `puma_requests_count` -- CORRECT (panel title and legend also updated) **Utilization formula consistency check**: The utilization percentage formula `puma_running / clamp_min(puma_max_threads, 1) * 100` correctly uses the updated metric names on both sides of the division. No stale references remain. **Panel metadata updates**: Title "Puma Workers & Pool Capacity" -> "Puma Requests & Pool Capacity" and legend labels updated to match new metric semantics ("max threads", "requests served"). These are accurate. **No structural changes**: Panel IDs, grid positions, datasource references, schema version, and template variables are unchanged. The reformatting is cosmetic only. ### BLOCKERS None. - No new functionality requiring tests (this is a data-only config fix) - No user input handling, no secrets, no auth paths - No code logic -- only PromQL string corrections in a declarative JSON file ### NITS 1. **JSON formatting churn**: The reformatting from compact to expanded JSON creates noise in the diff (659+/193- for ~8 actual changes). Future dashboard edits could benefit from a consistent formatter (e.g., `python -m json.tool` or a pre-commit hook) so formatting is always normalized and diffs only show real changes. Not blocking -- this is a one-time cleanup. 2. **Test plan has unchecked items**: The PR body shows `tofu apply` and "Panels show data in Grafana" as unchecked. These are post-merge validation steps, which is fine, but worth confirming they will be done after merge. ### SOP COMPLIANCE - [ ] Branch named after issue -- Branch is `fix-dashboard-metric-names`, not `401-fix-dashboard-metric-names`. Missing the issue number prefix per `{issue-number}-{kebab-case-purpose}` convention. Non-blocking since the parent issue #401 is already closed and this is a hotfix follow-up to PR #402. - [x] PR body follows template -- Has Summary, Changes, Test Plan, Review Checklist, Acceptance Criteria, Related sections - [ ] Related references plan slug -- Related section references PR #402 but no plan slug. Acceptable for a quick-fix PR. - [x] No secrets committed -- No credentials, API keys, or sensitive data in the diff - [x] No scope creep -- All changes are strictly metric name corrections in the single dashboard file - [x] Commit messages -- PR title is descriptive and accurate ### PROCESS OBSERVATIONS - **Change failure risk**: LOW. This is a config-only fix to a Grafana dashboard. Incorrect metric names mean "No data" panels, not outages. The fix was validated against the live `/metrics` endpoint, which is the right approach. - **Deployment frequency**: This is a fast follow-up to PR #402 (the original dashboard), showing good iteration speed. - **Documentation gap**: Consider adding a brief note in the repo (README or a conventions doc) about yabeda's `_seconds_` histogram naming convention so future dashboard authors don't repeat the same mistake. ### VERDICT: APPROVED Clean, well-scoped fix. All 8 metric name corrections are consistent with yabeda/puma-stats-exporter conventions. No stale references, formula consistency verified, panel metadata updated to match. The JSON reformatting noise is cosmetic and harmless.
Merge branch 'main' into fix-dashboard-metric-names
All checks were successful
ci/woodpecker/push/terraform Pipeline was successful
ci/woodpecker/pr/terraform Pipeline was successful
c9ee1c6db7
Merge branch 'main' into fix-dashboard-metric-names
All checks were successful
ci/woodpecker/push/terraform Pipeline was successful
ci/woodpecker/pr/terraform Pipeline was successful
dfff393c6d
Merge branch 'main' into fix-dashboard-metric-names
All checks were successful
ci/woodpecker/push/terraform Pipeline was successful
ci/woodpecker/pr/terraform Pipeline was successful
36c23c941b
Merge branch 'main' into fix-dashboard-metric-names
All checks were successful
ci/woodpecker/push/terraform Pipeline was successful
ci/woodpecker/pr/terraform Pipeline was successful
08cd6ad03b
Merge branch 'main' into fix-dashboard-metric-names
Some checks failed
ci/woodpecker/pr/terraform Pipeline failed
ci/woodpecker/push/terraform Pipeline was successful
6b7a960a8f
Merge branch 'main' into fix-dashboard-metric-names
All checks were successful
ci/woodpecker/push/terraform Pipeline was successful
ci/woodpecker/pr/terraform Pipeline was successful
e8b70adef6
Merge branch 'main' into fix-dashboard-metric-names
All checks were successful
ci/woodpecker/push/terraform Pipeline was successful
ci/woodpecker/pr/terraform Pipeline was successful
7a1681412c
Merge branch 'main' into fix-dashboard-metric-names
All checks were successful
ci/woodpecker/push/terraform Pipeline was successful
ci/woodpecker/pr/terraform Pipeline was successful
ada6bdc073
Merge branch 'main' into fix-dashboard-metric-names
All checks were successful
ci/woodpecker/push/terraform Pipeline was successful
ci/woodpecker/pr/terraform Pipeline was successful
62ec69a3b4
Merge branch 'main' into fix-dashboard-metric-names
All checks were successful
ci/woodpecker/push/terraform Pipeline was successful
ci/woodpecker/pr/terraform Pipeline was successful
277a60f2be
Merge branch 'main' into fix-dashboard-metric-names
All checks were successful
ci/woodpecker/push/terraform Pipeline was successful
ci/woodpecker/pr/terraform Pipeline was successful
bdcc676749
Merge branch 'main' into fix-dashboard-metric-names
All checks were successful
ci/woodpecker/push/terraform Pipeline was successful
ci/woodpecker/pr/terraform Pipeline was successful
055df8171f
Merge branch 'main' into fix-dashboard-metric-names
Some checks failed
ci/woodpecker/push/terraform Pipeline was successful
ci/woodpecker/pr/terraform Pipeline failed
94cae11fd4
Merge branch 'main' into fix-dashboard-metric-names
All checks were successful
ci/woodpecker/push/terraform Pipeline was successful
ci/woodpecker/pr/terraform Pipeline was successful
7955a36500
All checks were successful
ci/woodpecker/push/terraform Pipeline was successful
ci/woodpecker/pr/terraform Pipeline was successful
This pull request can be merged automatically.
You are not authorized to merge this pull request.
View command line instructions

Checkout

From your project repository, check out a new branch and test the changes.
git fetch -u origin fix-dashboard-metric-names:fix-dashboard-metric-names
git switch fix-dashboard-metric-names
Sign in to join this conversation.
No description provided.