Add Stripe webhook alerting rules and basketball-api Grafana dashboard #272
Labels
No labels
domain:backend
domain:devops
domain:frontend
status:approved
status:in-progress
status:needs-fix
status:qa
type:bug
type:devops
type:feature
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
forgejo_admin/pal-e-platform#272
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Type
Bug
Lineage
Discovered during forgejo_admin/basketball-api#340 investigation. Stripe webhooks were silently failing for weeks with zero alerting. Depends on forgejo_admin/basketball-api#350 (webhook metrics must exist before platform can alert on them).
Repo
forgejo_admin/pal-e-platformWhat Broke
No observability on Stripe webhook delivery. When webhooks failed (for multiple reasons — secret mismatch, SDK crash, deploy downtime), there was zero alerting. 7+ payments ($870+) went unrecorded. Issue only discovered when a user emailed about a different bug. basketball-api currently exposes only
basketball_api_upon/metrics— no request rates, no webhook counters, no error tracking.Repro Steps
Expected Behavior
Platform detects and alerts within minutes when:
pendingjersey orders exist > 24 hours withoutjersey_optionset (stale checkout)Grafana dashboard shows webhook success/failure rate and event type breakdown.
Environment
basketball-apiexists, scrapes/metricsbasketball_api_upgaugewebhook_received_total,webhook_processed_total,webhook_errors_total,webhook_last_received_timestampAcceptance Criteria
WebhookErrorRateHigh—rate(webhook_errors_total[5m]) > 0for 5m, severity warningWebhookStale—time() - webhook_last_received_timestamp > 86400for 1h, severity warningRelated
project-westside-basketball— project this affectsforgejo_admin/basketball-api#350— dependency (metrics must exist first)platform-architecture— monitoring moduleterraform/modules/monitoring/main.tf(PrometheusRules),terraform/dashboards/(Grafana dashboards)