Clean up AlertManager: disable default kube-prometheus-stack rules #95
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Type
Feature
Lineage
Standalone — discovered during observability audit session 2026-06-04. Prerequisite to #17.
Repo
ldraney/pal-e-platformUser Story
As the platform operator
I want to disable the ~100 default kube-prometheus-stack alert rules that are noise for a solo-dev cluster
So that AlertManager only fires for actionable, custom alerts I've written
Context
The cluster has 123 alert rules across 18 groups. ~95 are kube-prometheus-stack defaults designed for multi-team Kubernetes operations (alertmanager internals, kubelet health, 26 node-exporter rules, 23 Prometheus self-monitoring rules, API server SLOs). These never fire for real issues on this cluster and dilute attention from the ~28 custom rules that do.
Currently 11 alerts firing — all from custom rules or general health checks:
File Targets
Files the agent should modify:
pal-e-platform: kube-prometheus-stack Helm values — setdefaultRules.create: falseor selectively disable rule groupspal-e-platform: review/remove stale blackbox probes for platform-validation and playme2kpal-e-platform: review OOMKilled pods — raise memory limits or investigate root causeFiles the agent should NOT touch:
Acceptance Criteria
Test Expectations
kubectl get prometheusrules -n monitoringshows only custom rules/api/v1/rulesalert count is ~28 (custom only)Constraints
Checklist
Related