fix: remove invalid Slack receiver from alertmanager config #82
Labels
No labels
domain:backend
domain:devops
domain:frontend
status:approved
status:in-progress
status:needs-fix
status:qa
type:bug
type:devops
type:feature
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
forgejo_admin/pal-e-platform#82
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Lineage
plan-pal-e-platform→ Phase 16 → 16a (Alertmanager Slack URL fix)Repo
forgejo_admin/pal-e-platformUser Story
As a platform operator
I want the Prometheus Operator to reconcile alertmanager successfully
So that alertmanager config changes actually take effect and the PrometheusOperatorSyncFailed alert stops firing
Context
The Slack receiver in the kube-prometheus-stack helm values has
api_url: ' '(a single space). The Prometheus Operator tries to parse this as a URL every ~3 minutes and fails withunsupported scheme "" for URL. This causes:PrometheusOperatorSyncFailedalert firing continuouslyPrometheusOperatorReconcileErrorsalert pendingTelegram is the primary alerting receiver. The Slack receiver was a placeholder that was never configured. It should be removed entirely.
File Targets
Files the agent should modify:
terraform/main.tf— remove theslackreceiver from thealertmanager.config.receiverssection in thekube_prometheus_stackhelm_release values. Also remove anyslackroute references.Files the agent should NOT touch:
Acceptance Criteria
tofu validatepassestofu fmtproduces no changestofu plan -lock=falseshows only the alertmanager config changeTest Expectations
tofu validate— must passtofu plan -lock=false— review output for expected changes onlyConstraints
tofu fmtbefore committingtofu validatebefore committingtofu apply— Betty Sue handles apply after mergetofu plan -lock=falseoutput in the PR descriptionChecklist
tofu validatepassesRelated
phase-platform-16-alert-tuning— parent phasepal-e-platform— projectPR #83 Review
DOMAIN REVIEW
Tech stack: Terraform/Helm (kube-prometheus-stack alertmanager config), Woodpecker CI pipeline, Makefile.
Terraform changes (terraform/main.tf):
alertmanager.config.receivers. Theconcat()with conditional Slack block replaced by a clean static list containing onlydefaultandtelegram. Correct.route.routes. Simplified toroutes = []. Correct -- Telegram is the default receiver, no sub-routes needed.set_sensitiveblock forslack_configs[0].api_urlremoved. Correct.set_sensitiveblocks referencereceivers[1]-- still correct after removal because Telegram remains index 1 in the simplified list. No index drift.Variable cleanup (terraform/variables.tf):
slack_webhook_urlvariable declaration removed (haddefault = "",sensitive = true). Clean removal.CI cleanup (.woodpecker.yaml):
TF_VAR_slack_webhook_urlremoved from bothplan(line 47-48) andapply(line 116-117) steps. The Woodpecker secrettf_var_slack_webhook_urlwill remain in Woodpecker's secret store as a harmless orphan -- no functional impact. Can be cleaned up separately if desired.Makefile cleanup:
slack_webhook_urlremoved fromTF_SECRET_VARS. Nextmake tofu-secretswill no longer render it tosecrets.auto.tfvars. Correct.Salt pillar (intentionally retained):
salt/pillar/secrets_registry.sls(line 105-111) andsalt/pillar/secrets/platform.sls(line 261) retain theslack_webhook_urlentry. The registry already marks it asdormant -- value 'unused'. PR body explicitly documents this decision. Acceptable -- Salt pillar serves as the historical backup/audit layer.tofu fmt / tofu validate:
tofu fmt -recursiveandtofu validatepassed. The CIvalidatestep in.woodpecker.yamlruns bothtofu fmt -check -recursiveandtofu validateon PR events, providing automated verification.tofu plan output:
0 to add, 2 to change, 0 to destroy. The two changes are thehelm_release.kube_prometheus_stack(alertmanager config) andkubernetes_secret_v1.dora_exporter(unrelated state drift from write-only attributes). No unexpected resources.Post-merge note: The local
terraform/secrets.auto.tfvarsstill containsslack_webhook_url = "unused"on disk (gitignored). After merge, it will reference an undeclared variable until the nextmake tofu-secretsregeneration. OpenTofu emits a warning (not an error) for undefined variables in.auto.tfvars. The CI pipeline does not use this file (secrets come viaTF_VAR_*env vars), so CI is unaffected. A manualmake tofu-secretsafter merge will regenerate the file cleanly. No action required.BLOCKERS
None.
NITS
tf_var_slack_webhook_urlremains in Woodpecker's secret store. Harmless but could be cleaned up to reduce confusion. Low priority.SOP COMPLIANCE
82-fix-remove-invalid-slack-receiver-from-areferences #82)Closes #82tofu planoutput included in PR body (per CLAUDE.md convention)tofu fmtandtofu validaterun (per CLAUDE.md convention).tfvarsis gitignored)fix: remove invalid Slack receiver from alertmanager config)PROCESS OBSERVATIONS
tofu planconfirms only 2 resources change, both expected. No new infrastructure introduced.slack_webhook_urlin the Terraform/CI path were updated. The Salt pillar retention decision is documented in the PR body. Thorough.VERDICT: APPROVED