fix: add monitoring ingress to Keycloak NetworkPolicy + use internal probe URL #117

Merged
forgejo_admin merged 1 commit from 111-keycloak-probe-networkpolicy into main 2026-03-21 16:01:49 +00:00

Summary

Fixes the Keycloak blackbox probe alert by adding monitoring namespace ingress to the Keycloak NetworkPolicy and switching the probe URL from the external HTTPS endpoint to the cluster-internal HTTP endpoint with a /realms/master health path.

Changes

  • terraform/network-policies.tf: Added monitoring namespace to netpol_keycloak ingress rules, matching the pattern used by every other service NetworkPolicy in the file
  • terraform/main.tf: Changed Keycloak blackbox probe URL from https://keycloak.tail5b443a.ts.net to http://keycloak.keycloak.svc.cluster.local:80/realms/master

tofu plan Output

Plan shows the two expected changes from this PR (plus 4 unrelated state drift items from #113):

# kubernetes_manifest.netpol_keycloak will be updated in-place
  ~ ingress = [
        { from = [{ namespaceSelector = { matchLabels = { "kubernetes.io/metadata.name" = "tailscale" } } }] },
      + { from = [{ namespaceSelector = { matchLabels = { "kubernetes.io/metadata.name" = "monitoring" } } }] },
    ]

# helm_release.blackbox_exporter will be updated in-place
  -     "url": "https://keycloak.tail5b443a.ts.net"
  +     "url": "http://keycloak.keycloak.svc.cluster.local:80/realms/master"

Plan: 0 to add, 6 to change, 0 to destroy.

The 4 additional changes are state drift from previously merged PRs (tracked in #113), not introduced by this PR.

Test Plan

  • tofu plan -lock=false shows only expected keycloak-related changes from this PR
  • After apply, BlackboxProbeFailed alert for keycloak clears within 2 minutes
  • Verify probe_success{instance="keycloak"} returns 1 in Prometheus

Review Checklist

  • tofu fmt -- no formatting changes needed
  • tofu validate -- passes
  • tofu plan -lock=false -- only expected changes, 0 to add, 0 to destroy
  • NetworkPolicy pattern matches existing services (forgejo, woodpecker, harbor, minio all have monitoring ingress)
  • README roadmap -- no update needed
  • Plan: plan-pal-e-platform
  • Forgejo issue: #111
  • Parent epic: #109

Closes #111

## Summary Fixes the Keycloak blackbox probe alert by adding monitoring namespace ingress to the Keycloak NetworkPolicy and switching the probe URL from the external HTTPS endpoint to the cluster-internal HTTP endpoint with a `/realms/master` health path. ## Changes - `terraform/network-policies.tf`: Added `monitoring` namespace to `netpol_keycloak` ingress rules, matching the pattern used by every other service NetworkPolicy in the file - `terraform/main.tf`: Changed Keycloak blackbox probe URL from `https://keycloak.tail5b443a.ts.net` to `http://keycloak.keycloak.svc.cluster.local:80/realms/master` ## tofu plan Output Plan shows the two expected changes from this PR (plus 4 unrelated state drift items from #113): ``` # kubernetes_manifest.netpol_keycloak will be updated in-place ~ ingress = [ { from = [{ namespaceSelector = { matchLabels = { "kubernetes.io/metadata.name" = "tailscale" } } }] }, + { from = [{ namespaceSelector = { matchLabels = { "kubernetes.io/metadata.name" = "monitoring" } } }] }, ] # helm_release.blackbox_exporter will be updated in-place - "url": "https://keycloak.tail5b443a.ts.net" + "url": "http://keycloak.keycloak.svc.cluster.local:80/realms/master" Plan: 0 to add, 6 to change, 0 to destroy. ``` The 4 additional changes are state drift from previously merged PRs (tracked in #113), not introduced by this PR. ## Test Plan - [ ] `tofu plan -lock=false` shows only expected keycloak-related changes from this PR - [ ] After apply, `BlackboxProbeFailed` alert for keycloak clears within 2 minutes - [ ] Verify `probe_success{instance="keycloak"}` returns 1 in Prometheus ## Review Checklist - [x] `tofu fmt` -- no formatting changes needed - [x] `tofu validate` -- passes - [x] `tofu plan -lock=false` -- only expected changes, 0 to add, 0 to destroy - [x] NetworkPolicy pattern matches existing services (forgejo, woodpecker, harbor, minio all have monitoring ingress) - [ ] README roadmap -- no update needed ## Related - Plan: `plan-pal-e-platform` - Forgejo issue: #111 - Parent epic: #109 Closes #111
fix: add monitoring ingress to Keycloak NetworkPolicy + use internal probe URL
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
8c947597e9
The Keycloak blackbox probe was failing because:
1. The NetworkPolicy blocked monitoring namespace ingress to keycloak
2. The probe URL used the external HTTPS endpoint instead of cluster-internal

Adds monitoring namespace to Keycloak NetworkPolicy ingress (matching
the pattern used by every other service). Changes the blackbox probe
URL to http://keycloak.keycloak.svc.cluster.local:80/realms/master
which validates Keycloak is serving its master realm.

Closes #111

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Author
Owner

Self-review: LGTM

Two-line change, both following established patterns:

  1. NetworkPolicy (network-policies.tf): Added monitoring namespace ingress to netpol_keycloak. Every other service in this file (forgejo, woodpecker, harbor, minio, cnpg-system) already has this rule -- keycloak was the only one missing it.

  2. Probe URL (main.tf): Changed from external https://keycloak.tail5b443a.ts.net to internal http://keycloak.keycloak.svc.cluster.local:80/realms/master. The /realms/master path validates Keycloak is actually serving its realm, not just that the HTTP port is open.

tofu plan confirmed only these two resources change (plus 4 unrelated state drift items from #113). No issues found.

**Self-review: LGTM** Two-line change, both following established patterns: 1. **NetworkPolicy** (`network-policies.tf`): Added `monitoring` namespace ingress to `netpol_keycloak`. Every other service in this file (forgejo, woodpecker, harbor, minio, cnpg-system) already has this rule -- keycloak was the only one missing it. 2. **Probe URL** (`main.tf`): Changed from external `https://keycloak.tail5b443a.ts.net` to internal `http://keycloak.keycloak.svc.cluster.local:80/realms/master`. The `/realms/master` path validates Keycloak is actually serving its realm, not just that the HTTP port is open. `tofu plan` confirmed only these two resources change (plus 4 unrelated state drift items from #113). No issues found.
Author
Owner

PR #117 Review

DOMAIN REVIEW

Tech stack: Terraform (OpenTofu) / Kubernetes NetworkPolicies / Helm (blackbox exporter)

NetworkPolicy change (terraform/network-policies.tf):

  • Adds monitoring namespace ingress to netpol_keycloak. This exactly matches the pattern used by every other platform service NetworkPolicy in the file (forgejo L45, woodpecker L66, harbor L88, minio L111, postgres L153, cnpg-system L192). Keycloak was the only service missing it. Correct fix.

Blackbox probe URL change (terraform/main.tf):

  • Switches keycloak probe from https://keycloak.tail5b443a.ts.net (external, Tailscale funnel) to http://keycloak.keycloak.svc.cluster.local:80/realms/master (cluster-internal).
  • This matches the pattern for all other platform-tier probes (forgejo, woodpecker, grafana, alertmanager, harbor, argocd, minio) which all use internal svc.cluster.local URLs.
  • The /realms/master path is a valid Keycloak health endpoint that returns 200 when the service is ready. Good choice over the bare root URL which would 302 redirect.
  • App-tier probes (pal-e-docs, pal-e-app, westside-app) correctly remain on external URLs to validate the full funnel path, so no consistency concern there.

Scope: Both changes are tightly scoped to the issue. No drift, no unrelated modifications.

BLOCKERS

None.

  • No new functionality requiring test coverage (this is infrastructure config, not application code)
  • No user input handling
  • No secrets or credentials in the diff
  • No DRY violations

NITS

  1. Merge conflict: PR shows mergeable: false. This is likely caused by PR #120 (basketball-api ingress to keycloak) which merged into the same region of network-policies.tf after this PR was created. The branch needs a rebase against main before merge.

  2. Duplicate PR #116: PR #116 has the identical title. If it is a stale duplicate, it should be closed to keep the PR list clean.

SOP COMPLIANCE

  • Branch named after issue (111-keycloak-probe-networkpolicy references issue #111)
  • PR body has Summary, Changes, Test Plan, Related sections
  • Related section references plan-pal-e-platform plan slug
  • Related section references parent epic #109
  • tofu fmt and tofu validate confirmed passing
  • tofu plan -lock=false output included with explanation of unrelated drift (#113)
  • No secrets committed
  • No unnecessary file changes (scope is tight: 2 files, 3 lines)
  • Commit message is descriptive

PROCESS OBSERVATIONS

  • MTTR: This PR directly resolves a firing BlackboxProbeFailed alert for keycloak. Fast fix, minimal risk. Good MTTR contribution.
  • Change failure risk: Very low. The NetworkPolicy addition is purely additive (does not remove existing rules). The probe URL change swaps to an already-proven pattern used by 7 other platform services.
  • Rebase needed: The merge conflict from PR #120 must be resolved. This is mechanical (adjacent lines in the same ingress block) and should be trivial.
  • State drift: The PR body transparently documents 4 unrelated plan changes from #113, which is good practice.

VERDICT: APPROVED

Clean, minimal fix. Both changes follow established patterns in the codebase. No blockers. Rebase against main required to resolve the merge conflict from PR #120 before merge.

## PR #117 Review ### DOMAIN REVIEW **Tech stack**: Terraform (OpenTofu) / Kubernetes NetworkPolicies / Helm (blackbox exporter) **NetworkPolicy change** (`terraform/network-policies.tf`): - Adds `monitoring` namespace ingress to `netpol_keycloak`. This exactly matches the pattern used by every other platform service NetworkPolicy in the file (forgejo L45, woodpecker L66, harbor L88, minio L111, postgres L153, cnpg-system L192). Keycloak was the only service missing it. Correct fix. **Blackbox probe URL change** (`terraform/main.tf`): - Switches keycloak probe from `https://keycloak.tail5b443a.ts.net` (external, Tailscale funnel) to `http://keycloak.keycloak.svc.cluster.local:80/realms/master` (cluster-internal). - This matches the pattern for all other platform-tier probes (forgejo, woodpecker, grafana, alertmanager, harbor, argocd, minio) which all use internal `svc.cluster.local` URLs. - The `/realms/master` path is a valid Keycloak health endpoint that returns 200 when the service is ready. Good choice over the bare root URL which would 302 redirect. - App-tier probes (pal-e-docs, pal-e-app, westside-app) correctly remain on external URLs to validate the full funnel path, so no consistency concern there. **Scope**: Both changes are tightly scoped to the issue. No drift, no unrelated modifications. ### BLOCKERS None. - No new functionality requiring test coverage (this is infrastructure config, not application code) - No user input handling - No secrets or credentials in the diff - No DRY violations ### NITS 1. **Merge conflict**: PR shows `mergeable: false`. This is likely caused by PR #120 (basketball-api ingress to keycloak) which merged into the same region of `network-policies.tf` after this PR was created. The branch needs a rebase against `main` before merge. 2. **Duplicate PR #116**: PR #116 has the identical title. If it is a stale duplicate, it should be closed to keep the PR list clean. ### SOP COMPLIANCE - [x] Branch named after issue (`111-keycloak-probe-networkpolicy` references issue #111) - [x] PR body has Summary, Changes, Test Plan, Related sections - [x] Related section references `plan-pal-e-platform` plan slug - [x] Related section references parent epic #109 - [x] `tofu fmt` and `tofu validate` confirmed passing - [x] `tofu plan -lock=false` output included with explanation of unrelated drift (#113) - [x] No secrets committed - [x] No unnecessary file changes (scope is tight: 2 files, 3 lines) - [x] Commit message is descriptive ### PROCESS OBSERVATIONS - **MTTR**: This PR directly resolves a firing `BlackboxProbeFailed` alert for keycloak. Fast fix, minimal risk. Good MTTR contribution. - **Change failure risk**: Very low. The NetworkPolicy addition is purely additive (does not remove existing rules). The probe URL change swaps to an already-proven pattern used by 7 other platform services. - **Rebase needed**: The merge conflict from PR #120 must be resolved. This is mechanical (adjacent lines in the same ingress block) and should be trivial. - **State drift**: The PR body transparently documents 4 unrelated plan changes from #113, which is good practice. ### VERDICT: APPROVED Clean, minimal fix. Both changes follow established patterns in the codebase. No blockers. Rebase against `main` required to resolve the merge conflict from PR #120 before merge.
forgejo_admin force-pushed 111-keycloak-probe-networkpolicy from 8c947597e9
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
to d595bd94c7
Some checks failed
ci/woodpecker/push/woodpecker Pipeline failed
ci/woodpecker/pull_request_closed/woodpecker Pipeline failed
2026-03-21 03:26:58 +00:00
Compare
forgejo_admin deleted branch 111-keycloak-probe-networkpolicy 2026-03-21 16:01:49 +00:00
Sign in to join this conversation.
No description provided.