fix: add monitoring ingress to postgres NetworkPolicy #95

Merged
forgejo_admin merged 1 commit from 94-re-enable-netpol-fix-postgres into main 2026-03-17 03:12:24 +00:00

Summary

Postgres NetworkPolicy was missing monitoring namespace ingress, causing Prometheus to lose the postgres scrape target after NetworkPolicy re-enablement (37/38 UP, 1 DOWN). All other 8 namespace policies already had monitoring ingress.

Changes

  • terraform/network-policies.tf — add monitoring namespace to netpol_postgres ingress rules (1 line)

Test Plan

  • tofu plan shows only postgres NetworkPolicy changing
  • After apply, Prometheus targets: 38/38 UP
  • All services still reachable via Tailscale funnels

Review Checklist

  • tofu fmt -check passes
  • tofu validate passes
  • One-line change, no unrelated modifications
  • Closes #94
  • bug-kube-router-ipset-empty — root investigation
  • sop-network-security — three-layer security architecture
## Summary Postgres NetworkPolicy was missing `monitoring` namespace ingress, causing Prometheus to lose the postgres scrape target after NetworkPolicy re-enablement (37/38 UP, 1 DOWN). All other 8 namespace policies already had monitoring ingress. ## Changes - `terraform/network-policies.tf` — add `monitoring` namespace to `netpol_postgres` ingress rules (1 line) ## Test Plan - [ ] `tofu plan` shows only postgres NetworkPolicy changing - [ ] After apply, Prometheus targets: 38/38 UP - [ ] All services still reachable via Tailscale funnels ## Review Checklist - [x] `tofu fmt -check` passes - [x] `tofu validate` passes - [x] One-line change, no unrelated modifications ## Related - Closes #94 - `bug-kube-router-ipset-empty` — root investigation - `sop-network-security` — three-layer security architecture
fix: add monitoring ingress to postgres NetworkPolicy
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
ci/woodpecker/pull_request_closed/woodpecker Pipeline was successful
711dfafa4f
Postgres NetworkPolicy was missing monitoring namespace, blocking
Prometheus from scraping CNPG postgres metrics (37/38 targets UP,
postgres DOWN). All other 8 namespace policies already included
monitoring ingress.

Discovered during NetworkPolicy re-enablement after kube-router
ipset investigation confirmed the controller is functional.

Closes #94

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Author
Owner

Tofu Plan Output

tailscale_acl.this: Refreshing state... [id=acl]
helm_release.nvidia_device_plugin: Refreshing state... [id=nvidia-device-plugin]
kubernetes_namespace_v1.woodpecker: Refreshing state... [id=woodpecker]
kubernetes_namespace_v1.tailscale: Refreshing state... [id=tailscale]
kubernetes_namespace_v1.postgres: Refreshing state... [id=postgres]
kubernetes_namespace_v1.monitoring: Refreshing state... [id=monitoring]
kubernetes_namespace_v1.forgejo: Refreshing state... [id=forgejo]
kubernetes_namespace_v1.keycloak: Refreshing state... [id=keycloak]
kubernetes_namespace_v1.harbor: Refreshing state... [id=harbor]
kubernetes_namespace_v1.minio: Refreshing state... [id=minio]
kubernetes_namespace_v1.ollama: Refreshing state... [id=ollama]
data.kubernetes_namespace_v1.tofu_state: Reading...
data.kubernetes_namespace_v1.pal_e_docs: Reading...
kubernetes_persistent_volume_claim_v1.keycloak_data: Refreshing state... [id=keycloak/keycloak-data]
kubernetes_namespace_v1.cnpg_system: Refreshing state... [id=cnpg-system]
kubernetes_service_v1.keycloak: Refreshing state... [id=keycloak/keycloak]
data.kubernetes_namespace_v1.pal_e_docs: Read complete after 0s [id=pal-e-docs]
data.kubernetes_namespace_v1.tofu_state: Read complete after 0s [id=tofu-state]
kubernetes_secret_v1.keycloak_admin: Refreshing state... [id=keycloak/keycloak-admin]
kubernetes_service_v1.dora_exporter: Refreshing state... [id=monitoring/dora-exporter]
helm_release.kube_prometheus_stack: Refreshing state... [id=kube-prometheus-stack]
kubernetes_secret_v1.dora_exporter: Refreshing state... [id=monitoring/dora-exporter]
kubernetes_config_map_v1.uptime_dashboard: Refreshing state... [id=monitoring/uptime-dashboard]
helm_release.loki_stack: Refreshing state... [id=loki-stack]
helm_release.forgejo: Refreshing state... [id=forgejo]
kubernetes_secret_v1.woodpecker_db_credentials: Refreshing state... [id=woodpecker/woodpecker-db-credentials]
kubernetes_manifest.netpol_minio: Refreshing state...
kubernetes_manifest.netpol_keycloak: Refreshing state...
kubernetes_manifest.netpol_monitoring: Refreshing state...
kubernetes_manifest.netpol_harbor: Refreshing state...
kubernetes_manifest.netpol_postgres: Refreshing state...
kubernetes_manifest.netpol_forgejo: Refreshing state...
kubernetes_manifest.netpol_woodpecker: Refreshing state...
helm_release.tailscale_operator: Refreshing state... [id=tailscale-operator]
kubernetes_service_v1.embedding_worker_metrics: Refreshing state... [id=pal-e-docs/embedding-worker-metrics]
kubernetes_secret_v1.paledocs_db_url: Refreshing state... [id=pal-e-docs/paledocs-db-url]
kubernetes_service_account_v1.tf_backup: Refreshing state... [id=tofu-state/tf-state-backup]
kubernetes_role_v1.tf_backup: Refreshing state... [id=tofu-state/tf-state-backup]
helm_release.cnpg: Refreshing state... [id=cnpg]
kubernetes_manifest.netpol_ollama: Refreshing state...
helm_release.ollama: Refreshing state... [id=ollama]
kubernetes_role_binding_v1.tf_backup: Refreshing state... [id=tofu-state/tf-state-backup]
kubernetes_manifest.netpol_cnpg_system: Refreshing state...
kubernetes_deployment_v1.keycloak: Refreshing state... [id=keycloak/keycloak]
kubernetes_config_map_v1.dora_dashboard: Refreshing state... [id=monitoring/dora-dashboard]
helm_release.blackbox_exporter: Refreshing state... [id=blackbox-exporter]
kubernetes_manifest.embedding_alerts: Refreshing state...
kubernetes_deployment_v1.dora_exporter: Refreshing state... [id=monitoring/dora-exporter]
kubernetes_config_map_v1.pal_e_docs_dashboard: Refreshing state... [id=monitoring/pal-e-docs-dashboard]
helm_release.harbor: Refreshing state... [id=harbor]
kubernetes_manifest.embedding_worker_service_monitor: Refreshing state...
helm_release.minio: Refreshing state... [id=minio]
kubernetes_manifest.dora_exporter_service_monitor: Refreshing state...
kubernetes_manifest.blackbox_alerts: Refreshing state...
kubernetes_config_map_v1.grafana_loki_datasource: Refreshing state... [id=monitoring/grafana-loki-datasource]
kubernetes_ingress_v1.forgejo_funnel: Refreshing state... [id=forgejo/forgejo-funnel]
kubernetes_ingress_v1.grafana_funnel: Refreshing state... [id=monitoring/grafana-funnel]
kubernetes_ingress_v1.alertmanager_funnel: Refreshing state... [id=monitoring/alertmanager-funnel]
kubernetes_ingress_v1.keycloak_funnel: Refreshing state... [id=keycloak/keycloak-funnel]
kubernetes_ingress_v1.harbor_funnel: Refreshing state... [id=harbor/harbor-funnel]
minio_iam_user.tf_backup: Refreshing state... [id=tf-backup]
minio_s3_bucket.tf_state_backups: Refreshing state... [id=tf-state-backups]
minio_iam_policy.tf_backup: Refreshing state... [id=tf-backup]
minio_s3_bucket.postgres_wal: Refreshing state... [id=postgres-wal]
kubernetes_ingress_v1.minio_api_funnel: Refreshing state... [id=minio/minio-api-funnel]
minio_iam_user.cnpg: Refreshing state... [id=cnpg]
kubernetes_ingress_v1.minio_funnel: Refreshing state... [id=minio/minio-funnel]
minio_iam_policy.cnpg_wal: Refreshing state... [id=cnpg-wal]
minio_s3_bucket.assets: Refreshing state... [id=assets]
minio_iam_user_policy_attachment.tf_backup: Refreshing state... [id=tf-backup-20260314163610110100000001]
kubernetes_secret_v1.tf_backup_s3_creds: Refreshing state... [id=tofu-state/tf-backup-s3-creds]
minio_iam_user_policy_attachment.cnpg: Refreshing state... [id=cnpg-20260302210642491000000001]
kubernetes_secret_v1.cnpg_s3_creds: Refreshing state... [id=postgres/cnpg-s3-creds]
kubernetes_secret_v1.woodpecker_cnpg_s3_creds: Refreshing state... [id=woodpecker/cnpg-s3-creds]
kubernetes_cron_job_v1.tf_state_backup: Refreshing state... [id=tofu-state/tf-state-backup]
kubernetes_cron_job_v1.cnpg_backup_verify: Refreshing state... [id=postgres/cnpg-backup-verify]
kubernetes_manifest.woodpecker_postgres: Refreshing state...
helm_release.woodpecker: Refreshing state... [id=woodpecker]
kubernetes_manifest.woodpecker_postgres_scheduled_backup: Refreshing state...
kubernetes_ingress_v1.woodpecker_funnel: Refreshing state... [id=woodpecker/woodpecker-funnel]

OpenTofu used the selected providers to generate the following execution
plan. Resource actions are indicated with the following symbols:
  ~ update in-place

OpenTofu will perform the following actions:

  # helm_release.woodpecker will be updated in-place
  ~ resource "helm_release" "woodpecker" {
        id                         = "woodpecker"
      ~ metadata                   = [
          - {
              - app_version    = "3.13.0"
              - chart          = "woodpecker"
              - first_deployed = 1773625582
              - last_deployed  = 1773710708
              - name           = "woodpecker"
              - namespace      = "woodpecker"
              - notes          = <<-EOT
                    1. Get the application URL by running these commands:
                      export POD_NAME=$(kubectl get pods --namespace woodpecker -l "app.kubernetes.io/name=server,app.kubernetes.io/instance=woodpecker" -o jsonpath="{.items[0].metadata.name}")
                      export CONTAINER_PORT=$(kubectl get pod --namespace woodpecker $POD_NAME -o jsonpath="{.spec.containers[0].ports[0].containerPort}")
                      echo "Visit http://127.0.0.1:8080 to use your application"
                      kubectl --namespace woodpecker port-forward $POD_NAME 8080:$CONTAINER_PORT
                EOT
              - revision       = 3
              - values         = jsonencode(
                    {
                      - agent  = {
                          - enabled      = true
                          - env          = {
                              - WOODPECKER_AGENT_SECRET              = "(sensitive value)"
                              - WOODPECKER_BACKEND                   = "kubernetes"
                              - WOODPECKER_BACKEND_K8S_NAMESPACE     = "woodpecker"
                              - WOODPECKER_BACKEND_K8S_STORAGE_CLASS = "local-path"
                              - WOODPECKER_BACKEND_K8S_VOLUME_SIZE   = "1Gi"
                            }
                          - replicaCount = 1
                          - resources    = {
                              - limits   = {
                                  - memory = "256Mi"
                                }
                              - requests = {
                                  - cpu    = "50m"
                                  - memory = "64Mi"
                                }
                            }
                        }
                      - server = {
                          - env              = {
                              - WOODPECKER_ADMIN               = "forgejo_admin"
                              - WOODPECKER_AGENT_SECRET        = "(sensitive value)"
                              - WOODPECKER_DATABASE_DATASOURCE = "postgres://woodpecker:kM3L4AhLNiuMhIY7tMQ@woodpecker-db-rw.woodpecker.svc.cluster.local:5432/woodpecker?sslmode=disable"
                              - WOODPECKER_DATABASE_DRIVER     = "postgres"
                              - WOODPECKER_FORGEJO             = "true"
                              - WOODPECKER_FORGEJO_CLIENT      = "(sensitive value)"
                              - WOODPECKER_FORGEJO_CLONE_URL   = "http://forgejo-http.forgejo.svc.cluster.local:80"
                              - WOODPECKER_FORGEJO_SECRET      = "(sensitive value)"
                              - WOODPECKER_FORGEJO_URL         = "https://forgejo.tail5b443a.ts.net"
                              - WOODPECKER_HOST                = "https://woodpecker.tail5b443a.ts.net"
                            }
                          - persistentVolume = {
                              - enabled      = true
                              - size         = "5Gi"
                              - storageClass = "local-path"
                            }
                          - resources        = {
                              - limits   = {
                                  - memory = "512Mi"
                                }
                              - requests = {
                                  - cpu    = "50m"
                                  - memory = "128Mi"
                                }
                            }
                          - statefulSet      = {
                              - replicaCount = 1
                            }
                        }
                    }
                )
              - version        = "3.5.1"
            },
        ] -> (known after apply)
        name                       = "woodpecker"
      ~ status                     = "pending-upgrade" -> "deployed"
        # (25 unchanged attributes hidden)

      - set_sensitive {
          # At least one attribute in this block is (or was) sensitive,
          # so its contents will not be displayed.
        }
      - set_sensitive {
          # At least one attribute in this block is (or was) sensitive,
          # so its contents will not be displayed.
        }
      + set_sensitive {
          # At least one attribute in this block is (or was) sensitive,
          # so its contents will not be displayed.
        }
      + set_sensitive {
          # At least one attribute in this block is (or was) sensitive,
          # so its contents will not be displayed.
        }

        # (2 unchanged blocks hidden)
    }

  # kubernetes_cron_job_v1.cnpg_backup_verify will be updated in-place
  ~ resource "kubernetes_cron_job_v1" "cnpg_backup_verify" {
        id = "postgres/cnpg-backup-verify"

      ~ spec {
            # (6 unchanged attributes hidden)

          ~ job_template {
              ~ spec {
                    # (7 unchanged attributes hidden)

                  ~ template {
                      ~ spec {
                            # (12 unchanged attributes hidden)

                          ~ container {
                              ~ args                       = [
                                  - <<-EOT
                                        set -euo pipefail
                                        
                                        apk add --no-cache curl >/dev/null
                                        
                                        # Install mc (MinIO Client)
                                        curl -sSL https://dl.min.io/client/mc/release/linux-amd64/mc -o /tmp/mc
                                        chmod +x /tmp/mc
                                        
                                        # Configure MinIO alias
                                        /tmp/mc alias set backup http://minio.minio.svc.cluster.local:9000 "$ACCESS_KEY_ID" "$ACCESS_SECRET_KEY"
                                        
                                        ERRORS=0
                                        MAX_AGE_HOURS=25  # Allow 1h buffer beyond 24h
                                        
                                        # Check each backup path prefix
                                        for PREFIX in "pal-e-postgres" "woodpecker"; do
                                          echo "=== Checking backups for $PREFIX ==="
                                        
                                          # List objects in the backup path
                                          OBJECTS=$(/tmp/mc ls "backup/postgres-wal/$PREFIX/" 2>/dev/null | head -5 || true)
                                        
                                          if [ -z "$OBJECTS" ]; then
                                            echo "ERROR: No backup objects found for $PREFIX"
                                            ERRORS=$((ERRORS + 1))
                                            continue
                                          fi
                                        
                                          echo "Found backup objects for $PREFIX:"
                                          echo "$OBJECTS"
                                        
                                          # Check WAL directory for recent files
                                          RECENT=$(/tmp/mc find "backup/postgres-wal/$PREFIX/wals/" --newer-than "${MAX_AGE_HOURS}h" 2>/dev/null | head -1 || true)
                                        
                                          if [ -z "$RECENT" ]; then
                                            echo "WARNING: No WAL files newer than ${MAX_AGE_HOURS}h for $PREFIX"
                                            ERRORS=$((ERRORS + 1))
                                          else
                                            echo "OK: Recent WAL files found for $PREFIX"
                                          fi
                                        done
                                        
                                        if [ "$ERRORS" -gt 0 ]; then
                                          echo "FAILED: $ERRORS backup verification errors"
                                          exit 1
                                        fi
                                        
                                        echo "All backup verifications passed."
                                    EOT,
                                  + <<-EOT
                                        set -euo pipefail
                                        
                                        apk add --no-cache curl >/dev/null
                                        
                                        # Install mc (MinIO Client)
                                        curl -sSL https://dl.min.io/client/mc/release/linux-amd64/mc -o /tmp/mc
                                        chmod +x /tmp/mc
                                        
                                        # Configure MinIO alias
                                        /tmp/mc alias set backup http://minio.minio.svc.cluster.local:9000 "$ACCESS_KEY_ID" "$ACCESS_SECRET_KEY"
                                        
                                        ERRORS=0
                                        MAX_AGE_HOURS=25  # Allow 1h buffer beyond 24h
                                        
                                        # Check each backup path prefix
                                        for PREFIX in "pal-e-postgres" "woodpecker"; do
                                          echo "=== Checking backups for $PREFIX ==="
                                        
                                          # List objects in the backup path
                                          OBJECTS=$(/tmp/mc ls "backup/postgres-wal/$PREFIX/" 2>/dev/null | head -5 || true)
                                        
                                          if [ -z "$OBJECTS" ]; then
                                            echo "ERROR: No backup objects found for $PREFIX"
                                            ERRORS=$((ERRORS + 1))
                                            continue
                                          fi
                                        
                                          echo "Found backup objects for $PREFIX:"
                                          echo "$OBJECTS"
                                        
                                          # Check if WAL directory has content (new clusters may not have archived WALs yet)
                                          WAL_EXISTS=$(/tmp/mc ls "backup/postgres-wal/$PREFIX/wals/" 2>/dev/null | head -1 || true)
                                        
                                          if [ -z "$WAL_EXISTS" ]; then
                                            echo "SKIP: No WAL archives yet for $PREFIX (new cluster, base backup only)"
                                            continue
                                          fi
                                        
                                          # Check WAL directory for recent files
                                          RECENT=$(/tmp/mc find "backup/postgres-wal/$PREFIX/wals/" --newer-than "${MAX_AGE_HOURS}h" 2>/dev/null | head -1 || true)
                                        
                                          if [ -z "$RECENT" ]; then
                                            echo "WARNING: No WAL files newer than ${MAX_AGE_HOURS}h for $PREFIX"
                                            ERRORS=$((ERRORS + 1))
                                          else
                                            echo "OK: Recent WAL files found for $PREFIX"
                                          fi
                                        done
                                        
                                        if [ "$ERRORS" -gt 0 ]; then
                                          echo "FAILED: $ERRORS backup verification errors"
                                          exit 1
                                        fi
                                        
                                        echo "All backup verifications passed."
                                    EOT,
                                ]
                                name                       = "verify"
                                # (8 unchanged attributes hidden)

                                # (3 unchanged blocks hidden)
                            }
                        }

                        # (1 unchanged block hidden)
                    }
                }

                # (1 unchanged block hidden)
            }
        }

        # (1 unchanged block hidden)
    }

  # kubernetes_manifest.netpol_postgres will be updated in-place
  ~ resource "kubernetes_manifest" "netpol_postgres" {
      ~ manifest = {
          ~ spec       = {
              ~ ingress     = [
                    # (1 unchanged element hidden)
                    {
                        from = [
                            {
                                namespaceSelector = {
                                    matchLabels = {
                                        "kubernetes.io/metadata.name" = "cnpg-system"
                                    }
                                }
                            },
                        ]
                    },
                  + {
                      + from = [
                          + {
                              + namespaceSelector = {
                                  + matchLabels = {
                                      + "kubernetes.io/metadata.name" = "monitoring"
                                    }
                                }
                            },
                        ]
                    },
                ]
                # (2 unchanged attributes hidden)
            }
            # (3 unchanged attributes hidden)
        }
      ~ object   = {
          ~ spec       = {
              ~ ingress     = [
                    # (1 unchanged element hidden)
                    {
                        from  = [
                            {
                                ipBlock           = {
                                    cidr   = null
                                    except = null
                                }
                                namespaceSelector = {
                                    matchExpressions = null
                                    matchLabels      = {
                                        "kubernetes.io/metadata.name" = "cnpg-system"
                                    }
                                }
                                podSelector       = {
                                    matchExpressions = null
                                    matchLabels      = null
                                }
                            },
                        ]
                        ports = null
                    },
                  + {
                      + from  = [
                          + {
                              + ipBlock           = {
                                  + cidr   = (known after apply)
                                  + except = (known after apply)
                                }
                              + namespaceSelector = {
                                  + matchExpressions = (known after apply)
                                  + matchLabels      = {
                                      + "kubernetes.io/metadata.name" = "monitoring"
                                    }
                                }
                              + podSelector       = {
                                  + matchExpressions = (known after apply)
                                  + matchLabels      = (known after apply)
                                }
                            },
                        ]
                      + ports = (known after apply)
                    },
                ]
                # (3 unchanged attributes hidden)
            }
            # (3 unchanged attributes hidden)
        }
    }

Plan: 0 to add, 3 to change, 0 to destroy.

─────────────────────────────────────────────────────────────────────────────

Note: You didn't use the -out option to save this plan, so OpenTofu can't
guarantee to take exactly these actions if you run "tofu apply" now.
## Tofu Plan Output ``` tailscale_acl.this: Refreshing state... [id=acl] helm_release.nvidia_device_plugin: Refreshing state... [id=nvidia-device-plugin] kubernetes_namespace_v1.woodpecker: Refreshing state... [id=woodpecker] kubernetes_namespace_v1.tailscale: Refreshing state... [id=tailscale] kubernetes_namespace_v1.postgres: Refreshing state... [id=postgres] kubernetes_namespace_v1.monitoring: Refreshing state... [id=monitoring] kubernetes_namespace_v1.forgejo: Refreshing state... [id=forgejo] kubernetes_namespace_v1.keycloak: Refreshing state... [id=keycloak] kubernetes_namespace_v1.harbor: Refreshing state... [id=harbor] kubernetes_namespace_v1.minio: Refreshing state... [id=minio] kubernetes_namespace_v1.ollama: Refreshing state... [id=ollama] data.kubernetes_namespace_v1.tofu_state: Reading... data.kubernetes_namespace_v1.pal_e_docs: Reading... kubernetes_persistent_volume_claim_v1.keycloak_data: Refreshing state... [id=keycloak/keycloak-data] kubernetes_namespace_v1.cnpg_system: Refreshing state... [id=cnpg-system] kubernetes_service_v1.keycloak: Refreshing state... [id=keycloak/keycloak] data.kubernetes_namespace_v1.pal_e_docs: Read complete after 0s [id=pal-e-docs] data.kubernetes_namespace_v1.tofu_state: Read complete after 0s [id=tofu-state] kubernetes_secret_v1.keycloak_admin: Refreshing state... [id=keycloak/keycloak-admin] kubernetes_service_v1.dora_exporter: Refreshing state... [id=monitoring/dora-exporter] helm_release.kube_prometheus_stack: Refreshing state... [id=kube-prometheus-stack] kubernetes_secret_v1.dora_exporter: Refreshing state... [id=monitoring/dora-exporter] kubernetes_config_map_v1.uptime_dashboard: Refreshing state... [id=monitoring/uptime-dashboard] helm_release.loki_stack: Refreshing state... [id=loki-stack] helm_release.forgejo: Refreshing state... [id=forgejo] kubernetes_secret_v1.woodpecker_db_credentials: Refreshing state... [id=woodpecker/woodpecker-db-credentials] kubernetes_manifest.netpol_minio: Refreshing state... kubernetes_manifest.netpol_keycloak: Refreshing state... kubernetes_manifest.netpol_monitoring: Refreshing state... kubernetes_manifest.netpol_harbor: Refreshing state... kubernetes_manifest.netpol_postgres: Refreshing state... kubernetes_manifest.netpol_forgejo: Refreshing state... kubernetes_manifest.netpol_woodpecker: Refreshing state... helm_release.tailscale_operator: Refreshing state... [id=tailscale-operator] kubernetes_service_v1.embedding_worker_metrics: Refreshing state... [id=pal-e-docs/embedding-worker-metrics] kubernetes_secret_v1.paledocs_db_url: Refreshing state... [id=pal-e-docs/paledocs-db-url] kubernetes_service_account_v1.tf_backup: Refreshing state... [id=tofu-state/tf-state-backup] kubernetes_role_v1.tf_backup: Refreshing state... [id=tofu-state/tf-state-backup] helm_release.cnpg: Refreshing state... [id=cnpg] kubernetes_manifest.netpol_ollama: Refreshing state... helm_release.ollama: Refreshing state... [id=ollama] kubernetes_role_binding_v1.tf_backup: Refreshing state... [id=tofu-state/tf-state-backup] kubernetes_manifest.netpol_cnpg_system: Refreshing state... kubernetes_deployment_v1.keycloak: Refreshing state... [id=keycloak/keycloak] kubernetes_config_map_v1.dora_dashboard: Refreshing state... [id=monitoring/dora-dashboard] helm_release.blackbox_exporter: Refreshing state... [id=blackbox-exporter] kubernetes_manifest.embedding_alerts: Refreshing state... kubernetes_deployment_v1.dora_exporter: Refreshing state... [id=monitoring/dora-exporter] kubernetes_config_map_v1.pal_e_docs_dashboard: Refreshing state... [id=monitoring/pal-e-docs-dashboard] helm_release.harbor: Refreshing state... [id=harbor] kubernetes_manifest.embedding_worker_service_monitor: Refreshing state... helm_release.minio: Refreshing state... [id=minio] kubernetes_manifest.dora_exporter_service_monitor: Refreshing state... kubernetes_manifest.blackbox_alerts: Refreshing state... kubernetes_config_map_v1.grafana_loki_datasource: Refreshing state... [id=monitoring/grafana-loki-datasource] kubernetes_ingress_v1.forgejo_funnel: Refreshing state... [id=forgejo/forgejo-funnel] kubernetes_ingress_v1.grafana_funnel: Refreshing state... [id=monitoring/grafana-funnel] kubernetes_ingress_v1.alertmanager_funnel: Refreshing state... [id=monitoring/alertmanager-funnel] kubernetes_ingress_v1.keycloak_funnel: Refreshing state... [id=keycloak/keycloak-funnel] kubernetes_ingress_v1.harbor_funnel: Refreshing state... [id=harbor/harbor-funnel] minio_iam_user.tf_backup: Refreshing state... [id=tf-backup] minio_s3_bucket.tf_state_backups: Refreshing state... [id=tf-state-backups] minio_iam_policy.tf_backup: Refreshing state... [id=tf-backup] minio_s3_bucket.postgres_wal: Refreshing state... [id=postgres-wal] kubernetes_ingress_v1.minio_api_funnel: Refreshing state... [id=minio/minio-api-funnel] minio_iam_user.cnpg: Refreshing state... [id=cnpg] kubernetes_ingress_v1.minio_funnel: Refreshing state... [id=minio/minio-funnel] minio_iam_policy.cnpg_wal: Refreshing state... [id=cnpg-wal] minio_s3_bucket.assets: Refreshing state... [id=assets] minio_iam_user_policy_attachment.tf_backup: Refreshing state... [id=tf-backup-20260314163610110100000001] kubernetes_secret_v1.tf_backup_s3_creds: Refreshing state... [id=tofu-state/tf-backup-s3-creds] minio_iam_user_policy_attachment.cnpg: Refreshing state... [id=cnpg-20260302210642491000000001] kubernetes_secret_v1.cnpg_s3_creds: Refreshing state... [id=postgres/cnpg-s3-creds] kubernetes_secret_v1.woodpecker_cnpg_s3_creds: Refreshing state... [id=woodpecker/cnpg-s3-creds] kubernetes_cron_job_v1.tf_state_backup: Refreshing state... [id=tofu-state/tf-state-backup] kubernetes_cron_job_v1.cnpg_backup_verify: Refreshing state... [id=postgres/cnpg-backup-verify] kubernetes_manifest.woodpecker_postgres: Refreshing state... helm_release.woodpecker: Refreshing state... [id=woodpecker] kubernetes_manifest.woodpecker_postgres_scheduled_backup: Refreshing state... kubernetes_ingress_v1.woodpecker_funnel: Refreshing state... [id=woodpecker/woodpecker-funnel] OpenTofu used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols: ~ update in-place OpenTofu will perform the following actions: # helm_release.woodpecker will be updated in-place ~ resource "helm_release" "woodpecker" { id = "woodpecker" ~ metadata = [ - { - app_version = "3.13.0" - chart = "woodpecker" - first_deployed = 1773625582 - last_deployed = 1773710708 - name = "woodpecker" - namespace = "woodpecker" - notes = <<-EOT 1. Get the application URL by running these commands: export POD_NAME=$(kubectl get pods --namespace woodpecker -l "app.kubernetes.io/name=server,app.kubernetes.io/instance=woodpecker" -o jsonpath="{.items[0].metadata.name}") export CONTAINER_PORT=$(kubectl get pod --namespace woodpecker $POD_NAME -o jsonpath="{.spec.containers[0].ports[0].containerPort}") echo "Visit http://127.0.0.1:8080 to use your application" kubectl --namespace woodpecker port-forward $POD_NAME 8080:$CONTAINER_PORT EOT - revision = 3 - values = jsonencode( { - agent = { - enabled = true - env = { - WOODPECKER_AGENT_SECRET = "(sensitive value)" - WOODPECKER_BACKEND = "kubernetes" - WOODPECKER_BACKEND_K8S_NAMESPACE = "woodpecker" - WOODPECKER_BACKEND_K8S_STORAGE_CLASS = "local-path" - WOODPECKER_BACKEND_K8S_VOLUME_SIZE = "1Gi" } - replicaCount = 1 - resources = { - limits = { - memory = "256Mi" } - requests = { - cpu = "50m" - memory = "64Mi" } } } - server = { - env = { - WOODPECKER_ADMIN = "forgejo_admin" - WOODPECKER_AGENT_SECRET = "(sensitive value)" - WOODPECKER_DATABASE_DATASOURCE = "postgres://woodpecker:kM3L4AhLNiuMhIY7tMQ@woodpecker-db-rw.woodpecker.svc.cluster.local:5432/woodpecker?sslmode=disable" - WOODPECKER_DATABASE_DRIVER = "postgres" - WOODPECKER_FORGEJO = "true" - WOODPECKER_FORGEJO_CLIENT = "(sensitive value)" - WOODPECKER_FORGEJO_CLONE_URL = "http://forgejo-http.forgejo.svc.cluster.local:80" - WOODPECKER_FORGEJO_SECRET = "(sensitive value)" - WOODPECKER_FORGEJO_URL = "https://forgejo.tail5b443a.ts.net" - WOODPECKER_HOST = "https://woodpecker.tail5b443a.ts.net" } - persistentVolume = { - enabled = true - size = "5Gi" - storageClass = "local-path" } - resources = { - limits = { - memory = "512Mi" } - requests = { - cpu = "50m" - memory = "128Mi" } } - statefulSet = { - replicaCount = 1 } } } ) - version = "3.5.1" }, ] -> (known after apply) name = "woodpecker" ~ status = "pending-upgrade" -> "deployed" # (25 unchanged attributes hidden) - set_sensitive { # At least one attribute in this block is (or was) sensitive, # so its contents will not be displayed. } - set_sensitive { # At least one attribute in this block is (or was) sensitive, # so its contents will not be displayed. } + set_sensitive { # At least one attribute in this block is (or was) sensitive, # so its contents will not be displayed. } + set_sensitive { # At least one attribute in this block is (or was) sensitive, # so its contents will not be displayed. } # (2 unchanged blocks hidden) } # kubernetes_cron_job_v1.cnpg_backup_verify will be updated in-place ~ resource "kubernetes_cron_job_v1" "cnpg_backup_verify" { id = "postgres/cnpg-backup-verify" ~ spec { # (6 unchanged attributes hidden) ~ job_template { ~ spec { # (7 unchanged attributes hidden) ~ template { ~ spec { # (12 unchanged attributes hidden) ~ container { ~ args = [ - <<-EOT set -euo pipefail apk add --no-cache curl >/dev/null # Install mc (MinIO Client) curl -sSL https://dl.min.io/client/mc/release/linux-amd64/mc -o /tmp/mc chmod +x /tmp/mc # Configure MinIO alias /tmp/mc alias set backup http://minio.minio.svc.cluster.local:9000 "$ACCESS_KEY_ID" "$ACCESS_SECRET_KEY" ERRORS=0 MAX_AGE_HOURS=25 # Allow 1h buffer beyond 24h # Check each backup path prefix for PREFIX in "pal-e-postgres" "woodpecker"; do echo "=== Checking backups for $PREFIX ===" # List objects in the backup path OBJECTS=$(/tmp/mc ls "backup/postgres-wal/$PREFIX/" 2>/dev/null | head -5 || true) if [ -z "$OBJECTS" ]; then echo "ERROR: No backup objects found for $PREFIX" ERRORS=$((ERRORS + 1)) continue fi echo "Found backup objects for $PREFIX:" echo "$OBJECTS" # Check WAL directory for recent files RECENT=$(/tmp/mc find "backup/postgres-wal/$PREFIX/wals/" --newer-than "${MAX_AGE_HOURS}h" 2>/dev/null | head -1 || true) if [ -z "$RECENT" ]; then echo "WARNING: No WAL files newer than ${MAX_AGE_HOURS}h for $PREFIX" ERRORS=$((ERRORS + 1)) else echo "OK: Recent WAL files found for $PREFIX" fi done if [ "$ERRORS" -gt 0 ]; then echo "FAILED: $ERRORS backup verification errors" exit 1 fi echo "All backup verifications passed." EOT, + <<-EOT set -euo pipefail apk add --no-cache curl >/dev/null # Install mc (MinIO Client) curl -sSL https://dl.min.io/client/mc/release/linux-amd64/mc -o /tmp/mc chmod +x /tmp/mc # Configure MinIO alias /tmp/mc alias set backup http://minio.minio.svc.cluster.local:9000 "$ACCESS_KEY_ID" "$ACCESS_SECRET_KEY" ERRORS=0 MAX_AGE_HOURS=25 # Allow 1h buffer beyond 24h # Check each backup path prefix for PREFIX in "pal-e-postgres" "woodpecker"; do echo "=== Checking backups for $PREFIX ===" # List objects in the backup path OBJECTS=$(/tmp/mc ls "backup/postgres-wal/$PREFIX/" 2>/dev/null | head -5 || true) if [ -z "$OBJECTS" ]; then echo "ERROR: No backup objects found for $PREFIX" ERRORS=$((ERRORS + 1)) continue fi echo "Found backup objects for $PREFIX:" echo "$OBJECTS" # Check if WAL directory has content (new clusters may not have archived WALs yet) WAL_EXISTS=$(/tmp/mc ls "backup/postgres-wal/$PREFIX/wals/" 2>/dev/null | head -1 || true) if [ -z "$WAL_EXISTS" ]; then echo "SKIP: No WAL archives yet for $PREFIX (new cluster, base backup only)" continue fi # Check WAL directory for recent files RECENT=$(/tmp/mc find "backup/postgres-wal/$PREFIX/wals/" --newer-than "${MAX_AGE_HOURS}h" 2>/dev/null | head -1 || true) if [ -z "$RECENT" ]; then echo "WARNING: No WAL files newer than ${MAX_AGE_HOURS}h for $PREFIX" ERRORS=$((ERRORS + 1)) else echo "OK: Recent WAL files found for $PREFIX" fi done if [ "$ERRORS" -gt 0 ]; then echo "FAILED: $ERRORS backup verification errors" exit 1 fi echo "All backup verifications passed." EOT, ] name = "verify" # (8 unchanged attributes hidden) # (3 unchanged blocks hidden) } } # (1 unchanged block hidden) } } # (1 unchanged block hidden) } } # (1 unchanged block hidden) } # kubernetes_manifest.netpol_postgres will be updated in-place ~ resource "kubernetes_manifest" "netpol_postgres" { ~ manifest = { ~ spec = { ~ ingress = [ # (1 unchanged element hidden) { from = [ { namespaceSelector = { matchLabels = { "kubernetes.io/metadata.name" = "cnpg-system" } } }, ] }, + { + from = [ + { + namespaceSelector = { + matchLabels = { + "kubernetes.io/metadata.name" = "monitoring" } } }, ] }, ] # (2 unchanged attributes hidden) } # (3 unchanged attributes hidden) } ~ object = { ~ spec = { ~ ingress = [ # (1 unchanged element hidden) { from = [ { ipBlock = { cidr = null except = null } namespaceSelector = { matchExpressions = null matchLabels = { "kubernetes.io/metadata.name" = "cnpg-system" } } podSelector = { matchExpressions = null matchLabels = null } }, ] ports = null }, + { + from = [ + { + ipBlock = { + cidr = (known after apply) + except = (known after apply) } + namespaceSelector = { + matchExpressions = (known after apply) + matchLabels = { + "kubernetes.io/metadata.name" = "monitoring" } } + podSelector = { + matchExpressions = (known after apply) + matchLabels = (known after apply) } }, ] + ports = (known after apply) }, ] # (3 unchanged attributes hidden) } # (3 unchanged attributes hidden) } } Plan: 0 to add, 3 to change, 0 to destroy. ───────────────────────────────────────────────────────────────────────────── Note: You didn't use the -out option to save this plan, so OpenTofu can't guarantee to take exactly these actions if you run "tofu apply" now. ```
Author
Owner

PR #95 Review

DOMAIN REVIEW

Tech stack: Terraform (OpenTofu) / Kubernetes NetworkPolicies

Correctness: The one-line addition adds monitoring namespace ingress to netpol_postgres, following the exact same HCL pattern used across all other policies in this file. The syntax matches perfectly -- namespaceSelector with matchLabels on kubernetes.io/metadata.name.

Consistency audit: I reviewed all 9 NetworkPolicy resources in terraform/network-policies.tf. After this PR, 7 of 9 policies include monitoring ingress. The two that do not are:

  • netpol_keycloak (line 119) -- no monitoring ingress
  • netpol_ollama (line 159) -- no monitoring ingress

If Prometheus has scrape targets in either of those namespaces, they would exhibit the same symptom (target DOWN). This is out of scope for this bug fix but worth noting as discovered scope.

Terraform style: The line follows tofu fmt conventions (consistent indentation, trailing comma). PR body confirms tofu fmt -check and tofu validate both pass.

k8s security: The change correctly scopes ingress to only the monitoring namespace via label selector -- no overly broad selectors, no {} podSelector on the source side. This is the right pattern.

BLOCKERS

None.

This is infrastructure-as-code with no application logic, no user input handling, no secrets, and no auth paths. The single-line change is a NetworkPolicy ingress rule addition that matches the established pattern exactly. No BLOCKER criteria are triggered.

NITS

  1. Discovered scope: netpol_keycloak and netpol_ollama lack monitoring ingress. If either namespace has Prometheus scrape targets, a follow-up issue should be filed. Not blocking this PR since the fix is correctly scoped to the reported symptom (postgres target DOWN).

SOP COMPLIANCE

  • Branch named after issue (94-re-enable-netpol-fix-postgres references issue #94)
  • PR body follows template (Summary, Changes, Test Plan, Review Checklist, Related)
  • Related references investigation slug (bug-kube-router-ipset-empty) and SOP (sop-network-security)
  • No secrets committed
  • No unnecessary file changes (1 file, 1 line, zero scope creep)
  • Commit message is descriptive (fix: add monitoring ingress to postgres NetworkPolicy)

PROCESS OBSERVATIONS

  • MTTR: This is a clean, fast fix for a monitoring regression. One-line change with clear root cause analysis in the PR body. Good MTTR pattern.
  • Change failure risk: Very low. The change is additive (new ingress rule), follows an established pattern, and does not modify existing rules.
  • Discovered scope: The keycloak/ollama monitoring gap should be tracked if those namespaces have Prometheus targets. Recommend filing a follow-up issue if applicable.

VERDICT: APPROVED

## PR #95 Review ### DOMAIN REVIEW **Tech stack**: Terraform (OpenTofu) / Kubernetes NetworkPolicies **Correctness**: The one-line addition adds `monitoring` namespace ingress to `netpol_postgres`, following the exact same HCL pattern used across all other policies in this file. The syntax matches perfectly -- `namespaceSelector` with `matchLabels` on `kubernetes.io/metadata.name`. **Consistency audit**: I reviewed all 9 NetworkPolicy resources in `terraform/network-policies.tf`. After this PR, 7 of 9 policies include monitoring ingress. The two that do not are: - `netpol_keycloak` (line 119) -- no monitoring ingress - `netpol_ollama` (line 159) -- no monitoring ingress If Prometheus has scrape targets in either of those namespaces, they would exhibit the same symptom (target DOWN). This is out of scope for this bug fix but worth noting as discovered scope. **Terraform style**: The line follows `tofu fmt` conventions (consistent indentation, trailing comma). PR body confirms `tofu fmt -check` and `tofu validate` both pass. **k8s security**: The change correctly scopes ingress to only the `monitoring` namespace via label selector -- no overly broad selectors, no `{}` podSelector on the source side. This is the right pattern. ### BLOCKERS None. This is infrastructure-as-code with no application logic, no user input handling, no secrets, and no auth paths. The single-line change is a NetworkPolicy ingress rule addition that matches the established pattern exactly. No BLOCKER criteria are triggered. ### NITS 1. **Discovered scope**: `netpol_keycloak` and `netpol_ollama` lack monitoring ingress. If either namespace has Prometheus scrape targets, a follow-up issue should be filed. Not blocking this PR since the fix is correctly scoped to the reported symptom (postgres target DOWN). ### SOP COMPLIANCE - [x] Branch named after issue (`94-re-enable-netpol-fix-postgres` references issue #94) - [x] PR body follows template (Summary, Changes, Test Plan, Review Checklist, Related) - [x] Related references investigation slug (`bug-kube-router-ipset-empty`) and SOP (`sop-network-security`) - [x] No secrets committed - [x] No unnecessary file changes (1 file, 1 line, zero scope creep) - [x] Commit message is descriptive (`fix: add monitoring ingress to postgres NetworkPolicy`) ### PROCESS OBSERVATIONS - **MTTR**: This is a clean, fast fix for a monitoring regression. One-line change with clear root cause analysis in the PR body. Good MTTR pattern. - **Change failure risk**: Very low. The change is additive (new ingress rule), follows an established pattern, and does not modify existing rules. - **Discovered scope**: The keycloak/ollama monitoring gap should be tracked if those namespaces have Prometheus targets. Recommend filing a follow-up issue if applicable. ### VERDICT: APPROVED
forgejo_admin deleted branch 94-re-enable-netpol-fix-postgres 2026-03-17 03:12:24 +00:00
Sign in to join this conversation.
No description provided.