feat: add lock-aware retry to CI apply step #100

Merged
forgejo_admin merged 1 commit from 98-feat-add-lock-aware-retry-to-ci-apply-st into main 2026-03-17 04:09:58 +00:00

Summary

  • Add lock-aware retry logic to the Woodpecker CI apply step
  • If tofu apply fails with "state is already locked", automatically extract the lock ID, force-unlock, and retry once
  • Prevents pipeline #80-style incidents where crashed applies block all deployments

Changes

  • .woodpecker.yaml -- replaced single tofu apply -auto-approve -no-color command in the apply step with a POSIX sh lock-detection-and-retry script

Test Plan

  • Pipeline triggers on merge and apply step executes normally (happy path -- no lock)
  • Lock detection script is POSIX sh compatible (Alpine/BusyBox -- no bashisms, no grep -P, no PIPESTATUS)
  • Retry logic only fires on "state is already locked" errors, not other failures
  • Exit code propagation: non-lock failures exit with original tofu exit code
  • YAML validates: python3 -c "import yaml; yaml.safe_load(open('.woodpecker.yaml'))"

Review Checklist

  • Passed automated review-fix loop
  • No secrets committed
  • No unnecessary file changes
  • Commit messages are descriptive
  • Closes #98 -- add lock-aware retry to CI apply step
  • phase-platform-17b-tf-state-governance -- CI Lock Recovery (Phase 17b.1)

Generated with Claude Code

## Summary - Add lock-aware retry logic to the Woodpecker CI apply step - If `tofu apply` fails with "state is already locked", automatically extract the lock ID, force-unlock, and retry once - Prevents pipeline #80-style incidents where crashed applies block all deployments ## Changes - `.woodpecker.yaml` -- replaced single `tofu apply -auto-approve -no-color` command in the apply step with a POSIX sh lock-detection-and-retry script ## Test Plan - [ ] Pipeline triggers on merge and apply step executes normally (happy path -- no lock) - [ ] Lock detection script is POSIX sh compatible (Alpine/BusyBox -- no bashisms, no grep -P, no PIPESTATUS) - [ ] Retry logic only fires on "state is already locked" errors, not other failures - [ ] Exit code propagation: non-lock failures exit with original tofu exit code - [ ] YAML validates: `python3 -c "import yaml; yaml.safe_load(open('.woodpecker.yaml'))"` ## Review Checklist - [x] Passed automated review-fix loop - [x] No secrets committed - [x] No unnecessary file changes - [x] Commit messages are descriptive ## Related - Closes #98 -- add lock-aware retry to CI apply step - `phase-platform-17b-tf-state-governance` -- CI Lock Recovery (Phase 17b.1) Generated with [Claude Code](https://claude.com/claude-code)
feat: add lock-aware retry to CI apply step
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
ci/woodpecker/pull_request_closed/woodpecker Pipeline was successful
4f84e688f7
Detect stale state locks during tofu apply and auto-recover by
extracting the lock ID, force-unlocking, and retrying once.
Prevents pipeline #80-style incidents where a crashed apply
blocks all deployments on main.

Phase: 17b.1 (CI Lock Recovery)
Refs: phase-platform-17b-tf-state-governance

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Author
Owner

Tofu Plan Output

tailscale_acl.this: Refreshing state... [id=acl]
data.kubernetes_namespace_v1.tofu_state: Reading...
kubernetes_namespace_v1.minio: Refreshing state... [id=minio]
helm_release.nvidia_device_plugin: Refreshing state... [id=nvidia-device-plugin]
kubernetes_namespace_v1.harbor: Refreshing state... [id=harbor]
kubernetes_namespace_v1.ollama: Refreshing state... [id=ollama]
kubernetes_namespace_v1.postgres: Refreshing state... [id=postgres]
kubernetes_namespace_v1.keycloak: Refreshing state... [id=keycloak]
kubernetes_namespace_v1.cnpg_system: Refreshing state... [id=cnpg-system]
kubernetes_namespace_v1.monitoring: Refreshing state... [id=monitoring]
data.kubernetes_namespace_v1.tofu_state: Read complete after 0s [id=tofu-state]
kubernetes_namespace_v1.woodpecker: Refreshing state... [id=woodpecker]
kubernetes_namespace_v1.tailscale: Refreshing state... [id=tailscale]
data.kubernetes_namespace_v1.pal_e_docs: Reading...
kubernetes_namespace_v1.forgejo: Refreshing state... [id=forgejo]
kubernetes_service_account_v1.tf_backup: Refreshing state... [id=tofu-state/tf-state-backup]
data.kubernetes_namespace_v1.pal_e_docs: Read complete after 0s [id=pal-e-docs]
kubernetes_role_v1.tf_backup: Refreshing state... [id=tofu-state/tf-state-backup]
helm_release.loki_stack: Refreshing state... [id=loki-stack]
kubernetes_secret_v1.keycloak_admin: Refreshing state... [id=keycloak/keycloak-admin]
kubernetes_service_v1.keycloak: Refreshing state... [id=keycloak/keycloak]
kubernetes_persistent_volume_claim_v1.keycloak_data: Refreshing state... [id=keycloak/keycloak-data]
kubernetes_config_map_v1.uptime_dashboard: Refreshing state... [id=monitoring/uptime-dashboard]
kubernetes_secret_v1.dora_exporter: Refreshing state... [id=monitoring/dora-exporter]
kubernetes_service_v1.dora_exporter: Refreshing state... [id=monitoring/dora-exporter]
helm_release.cnpg: Refreshing state... [id=cnpg]
helm_release.kube_prometheus_stack: Refreshing state... [id=kube-prometheus-stack]
kubernetes_manifest.netpol_ollama: Refreshing state...
kubernetes_manifest.netpol_postgres: Refreshing state...
kubernetes_manifest.netpol_keycloak: Refreshing state...
kubernetes_manifest.netpol_harbor: Refreshing state...
kubernetes_manifest.netpol_monitoring: Refreshing state...
kubernetes_manifest.netpol_minio: Refreshing state...
kubernetes_manifest.netpol_cnpg_system: Refreshing state...
kubernetes_service_v1.embedding_worker_metrics: Refreshing state... [id=pal-e-docs/embedding-worker-metrics]
kubernetes_secret_v1.paledocs_db_url: Refreshing state... [id=pal-e-docs/paledocs-db-url]
kubernetes_secret_v1.woodpecker_db_credentials: Refreshing state... [id=woodpecker/woodpecker-db-credentials]
helm_release.tailscale_operator: Refreshing state... [id=tailscale-operator]
kubernetes_role_binding_v1.tf_backup: Refreshing state... [id=tofu-state/tf-state-backup]
helm_release.forgejo: Refreshing state... [id=forgejo]
helm_release.ollama: Refreshing state... [id=ollama]
kubernetes_deployment_v1.keycloak: Refreshing state... [id=keycloak/keycloak]
kubernetes_manifest.netpol_woodpecker: Refreshing state...
kubernetes_manifest.netpol_forgejo: Refreshing state...
kubernetes_config_map_v1.grafana_loki_datasource: Refreshing state... [id=monitoring/grafana-loki-datasource]
kubernetes_config_map_v1.dora_dashboard: Refreshing state... [id=monitoring/dora-dashboard]
helm_release.minio: Refreshing state... [id=minio]
helm_release.harbor: Refreshing state... [id=harbor]
kubernetes_manifest.embedding_alerts: Refreshing state...
kubernetes_manifest.blackbox_alerts: Refreshing state...
helm_release.blackbox_exporter: Refreshing state... [id=blackbox-exporter]
kubernetes_config_map_v1.pal_e_docs_dashboard: Refreshing state... [id=monitoring/pal-e-docs-dashboard]
kubernetes_deployment_v1.dora_exporter: Refreshing state... [id=monitoring/dora-exporter]
kubernetes_manifest.dora_exporter_service_monitor: Refreshing state...
kubernetes_manifest.embedding_worker_service_monitor: Refreshing state...
kubernetes_ingress_v1.keycloak_funnel: Refreshing state... [id=keycloak/keycloak-funnel]
kubernetes_ingress_v1.alertmanager_funnel: Refreshing state... [id=monitoring/alertmanager-funnel]
kubernetes_ingress_v1.grafana_funnel: Refreshing state... [id=monitoring/grafana-funnel]
kubernetes_ingress_v1.forgejo_funnel: Refreshing state... [id=forgejo/forgejo-funnel]
kubernetes_ingress_v1.harbor_funnel: Refreshing state... [id=harbor/harbor-funnel]
minio_iam_policy.cnpg_wal: Refreshing state... [id=cnpg-wal]
minio_iam_user.tf_backup: Refreshing state... [id=tf-backup]
minio_iam_policy.tf_backup: Refreshing state... [id=tf-backup]
minio_s3_bucket.postgres_wal: Refreshing state... [id=postgres-wal]
minio_s3_bucket.tf_state_backups: Refreshing state... [id=tf-state-backups]
minio_iam_user.cnpg: Refreshing state... [id=cnpg]
minio_s3_bucket.assets: Refreshing state... [id=assets]
kubernetes_ingress_v1.minio_api_funnel: Refreshing state... [id=minio/minio-api-funnel]
kubernetes_ingress_v1.minio_funnel: Refreshing state... [id=minio/minio-funnel]
minio_iam_user_policy_attachment.cnpg: Refreshing state... [id=cnpg-20260302210642491000000001]
minio_iam_user_policy_attachment.tf_backup: Refreshing state... [id=tf-backup-20260314163610110100000001]
kubernetes_secret_v1.woodpecker_cnpg_s3_creds: Refreshing state... [id=woodpecker/cnpg-s3-creds]
kubernetes_secret_v1.cnpg_s3_creds: Refreshing state... [id=postgres/cnpg-s3-creds]
kubernetes_secret_v1.tf_backup_s3_creds: Refreshing state... [id=tofu-state/tf-backup-s3-creds]
kubernetes_cron_job_v1.cnpg_backup_verify: Refreshing state... [id=postgres/cnpg-backup-verify]
kubernetes_cron_job_v1.tf_state_backup: Refreshing state... [id=tofu-state/tf-state-backup]
kubernetes_manifest.woodpecker_postgres: Refreshing state...
helm_release.woodpecker: Refreshing state... [id=woodpecker]
kubernetes_manifest.woodpecker_postgres_scheduled_backup: Refreshing state...
kubernetes_ingress_v1.woodpecker_funnel: Refreshing state... [id=woodpecker/woodpecker-funnel]

OpenTofu used the selected providers to generate the following execution
plan. Resource actions are indicated with the following symbols:
  ~ update in-place

OpenTofu will perform the following actions:

  # helm_release.woodpecker will be updated in-place
  ~ resource "helm_release" "woodpecker" {
        id                         = "woodpecker"
      ~ metadata                   = [
          - {
              - app_version    = "3.13.0"
              - chart          = "woodpecker"
              - first_deployed = 1773625582
              - last_deployed  = 1773710708
              - name           = "woodpecker"
              - namespace      = "woodpecker"
              - notes          = <<-EOT
                    1. Get the application URL by running these commands:
                      export POD_NAME=$(kubectl get pods --namespace woodpecker -l "app.kubernetes.io/name=server,app.kubernetes.io/instance=woodpecker" -o jsonpath="{.items[0].metadata.name}")
                      export CONTAINER_PORT=$(kubectl get pod --namespace woodpecker $POD_NAME -o jsonpath="{.spec.containers[0].ports[0].containerPort}")
                      echo "Visit http://127.0.0.1:8080 to use your application"
                      kubectl --namespace woodpecker port-forward $POD_NAME 8080:$CONTAINER_PORT
                EOT
              - revision       = 3
              - values         = jsonencode(
                    {
                      - agent  = {
                          - enabled      = true
                          - env          = {
                              - WOODPECKER_AGENT_SECRET              = "(sensitive value)"
                              - WOODPECKER_BACKEND                   = "kubernetes"
                              - WOODPECKER_BACKEND_K8S_NAMESPACE     = "woodpecker"
                              - WOODPECKER_BACKEND_K8S_STORAGE_CLASS = "local-path"
                              - WOODPECKER_BACKEND_K8S_VOLUME_SIZE   = "1Gi"
                            }
                          - replicaCount = 1
                          - resources    = {
                              - limits   = {
                                  - memory = "256Mi"
                                }
                              - requests = {
                                  - cpu    = "50m"
                                  - memory = "64Mi"
                                }
                            }
                        }
                      - server = {
                          - env              = {
                              - WOODPECKER_ADMIN               = "forgejo_admin"
                              - WOODPECKER_AGENT_SECRET        = "(sensitive value)"
                              - WOODPECKER_DATABASE_DATASOURCE = "postgres://woodpecker:kM3L4AhLNiuMhIY7tMQ@woodpecker-db-rw.woodpecker.svc.cluster.local:5432/woodpecker?sslmode=disable"
                              - WOODPECKER_DATABASE_DRIVER     = "postgres"
                              - WOODPECKER_FORGEJO             = "true"
                              - WOODPECKER_FORGEJO_CLIENT      = "(sensitive value)"
                              - WOODPECKER_FORGEJO_CLONE_URL   = "http://forgejo-http.forgejo.svc.cluster.local:80"
                              - WOODPECKER_FORGEJO_SECRET      = "(sensitive value)"
                              - WOODPECKER_FORGEJO_URL         = "https://forgejo.tail5b443a.ts.net"
                              - WOODPECKER_HOST                = "https://woodpecker.tail5b443a.ts.net"
                            }
                          - persistentVolume = {
                              - enabled      = true
                              - size         = "5Gi"
                              - storageClass = "local-path"
                            }
                          - resources        = {
                              - limits   = {
                                  - memory = "512Mi"
                                }
                              - requests = {
                                  - cpu    = "50m"
                                  - memory = "128Mi"
                                }
                            }
                          - statefulSet      = {
                              - replicaCount = 1
                            }
                        }
                    }
                )
              - version        = "3.5.1"
            },
        ] -> (known after apply)
        name                       = "woodpecker"
      ~ status                     = "pending-upgrade" -> "deployed"
        # (25 unchanged attributes hidden)

      - set_sensitive {
          # At least one attribute in this block is (or was) sensitive,
          # so its contents will not be displayed.
        }
      - set_sensitive {
          # At least one attribute in this block is (or was) sensitive,
          # so its contents will not be displayed.
        }
      + set_sensitive {
          # At least one attribute in this block is (or was) sensitive,
          # so its contents will not be displayed.
        }
      + set_sensitive {
          # At least one attribute in this block is (or was) sensitive,
          # so its contents will not be displayed.
        }

        # (2 unchanged blocks hidden)
    }

  # kubernetes_cron_job_v1.cnpg_backup_verify will be updated in-place
  ~ resource "kubernetes_cron_job_v1" "cnpg_backup_verify" {
        id = "postgres/cnpg-backup-verify"

      ~ spec {
            # (6 unchanged attributes hidden)

          ~ job_template {
              ~ spec {
                    # (7 unchanged attributes hidden)

                  ~ template {
                      ~ spec {
                            # (12 unchanged attributes hidden)

                          ~ container {
                              ~ args                       = [
                                  - <<-EOT
                                        set -euo pipefail
                                        
                                        apk add --no-cache curl >/dev/null
                                        
                                        # Install mc (MinIO Client)
                                        curl -sSL https://dl.min.io/client/mc/release/linux-amd64/mc -o /tmp/mc
                                        chmod +x /tmp/mc
                                        
                                        # Configure MinIO alias
                                        /tmp/mc alias set backup http://minio.minio.svc.cluster.local:9000 "$ACCESS_KEY_ID" "$ACCESS_SECRET_KEY"
                                        
                                        ERRORS=0
                                        MAX_AGE_HOURS=25  # Allow 1h buffer beyond 24h
                                        
                                        # Check each backup path prefix
                                        for PREFIX in "pal-e-postgres" "woodpecker"; do
                                          echo "=== Checking backups for $PREFIX ==="
                                        
                                          # List objects in the backup path
                                          OBJECTS=$(/tmp/mc ls "backup/postgres-wal/$PREFIX/" 2>/dev/null | head -5 || true)
                                        
                                          if [ -z "$OBJECTS" ]; then
                                            echo "ERROR: No backup objects found for $PREFIX"
                                            ERRORS=$((ERRORS + 1))
                                            continue
                                          fi
                                        
                                          echo "Found backup objects for $PREFIX:"
                                          echo "$OBJECTS"
                                        
                                          # Check WAL directory for recent files
                                          RECENT=$(/tmp/mc find "backup/postgres-wal/$PREFIX/wals/" --newer-than "${MAX_AGE_HOURS}h" 2>/dev/null | head -1 || true)
                                        
                                          if [ -z "$RECENT" ]; then
                                            echo "WARNING: No WAL files newer than ${MAX_AGE_HOURS}h for $PREFIX"
                                            ERRORS=$((ERRORS + 1))
                                          else
                                            echo "OK: Recent WAL files found for $PREFIX"
                                          fi
                                        done
                                        
                                        if [ "$ERRORS" -gt 0 ]; then
                                          echo "FAILED: $ERRORS backup verification errors"
                                          exit 1
                                        fi
                                        
                                        echo "All backup verifications passed."
                                    EOT,
                                  + <<-EOT
                                        set -euo pipefail
                                        
                                        apk add --no-cache curl >/dev/null
                                        
                                        # Install mc (MinIO Client)
                                        curl -sSL https://dl.min.io/client/mc/release/linux-amd64/mc -o /tmp/mc
                                        chmod +x /tmp/mc
                                        
                                        # Configure MinIO alias
                                        /tmp/mc alias set backup http://minio.minio.svc.cluster.local:9000 "$ACCESS_KEY_ID" "$ACCESS_SECRET_KEY"
                                        
                                        ERRORS=0
                                        MAX_AGE_HOURS=25  # Allow 1h buffer beyond 24h
                                        
                                        # Check each backup path prefix
                                        for PREFIX in "pal-e-postgres" "woodpecker"; do
                                          echo "=== Checking backups for $PREFIX ==="
                                        
                                          # List objects in the backup path
                                          OBJECTS=$(/tmp/mc ls "backup/postgres-wal/$PREFIX/" 2>/dev/null | head -5 || true)
                                        
                                          if [ -z "$OBJECTS" ]; then
                                            echo "ERROR: No backup objects found for $PREFIX"
                                            ERRORS=$((ERRORS + 1))
                                            continue
                                          fi
                                        
                                          echo "Found backup objects for $PREFIX:"
                                          echo "$OBJECTS"
                                        
                                          # Check if WAL directory has content (new clusters may not have archived WALs yet)
                                          WAL_EXISTS=$(/tmp/mc ls "backup/postgres-wal/$PREFIX/wals/" 2>/dev/null | head -1 || true)
                                        
                                          if [ -z "$WAL_EXISTS" ]; then
                                            echo "SKIP: No WAL archives yet for $PREFIX (new cluster, base backup only)"
                                            continue
                                          fi
                                        
                                          # Check WAL directory for recent files
                                          RECENT=$(/tmp/mc find "backup/postgres-wal/$PREFIX/wals/" --newer-than "${MAX_AGE_HOURS}h" 2>/dev/null | head -1 || true)
                                        
                                          if [ -z "$RECENT" ]; then
                                            echo "WARNING: No WAL files newer than ${MAX_AGE_HOURS}h for $PREFIX"
                                            ERRORS=$((ERRORS + 1))
                                          else
                                            echo "OK: Recent WAL files found for $PREFIX"
                                          fi
                                        done
                                        
                                        if [ "$ERRORS" -gt 0 ]; then
                                          echo "FAILED: $ERRORS backup verification errors"
                                          exit 1
                                        fi
                                        
                                        echo "All backup verifications passed."
                                    EOT,
                                ]
                                name                       = "verify"
                                # (8 unchanged attributes hidden)

                                # (3 unchanged blocks hidden)
                            }
                        }

                        # (1 unchanged block hidden)
                    }
                }

                # (1 unchanged block hidden)
            }
        }

        # (1 unchanged block hidden)
    }

  # kubernetes_manifest.netpol_postgres will be updated in-place
  ~ resource "kubernetes_manifest" "netpol_postgres" {
      ~ manifest = {
          ~ spec       = {
              ~ ingress     = [
                    # (1 unchanged element hidden)
                    {
                        from = [
                            {
                                namespaceSelector = {
                                    matchLabels = {
                                        "kubernetes.io/metadata.name" = "cnpg-system"
                                    }
                                }
                            },
                        ]
                    },
                  + {
                      + from = [
                          + {
                              + namespaceSelector = {
                                  + matchLabels = {
                                      + "kubernetes.io/metadata.name" = "monitoring"
                                    }
                                }
                            },
                        ]
                    },
                ]
                # (2 unchanged attributes hidden)
            }
            # (3 unchanged attributes hidden)
        }
      ~ object   = {
          ~ spec       = {
              ~ ingress     = [
                    # (1 unchanged element hidden)
                    {
                        from  = [
                            {
                                ipBlock           = {
                                    cidr   = null
                                    except = null
                                }
                                namespaceSelector = {
                                    matchExpressions = null
                                    matchLabels      = {
                                        "kubernetes.io/metadata.name" = "cnpg-system"
                                    }
                                }
                                podSelector       = {
                                    matchExpressions = null
                                    matchLabels      = null
                                }
                            },
                        ]
                        ports = null
                    },
                  + {
                      + from  = [
                          + {
                              + ipBlock           = {
                                  + cidr   = (known after apply)
                                  + except = (known after apply)
                                }
                              + namespaceSelector = {
                                  + matchExpressions = (known after apply)
                                  + matchLabels      = {
                                      + "kubernetes.io/metadata.name" = "monitoring"
                                    }
                                }
                              + podSelector       = {
                                  + matchExpressions = (known after apply)
                                  + matchLabels      = (known after apply)
                                }
                            },
                        ]
                      + ports = (known after apply)
                    },
                ]
                # (3 unchanged attributes hidden)
            }
            # (3 unchanged attributes hidden)
        }
    }

Plan: 0 to add, 3 to change, 0 to destroy.

─────────────────────────────────────────────────────────────────────────────

Note: You didn't use the -out option to save this plan, so OpenTofu can't
guarantee to take exactly these actions if you run "tofu apply" now.
## Tofu Plan Output ``` tailscale_acl.this: Refreshing state... [id=acl] data.kubernetes_namespace_v1.tofu_state: Reading... kubernetes_namespace_v1.minio: Refreshing state... [id=minio] helm_release.nvidia_device_plugin: Refreshing state... [id=nvidia-device-plugin] kubernetes_namespace_v1.harbor: Refreshing state... [id=harbor] kubernetes_namespace_v1.ollama: Refreshing state... [id=ollama] kubernetes_namespace_v1.postgres: Refreshing state... [id=postgres] kubernetes_namespace_v1.keycloak: Refreshing state... [id=keycloak] kubernetes_namespace_v1.cnpg_system: Refreshing state... [id=cnpg-system] kubernetes_namespace_v1.monitoring: Refreshing state... [id=monitoring] data.kubernetes_namespace_v1.tofu_state: Read complete after 0s [id=tofu-state] kubernetes_namespace_v1.woodpecker: Refreshing state... [id=woodpecker] kubernetes_namespace_v1.tailscale: Refreshing state... [id=tailscale] data.kubernetes_namespace_v1.pal_e_docs: Reading... kubernetes_namespace_v1.forgejo: Refreshing state... [id=forgejo] kubernetes_service_account_v1.tf_backup: Refreshing state... [id=tofu-state/tf-state-backup] data.kubernetes_namespace_v1.pal_e_docs: Read complete after 0s [id=pal-e-docs] kubernetes_role_v1.tf_backup: Refreshing state... [id=tofu-state/tf-state-backup] helm_release.loki_stack: Refreshing state... [id=loki-stack] kubernetes_secret_v1.keycloak_admin: Refreshing state... [id=keycloak/keycloak-admin] kubernetes_service_v1.keycloak: Refreshing state... [id=keycloak/keycloak] kubernetes_persistent_volume_claim_v1.keycloak_data: Refreshing state... [id=keycloak/keycloak-data] kubernetes_config_map_v1.uptime_dashboard: Refreshing state... [id=monitoring/uptime-dashboard] kubernetes_secret_v1.dora_exporter: Refreshing state... [id=monitoring/dora-exporter] kubernetes_service_v1.dora_exporter: Refreshing state... [id=monitoring/dora-exporter] helm_release.cnpg: Refreshing state... [id=cnpg] helm_release.kube_prometheus_stack: Refreshing state... [id=kube-prometheus-stack] kubernetes_manifest.netpol_ollama: Refreshing state... kubernetes_manifest.netpol_postgres: Refreshing state... kubernetes_manifest.netpol_keycloak: Refreshing state... kubernetes_manifest.netpol_harbor: Refreshing state... kubernetes_manifest.netpol_monitoring: Refreshing state... kubernetes_manifest.netpol_minio: Refreshing state... kubernetes_manifest.netpol_cnpg_system: Refreshing state... kubernetes_service_v1.embedding_worker_metrics: Refreshing state... [id=pal-e-docs/embedding-worker-metrics] kubernetes_secret_v1.paledocs_db_url: Refreshing state... [id=pal-e-docs/paledocs-db-url] kubernetes_secret_v1.woodpecker_db_credentials: Refreshing state... [id=woodpecker/woodpecker-db-credentials] helm_release.tailscale_operator: Refreshing state... [id=tailscale-operator] kubernetes_role_binding_v1.tf_backup: Refreshing state... [id=tofu-state/tf-state-backup] helm_release.forgejo: Refreshing state... [id=forgejo] helm_release.ollama: Refreshing state... [id=ollama] kubernetes_deployment_v1.keycloak: Refreshing state... [id=keycloak/keycloak] kubernetes_manifest.netpol_woodpecker: Refreshing state... kubernetes_manifest.netpol_forgejo: Refreshing state... kubernetes_config_map_v1.grafana_loki_datasource: Refreshing state... [id=monitoring/grafana-loki-datasource] kubernetes_config_map_v1.dora_dashboard: Refreshing state... [id=monitoring/dora-dashboard] helm_release.minio: Refreshing state... [id=minio] helm_release.harbor: Refreshing state... [id=harbor] kubernetes_manifest.embedding_alerts: Refreshing state... kubernetes_manifest.blackbox_alerts: Refreshing state... helm_release.blackbox_exporter: Refreshing state... [id=blackbox-exporter] kubernetes_config_map_v1.pal_e_docs_dashboard: Refreshing state... [id=monitoring/pal-e-docs-dashboard] kubernetes_deployment_v1.dora_exporter: Refreshing state... [id=monitoring/dora-exporter] kubernetes_manifest.dora_exporter_service_monitor: Refreshing state... kubernetes_manifest.embedding_worker_service_monitor: Refreshing state... kubernetes_ingress_v1.keycloak_funnel: Refreshing state... [id=keycloak/keycloak-funnel] kubernetes_ingress_v1.alertmanager_funnel: Refreshing state... [id=monitoring/alertmanager-funnel] kubernetes_ingress_v1.grafana_funnel: Refreshing state... [id=monitoring/grafana-funnel] kubernetes_ingress_v1.forgejo_funnel: Refreshing state... [id=forgejo/forgejo-funnel] kubernetes_ingress_v1.harbor_funnel: Refreshing state... [id=harbor/harbor-funnel] minio_iam_policy.cnpg_wal: Refreshing state... [id=cnpg-wal] minio_iam_user.tf_backup: Refreshing state... [id=tf-backup] minio_iam_policy.tf_backup: Refreshing state... [id=tf-backup] minio_s3_bucket.postgres_wal: Refreshing state... [id=postgres-wal] minio_s3_bucket.tf_state_backups: Refreshing state... [id=tf-state-backups] minio_iam_user.cnpg: Refreshing state... [id=cnpg] minio_s3_bucket.assets: Refreshing state... [id=assets] kubernetes_ingress_v1.minio_api_funnel: Refreshing state... [id=minio/minio-api-funnel] kubernetes_ingress_v1.minio_funnel: Refreshing state... [id=minio/minio-funnel] minio_iam_user_policy_attachment.cnpg: Refreshing state... [id=cnpg-20260302210642491000000001] minio_iam_user_policy_attachment.tf_backup: Refreshing state... [id=tf-backup-20260314163610110100000001] kubernetes_secret_v1.woodpecker_cnpg_s3_creds: Refreshing state... [id=woodpecker/cnpg-s3-creds] kubernetes_secret_v1.cnpg_s3_creds: Refreshing state... [id=postgres/cnpg-s3-creds] kubernetes_secret_v1.tf_backup_s3_creds: Refreshing state... [id=tofu-state/tf-backup-s3-creds] kubernetes_cron_job_v1.cnpg_backup_verify: Refreshing state... [id=postgres/cnpg-backup-verify] kubernetes_cron_job_v1.tf_state_backup: Refreshing state... [id=tofu-state/tf-state-backup] kubernetes_manifest.woodpecker_postgres: Refreshing state... helm_release.woodpecker: Refreshing state... [id=woodpecker] kubernetes_manifest.woodpecker_postgres_scheduled_backup: Refreshing state... kubernetes_ingress_v1.woodpecker_funnel: Refreshing state... [id=woodpecker/woodpecker-funnel] OpenTofu used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols: ~ update in-place OpenTofu will perform the following actions: # helm_release.woodpecker will be updated in-place ~ resource "helm_release" "woodpecker" { id = "woodpecker" ~ metadata = [ - { - app_version = "3.13.0" - chart = "woodpecker" - first_deployed = 1773625582 - last_deployed = 1773710708 - name = "woodpecker" - namespace = "woodpecker" - notes = <<-EOT 1. Get the application URL by running these commands: export POD_NAME=$(kubectl get pods --namespace woodpecker -l "app.kubernetes.io/name=server,app.kubernetes.io/instance=woodpecker" -o jsonpath="{.items[0].metadata.name}") export CONTAINER_PORT=$(kubectl get pod --namespace woodpecker $POD_NAME -o jsonpath="{.spec.containers[0].ports[0].containerPort}") echo "Visit http://127.0.0.1:8080 to use your application" kubectl --namespace woodpecker port-forward $POD_NAME 8080:$CONTAINER_PORT EOT - revision = 3 - values = jsonencode( { - agent = { - enabled = true - env = { - WOODPECKER_AGENT_SECRET = "(sensitive value)" - WOODPECKER_BACKEND = "kubernetes" - WOODPECKER_BACKEND_K8S_NAMESPACE = "woodpecker" - WOODPECKER_BACKEND_K8S_STORAGE_CLASS = "local-path" - WOODPECKER_BACKEND_K8S_VOLUME_SIZE = "1Gi" } - replicaCount = 1 - resources = { - limits = { - memory = "256Mi" } - requests = { - cpu = "50m" - memory = "64Mi" } } } - server = { - env = { - WOODPECKER_ADMIN = "forgejo_admin" - WOODPECKER_AGENT_SECRET = "(sensitive value)" - WOODPECKER_DATABASE_DATASOURCE = "postgres://woodpecker:kM3L4AhLNiuMhIY7tMQ@woodpecker-db-rw.woodpecker.svc.cluster.local:5432/woodpecker?sslmode=disable" - WOODPECKER_DATABASE_DRIVER = "postgres" - WOODPECKER_FORGEJO = "true" - WOODPECKER_FORGEJO_CLIENT = "(sensitive value)" - WOODPECKER_FORGEJO_CLONE_URL = "http://forgejo-http.forgejo.svc.cluster.local:80" - WOODPECKER_FORGEJO_SECRET = "(sensitive value)" - WOODPECKER_FORGEJO_URL = "https://forgejo.tail5b443a.ts.net" - WOODPECKER_HOST = "https://woodpecker.tail5b443a.ts.net" } - persistentVolume = { - enabled = true - size = "5Gi" - storageClass = "local-path" } - resources = { - limits = { - memory = "512Mi" } - requests = { - cpu = "50m" - memory = "128Mi" } } - statefulSet = { - replicaCount = 1 } } } ) - version = "3.5.1" }, ] -> (known after apply) name = "woodpecker" ~ status = "pending-upgrade" -> "deployed" # (25 unchanged attributes hidden) - set_sensitive { # At least one attribute in this block is (or was) sensitive, # so its contents will not be displayed. } - set_sensitive { # At least one attribute in this block is (or was) sensitive, # so its contents will not be displayed. } + set_sensitive { # At least one attribute in this block is (or was) sensitive, # so its contents will not be displayed. } + set_sensitive { # At least one attribute in this block is (or was) sensitive, # so its contents will not be displayed. } # (2 unchanged blocks hidden) } # kubernetes_cron_job_v1.cnpg_backup_verify will be updated in-place ~ resource "kubernetes_cron_job_v1" "cnpg_backup_verify" { id = "postgres/cnpg-backup-verify" ~ spec { # (6 unchanged attributes hidden) ~ job_template { ~ spec { # (7 unchanged attributes hidden) ~ template { ~ spec { # (12 unchanged attributes hidden) ~ container { ~ args = [ - <<-EOT set -euo pipefail apk add --no-cache curl >/dev/null # Install mc (MinIO Client) curl -sSL https://dl.min.io/client/mc/release/linux-amd64/mc -o /tmp/mc chmod +x /tmp/mc # Configure MinIO alias /tmp/mc alias set backup http://minio.minio.svc.cluster.local:9000 "$ACCESS_KEY_ID" "$ACCESS_SECRET_KEY" ERRORS=0 MAX_AGE_HOURS=25 # Allow 1h buffer beyond 24h # Check each backup path prefix for PREFIX in "pal-e-postgres" "woodpecker"; do echo "=== Checking backups for $PREFIX ===" # List objects in the backup path OBJECTS=$(/tmp/mc ls "backup/postgres-wal/$PREFIX/" 2>/dev/null | head -5 || true) if [ -z "$OBJECTS" ]; then echo "ERROR: No backup objects found for $PREFIX" ERRORS=$((ERRORS + 1)) continue fi echo "Found backup objects for $PREFIX:" echo "$OBJECTS" # Check WAL directory for recent files RECENT=$(/tmp/mc find "backup/postgres-wal/$PREFIX/wals/" --newer-than "${MAX_AGE_HOURS}h" 2>/dev/null | head -1 || true) if [ -z "$RECENT" ]; then echo "WARNING: No WAL files newer than ${MAX_AGE_HOURS}h for $PREFIX" ERRORS=$((ERRORS + 1)) else echo "OK: Recent WAL files found for $PREFIX" fi done if [ "$ERRORS" -gt 0 ]; then echo "FAILED: $ERRORS backup verification errors" exit 1 fi echo "All backup verifications passed." EOT, + <<-EOT set -euo pipefail apk add --no-cache curl >/dev/null # Install mc (MinIO Client) curl -sSL https://dl.min.io/client/mc/release/linux-amd64/mc -o /tmp/mc chmod +x /tmp/mc # Configure MinIO alias /tmp/mc alias set backup http://minio.minio.svc.cluster.local:9000 "$ACCESS_KEY_ID" "$ACCESS_SECRET_KEY" ERRORS=0 MAX_AGE_HOURS=25 # Allow 1h buffer beyond 24h # Check each backup path prefix for PREFIX in "pal-e-postgres" "woodpecker"; do echo "=== Checking backups for $PREFIX ===" # List objects in the backup path OBJECTS=$(/tmp/mc ls "backup/postgres-wal/$PREFIX/" 2>/dev/null | head -5 || true) if [ -z "$OBJECTS" ]; then echo "ERROR: No backup objects found for $PREFIX" ERRORS=$((ERRORS + 1)) continue fi echo "Found backup objects for $PREFIX:" echo "$OBJECTS" # Check if WAL directory has content (new clusters may not have archived WALs yet) WAL_EXISTS=$(/tmp/mc ls "backup/postgres-wal/$PREFIX/wals/" 2>/dev/null | head -1 || true) if [ -z "$WAL_EXISTS" ]; then echo "SKIP: No WAL archives yet for $PREFIX (new cluster, base backup only)" continue fi # Check WAL directory for recent files RECENT=$(/tmp/mc find "backup/postgres-wal/$PREFIX/wals/" --newer-than "${MAX_AGE_HOURS}h" 2>/dev/null | head -1 || true) if [ -z "$RECENT" ]; then echo "WARNING: No WAL files newer than ${MAX_AGE_HOURS}h for $PREFIX" ERRORS=$((ERRORS + 1)) else echo "OK: Recent WAL files found for $PREFIX" fi done if [ "$ERRORS" -gt 0 ]; then echo "FAILED: $ERRORS backup verification errors" exit 1 fi echo "All backup verifications passed." EOT, ] name = "verify" # (8 unchanged attributes hidden) # (3 unchanged blocks hidden) } } # (1 unchanged block hidden) } } # (1 unchanged block hidden) } } # (1 unchanged block hidden) } # kubernetes_manifest.netpol_postgres will be updated in-place ~ resource "kubernetes_manifest" "netpol_postgres" { ~ manifest = { ~ spec = { ~ ingress = [ # (1 unchanged element hidden) { from = [ { namespaceSelector = { matchLabels = { "kubernetes.io/metadata.name" = "cnpg-system" } } }, ] }, + { + from = [ + { + namespaceSelector = { + matchLabels = { + "kubernetes.io/metadata.name" = "monitoring" } } }, ] }, ] # (2 unchanged attributes hidden) } # (3 unchanged attributes hidden) } ~ object = { ~ spec = { ~ ingress = [ # (1 unchanged element hidden) { from = [ { ipBlock = { cidr = null except = null } namespaceSelector = { matchExpressions = null matchLabels = { "kubernetes.io/metadata.name" = "cnpg-system" } } podSelector = { matchExpressions = null matchLabels = null } }, ] ports = null }, + { + from = [ + { + ipBlock = { + cidr = (known after apply) + except = (known after apply) } + namespaceSelector = { + matchExpressions = (known after apply) + matchLabels = { + "kubernetes.io/metadata.name" = "monitoring" } } + podSelector = { + matchExpressions = (known after apply) + matchLabels = (known after apply) } }, ] + ports = (known after apply) }, ] # (3 unchanged attributes hidden) } # (3 unchanged attributes hidden) } } Plan: 0 to add, 3 to change, 0 to destroy. ───────────────────────────────────────────────────────────────────────────── Note: You didn't use the -out option to save this plan, so OpenTofu can't guarantee to take exactly these actions if you run "tofu apply" now. ```
Author
Owner

Review: PASS

Scope: 1 file, 19 additions, 1 deletion. Only .woodpecker.yaml apply step modified.

Checks

  • POSIX sh compatibility: No bashisms. No PIPESTATUS, no grep -P, no [[. Exit code captured via subshell + temp file. All constructs are POSIX sh safe.
  • BusyBox compatibility: grep -q supported. sed -n with \s supported in BusyBox sed (Alpine). head -1 supported.
  • Logic correctness:
    • Happy path (apply succeeds): exits 0 via exit "$APPLY_EXIT"
    • Non-lock failure: propagates original exit code
    • Lock detected + ID extracted: force-unlock + retry once
    • Lock detected + ID not extractable: exits 1 with error message
  • YAML structure: Block scalar - | properly indented, surrounding steps unaffected
  • YAML validation: Confirmed valid via python3 -c "import yaml; yaml.safe_load(...)"
  • tofu fmt: No .tf files changed, tofu fmt -check -recursive passes
  • No secrets: No sensitive data in diff
  • No unrelated changes: Diff is scoped exactly to the apply command replacement

Note

\s in the sed pattern is a GNU/BusyBox extension, not strictly POSIX. Alpine's BusyBox sed supports it. If strict POSIX compliance ever matters, replace with [ \t]. Not blocking for this target environment.

No fixes needed. Ready for merge.

## Review: PASS **Scope**: 1 file, 19 additions, 1 deletion. Only `.woodpecker.yaml` apply step modified. ### Checks - **POSIX sh compatibility**: No bashisms. No `PIPESTATUS`, no `grep -P`, no `[[`. Exit code captured via subshell + temp file. All constructs are POSIX sh safe. - **BusyBox compatibility**: `grep -q` supported. `sed -n` with `\s` supported in BusyBox sed (Alpine). `head -1` supported. - **Logic correctness**: - Happy path (apply succeeds): exits 0 via `exit "$APPLY_EXIT"` - Non-lock failure: propagates original exit code - Lock detected + ID extracted: force-unlock + retry once - Lock detected + ID not extractable: exits 1 with error message - **YAML structure**: Block scalar `- |` properly indented, surrounding steps unaffected - **YAML validation**: Confirmed valid via `python3 -c "import yaml; yaml.safe_load(...)"` - **tofu fmt**: No .tf files changed, `tofu fmt -check -recursive` passes - **No secrets**: No sensitive data in diff - **No unrelated changes**: Diff is scoped exactly to the apply command replacement ### Note `\s` in the sed pattern is a GNU/BusyBox extension, not strictly POSIX. Alpine's BusyBox sed supports it. If strict POSIX compliance ever matters, replace with `[ \t]`. Not blocking for this target environment. No fixes needed. Ready for merge.
forgejo_admin deleted branch 98-feat-add-lock-aware-retry-to-ci-apply-st 2026-03-17 04:09:58 +00:00
Sign in to join this conversation.
No description provided.