Add CNPG backup verification CronJob #60

Closed
opened 2026-03-14 19:51:44 +00:00 by forgejo_admin · 0 comments

Lineage

plan-pal-e-platform → Phase 13 (Backup Verification)

Repo

forgejo_admin/pal-e-platform

User Story

As a platform operator
I want automated verification that CNPG Postgres backups in MinIO are restorable
So that I have confidence in disaster recovery readiness

Context

CNPG Postgres clusters (pal-e-docs, soon Woodpecker) back up WAL archives to MinIO (postgres-wal bucket). The tf-state-backup CronJob already proves the pattern for MinIO-based backup verification. But we have no automated check that Postgres backups are actually restorable. This CronJob lists the latest backup objects in MinIO and verifies they exist and are recent (within 24h).

File Targets

Files the agent should modify:

  • terraform/main.tf — add CronJob resource after the existing tf-state-backup CronJob (around line 1770)

Files the agent should NOT touch:

  • terraform/variables.tf — no new variables needed
  • .woodpecker.yaml — no CI changes needed

Acceptance Criteria

  • CronJob runs daily at 03:00 UTC (1 hour after tf-state-backup)
  • Checks MinIO postgres-wal bucket for recent backup objects
  • Logs success/failure clearly
  • Uses existing cnpg MinIO IAM user credentials (reuse minio_iam_user.cnpg)

Test Expectations

  • tofu validate passes
  • tofu fmt -check passes
  • CronJob resource is syntactically correct
  • Run command: cd terraform && tofu fmt -check && tofu validate

Constraints

  • Follow the tf-state-backup CronJob pattern (same structure: alpine image, mc client, env_from with S3 creds)
  • Use the existing cnpg MinIO IAM user — no new IAM resources needed
  • Create a cnpg-s3-creds secret copy in the postgres namespace for the CronJob to use (or reference the existing one at kubernetes_secret_v1.cnpg_s3_creds)
  • Keep it simple — just verify backup objects exist and are recent, not full restore

Checklist

  • PR opened
  • Tests pass
  • No unrelated changes
  • plan-pal-e-platform — Platform Hardening plan, Phase 13
### Lineage `plan-pal-e-platform` → Phase 13 (Backup Verification) ### Repo `forgejo_admin/pal-e-platform` ### User Story As a platform operator I want automated verification that CNPG Postgres backups in MinIO are restorable So that I have confidence in disaster recovery readiness ### Context CNPG Postgres clusters (pal-e-docs, soon Woodpecker) back up WAL archives to MinIO (`postgres-wal` bucket). The tf-state-backup CronJob already proves the pattern for MinIO-based backup verification. But we have no automated check that Postgres backups are actually restorable. This CronJob lists the latest backup objects in MinIO and verifies they exist and are recent (within 24h). ### File Targets Files the agent should modify: - `terraform/main.tf` — add CronJob resource after the existing tf-state-backup CronJob (around line 1770) Files the agent should NOT touch: - `terraform/variables.tf` — no new variables needed - `.woodpecker.yaml` — no CI changes needed ### Acceptance Criteria - [ ] CronJob runs daily at 03:00 UTC (1 hour after tf-state-backup) - [ ] Checks MinIO `postgres-wal` bucket for recent backup objects - [ ] Logs success/failure clearly - [ ] Uses existing `cnpg` MinIO IAM user credentials (reuse `minio_iam_user.cnpg`) ### Test Expectations - [ ] `tofu validate` passes - [ ] `tofu fmt -check` passes - [ ] CronJob resource is syntactically correct - Run command: `cd terraform && tofu fmt -check && tofu validate` ### Constraints - Follow the tf-state-backup CronJob pattern (same structure: alpine image, mc client, env_from with S3 creds) - Use the existing `cnpg` MinIO IAM user — no new IAM resources needed - Create a cnpg-s3-creds secret copy in the `postgres` namespace for the CronJob to use (or reference the existing one at `kubernetes_secret_v1.cnpg_s3_creds`) - Keep it simple — just verify backup objects exist and are recent, not full restore ### Checklist - [ ] PR opened - [ ] Tests pass - [ ] No unrelated changes ### Related - `plan-pal-e-platform` — Platform Hardening plan, Phase 13
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
forgejo_admin/pal-e-platform#60
No description provided.