Deploy CloudNativePG operator + Postgres cluster to k3s #11
Labels
No labels
domain:backend
domain:devops
domain:frontend
status:approved
status:in-progress
status:needs-fix
status:qa
type:bug
type:devops
type:feature
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
forgejo_admin/pal-e-platform#11
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Plan
plan-2026-02-26-tf-modularize-postgres-- Phase 2Repo
forgejo_admin/pal-e-platformUser Story
As a platform operator
I want a shared Postgres instance running on k3s via CloudNativePG
So that services can migrate from SQLite to enterprise-grade Postgres with transactional DDL, automated failover, and continuous backup
Context
Two production outages from SQLite auto-committed DDL crashing Alembic migrations on pal-e-docs. Root cause: SQLite cannot roll back DDL inside a transaction. Postgres eliminates this entire class of bugs.
We chose CloudNativePG (CNCF project) over Bitnami (too simple) and Zalando operator (heavier, older architecture). CloudNativePG gives us k8s-native CRDs, automated failover, built-in WAL archiving to object storage, and PgBouncer integration.
Architecture: shared Postgres cluster with per-service databases (same pattern as MinIO). One CloudNativePG Cluster resource, CREATE DATABASE per service.
MinIO is already deployed in the cluster and will serve as the WAL archive target for continuous backup with point-in-time recovery.
Key decisions:
s3://postgres-wal/)File Targets
Files to create or modify:
terraform/main.tf— add CloudNativePG operator helm_release in a new section, add CNPG Cluster resource via kubernetes_manifest, add MinIO bucket for WAL archiveterraform/variables.tf— add Postgres-related variables (admin password, pal-e-docs db credentials)terraform/terraform.tfvarsorsalt/pillar/secrets/platform.sls— actual credential values (GPG-encrypted if using Salt)Files NOT to touch:
Acceptance Criteria
tofu apply, then CloudNativePG operator is installed in its own namespacekubectl get clusters -A, then a Postgres cluster is running with 1 primary instancepsql -h <service> -U <user> -d pal_e_docs, then I connect successfullypostgres-wal, then WAL files are being archivedtofu planafter apply, then there is zero diff (idempotent)Test Expectations
tofu validatepassestofu fmt -checkpassestofu planshows only additive changes (no modifications to existing resources)cd terraform && tofu validate && tofu fmt -checkConstraints
tofunotterraform) — this is a k3s cluster managed by tofucloudnative-pg/cloudnative-pgfromhttps://cloudnative-pg.github.io/chartssensitive = trueChecklist
tofu validatepassestofu planoutput included in PRRelated
phase-postgres-2-deploy-cnpg— phase notetodo-pal-e-docs-deployment-reliability— the incident analysis