ldraney/pal-e-api

Archived

Fork 0

Migrate pal-e-docs from SQLite to Postgres (CNPG) #76

New issue

Closed

opened 2026-03-06 17:13:25 +00:00 by forgejo_admin · 2 comments

forgejo_admin commented

2026-03-06 17:13:25 +00:00

Contributor

Plan

plan-2026-02-26-tf-modularize-postgres -- Phase 3, Parts B/C/D (traceability only)

Repo

forgejo_admin/pal-e-docs

User Story

As a platform operator
I want pal-e-docs running on the shared CNPG Postgres cluster
So that migrations are safe (no SQLite auto-commit DDL crashes) and we unlock future search capabilities

Context

pal-e-docs currently uses SQLite with Litestream for backup/replication. A CNPG Postgres cluster is already running:

Cluster: pal-e-postgres in postgres namespace (healthy, Postgres 17.4)
Database: paledocs, user: paledocs
Endpoint: pal-e-postgres-rw.postgres.svc.cluster.local:5432
Connection string will be provided via k8s secret paledocs-db-url key DATABASE_URL in the pal-e-docs namespace (created by Terraform — pal-e-platform issue #22, applied before this work deploys)

ArgoCD auto-syncs from pal-e-docs/k8s/ on main with prune + self-heal enabled. When PVC and ConfigMap are removed from kustomization.yaml, ArgoCD will delete them. This is intentional.

This is a two-PR delivery (see Deployment Sequence below).

SQLite-isms in Existing Migrations

The following patterns in alembic/versions/ are SQLite-specific and will fail on Postgres:

sa.text('(CURRENT_TIMESTAMP)') — files: 54714bc57fbe, a1b2c3d4e5f6, b2c3d4e5f6a7. Postgres wants CURRENT_TIMESTAMP (no parens) or now().
server_default=sa.text('1') for Boolean — file: b2c3d4e5f6a7 (users table is_approved). Postgres wants true.
f6a7b8c9d0e1 (sprint schema expansion) — upgrade is pass (no-op on SQLite). On Postgres with native_enum=False, CHECK constraints for expanded enum values may need handling.

Recommended approach: Since we're running against an empty Postgres database, the cleanest path is:

Option A: Fix each migration to be dialect-aware (add if context.get_x_argument... or use sa.func.now() instead of sa.text)
Option B: Generate a single fresh baseline migration from the current models for Postgres, squashing the chain
Preference: Option A — fix in-place so the migration chain works on both SQLite (local dev) and Postgres (production). Replace sa.text('(CURRENT_TIMESTAMP)') with sa.func.now(), replace sa.text('1') with sa.text('true') guarded by dialect check or just use sa.true_().

File Targets

PR 1 — Code changes (Postgres support, backward compatible):

src/pal_e_docs/config.py -- add database_url: str | None = None setting
src/pal_e_docs/database.py -- use database_url when set, fall back to SQLite. Conditionally attach SQLite pragma listener.
alembic/env.py -- use settings.database_url when available
alembic/versions/54714bc57fbe_initial_schema.py -- fix (CURRENT_TIMESTAMP) → sa.func.now()
alembic/versions/a1b2c3d4e5f6_add_repos_table.py -- same fix
alembic/versions/b2c3d4e5f6a7_add_users_table.py -- fix timestamp + boolean default
alembic/versions/c3d4e5f6a7b8_add_project_is_public_and_page_note_fk.py -- fix boolean default if present
alembic/versions/f6a7b8c9d0e1_sprint_schema_expansion.py -- add Postgres CHECK constraint handling
pyproject.toml -- add psycopg2-binary dependency (or psycopg[binary])
Dockerfile -- if psycopg2-binary needs system deps, add them

Files NOT to touch:

k8s/* -- that's PR 2
tests/ -- existing tests should still pass against SQLite

PR 2 — Manifest changes (switch to Postgres, remove Litestream):

k8s/deployment.yaml -- replace PALDOCS_DATABASE_PATH env with PALDOCS_DATABASE_URL from secret, remove Litestream init + sidecar containers, remove data volume + litestream-config volume
k8s/kustomization.yaml -- remove pvc.yaml and litestream-configmap.yaml
k8s/pvc.yaml -- DELETE this file
k8s/litestream-configmap.yaml -- DELETE this file

Data migration script (can live in repo root or scripts/):

scripts/migrate_sqlite_to_postgres.py -- dump SQLite → insert into Postgres, reset PK sequences

Acceptance Criteria

App starts successfully with PALDOCS_DATABASE_URL=postgresql://... env var
App still starts with PALDOCS_DATABASE_PATH=/data/foo.db (backward compat for local dev)
alembic upgrade head runs cleanly against an empty Postgres database
All existing API endpoints return correct data after migration
MCP tools (list_notes, get_note, create_note, etc.) work against Postgres
Litestream containers removed from deployment
PVC and ConfigMap removed from kustomization

Test Expectations

Existing tests pass (they use SQLite — no change needed)
Manual verification: run alembic upgrade head against a local Postgres
Manual verification: run data migration script against test data
Run: pytest tests/ (should pass unchanged for PR 1)

Deployment Sequence (CRITICAL)

1. Merge PR 1 (code) → ArgoCD deploys. App still uses SQLite. Nothing changes.
2. Lucas runs `tofu apply` on pal-e-platform (creates DB secret)
3. Lucas runs `alembic upgrade head` against Postgres (creates schema)
4. Lucas runs `scripts/migrate_sqlite_to_postgres.py` (loads data)
5. Lucas verifies data in Postgres
6. Merge PR 2 (manifests) → ArgoCD deploys. App switches to Postgres. Litestream removed.
7. Verify app is healthy, API responds, data intact.

Steps 2-5 are manual — coordinated by Betty Sue during a short maintenance window. The dev agent delivers PR 1, PR 2, and the migration script. Lucas handles the operational steps.

Constraints

database.py engine creation must be at module level (current pattern) — don't change to lazy init
Keep config.py using pydantic-settings with env_prefix = "PALDOCS_"
The PALDOCS_DATABASE_URL env var name must match the config field name (database_url) with the prefix
psycopg2-binary is fine for now (avoids libpq build dep). Can switch to psycopg[binary] (psycopg3) if preferred.
Do NOT run alembic upgrade head against production — that's Lucas's job during the maintenance window
PR 1 and PR 2 should be separate branches and separate PRs

Checklist

PR 1 opened (code changes)
PR 2 opened (manifest changes)
Migration script included
Tests pass
No unrelated changes

pal-e-platform project -- issue #22 (Terraform secret) must be applied before PR 2 merges
plan-2026-02-25-platform-observability -- Litestream removal affects backup strategy

### Plan `plan-2026-02-26-tf-modularize-postgres` -- Phase 3, Parts B/C/D (traceability only) ### Repo `forgejo_admin/pal-e-docs` ### User Story As a platform operator I want pal-e-docs running on the shared CNPG Postgres cluster So that migrations are safe (no SQLite auto-commit DDL crashes) and we unlock future search capabilities ### Context pal-e-docs currently uses SQLite with Litestream for backup/replication. A CNPG Postgres cluster is already running: - Cluster: `pal-e-postgres` in `postgres` namespace (healthy, Postgres 17.4) - Database: `paledocs`, user: `paledocs` - Endpoint: `pal-e-postgres-rw.postgres.svc.cluster.local:5432` - Connection string will be provided via k8s secret `paledocs-db-url` key `DATABASE_URL` in the `pal-e-docs` namespace (created by Terraform — pal-e-platform issue #22, applied before this work deploys) ArgoCD auto-syncs from `pal-e-docs/k8s/` on main with prune + self-heal enabled. When PVC and ConfigMap are removed from kustomization.yaml, ArgoCD will delete them. This is intentional. **This is a two-PR delivery (see Deployment Sequence below).** ### SQLite-isms in Existing Migrations The following patterns in `alembic/versions/` are SQLite-specific and will fail on Postgres: 1. **`sa.text('(CURRENT_TIMESTAMP)')`** — files: `54714bc57fbe`, `a1b2c3d4e5f6`, `b2c3d4e5f6a7`. Postgres wants `CURRENT_TIMESTAMP` (no parens) or `now()`. 2. **`server_default=sa.text('1')` for Boolean** — file: `b2c3d4e5f6a7` (users table `is_approved`). Postgres wants `true`. 3. **`f6a7b8c9d0e1` (sprint schema expansion)** — upgrade is `pass` (no-op on SQLite). On Postgres with `native_enum=False`, CHECK constraints for expanded enum values may need handling. **Recommended approach:** Since we're running against an empty Postgres database, the cleanest path is: - Option A: Fix each migration to be dialect-aware (add `if context.get_x_argument...` or use `sa.func.now()` instead of `sa.text`) - Option B: Generate a single fresh baseline migration from the current models for Postgres, squashing the chain - **Preference: Option A** — fix in-place so the migration chain works on both SQLite (local dev) and Postgres (production). Replace `sa.text('(CURRENT_TIMESTAMP)')` with `sa.func.now()`, replace `sa.text('1')` with `sa.text('true')` guarded by dialect check or just use `sa.true_()`. ### File Targets **PR 1 — Code changes (Postgres support, backward compatible):** - `src/pal_e_docs/config.py` -- add `database_url: str | None = None` setting - `src/pal_e_docs/database.py` -- use `database_url` when set, fall back to SQLite. Conditionally attach SQLite pragma listener. - `alembic/env.py` -- use `settings.database_url` when available - `alembic/versions/54714bc57fbe_initial_schema.py` -- fix `(CURRENT_TIMESTAMP)` → `sa.func.now()` - `alembic/versions/a1b2c3d4e5f6_add_repos_table.py` -- same fix - `alembic/versions/b2c3d4e5f6a7_add_users_table.py` -- fix timestamp + boolean default - `alembic/versions/c3d4e5f6a7b8_add_project_is_public_and_page_note_fk.py` -- fix boolean default if present - `alembic/versions/f6a7b8c9d0e1_sprint_schema_expansion.py` -- add Postgres CHECK constraint handling - `pyproject.toml` -- add `psycopg2-binary` dependency (or `psycopg[binary]`) - `Dockerfile` -- if psycopg2-binary needs system deps, add them Files NOT to touch: - `k8s/*` -- that's PR 2 - `tests/` -- existing tests should still pass against SQLite **PR 2 — Manifest changes (switch to Postgres, remove Litestream):** - `k8s/deployment.yaml` -- replace `PALDOCS_DATABASE_PATH` env with `PALDOCS_DATABASE_URL` from secret, remove Litestream init + sidecar containers, remove data volume + litestream-config volume - `k8s/kustomization.yaml` -- remove `pvc.yaml` and `litestream-configmap.yaml` - `k8s/pvc.yaml` -- DELETE this file - `k8s/litestream-configmap.yaml` -- DELETE this file **Data migration script (can live in repo root or scripts/):** - `scripts/migrate_sqlite_to_postgres.py` -- dump SQLite → insert into Postgres, reset PK sequences ### Acceptance Criteria - [ ] App starts successfully with `PALDOCS_DATABASE_URL=postgresql://...` env var - [ ] App still starts with `PALDOCS_DATABASE_PATH=/data/foo.db` (backward compat for local dev) - [ ] `alembic upgrade head` runs cleanly against an empty Postgres database - [ ] All existing API endpoints return correct data after migration - [ ] MCP tools (list_notes, get_note, create_note, etc.) work against Postgres - [ ] Litestream containers removed from deployment - [ ] PVC and ConfigMap removed from kustomization ### Test Expectations - [ ] Existing tests pass (they use SQLite — no change needed) - [ ] Manual verification: run `alembic upgrade head` against a local Postgres - [ ] Manual verification: run data migration script against test data - Run: `pytest tests/` (should pass unchanged for PR 1) ### Deployment Sequence (CRITICAL) ``` 1. Merge PR 1 (code) → ArgoCD deploys. App still uses SQLite. Nothing changes. 2. Lucas runs `tofu apply` on pal-e-platform (creates DB secret) 3. Lucas runs `alembic upgrade head` against Postgres (creates schema) 4. Lucas runs `scripts/migrate_sqlite_to_postgres.py` (loads data) 5. Lucas verifies data in Postgres 6. Merge PR 2 (manifests) → ArgoCD deploys. App switches to Postgres. Litestream removed. 7. Verify app is healthy, API responds, data intact. ``` Steps 2-5 are manual — coordinated by Betty Sue during a short maintenance window. The dev agent delivers PR 1, PR 2, and the migration script. Lucas handles the operational steps. ### Constraints - `database.py` engine creation must be at module level (current pattern) — don't change to lazy init - Keep `config.py` using pydantic-settings with `env_prefix = "PALDOCS_"` - The `PALDOCS_DATABASE_URL` env var name must match the config field name (`database_url`) with the prefix - `psycopg2-binary` is fine for now (avoids libpq build dep). Can switch to `psycopg[binary]` (psycopg3) if preferred. - Do NOT run `alembic upgrade head` against production — that's Lucas's job during the maintenance window - PR 1 and PR 2 should be separate branches and separate PRs ### Checklist - [ ] PR 1 opened (code changes) - [ ] PR 2 opened (manifest changes) - [ ] Migration script included - [ ] Tests pass - [ ] No unrelated changes ### Related - `pal-e-platform` project -- issue #22 (Terraform secret) must be applied before PR 2 merges - `plan-2026-02-25-platform-observability` -- Litestream removal affects backup strategy

forgejo_admin referenced this issue from a commit

2026-03-06 17:37:05 +00:00

Add Postgres support with SQLite backward compatibility

forgejo_admin referenced this issue

2026-03-06 17:37:45 +00:00

Add Postgres support with SQLite backward compatibility #77

forgejo_admin referenced this issue from a commit

2026-03-06 17:38:38 +00:00

Switch k8s manifests from SQLite/Litestream to Postgres (CNPG)

forgejo_admin referenced this issue

2026-03-06 17:38:51 +00:00

Switch k8s manifests from SQLite/Litestream to Postgres #78

forgejo_admin commented

2026-03-06 17:41:47 +00:00

Author

Contributor

PR #78 Review

BLOCKERS

None.

NITS

None. The diff is clean and minimal -- 84 lines removed, 5 added. Exactly the right scope for an infrastructure swap.

KEY VERIFICATIONS (per review request)

Secret name paledocs-db-url with key DATABASE_URL -- Confirmed. Matches Terraform resource at main.tf:1108-1117 (kubernetes_secret_v1.paledocs_db_url). Name: paledocs-db-url, key: DATABASE_URL.
Env var PALDOCS_DATABASE_URL -- Confirmed. Matches PR #77's config.py field database_url (pydantic-settings maps PALDOCS_ prefix + field name to env var PALDOCS_DATABASE_URL).
Litestream init container fully removed -- Confirmed. The litestream-restore init container (image, args, envFrom, volumeMounts, resources) is entirely gone.
Litestream sidecar container fully removed -- Confirmed. The litestream-replicate sidecar container is entirely gone.
PVC volume and all volume mounts removed -- Confirmed. The volumes section (PVC pal-e-docs-data + configmap litestream-config) and the main container's volumeMounts for /data are all removed.
pvc.yaml deleted -- Confirmed. File deleted in diff.
litestream-configmap.yaml deleted -- Confirmed. File deleted in diff.
kustomization.yaml updated -- Confirmed. References to both pvc.yaml and litestream-configmap.yaml removed.
No orphaned volume mounts or references remain -- Confirmed. The resulting deployment has zero volumes, volumeMounts, or initContainers sections.

SOP COMPLIANCE

Branch named after issue (76-postgres-manifest-switch references issue #76)
PR body follows template (Summary, Changes, Test Plan, Review Checklist, Related Notes)
Related references plan slug (plan-2026-02-26-tf-modularize-postgres)
No secrets committed (only secretKeyRef references, no values)
No unnecessary file changes (4 files, all directly in scope)
Scope is tight -- pure removal of SQLite/Litestream infra + addition of one env var from secret

VERDICT: APPROVED

The manifest changes are correct, complete, and well-scoped. All three layers (Terraform secret, k8s deployment, Python config) agree on naming. The deploy ordering documented in the PR body and test plan is appropriate -- this must be merged after PR #77, Terraform apply, alembic migration, and data migration.

## PR #78 Review ### BLOCKERS None. ### NITS None. The diff is clean and minimal -- 84 lines removed, 5 added. Exactly the right scope for an infrastructure swap. ### KEY VERIFICATIONS (per review request) - **Secret name `paledocs-db-url` with key `DATABASE_URL`** -- Confirmed. Matches Terraform resource at `main.tf:1108-1117` (`kubernetes_secret_v1.paledocs_db_url`). Name: `paledocs-db-url`, key: `DATABASE_URL`. - **Env var `PALDOCS_DATABASE_URL`** -- Confirmed. Matches PR #77's `config.py` field `database_url` (pydantic-settings maps `PALDOCS_` prefix + field name to env var `PALDOCS_DATABASE_URL`). - **Litestream init container fully removed** -- Confirmed. The `litestream-restore` init container (image, args, envFrom, volumeMounts, resources) is entirely gone. - **Litestream sidecar container fully removed** -- Confirmed. The `litestream-replicate` sidecar container is entirely gone. - **PVC volume and all volume mounts removed** -- Confirmed. The `volumes` section (PVC `pal-e-docs-data` + configmap `litestream-config`) and the main container's `volumeMounts` for `/data` are all removed. - **`pvc.yaml` deleted** -- Confirmed. File deleted in diff. - **`litestream-configmap.yaml` deleted** -- Confirmed. File deleted in diff. - **`kustomization.yaml` updated** -- Confirmed. References to both `pvc.yaml` and `litestream-configmap.yaml` removed. - **No orphaned volume mounts or references remain** -- Confirmed. The resulting deployment has zero `volumes`, `volumeMounts`, or `initContainers` sections. ### SOP COMPLIANCE - [x] Branch named after issue (`76-postgres-manifest-switch` references issue #76) - [x] PR body follows template (Summary, Changes, Test Plan, Review Checklist, Related Notes) - [x] Related references plan slug (`plan-2026-02-26-tf-modularize-postgres`) - [x] No secrets committed (only secretKeyRef references, no values) - [x] No unnecessary file changes (4 files, all directly in scope) - [x] Scope is tight -- pure removal of SQLite/Litestream infra + addition of one env var from secret ### VERDICT: APPROVED The manifest changes are correct, complete, and well-scoped. All three layers (Terraform secret, k8s deployment, Python config) agree on naming. The deploy ordering documented in the PR body and test plan is appropriate -- this must be merged after PR #77, Terraform apply, alembic migration, and data migration.

forgejo_admin commented

2026-03-06 17:42:11 +00:00

Author

Contributor

PR #77 Review

BLOCKERS

None found. The code is correct and well-scoped.

NITS

Unused import in migration script (scripts/migrate_sqlite_to_postgres.py line 15): from sqlalchemy.orm import Session # noqa: F401 is imported but never used. The noqa suppresses the lint warning, but the import should just be removed entirely.
psycopg2-binary vs psycopg2: Using psycopg2-binary is fine for containerized deployments, but worth noting that some teams prefer the source-built psycopg2 for production. Not a blocker -- just a conscious tradeoff to be aware of.
alembic/env.py model imports: The diff shows Sprint and SprintItem models are not imported in the updated alembic/env.py. The current main branch also lacks them (it only imports Note, NoteLink, NoteRevision, NoteTag, Project, Repo, Tag, User). This is a pre-existing gap, not introduced by this PR, but worth noting since those models need to be imported for autogenerate to detect them. Since the sprint migration files were hand-written, this has not caused problems yet.

OBSERVATIONS (Positive)

All SQLite-isms fixed: Every migration file that used sa.text('(CURRENT_TIMESTAMP)') or sa.text('1') has been updated. The later migrations (d4e5f6a7b8c9 through h8c9d0e1f2g3) already used portable constructs -- no gaps.
database.py rewrite is clean: The _build_engine() function correctly branches on settings.database_url, and the SQLite pragma listener is conditionally registered only when the dialect is sqlite. No pragma leakage to Postgres.
config.py change is minimal and correct: database_url: str | None = None maps to PALDOCS_DATABASE_URL via the PALDOCS_ env prefix. When unset, falls back to SQLite. Zero behavior change for existing deployments.
alembic/env.py correctly prefers Postgres URL: Same logic as database.py -- settings.database_url when set, SQLite path otherwise.
Migration script handles FK order correctly: TABLE_ORDER lists parents before children (projects before notes, notes before note_tags/note_links/note_revisions, sprints before sprint_items). All 10 tables accounted for.
Sequence reset is correct: SEQUENCE_TABLES correctly excludes note_tags and note_links (composite PKs, no auto-increment sequences). The setval() call with false as the third argument means the next nextval() returns the set value, which is MAX(id) + 1 -- correct.
Credential masking in migration script: pg_url.split('@')[-1] strips user:pass before printing. Good security practice.
Scope is tight: 9 files changed, 160 additions, 18 deletions. No scope creep. This is purely code-side Postgres support with no infrastructure changes (that is deferred to PR #78).

SOP COMPLIANCE

Branch named after issue: 76-postgres-code-changes references issue #76
PR body has: Summary, Changes, Test Plan, Related -- all present and detailed
Related section references plan slug: plan-2026-02-26-tf-modularize-postgres
No secrets, .env files, or credentials committed
No unnecessary file changes (scope is clean)
Commit messages: PR title is descriptive
290 tests pass per PR description (SQLite path exercised)

VERDICT: APPROVED

## PR #77 Review ### BLOCKERS None found. The code is correct and well-scoped. ### NITS 1. **Unused import in migration script** (`scripts/migrate_sqlite_to_postgres.py` line 15): `from sqlalchemy.orm import Session # noqa: F401` is imported but never used. The `noqa` suppresses the lint warning, but the import should just be removed entirely. 2. **`psycopg2-binary` vs `psycopg2`**: Using `psycopg2-binary` is fine for containerized deployments, but worth noting that some teams prefer the source-built `psycopg2` for production. Not a blocker -- just a conscious tradeoff to be aware of. 3. **`alembic/env.py` model imports**: The diff shows Sprint and SprintItem models are not imported in the updated `alembic/env.py`. The current main branch also lacks them (it only imports Note, NoteLink, NoteRevision, NoteTag, Project, Repo, Tag, User). This is a pre-existing gap, not introduced by this PR, but worth noting since those models need to be imported for autogenerate to detect them. Since the sprint migration files were hand-written, this has not caused problems yet. ### OBSERVATIONS (Positive) - **All SQLite-isms fixed**: Every migration file that used `sa.text('(CURRENT_TIMESTAMP)')` or `sa.text('1')` has been updated. The later migrations (`d4e5f6a7b8c9` through `h8c9d0e1f2g3`) already used portable constructs -- no gaps. - **`database.py` rewrite is clean**: The `_build_engine()` function correctly branches on `settings.database_url`, and the SQLite pragma listener is conditionally registered only when the dialect is sqlite. No pragma leakage to Postgres. - **`config.py` change is minimal and correct**: `database_url: str | None = None` maps to `PALDOCS_DATABASE_URL` via the `PALDOCS_` env prefix. When unset, falls back to SQLite. Zero behavior change for existing deployments. - **`alembic/env.py` correctly prefers Postgres URL**: Same logic as database.py -- `settings.database_url` when set, SQLite path otherwise. - **Migration script handles FK order correctly**: TABLE_ORDER lists parents before children (projects before notes, notes before note_tags/note_links/note_revisions, sprints before sprint_items). All 10 tables accounted for. - **Sequence reset is correct**: SEQUENCE_TABLES correctly excludes `note_tags` and `note_links` (composite PKs, no auto-increment sequences). The `setval()` call with `false` as the third argument means the next `nextval()` returns the set value, which is `MAX(id) + 1` -- correct. - **Credential masking in migration script**: `pg_url.split('@')[-1]` strips user:pass before printing. Good security practice. - **Scope is tight**: 9 files changed, 160 additions, 18 deletions. No scope creep. This is purely code-side Postgres support with no infrastructure changes (that is deferred to PR #78). ### SOP COMPLIANCE - [x] Branch named after issue: `76-postgres-code-changes` references issue #76 - [x] PR body has: Summary, Changes, Test Plan, Related -- all present and detailed - [x] Related section references plan slug: `plan-2026-02-26-tf-modularize-postgres` - [x] No secrets, .env files, or credentials committed - [x] No unnecessary file changes (scope is clean) - [x] Commit messages: PR title is descriptive - [x] 290 tests pass per PR description (SQLite path exercised) ### VERDICT: APPROVED

forgejo_admin closed this issue

2026-03-06 19:51:38 +00:00

Commenting is not possible because the repository is archived.