Harden anchor_id column to NOT NULL #124

Closed
opened 2026-03-09 03:27:48 +00:00 by forgejo_admin · 1 comment

Lineage

todo-anchor-id-not-null — QA nit from PR #120 (Epilogue item 9b)

Repo

forgejo_admin/pal-e-docs

User Story

As a platform operator
I want the database schema to enforce the anchor_id invariant
So that no code path can accidentally create a block without an anchor_id

Context

PR #120 guarantees all blocks get anchor_ids via the parser and create_block API. But the DB column blocks.anchor_id remains nullable=True. The invariant is only enforced at the application layer. Adding NOT NULL to the column pushes enforcement to the schema level — belt and suspenders.

Must run AFTER PR #120's migration has backfilled all existing NULLs.

File Targets

Files the agent should modify:

  • src/pal_e_docs/models.py — change anchor_id column to nullable=False
  • alembic/versions/ — new migration: ALTER COLUMN anchor_id SET NOT NULL

Files the agent should NOT touch:

  • src/pal_e_docs/blocks/parser.py — already correct from PR #120
  • src/pal_e_docs/routes/blocks.py — already correct from PR #120

Acceptance Criteria

  • models.py has nullable=False on anchor_id
  • Alembic migration applies cleanly after PR #120's migration
  • Migration upgrade verifies zero NULLs exist before applying constraint
  • Migration downgrade reverts to nullable=True
  • All tests pass

Test Expectations

  • All existing tests pass: pytest tests/ -v
  • Migration applies cleanly: alembic upgrade head
  • Run command: pytest tests/ -v

Constraints

  • This migration MUST depend on PR #120's migration (use down_revision correctly)
  • Migration should fail-safe: check for NULLs before adding NOT NULL constraint
  • Issue #121 (test seed cleanup) should ideally be merged first so tests don't break

Checklist

  • PR opened
  • Tests pass
  • No unrelated changes
  • todo-anchor-id-not-null — tracking TODO
  • PR #120 — parent change (must be deployed first)
  • Issue #121 — test seed cleanup (should merge first)
### Lineage `todo-anchor-id-not-null` — QA nit from PR #120 (Epilogue item 9b) ### Repo `forgejo_admin/pal-e-docs` ### User Story As a platform operator I want the database schema to enforce the anchor_id invariant So that no code path can accidentally create a block without an anchor_id ### Context PR #120 guarantees all blocks get anchor_ids via the parser and create_block API. But the DB column `blocks.anchor_id` remains `nullable=True`. The invariant is only enforced at the application layer. Adding `NOT NULL` to the column pushes enforcement to the schema level — belt and suspenders. Must run AFTER PR #120's migration has backfilled all existing NULLs. ### File Targets Files the agent should modify: - `src/pal_e_docs/models.py` — change `anchor_id` column to `nullable=False` - `alembic/versions/` — new migration: `ALTER COLUMN anchor_id SET NOT NULL` Files the agent should NOT touch: - `src/pal_e_docs/blocks/parser.py` — already correct from PR #120 - `src/pal_e_docs/routes/blocks.py` — already correct from PR #120 ### Acceptance Criteria - [ ] `models.py` has `nullable=False` on `anchor_id` - [ ] Alembic migration applies cleanly after PR #120's migration - [ ] Migration upgrade verifies zero NULLs exist before applying constraint - [ ] Migration downgrade reverts to `nullable=True` - [ ] All tests pass ### Test Expectations - [ ] All existing tests pass: `pytest tests/ -v` - [ ] Migration applies cleanly: `alembic upgrade head` - Run command: `pytest tests/ -v` ### Constraints - This migration MUST depend on PR #120's migration (use `down_revision` correctly) - Migration should fail-safe: check for NULLs before adding NOT NULL constraint - Issue #121 (test seed cleanup) should ideally be merged first so tests don't break ### Checklist - [ ] PR opened - [ ] Tests pass - [ ] No unrelated changes ### Related - `todo-anchor-id-not-null` — tracking TODO - PR #120 — parent change (must be deployed first) - Issue #121 — test seed cleanup (should merge first)
Author
Owner

PR #127 Review

BLOCKERS

None.

NITS

  1. sync.py:104 uses .get("anchor_id") which can return None -- In src/pal_e_docs/blocks/sync.py line 104, the Block constructor receives anchor_id=bd.get("anchor_id"). While this is currently safe (the parser always populates this key with a non-None value before returning), the .get() pattern silently returns None if the key is missing, which would now cause a NOT NULL violation at the DB level. Consider changing to bd["anchor_id"] so a missing key raises a KeyError at the application level rather than a cryptic IntegrityError from Postgres. This is non-blocking because the parser contract guarantees the key is present.

  2. recompile() line 42 still has if b.anchor_id: guard -- In sync.py line 42, the TOC builder filters on if b.anchor_id:. With the NOT NULL constraint, this condition is always truthy (empty strings are falsy but the parser never produces them). This guard is now redundant for heading blocks. Non-blocking -- defensive code is fine, just noting the dead branch.

SOP COMPLIANCE

  • Branch named after issue (124-anchor-id-not-null references issue #124)
  • PR body follows template (Summary, Changes, Test Plan, Related sections all present)
  • Related section references traceability note (todo-anchor-id-not-null) and Forgejo issue #124
  • No secrets, .env files, or credentials committed
  • No unnecessary file changes -- exactly 3 files changed, all directly related to the constraint hardening
  • Commit message is descriptive

CODE REVIEW

Migration chain is correct:

  • k1f2g3h4i5j6 (PR #120) -- backfill + unique constraint, depends on j0e1f2g3h4i5
  • l2g3h4i5j6k7 (PR #122) -- vector embeddings, depends on k1f2g3h4i5j6
  • m3h4i5j6k7l8 (this PR) -- NOT NULL hardening, depends on l2g3h4i5j6k7

Chain is linear and correct. The new migration sits at the head.

Safety check is correct: The upgrade function queries SELECT COUNT(*) FROM blocks WHERE anchor_id IS NULL and raises RuntimeError if any NULLs remain. This is a proper guard -- it will abort the migration before applying the constraint if prerequisites were somehow skipped.

Downgrade is safe: Reverts nullable=False back to nullable=True. This is a non-destructive operation -- existing data is preserved.

Model annotation matches: Mapped[str | None] with nullable=True changed to Mapped[str] with nullable=False. The SQLAlchemy type annotation and column definition are consistent.

existing_type=sa.String(200) matches the model definition: The column type in the migration matches the model's String(200).

Test changes are correct: Two test helpers in test_blocks_compiled_pages.py that created Block instances without anchor_id now provide explicit values following the {block_type}-{position} pattern. All anchor_id values are unique within their respective notes.

No unrelated changes: All 3 files (migration, model, test) are directly scoped to the NOT NULL hardening.

VERDICT: APPROVED

## PR #127 Review ### BLOCKERS None. ### NITS 1. **`sync.py:104` uses `.get("anchor_id")` which can return `None`** -- In `src/pal_e_docs/blocks/sync.py` line 104, the Block constructor receives `anchor_id=bd.get("anchor_id")`. While this is currently safe (the parser always populates this key with a non-None value before returning), the `.get()` pattern silently returns `None` if the key is missing, which would now cause a NOT NULL violation at the DB level. Consider changing to `bd["anchor_id"]` so a missing key raises a `KeyError` at the application level rather than a cryptic `IntegrityError` from Postgres. This is non-blocking because the parser contract guarantees the key is present. 2. **`recompile()` line 42 still has `if b.anchor_id:` guard** -- In `sync.py` line 42, the TOC builder filters on `if b.anchor_id:`. With the NOT NULL constraint, this condition is always truthy (empty strings are falsy but the parser never produces them). This guard is now redundant for heading blocks. Non-blocking -- defensive code is fine, just noting the dead branch. ### SOP COMPLIANCE - [x] Branch named after issue (`124-anchor-id-not-null` references issue #124) - [x] PR body follows template (Summary, Changes, Test Plan, Related sections all present) - [x] Related section references traceability note (`todo-anchor-id-not-null`) and Forgejo issue #124 - [x] No secrets, .env files, or credentials committed - [x] No unnecessary file changes -- exactly 3 files changed, all directly related to the constraint hardening - [x] Commit message is descriptive ### CODE REVIEW **Migration chain is correct:** - `k1f2g3h4i5j6` (PR #120) -- backfill + unique constraint, depends on `j0e1f2g3h4i5` - `l2g3h4i5j6k7` (PR #122) -- vector embeddings, depends on `k1f2g3h4i5j6` - `m3h4i5j6k7l8` (this PR) -- NOT NULL hardening, depends on `l2g3h4i5j6k7` Chain is linear and correct. The new migration sits at the head. **Safety check is correct:** The upgrade function queries `SELECT COUNT(*) FROM blocks WHERE anchor_id IS NULL` and raises `RuntimeError` if any NULLs remain. This is a proper guard -- it will abort the migration before applying the constraint if prerequisites were somehow skipped. **Downgrade is safe:** Reverts `nullable=False` back to `nullable=True`. This is a non-destructive operation -- existing data is preserved. **Model annotation matches:** `Mapped[str | None]` with `nullable=True` changed to `Mapped[str]` with `nullable=False`. The SQLAlchemy type annotation and column definition are consistent. **`existing_type=sa.String(200)` matches the model definition:** The column type in the migration matches the model's `String(200)`. **Test changes are correct:** Two test helpers in `test_blocks_compiled_pages.py` that created Block instances without `anchor_id` now provide explicit values following the `{block_type}-{position}` pattern. All anchor_id values are unique within their respective notes. **No unrelated changes:** All 3 files (migration, model, test) are directly scoped to the NOT NULL hardening. ### VERDICT: APPROVED
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
forgejo_admin/pal-e-api#124
No description provided.