INCIDENT: Drop HNSW index before vector dimension ALTER #159

Merged

forgejo_admin merged 1 commit from 158-fix-migration-drop-hnsw-index into main

2026-03-14 17:19:58 +00:00

Conversation 1 Commits 1 Files changed 1 +10

forgejo_admin commented

2026-03-14 17:16:29 +00:00

Owner

Copy link

Summary

Fixes CrashLoopBackOff on both pal-e-docs pods. The Alembic migration from PR #157 tried to ALTER the embedding column to vector(2560), but the existing HNSW index has a hard 2000-dimension limit in pgvector. The fix drops the index before the ALTER.

Changes

alembic/versions/o5j6k7l8m9n0_fix_embedding_vector_dimension.py -- Added DROP INDEX IF EXISTS ix_blocks_embedding before the ALTER TABLE statement. Downgrade restores the HNSW index on vector(768) for reversibility. No index recreation in upgrade (sequential scan is fine for ~5600 blocks).

Test Plan

pytest tests/ -v -- 497 passed
ruff check + ruff format --check -- clean
After deploy: verify pods are Running (no CrashLoopBackOff)
After deploy: verify embedding worker starts processing blocks

Review Checklist

No unrelated changes
Migration is idempotent (DROP INDEX IF EXISTS)
Downgrade is symmetric (restores HNSW index on vector(768))
Reviewer: confirm no other indexes reference the embedding column

Plan: plan-pal-e-docs
Forgejo issue: #158

Closes #158

## Summary Fixes CrashLoopBackOff on both pal-e-docs pods. The Alembic migration from PR #157 tried to ALTER the embedding column to vector(2560), but the existing HNSW index has a hard 2000-dimension limit in pgvector. The fix drops the index before the ALTER. ## Changes - `alembic/versions/o5j6k7l8m9n0_fix_embedding_vector_dimension.py` -- Added `DROP INDEX IF EXISTS ix_blocks_embedding` before the ALTER TABLE statement. Downgrade restores the HNSW index on vector(768) for reversibility. No index recreation in upgrade (sequential scan is fine for ~5600 blocks). ## Test Plan - [x] `pytest tests/ -v` -- 497 passed - [x] `ruff check` + `ruff format --check` -- clean - [ ] After deploy: verify pods are Running (no CrashLoopBackOff) - [ ] After deploy: verify embedding worker starts processing blocks ## Review Checklist - [x] No unrelated changes - [x] Migration is idempotent (DROP INDEX IF EXISTS) - [x] Downgrade is symmetric (restores HNSW index on vector(768)) - [ ] Reviewer: confirm no other indexes reference the embedding column ## Related - Plan: `plan-pal-e-docs` - Forgejo issue: #158 Closes #158

forgejo_admin added 1 commit

2026-03-14 17:16:30 +00:00

Fix migration crash: drop HNSW index before vector dimension ALTER

All checks were successful

ci/woodpecker/pr/woodpecker Pipeline was successful

Details

568b0eb21c

The HNSW index has a hard 2000-dimension limit in pgvector. The migration
to vector(2560) was failing because it tried to ALTER the column type
without first dropping the index. This caused CrashLoopBackOff on both
pal-e-docs pods since Alembic runs on startup.

- Drop ix_blocks_embedding before ALTER TABLE
- Do not recreate index (sequential scan is fine for ~5600 blocks)
- Downgrade restores the HNSW index on vector(768) for reversibility

Closes #158

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

forgejo_admin commented

2026-03-14 17:18:00 +00:00

Author

Owner

Copy link

PR #159 Review

BLOCKERS

None.

CODE ANALYSIS

Upgrade path (correct):

DROP INDEX IF EXISTS ix_blocks_embedding -- removes the HNSW index that has a hard 2000-dimension limit in pgvector
ALTER TABLE blocks ALTER COLUMN embedding TYPE vector(2560) -- now succeeds without the index blocking it
UPDATE blocks SET embedding_status = 'pending' WHERE embedding_status = 'error' -- requeues previously-failed blocks

The ordering is correct. The IF EXISTS guard makes the migration idempotent (safe to re-run if the first attempt partially succeeded). No index recreation is intentional -- sequential scan on ~5,600 blocks is fine, and HNSW cannot support 2560 dimensions anyway.

Downgrade path (correct):

ALTER TABLE blocks ALTER COLUMN embedding TYPE vector(768) -- shrinks column first (will fail if 2560-dim data exists, documented in comment)
CREATE INDEX ix_blocks_embedding ON blocks USING hnsw (embedding vector_cosine_ops) -- recreates index on 768-dim column (within 2000-dim limit)

Ordering is correct: column must be 768-dim before the HNSW index can be created on it.

Cross-reference verified:

The original HNSW index was created in migration l2g3h4i5j6k7_add_vector_embeddings.py (line 47-49). That is the only CREATE INDEX on the embedding column in the migration history. No other indexes reference the embedding column.
The SQLAlchemy model at /home/ldraney/pal-e-docs/src/pal_e_docs/models.py:207 already has Vector(2560), confirming the model matches the migration target.
down_revision points to n4i5j6k7l8m9 (the boards migration), which is the correct parent in the chain.

NITS

None. This is a clean, minimal incident fix.

SOP COMPLIANCE

Branch named after issue: 158-fix-migration-drop-hnsw-index references issue #158
PR body follows template: Summary, Changes, Test Plan, Review Checklist, Related sections all present
Related references plan slug: plan-pal-e-docs
Closes #158 present in PR body
No secrets committed
No scope creep: 1 file changed, +10/-0 lines, all directly related to the HNSW index fix
Single migration file only -- no unrelated changes

VERDICT: APPROVED

Clean incident fix. Correct ordering, idempotent, symmetric downgrade, no scope creep. Ship it.

## PR #159 Review ### BLOCKERS None. ### CODE ANALYSIS **Upgrade path (correct):** 1. `DROP INDEX IF EXISTS ix_blocks_embedding` -- removes the HNSW index that has a hard 2000-dimension limit in pgvector 2. `ALTER TABLE blocks ALTER COLUMN embedding TYPE vector(2560)` -- now succeeds without the index blocking it 3. `UPDATE blocks SET embedding_status = 'pending' WHERE embedding_status = 'error'` -- requeues previously-failed blocks The ordering is correct. The `IF EXISTS` guard makes the migration idempotent (safe to re-run if the first attempt partially succeeded). No index recreation is intentional -- sequential scan on ~5,600 blocks is fine, and HNSW cannot support 2560 dimensions anyway. **Downgrade path (correct):** 1. `ALTER TABLE blocks ALTER COLUMN embedding TYPE vector(768)` -- shrinks column first (will fail if 2560-dim data exists, documented in comment) 2. `CREATE INDEX ix_blocks_embedding ON blocks USING hnsw (embedding vector_cosine_ops)` -- recreates index on 768-dim column (within 2000-dim limit) Ordering is correct: column must be 768-dim before the HNSW index can be created on it. **Cross-reference verified:** - The original HNSW index was created in migration `l2g3h4i5j6k7_add_vector_embeddings.py` (line 47-49). That is the only `CREATE INDEX` on the embedding column in the migration history. No other indexes reference the embedding column. - The SQLAlchemy model at `/home/ldraney/pal-e-docs/src/pal_e_docs/models.py:207` already has `Vector(2560)`, confirming the model matches the migration target. - `down_revision` points to `n4i5j6k7l8m9` (the boards migration), which is the correct parent in the chain. ### NITS None. This is a clean, minimal incident fix. ### SOP COMPLIANCE - [x] Branch named after issue: `158-fix-migration-drop-hnsw-index` references issue #158 - [x] PR body follows template: Summary, Changes, Test Plan, Review Checklist, Related sections all present - [x] Related references plan slug: `plan-pal-e-docs` - [x] `Closes #158` present in PR body - [x] No secrets committed - [x] No scope creep: 1 file changed, +10/-0 lines, all directly related to the HNSW index fix - [x] Single migration file only -- no unrelated changes ### VERDICT: APPROVED Clean incident fix. Correct ordering, idempotent, symmetric downgrade, no scope creep. Ship it.

forgejo_admin merged commit 0147be264a into main

2026-03-14 17:19:58 +00:00

forgejo_admin deleted branch 158-fix-migration-drop-hnsw-index

2026-03-14 17:19:58 +00:00

forgejo_admin referenced this pull request from a commit

2026-03-14 17:20:00 +00:00

INCIDENT: Drop HNSW index before vector dimension ALTER (#159)

Reviewers

No reviewers

Labels

Clear labels

QA passed, awaiting merge approval

status:in-progress

Dev agent is actively working

status:needs-fix

QA found issues, back to dev

status:qa

PR submitted, awaiting QA review

type:bug

Bug fix

type:devops

Infrastructure/CI/config work

No labels

Milestone

Clear milestone

No items

No milestone

Projects

Clear projects

No items

No project

Assignees

Clear assignees

No assignees

1 participant

Notifications

Due date

The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference

forgejo_admin/pal-e-api!159

Reference in a new issue

Repository

forgejo_admin/pal-e-api

Title

Body

No description provided.

Delete branch "158-fix-migration-drop-hnsw-index"

Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?

Rows
Columns