Add pgvector extension + embedding schema migration #122
No reviewers
Labels
No labels
domain:backend
domain:devops
domain:frontend
status:approved
status:in-progress
status:needs-fix
status:qa
type:bug
type:devops
type:feature
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
forgejo_admin/pal-e-api!122
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "feat/pgvector-embedding-schema"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Adds the pgvector extension to Postgres and creates the embedding schema on the blocks table. This enables Phase 6c (embedding worker) to store 768-dimensional vectors for semantic search.
Changes
alembic/versions/k1f2g3h4i5j6_add_vector_embeddings.py-- New Alembic migration that creates pgvector extension, addsembedding vector(768)andembedding_status varchar(20)columns, creates HNSW index for cosine similarity, and adds a trigger function that sets status to pending/skipped and fires NOTIFY on content changessrc/pal_e_docs/models.py-- Addsembedding(Vector(768)) andembedding_status(String(20)) columns to Block SQLAlchemy modelpyproject.toml-- Addspgvector>=0.3dependencyTest Plan
pytest tests/ -v)ruff formatandruff checkpassalembic upgrade headsucceeds against running CNPG databasealembic downgrade -1reverses cleanlyembedding_status = 'pending'and fires NOTIFYReview Checklist
Related
plan-2026-02-26-tf-modularize-postgres-- Phase 6b (pgvector extension + schema migration)PR #122 Re-Review
Re-review after dev agent addressed two blockers and one nit from the previous review.
Previous Findings Status
1. Blocker: Revision ID collision -- RESOLVED
l2g3h4i5j6k7(was colliding before).k1f2g3h4i5j6viadown_revision.j0e1f2g3h4i5->k1f2g3h4i5j6->l2g3h4i5j6k7. Confirmed correct.2. Blocker: Trigger column list -- RESOLVED
BEFORE INSERT OR UPDATE OF content, block_type(was missingblock_type).3. Nit:
rawblock type in skip list -- RESOLVEDmermaid, which is the correct non-textual block type.New Findings
No new blockers.
NITS
Stale PR body filename -- The "Changes" section references
alembic/versions/k1f2g3h4i5j6_add_vector_embeddings.pybut the actual file isalembic/versions/l2g3h4i5j6k7_add_vector_embeddings.py. The PR body was not updated after the revision ID fix. Non-blocking but could confuse future readers.mergeable: falsestatus -- The PR currently shows as not mergeable. This may require a rebase onto main to pick up thek1f2g3h4i5j6migration that the new migration chains from. Not a code issue but needs resolution before merge.Technical Notes (informational, not blocking)
SET embedding_status = 'skipped' WHERE block_type = 'mermaid') correctly avoids re-triggering the BEFORE trigger since it modifiesembedding_status, notcontentorblock_type.embeddingcolumn is safe (creates empty index structure).embeddingandembedding_statusare not exposed in API schemas -- appropriate for schema-only migration work.pg_notifywithNEW.id::textin a BEFORE INSERT trigger is valid because PostgreSQL evaluates SERIAL/sequence defaults before BEFORE triggers fire.SOP COMPLIANCE
feat/pgvector-embedding-schema, notissue-121. Matches feature convention but does not include issue number.plan-2026-02-26-tf-modularize-postgresPhase 6b.VERDICT: APPROVED
All three previous findings are resolved. No new blockers. The two nits (stale PR body, mergeable status) are non-blocking and can be addressed during merge preparation.
c06ed4d9673e3c07f033