6c: Async embedding pipeline + backfill #129
Labels
No labels
domain:backend
domain:devops
domain:frontend
status:approved
status:in-progress
status:needs-fix
status:qa
type:bug
type:devops
type:feature
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
forgejo_admin/pal-e-api#129
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Lineage
plan-2026-02-26-tf-modularize-postgres→ Phase 6 (Vector Search) → Phase 6cRepo
forgejo_admin/pal-e-docsUser Story
As an AI agent on the pal-e platform
I want block content automatically embedded as vectors when notes are created or updated
So that semantic search can find relevant knowledge without brute-force enumeration
Context
Phase 6b deployed pgvector schema:
embedding vector(768)andembedding_status varchar(20)columns on theblockstable, plus a Postgres trigger that firesNOTIFY embedding_queueon block INSERT/UPDATE and setsembedding_status = 'pending'. Ollama is live as a platform service (Phase 6a) athttp://ollama.ollama.svc.cluster.local:11434withqwen3-embedding:4bloaded in VRAM.The trigger is firing but nothing is listening. This phase builds the worker that consumes those notifications, calls Ollama for embeddings, and stores the vectors. It also backfills all ~5K existing blocks.
Decisions already made (from
decision-phase6-vector-search-architecture):"Represent this platform knowledge base section for retrieval: {block_text}"skippedby trigger)File Targets
Files to create:
src/pal_e_docs/embedding_worker.py— main worker process (LISTEN loop, poll fallback, batch processor, health endpoint, metrics, backfill mode)k8s/embedding-worker.yaml— k8s Deployment manifestFiles to modify:
src/pal_e_docs/config.py— addollama_urlsetting (PALDOCS_OLLAMA_URL)k8s/kustomization.yaml— addembedding-worker.yamlresourcepyproject.toml— movehttpxfrom dev to main depsFiles NOT to touch:
alembic/versions/— no new migrations (6b already created the schema + trigger)src/pal_e_docs/routes/— no API changes (that's 6d)src/pal_e_docs/models.py— model already hasembeddingandembedding_statuscolumnsAcceptance Criteria
LISTEN embedding_queueembedding_status = 'completed'embedding_status = 'pending'(catches missed notifications)pending → processing → completed | error.processingprevents duplicate work on restartprocessingblocks topendingembedding_total,embedding_errors_total,embedding_duration_seconds,embedding_queue_depth--backfillflag: processes all pending blocks in rate-limited batches with progress loggingpython -m pal_e_docs.embedding_worker, no GPU request, minimal resources (10m/64Mi req, 256Mi limit)embedding_status = 'completed'and non-nullembeddingTest Expectations
pytest tests/ -k test_embeddingConstraints
psycopg2connection for LISTEN (SQLAlchemy doesn't expose it)httpxfor Ollama HTTP calls (async-capable, already in dev deps)http.serveror similar)runtimeClassName: nvidia— it calls Ollama over HTTP, Ollama owns the GPUChecklist
Related
project-pal-e-docs— project this affectsforgejo_admin referenced this issue2026-03-09 13:06:42 +00:00