Activate semantic search pipeline — scale embedding worker, verify hybrid search #154
Labels
No labels
domain:backend
domain:devops
domain:frontend
status:approved
status:in-progress
status:needs-fix
status:qa
type:bug
type:devops
type:feature
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
forgejo_admin/pal-e-api#154
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Lineage
plan-pal-e-docs→ Phase: Activate Semantic Search Pipeline (phase-pal-e-docs-activate-semantic-search)Repo
forgejo_admin/pal-e-docsUser Story
As a platform agent
I want semantic search across all 260+ knowledge base notes
So that I can find relevant SOPs, conventions, and context by meaning rather than exact keywords
Context
The full semantic search stack was built during Act 2 but intentionally left dormant:
blockstable hasembeddingvector(768) columnembedding_worker.py(610 lines) — async worker with LISTEN/NOTIFY, batch processing, Prometheus metrics, health endpoint, backfill modeservices/search.py— Reciprocal Rank Fusion (RRF) combining tsvector keyword + pgvector semantic/search?mode=keyword|semantic|hybridand/semantic-searchsearch_notes()andsemantic_search()fully wiredk8s/embedding-worker.yamlwith replicas: 0qwen3-embedding:4b(768-dim, 3.5GB VRAM)File Targets
Files the agent should modify:
k8s/embedding-worker.yaml— changereplicas: 0toreplicas: 1k8s/embedding-worker.yaml— verify image tag is current (check ArgoCD Image Updater annotation or compare to deployed API image)Files the agent should NOT touch:
src/pal_e_docs/embedding_worker.py— already production-readysrc/pal_e_docs/services/search.py— already completesrc/pal_e_docs/routes/notes.py— search endpoints already wiredAcceptance Criteria
k8s/embedding-worker.yamlhasreplicas: 1/healthzreturns 200)embedding_statuscounts in DB)/search?mode=semantic&q=deployment+recoveryreturns relevant results via API/search?mode=hybrid&q=how+to+create+a+planreturns fused resultssemantic_search()MCP tool returns ranked resultsTest Expectations
kubectl get pods -n pal-e-docs -l app=pal-e-docs-embedding-workerkubectl exec -n pal-e-docs deploy/pal-e-docs -- python -c "from sqlalchemy import create_engine, text; import os; e=create_engine(os.environ['PALDOCS_DATABASE_URL']); c=e.connect(); print(c.execute(text(\"SELECT embedding_status, count(*) FROM blocks GROUP BY embedding_status\")).fetchall())"curl -s 'http://localhost:8000/notes/search?q=deployment&mode=semantic&limit=3'(via port-forward or in-cluster)Constraints
pal-e-docs/k8s/— push to main triggers deployollamanamespace, worker connects viahttp://ollama.ollama.svc.cluster.local:11434python -m pal_e_docs.embedding_workerChecklist
Related
pal-e-docs— project this affects