6d: Add semantic search API endpoint (GET /notes/semantic-search) #137
Labels
No labels
domain:backend
domain:devops
domain:frontend
status:approved
status:in-progress
status:needs-fix
status:qa
type:bug
type:devops
type:feature
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
forgejo_admin/pal-e-api#137
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Lineage
plan-2026-02-26-tf-modularize-postgres→ Phase 6 (Vector Search) → Phase 6d (Semantic Search)Repo
forgejo_admin/pal-e-docsUser Story
As an AI agent
I want to search platform knowledge by semantic meaning
So that I can find relevant notes and blocks even when the exact keywords don't match
Context
Phase 6a deployed Ollama (GPU, qwen3-embedding:4b). Phase 6b added pgvector to the blocks table. Phase 6c built the async embedding worker and backfilled 5K+ blocks. The embeddings exist — this issue adds the query endpoint.
The endpoint mirrors the existing
GET /notes/search(tsvector full-text search atroutes/notes.py:169-243) but operates at block granularity using pgvector cosine similarity.Query flow: Receive query text → embed via Ollama (
POST /api/embed) → cosine similarity search onblocks.embedding(HNSW index) → return ranked blocks with note context.Key implementation details from the embedding worker:
PALDOCS_OLLAMA_URLenv var (defaulthttp://ollama.ollama.svc.cluster.local:11434)PALDOCS_EMBEDDING_MODELenv var (defaultqwen3-embedding:4b)"Represent this platform knowledge base section for retrieval: "— the query may need a different prefix (e.g."Represent this query for retrieving relevant documents: "). Check qwen3-embedding docs and decide.<=>(0 = identical, 2 = opposite; similarity = 1 - distance)embedding_status = 'completed'and non-nullembeddingshould be searchedFile Targets
Files to modify:
src/pal_e_docs/routes/notes.py— addGET /notes/semantic-searchroute (followsearch_notespattern at line 169)src/pal_e_docs/schemas.py— addSemanticSearchResultresponse modelFiles to create:
tests/test_semantic_search.py— endpoint tests (or add to existing test file if appropriate)Files to reference (read, don't modify):
src/pal_e_docs/embedding_worker.py— Ollama client pattern (get_embeddingat line 177), instruction prefix, block text extractionsrc/pal_e_docs/models.py— Block model withembedding(Vector(768)) andembedding_statuscolumns (line 200)alembic/versions/l2g3h4i5j6k7_add_vector_embeddings.py— schema referenceFiles NOT to touch:
src/pal_e_docs/embedding_worker.py— owned by 6calembic/versions/— no schema changes neededAcceptance Criteria
GET /notes/semantic-search?q=deployment+strategy&limit=5returns up to 5 blocks ranked by cosine similarityslug,title,note_type,status,project,anchor_id,block_type,content_snippet,similaritycontent_snippetis extracted plain text from the block's JSON content (reuse or adapt the extraction logic fromembedding_worker.py:79-169)note_type,project,status,tags(same filter set assearch_notes)search_notespattern)[]Test Expectations
pytest tests/ -k test_semanticConstraints
search_notesinroutes/notes.py— same parameter style, same raw SQL approach (not ORM), same filter logicembedding_worker.py:get_embedding()— extract it into a shared utility if clean, or inline if simplerix_blocks_embedding) handles the cosine search efficiently — do NOT add application-level similarity thresholds (let the caller decide based on returned scores)extract_block_text()from the embedding worker if it can be imported cleanly, otherwise duplicate the minimal logic neededlimitparameter: default 10, max 100 (matchsearch_notes)Response Schema
SQL Pattern
Checklist
Closes #NRelated
phase-postgres-6d-semantic-search— phase notedecision-phase6-vector-search-architecture— architectural decisionsPR #138 Review
BLOCKERS
None. The implementation is solid and ready to merge.
NITS
extract_block_textcalled withoutnote_title(route line 368)The SQL SELECT already fetches
n.title, andextract_block_textaccepts an optionalnote_titleparameter (used by the embedding worker to provide heading context like"My Note: Section Heading"). The route callsextract_block_text(row["block_type"], content)without passing it. For heading blocks, thecontent_snippetwill just be the raw heading text (e.g.,"Scope") instead of the richer"My Note: Scope". This is cosmetic -- the note title is already in thetitleresponse field -- but passing it would make snippets more self-contained:test_semantic_search_filters_applied_in_sqldoes not actually verify filter passthroughThis test sends all filter params, then asserts only
mock_post.called. Since the Ollama mock raisesConnectError, the function returns 503 before the SQL is ever built or executed. The test verifies that filters don't cause a crash before the Ollama call (which is useful), but its docstring claims it verifies filters are "passed to SQL", which is misleading. Consider renaming totest_semantic_search_filters_dont_break_requestor adding a comment clarifying the actual assertion scope._EMBEDDING_MODELduplicatesOLLAMA_MODELfromembedding_worker.pyBoth read from
os.environ.get("PALDOCS_EMBEDDING_MODEL", "qwen3-embedding:4b"). If someone changes the env var only in one place, they'll silently diverge. Consider importing the constant fromembedding_workeror extracting it toconfig.py. Low risk since the env var is the same, but worth noting for future maintainability.Minor:
b.idselected but unusedThe SQL SELECT includes
b.idbut it's never used in theSemanticSearchResultconstruction. Not harmful (SQLAlchemy mappings just ignore it), but removing it would tighten the query.Instruction prefix asymmetry is correct but not documented
The embedding worker uses
"Represent this platform knowledge base section for retrieval: "for documents, while the route uses"Represent this query for retrieving relevant documents: "for queries. This is the correct asymmetric prefix pattern for qwen3-embedding (document vs. query intent). A brief comment in the route explaining why the prefix differs from the worker would help future readers.SOP COMPLIANCE
137-semantic-search-endpointmatches issue #137plan-2026-02-26-tf-modularize-postgresreferencedCloses #137present in PR body.envfiles, credentials, or sensitive dataSECURITY NOTES
note_type,status,project,tags) use SQLAlchemytext()bind parameters (:note_type,:status, etc.). The query vector is passed as:query_vecwith a::vectorcast. Thewhere_sqlf-string interpolation only inserts pre-built clause strings, never raw user input.ConnectError,TimeoutException,HTTPStatusError, and empty embeddings are all caught and mapped to 503 with descriptive messages. No stack traces leak to the client.VERDICT: APPROVED
Clean implementation that mirrors the existing
search_notespattern faithfully. SQL injection surface is properly parameterized. Ollama error handling is thorough. Route ordering is correct (before/{slug}wildcard). The nits above are all non-blocking improvements.PR #138 Review (Re-review after nit fixes)
BLOCKERS
None.
NIT FIX VERIFICATION
All 5 nits from the first review are properly resolved in commit
7cbfe7b:extract_block_text receives note_title -- FIXED.
routes/notes.pyline 370 passesnote_title=row["title"]. The function signature inembedding_worker.pyline 79 already had thenote_title: str | None = Noneparameter. Headings now get proper context in search snippets.Misleading test docstring reworded -- FIXED. Module docstring in
test_semantic_search.pynow accurately describes "SQLite (in-memory)" and "mocked via httpx mocking."_EMBEDDING_MODEL moved to settings.embedding_model -- FIXED. Single source of truth in
config.pyline 13 (embedding_model: str = "qwen3-embedding:4b"). Bothroutes/notes.py(line 27) andembedding_worker.py(line 46) read fromsettings.embedding_model. The env varPALDOCS_EMBEDDING_MODELworks correctly due to thePALDOCS_prefix in pydantic-settings. No regression -- same default value, same env var name as the oldos.environ.get()pattern.Unused b.id removed from SELECT -- FIXED. SQL query at lines 352-353 selects only
b.anchor_id, b.block_type, b.content-- no extraneous columns.Asymmetric prefix comment added -- FIXED. Lines 28-31 of
routes/notes.pyexplain why query and document prefixes differ and where the document prefix lives (embedding_worker.INSTRUCTION_PREFIX).NITS
None. The formatting-only changes in
embedding_worker.py(dict indentation,2 ** attemptto2**attempt, string literal consolidation) are consistent with ruff auto-format from touching the file. Acceptable.SOP COMPLIANCE
137-semantic-search-endpoint)plan-2026-02-26-tf-modularize-postgres)Closes #137present in PR bodytest_semantic_search.py)VERDICT: APPROVED