Fix semantic search SQL: SQLAlchemy text() fails to bind :query_vec with ::vector cast #160

Closed
opened 2026-03-14 17:40:01 +00:00 by forgejo_admin · 0 comments

Lineage

plan-pal-e-docs → Phase 5a (Activate Semantic Search)

Repo

forgejo_admin/pal-e-docs

User Story

As a pal-e-docs user
I want semantic and hybrid search to return results
So that agents and humans can find contextually relevant notes across the knowledge base

Context

Semantic search endpoints return 500 Internal Server Error. The embedding worker is running and has backfilled 186+ blocks, but the search SQL fails at query time.

SQLAlchemy's text() parameter parser fails to recognize :query_vec when followed by PostgreSQL's ::vector typecast operator. The parameter never gets bound — the error shows parameters: {'limit': 10} with query_vec missing entirely. SQLAlchemy converts :limit to psycopg2 format %(limit)s but leaves :query_vec::vector unconverted because :: confuses the parser.

Fix: replace :query_vec::vector with CAST(:query_vec AS vector) — standard SQL cast syntax that SQLAlchemy handles cleanly.

File Targets

Files the agent should modify:

  • src/pal_e_docs/routes/notes.py — replace 4 occurrences of :query_vec::vector with CAST(:query_vec AS vector)

Files the agent should NOT touch:

  • src/pal_e_docs/embedding_worker.py — uses raw psycopg2 (not SQLAlchemy text()), no issue there
  • src/pal_e_docs/services/search.py — pure Python RRF logic, no SQL

Acceptance Criteria

  • GET /notes/search?q=postgres&mode=semantic returns results (not 500)
  • GET /notes/search?q=postgres&mode=hybrid returns results
  • GET /notes/semantic-search?q=postgres returns results
  • Existing keyword search (GET /notes/search?q=postgres) still works
  • All existing tests pass

Test Expectations

  • Existing search tests continue to pass
  • If semantic search tests exist, verify they pass with the fix
  • Run command: pytest tests/ -v

Constraints

  • Only change the SQL cast syntax — do not refactor the query structure
  • The fix is exactly: :query_vec::vectorCAST(:query_vec AS vector) in 4 places
  • Keep the same query logic (DISTINCT ON, ORDER BY, etc.)

Checklist

  • PR opened
  • Tests pass
  • No unrelated changes
  • plan-pal-e-docs — Phase 5a: Activate Semantic Search
  • todo-fix-mcp-pypi-publish — related Phase 7 TODO
### Lineage `plan-pal-e-docs` → Phase 5a (Activate Semantic Search) ### Repo `forgejo_admin/pal-e-docs` ### User Story As a pal-e-docs user I want semantic and hybrid search to return results So that agents and humans can find contextually relevant notes across the knowledge base ### Context Semantic search endpoints return 500 Internal Server Error. The embedding worker is running and has backfilled 186+ blocks, but the search SQL fails at query time. SQLAlchemy's `text()` parameter parser fails to recognize `:query_vec` when followed by PostgreSQL's `::vector` typecast operator. The parameter never gets bound — the error shows `parameters: {'limit': 10}` with `query_vec` missing entirely. SQLAlchemy converts `:limit` to psycopg2 format `%(limit)s` but leaves `:query_vec::vector` unconverted because `::` confuses the parser. Fix: replace `:query_vec::vector` with `CAST(:query_vec AS vector)` — standard SQL cast syntax that SQLAlchemy handles cleanly. ### File Targets Files the agent should modify: - `src/pal_e_docs/routes/notes.py` — replace 4 occurrences of `:query_vec::vector` with `CAST(:query_vec AS vector)` Files the agent should NOT touch: - `src/pal_e_docs/embedding_worker.py` — uses raw psycopg2 (not SQLAlchemy text()), no issue there - `src/pal_e_docs/services/search.py` — pure Python RRF logic, no SQL ### Acceptance Criteria - [ ] `GET /notes/search?q=postgres&mode=semantic` returns results (not 500) - [ ] `GET /notes/search?q=postgres&mode=hybrid` returns results - [ ] `GET /notes/semantic-search?q=postgres` returns results - [ ] Existing keyword search (`GET /notes/search?q=postgres`) still works - [ ] All existing tests pass ### Test Expectations - [ ] Existing search tests continue to pass - [ ] If semantic search tests exist, verify they pass with the fix - Run command: `pytest tests/ -v` ### Constraints - Only change the SQL cast syntax — do not refactor the query structure - The fix is exactly: `:query_vec::vector` → `CAST(:query_vec AS vector)` in 4 places - Keep the same query logic (DISTINCT ON, ORDER BY, etc.) ### Checklist - [ ] PR opened - [ ] Tests pass - [ ] No unrelated changes ### Related - `plan-pal-e-docs` — Phase 5a: Activate Semantic Search - `todo-fix-mcp-pypi-publish` — related Phase 7 TODO
forgejo_admin 2026-03-14 17:45:22 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
forgejo_admin/pal-e-api#160
No description provided.