Add full-text search: tsvector column, GIN index, trigger, search endpoint #84

Merged
forgejo_admin merged 2 commits from 83-add-fulltext-search-tsvector into main 2026-03-06 20:48:40 +00:00
Contributor

Summary

Adds Postgres full-text search to pal-e-docs. A tsvector column with weighted ranking (title=A, html_content=B, slug=C) is auto-maintained by a Postgres trigger. A new GET /notes/search endpoint returns ranked results with headline snippets, replacing the current pattern of enumerating all notes via multiple API calls.

Changes

  • alembic/versions/i9d0e1f2g3h4_add_search_vector.py — New migration: adds search_vector tsvector column, GIN index, trigger function (notes_search_vector_update), trigger on INSERT/UPDATE, and backfills all existing rows
  • src/pal_e_docs/models.py — Added comment documenting the server-managed search_vector column (not mapped in ORM to keep SQLite compatibility for tests)
  • src/pal_e_docs/schemas.py — Added NoteSearchResult schema (slug, title, note_type, status, project, headline, rank)
  • src/pal_e_docs/routes/notes.py — Added GET /notes/search endpoint with query param q, optional filters (note_type, project, status, tags), and limit parameter. Returns 501 on SQLite.
  • tests/test_search.py — Tests for 501 on SQLite, missing query param (422), empty query (422)

Test Plan

  • Tests pass locally (302/302 including 3 new search tests)
  • ruff check and ruff format pass clean
  • On Postgres: GET /notes/search?q=secrets+management returns ranked results with headline snippets
  • Creating/updating a note auto-updates search_vector via trigger
  • Filters (note_type, project, status, tags) narrow results correctly
  • Results do NOT include html_content — only summaries + snippets

Review Checklist

  • Passed automated review-fix loop
  • No secrets committed
  • No unnecessary file changes
  • Commit messages are descriptive
  • phase-postgres-5-fulltext-search — the phase this PR implements
  • plan-2026-02-26-tf-modularize-postgres — Phase 5 (Full-Text Search)
  • Forgejo issue: #83
## Summary Adds Postgres full-text search to pal-e-docs. A tsvector column with weighted ranking (title=A, html_content=B, slug=C) is auto-maintained by a Postgres trigger. A new `GET /notes/search` endpoint returns ranked results with headline snippets, replacing the current pattern of enumerating all notes via multiple API calls. ## Changes - `alembic/versions/i9d0e1f2g3h4_add_search_vector.py` — New migration: adds `search_vector` tsvector column, GIN index, trigger function (`notes_search_vector_update`), trigger on INSERT/UPDATE, and backfills all existing rows - `src/pal_e_docs/models.py` — Added comment documenting the server-managed search_vector column (not mapped in ORM to keep SQLite compatibility for tests) - `src/pal_e_docs/schemas.py` — Added `NoteSearchResult` schema (slug, title, note_type, status, project, headline, rank) - `src/pal_e_docs/routes/notes.py` — Added `GET /notes/search` endpoint with query param `q`, optional filters (`note_type`, `project`, `status`, `tags`), and `limit` parameter. Returns 501 on SQLite. - `tests/test_search.py` — Tests for 501 on SQLite, missing query param (422), empty query (422) ## Test Plan - [x] Tests pass locally (302/302 including 3 new search tests) - [x] `ruff check` and `ruff format` pass clean - [ ] On Postgres: `GET /notes/search?q=secrets+management` returns ranked results with headline snippets - [ ] Creating/updating a note auto-updates search_vector via trigger - [ ] Filters (`note_type`, `project`, `status`, `tags`) narrow results correctly - [ ] Results do NOT include `html_content` — only summaries + snippets ## Review Checklist - [x] Passed automated review-fix loop - [x] No secrets committed - [x] No unnecessary file changes - [x] Commit messages are descriptive ## Related Notes - `phase-postgres-5-fulltext-search` — the phase this PR implements - `plan-2026-02-26-tf-modularize-postgres` — Phase 5 (Full-Text Search) - Forgejo issue: #83
Add full-text search: tsvector column, GIN index, trigger, search endpoint
Some checks failed
ci/woodpecker/pr/woodpecker Pipeline failed
b124d2666e
- Alembic migration adds search_vector (tsvector) column, GIN index,
  trigger function for auto-updating on INSERT/UPDATE, and backfills
  all existing rows
- GET /notes/search endpoint with ranked results, headline snippets,
  and optional filters (note_type, project, status, tags, limit)
- Returns 501 on SQLite (tsvector is Postgres-only)
- NoteSearchResult schema returns slug, title, note_type, status,
  project, headline, and rank -- no html_content for token efficiency

Closes #83

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sanitize tsquery input to strip special characters
Some checks failed
ci/woodpecker/pr/woodpecker Pipeline failed
174f8ba860
Prevents Postgres tsquery parse errors when user input contains
special characters like !, |, &, or parentheses.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
forgejo_admin deleted branch 83-add-fulltext-search-tsvector 2026-03-06 20:48:40 +00:00
Commenting is not possible because the repository is archived.
No description provided.