Platform knowledge service — notes, projects, tags, and revision tracking via REST API backed by SQLite

Python 99.8%

Find a file

argocd-image-updater e5e125d9d3 Some checks failed ci/woodpecker/push/woodpecker Pipeline failed Details build: automatic update of pal-e-docs updates image pal-e-docs/api tag '`027ab29e1a`' to '`dd7323c1fe`'		2026-06-13 22:42:21 +00:00
alembic	feat: user attribution on notes — track who creates and edits (#272 )	2026-04-13 15:12:50 +00:00
docs	docs: restructure README, fix CLAUDE.md, create docs/ (#278 ) (#279 )	2026-06-13 20:02:15 +00:00
k8s	build: automatic update of pal-e-docs	2026-06-13 22:42:21 +00:00
scripts	fix: reblock 34 legacy un-decomposed notes (#259 )	2026-04-12 16:55:34 +00:00
src/pal_e_docs	fix: exclude html_content from list_boards response (#276 ) (#277 )	2026-06-13 16:53:56 +00:00
tests	feat: user attribution on notes — track who creates and edits (#272 )	2026-04-13 15:12:50 +00:00
.claude-no-enforce	docs: restructure README, fix CLAUDE.md, create docs/ (#278 ) (#279 )	2026-06-13 20:02:15 +00:00
.gitignore	Remove .argocd-source-* from .gitignore (#163 )	2026-03-14 18:19:05 +00:00
.pre-commit-config.yaml	feat: responsive nav, mobile breakpoints, CSS cleanup, pre-commit config (#48 )	2026-02-27 16:24:40 +00:00
.woodpecker.yaml	fix: add retry loop to CI wget for update-kustomize-tag (#242 )	2026-03-28 20:02:24 +00:00
alembic.ini	Scaffold pal-e-docs FastAPI service (#4 )	2026-02-24 14:03:24 +00:00
CLAUDE.md	docs: restructure README, fix CLAUDE.md, create docs/ (#278 ) (#279 )	2026-06-13 20:02:15 +00:00
Dockerfile	Scaffold pal-e-docs FastAPI service (#4 )	2026-02-24 14:03:24 +00:00
pyproject.toml	feat: dual auth — accept Keycloak JWT alongside API key (#269 )	2026-04-13 14:41:01 +00:00
README.md	docs: rewrite README as project landing page (#281 )	2026-06-13 21:41:40 +00:00

README.md

pal-e-docs

A knowledge platform where your docs, your kanban board, and your search index are the same database.

Why This Exists

Most teams run three systems that don't talk to each other: a knowledge base (Notion, Confluence), a work tracker (Jira, Linear), and whatever search they can bolt on. When you ask "what's blocking the auth migration?" you have to check the board for status, the docs for context, and hope your search covers both.

pal-e-docs collapses all three into a single Postgres store. Notes, kanban boards, and vector embeddings share one schema. A query can filter by project and status (structured), then rank by semantic similarity (unstructured) — in one call. No glue code, no sync jobs between tools.

The primary interface isn't a web UI — it's a Model Context Protocol server with 45+ tools. AI agents get native read/write access to everything: docs, boards, search, projects, links, and revisions. The system was built to be an AI agent's working memory, not retrofitted for it.

How It Works

Block-Level Content

Notes are composed of ordered blocks — headings, paragraphs, code, tables, diagrams. Each block is independently addressable by anchor ID. You can read or update a single section without loading the full document, and each block gets its own vector embedding.

This matters because semantic search returns the specific section that answers your question, not a 5,000-word document you have to scan.

On-Write Embedding Pipeline

When a block is created or its content changes, a PostgreSQL trigger fires NOTIFY to an async embedding worker. The worker calls Ollama, stores the 2560-dimensional vector, and the block becomes searchable within seconds.

No batch jobs. No stale indexes. The search index is as fresh as the last write.

Block INSERT/UPDATE → Postgres trigger → NOTIFY embedding_queue
                                              ↓
                                     Async worker (LISTEN)
                                              ↓
                                     Ollama embed → pgvector store

Hybrid Search (RRF)

One endpoint, three modes:

Mode	Engine	Best For
Keyword	PostgreSQL `tsvector` full-text search	Exact terms, slugs, names
Semantic	pgvector cosine similarity on block embeddings	Fuzzy concepts, "things like X"
Hybrid	Reciprocal Rank Fusion combining both	General queries (default)

All three modes support SQL-level metadata filtering — project, status, note type, tags — applied as WHERE clauses before ranking. "Show me completed items in project X about auth" is one query, not three API calls stitched together.

The RRF alpha parameter (0.0 = pure keyword, 1.0 = pure semantic) lets callers tune the blend per query.

Boards Are Notes

Kanban boards aren't a separate subsystem — a board is a note with note_type="board". Board items reference other notes by slug and flow through eight columns:

backlog → todo → next_up → in_progress → qa → needs_approval → validation → done

When a phase note's status changes, its board item moves automatically. The board is a view over note state, not a parallel data store that can drift.

Note Graph

Notes link bidirectionally (zettelkasten-style), form parent-child hierarchies, carry typed metadata (14 note types, lifecycle statuses), and track full revision history with user attribution. Tags handle topic/domain classification. Projects scope everything.

MCP-Native

The MCP server exposes the full API surface as discrete tools:

Category	Tools	Examples
Notes	10	`get_note`, `search_notes`, `semantic_search`, `create_note_from_template`
Blocks	8	`get_section`, `update_block`, `get_note_toc`
Boards	10	`list_board_items`, `sync_board`, `bulk_move_board_items`
Projects	5	CRUD + nested resource listing
Links, Repos, Tags	9	Bidirectional links, repo registry, tag queries

An AI agent connected via MCP can search your knowledge base, check the board for blockers, update a doc section, and move a card to done — all without leaving the protocol.

The Stack

Layer	Technology
API	Python 3.12, FastAPI, Uvicorn
Database	PostgreSQL 16 + pgvector
ORM	SQLAlchemy 2.0 + Alembic
Embeddings	Ollama (qwen3-embedding, 2560-dim)
Search	tsvector + pgvector + RRF
Auth	Keycloak OIDC (JWT)
Frontend	SvelteKit (separate repo)
MCP	FastMCP (separate server)
CI/CD	Woodpecker CI → Harbor → ArgoCD
Infra	k3s, Tailscale, CNPG

By the Numbers

Metric	Value
API Endpoints	48
MCP Tools	45+
Tests	709+
SQLAlchemy Models	11 tables
Alembic Migrations	24
Embedding Dimensions	2,560
Kanban Columns	8
Note Types	14
Board Item Types	6

Documentation

Doc	Description
Architecture	FastAPI app structure, data model, auth, request lifecycle
Database	Dual-path engine (SQLite local, Postgres prod), shared CNPG cluster
API Endpoints	All 48 endpoints with request/response shapes
Embedding Pipeline	pgvector setup, Ollama integration, RRF hybrid search
Deployment	Woodpecker CI, Harbor registry, ArgoCD GitOps

Quick Start

pip install -e ".[dev]"

# SQLite (local dev)
PALDOCS_DATABASE_PATH=./local.db alembic upgrade head
PALDOCS_DATABASE_PATH=./local.db python -m pal_e_docs.main

# Tests
pytest

# Lint
ruff check src/ tests/
ruff format src/ tests/

Repository	Role
pal-e-app	SvelteKit frontend consuming this API
pal-e-docs-sdk	Python SDK for programmatic API access
pal-e-mcp	MCP server exposing 45+ tools to AI agents
pal-e-platform	Infrastructure bootstrap (k3s, Tailscale, CI/CD)
pal-e-deployments	Kustomize overlays for ArgoCD GitOps