6c-2: QA nits — dead code, N+1 query, dep hygiene, k8s hardening #135
Labels
No labels
domain:backend
domain:devops
domain:frontend
status:approved
status:in-progress
status:needs-fix
status:qa
type:bug
type:devops
type:feature
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
forgejo_admin/pal-e-api#135
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Lineage
plan-2026-02-26-tf-modularize-postgres→ Phase 6 (Vector Search) → Phase 6c-2Repo
forgejo_admin/pal-e-docsUser Story
As a platform operator
I want the embedding worker cleaned up to production quality
So that backfill performance is acceptable and the deployment is properly hardened
Context
QA approved PR #130 (6c: async embedding pipeline) with 7 nits. 2 are no-ops. 5 are actionable. This issue addresses those 5. Depends on PR #130 being merged first.
File Targets
Files to modify:
src/pal_e_docs/embedding_worker.py— nits 1, 2, 3pyproject.toml— nit 4k8s/embedding-worker.yaml— nit 6Acceptance Criteria
EMBEDDABLE_TYPESconstant is either used in block filtering logic or removed. Prefer using it.set_isolation_levelcall removed from reconnection path (~line 523)._connect()already handles this.httpxremoved from[project.optional-dependencies] devsection. It's already in main deps.terminationGracePeriodSeconds: 60added to k8s Deployment spec. Gives worker time to finish batch and resetprocessing→pendingon shutdown.Test Expectations
pytest tests/ -k test_embeddingpytest tests/ -k test_embeddingConstraints
terminationGracePeriodSecondsreasonable (60s) — the worker processes batches of 10, each embedding takes <1sChecklist
Closes #NRelated
phase-postgres-6c2-qa-nits— phase notePR #136 Review
PR: "Fix 5 QA nits: dead code, N+1 query, dep hygiene, k8s hardening"
Branch:
135-qa-nits-dead-code-n1-dep-k8s->mainFiles changed: 4 (+52, -27)
BLOCKERS
None.
NITS
Undeclared formatting changes -- The diff includes three changes not listed in the PR summary: collapsing
INSTRUCTION_PREFIXfrom a parenthesized multi-line string to a single line (lines 47-49), reformatting2 ** attemptto2**attempt, and joining a multi-line logger string. All are semantically identical and likely fromruff format, but the PR body should mention "ruff format pass" as a change for traceability. Non-blocking.Comment on line 51 is now slightly misleading -- The comment says "mermaid is skipped by the Postgres trigger" but the code now uses an allowlist (
IN %s) rather than a denylist. The trigger may still skip mermaid, but the worker-side filtering is no longer about skipping mermaid -- it is about including only embeddable types. Consider rewording to something like "only these types are fetched for embedding; mermaid is also excluded by the Postgres trigger." Non-blocking.CODE REVIEW
Nit 1: EMBEDDABLE_TYPES constant and block filtering -- CORRECT.
{"paragraph", "list", "heading", "table", "code"}(line 50)._fetch_pending_blocksconverts it to a sorted tuple (embeddable = tuple(sorted(EMBEDDABLE_TYPES))) and passes it viaIN %swith psycopg2 tuple parameter binding. This is correct psycopg2 usage -- psycopg2 adapts Python tuples to SQLIN (...)clauses.!= 'mermaid') to allowlist (IN %s) is strictly safer: if new non-embeddable block types are added in the future, they are excluded by default rather than silently included.Nit 2: Redundant
set_isolation_levelremoval -- CORRECT._connect()(lines 220-224) already setsISOLATION_LEVEL_AUTOCOMMIT. The removed call at the reconnection path (old line 523) was a duplicate. No behavioral change.Nit 3: N+1 note title query replaced with correlated subquery -- CORRECT.
SELECT n.title FROM notes n JOIN blocks b ON b.note_id = n.id WHERE b.id = %s. This is N+1.RETURNING id, block_type, content, (SELECT n.title FROM notes n WHERE n.id = blocks.note_id). In PostgreSQL, theblocksreference in the RETURNING clause refers to the row being updated. This correlated subquery is resolved per-row within the single UPDATE statement, so the database planner can optimize it (e.g., index scan onnotes.idPK). The N+1 round-trips are eliminated.Nit 4: httpx removed from dev deps -- CORRECT.
httpx>=0.27is present in maindependencies(pyproject.toml line 25). Having it also in[project.optional-dependencies] devwas redundant. Removal is clean.Nit 5:
terminationGracePeriodSeconds: 60-- CORRECT.spec.template.spec.terminationGracePeriodSeconds(line 20 of the k8s manifest). This is the correct location in a Deployment pod spec. The default is 30s; 60s gives the embedding worker more time to finish in-flight Ollama calls and reset processing blocks on SIGTERM. Aligns with the graceful shutdown logic inrun_worker().Tests -- 3 new tests in
TestEmbeddableTypes:test_embeddable_types_matches_extract_logic: verifies every type in the set produces non-empty text fromextract_block_text. Good coverage link between the constant and the extraction logic.test_mermaid_not_in_embeddable_types: explicit negative case.test_embeddable_types_is_expected_set: snapshot test guards against accidental drift.SOP COMPLIANCE
135-qa-nits-dead-code-n1-dep-k8s)plan-2026-02-26-tf-modularize-postgres)Closes #135present in Related sectionVERDICT: APPROVED