Generate anchor_ids for all block types, not just headings #119
Labels
No labels
domain:backend
domain:devops
domain:frontend
status:approved
status:in-progress
status:needs-fix
status:qa
type:bug
type:devops
type:feature
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
forgejo_admin/pal-e-api#119
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Lineage
todo-block-anchor-ids(no plan ancestry — Epilogue item 9 onphase-postgres-epilogue-cleanup)Repo
forgejo_admin/pal-e-docsUser Story
As an agent using the block API
I want every block to have an anchor_id
So that I can use
update_blockanddelete_blockon paragraphs, lists, tables, and code blocks — not just headingsContext
The HTML parser currently only generates
anchor_idfor heading blocks (by slugifying the heading text). All other block types getanchor_id: null. Since the block API endpoints (/notes/{slug}/blocks/{anchor_id}) useanchor_idas the path parameter, non-heading blocks are completely un-addressable for updates and deletes.This was discovered in a live session when trying to update a single paragraph in a plan note — the only option was
update_note()with the entire HTML content, defeating the purpose of block-level editing.~262 notes have blocks with null anchor_ids. After the parser fix, existing blocks need a backfill migration to populate their anchor_ids.
File Targets
Files the agent should modify or create:
src/pal_e_docs/services/block_parser.py(or wherever the parser lives) — generate anchor_ids for non-heading blockstests/— test anchor_id generation for all block typesFiles the agent should NOT touch:
src/pal_e_docs/routes/— API routes should not need changes (they already use anchor_id)Acceptance Criteria
anchor_id = nullupdate_block(slug, anchor_id)works on paragraph and list blocksdelete_block(slug, anchor_id)works on paragraph and list blocksTest Expectations
pytest tests/ -vConstraints
{block_type}-{position}(e.g.paragraph-3,list-7,table-12). Position-based is simple and deterministic. If the agent finds a better pattern during implementation, document the rationale.anchor_idshould remain unique per note (with a unique constraint on(note_id, anchor_id)if not already present)Checklist
Related
todo-block-anchor-ids— tracking TODO in pal-e-docsphase-postgres-7-block-content— original block implementationconvention-block-first-access— convention that depends on block-level editingPR #120 Review
BLOCKERS
1.
create_blockAPI endpoint does not generate anchor_ids for non-heading blocksThe parser change ensures all blocks parsed from HTML get non-null anchor_ids. However, the
create_blockendpoint insrc/pal_e_docs/routes/blocks.py(lines 222-238) only generates anchor_ids automatically for heading blocks:If a caller creates a paragraph, list, table, code, or mermaid block via POST without providing an
anchor_id, it will be inserted withanchor_id=None. This undermines the goal of making all blocks addressable and creates an inconsistency between parser-created blocks (always have anchors) and API-created blocks (may not).The fix should generate
{block_type}-{position}anchors for non-heading blocks increate_block, using_ensure_unique_anchorto avoid collisions.NITS
1. Anchor-position drift after reindexing (documentation, not a bug)
When
_reindex_positionsruns after a block deletion, or whencreate_blockshifts positions, the position portion of anchor_ids likeparagraph-3will no longer match the block's actual position. This is fine -- anchors are stable identifiers, not position labels -- but the docstring inparse_html_to_blockssays anchors use{block_type}-{position}format, which could mislead someone into thinking they reflect current position. Consider a brief note in the docstring: "position reflects the position at parse time; it does not change when blocks are reordered."2.
_make_blocksignature still acceptsNoneforanchor_id_make_blockhas the signatureanchor_id: str | None. Since the parser now guarantees non-null anchors, the individual_parse_*functions still returnNonefor anchor_id (which is fine since the caller fills it in), but from a type-safety perspective the intermediate state is nullable. This is purely cosmetic -- the code is correct -- but if you ever want to tighten the contract, the fallback fill-in could move into_make_blockitself.3. Migration revision ID is non-standard
The revision ID
k1f2g3h4i5j6follows the project's existing pattern (sequential hex-like IDs), so this is consistent. No action needed, just noting it is a manually chosen ID rather than auto-generated.4.
test_backfill.pyline 316 still hasanchor_id: NoneIn
test_build_toc(line 316), the test fixture still creates a paragraph block dict with"anchor_id": None. This is a test helper and does not affect production, but it is now inconsistent with the parser's output contract. Consider updating to"anchor_id": "paragraph-1"for clarity.SOP COMPLIANCE
119-block-anchor-idsreferences issue #119todo-block-anchor-idsTestAnchorIdGeneration, plus 6 updated assertions in existing testsVERDICT: NOT APPROVED
One blocker: the
create_blockAPI endpoint needs to generate position-based anchor_ids for non-heading blocks, otherwise API-created blocks will have null anchors while parser-created blocks will not. This inconsistency could cause unique constraint violations (if two null-anchor blocks exist for the same note, it works in PostgreSQL, but the block becomes unaddressable) and defeats the purpose of the change.PR #120 Review (Re-review)
Previous review found one blocker: the
create_blockAPI endpoint did not auto-generateanchor_idfor non-heading blocks. This re-review verifies the fix and checks for remaining issues.BLOCKERS
None. The previous blocker is resolved.
NITS
Test seed data still uses
anchor_id=None-- The test helpers_seed_note_with_blocks()and_seed_hierarchical_note()intests/test_blocks_api.py(lines 46, 60, 113) andtests/test_compiled_page_api.py(lines 46, 60) still create Block objects withanchor_id=Nonefor paragraph blocks. This works because (a) SQLite unique constraints treat NULLs as distinct, and (b) theanchor_idcolumn remainsnullable=Truein the model. However, these seed helpers now contradict the invariant this PR establishes -- that all blocks should have non-null anchor_ids. Similarlytests/test_blocks_compiled_pages.py(lines 50-52) creates blocks without anyanchor_idat all. Consider updating these in a follow-up to use the{block_type}-{position}pattern for consistency. Non-blocking because tests pass and the production write paths are all covered.anchor_idcolumn remainsnullable=Truein the model --models.pyline 212:anchor_id: Mapped[str | None] = mapped_column(String(200), nullable=True). After the migration backfills all existing NULLs and both write paths (parser + create_block API) now guarantee non-null values, consider a follow-up migration to make this columnNOT NULL. The unique constraint enforces uniqueness per note but does not prevent NULLs. Non-blocking -- the code paths are correct, this is a schema hardening item._ensure_unique_anchorscans up to suffix-19only --routes/blocks.pyline 58 pre-generates candidates[anchor] + [f"{anchor}-{i}" for i in range(2, 20)]. If a note ever has 19+ blocks with the same{type}-{position}candidate (unlikely but theoretically possible after many insert-at-same-position operations), the loop at line 68 would exhaust the pre-fetched set. This is a pre-existing concern, not introduced by this PR.SOP COMPLIANCE
119-block-anchor-idsreferences issue #119todo-block-anchor-ids.envfiles, credentials, or sensitive data in diffUPDATE ... WHERE anchor_id IS NULLis safe for re-runsk1f2g3h4i5j6->j0e1f2g3h4i5(blocks and compiled_pages migration)Write Path Audit
All three block creation paths now guarantee non-null
anchor_id:Parser (
src/pal_e_docs/blocks/parser.pylines 54-55, 64-67):parse_html_to_blocks()generates{block_type}-{position}for all non-heading blocks, including bare text nodes and fallback elements. Headings use slugified text. No path returnsanchor_id=None.parse_and_store_blocks()(src/pal_e_docs/blocks/sync.pyline 104): Usesbd.get("anchor_id")from parser output. Since the parser now always populatesanchor_id, this path is covered. Used by note create and update routes.create_blockAPI (src/pal_e_docs/routes/blocks.pylines 222-235): The previous blocker. Now correctly generates{block_type}-{position}for non-heading blocks (line 234) and runs it through_ensure_unique_anchor()for deduplication.Migration backfill (
alembic/versions/k1f2g3h4i5j6): Covers all existing rows withanchor_id IS NULLusing the same{block_type}-{position}pattern.Test Coverage
TestAnchorIdGenerationcovering all 6 block types, bare text, fallback elements, mixed documents, uniqueness, and idempotencytest_create_non_heading_auto_generates_anchorandtest_create_non_heading_anchor_uniqueness(with dedup suffix verification)anchor_id is Noneto expected valuesVERDICT: APPROVED
The blocker from the previous review is resolved. The
create_blockroute now auto-generatesanchor_idfor all block types, matching the parser behavior. The migration safely backfills existing data. Test coverage is thorough. The two nits (test seed data consistency and column nullability hardening) are follow-up items, not blockers.