7e-1: Source-of-truth cutover — note writes parse to blocks first #109

Closed
opened 2026-03-08 06:32:28 +00:00 by forgejo_admin · 0 comments
Contributor

Lineage

plan-2026-02-26-tf-modularize-postgres → Phase 7 (Block Content) → Phase 7e (Compiled Pages) → Phase 7e-1

Repo

forgejo_admin/pal-e-docs

User Story

As an agent using block tools (get_section, update_block),
I want blocks to always reflect the latest note content,
So that block reads are never stale after an update_note() call.

Context

Block API endpoints exist (Phase 7d) and work correctly — block writes trigger recompilation of both compiled_pages and notes.html_content. But the reverse path is broken: create_note() and update_note() write directly to html_content without parsing to blocks. Every note write via the original API silently causes blocks and html_content to go out of sync.

This makes block tools unreliable. An agent that calls update_note(content=...) then later calls get_section() will get stale block data. The fix: make blocks canonical by having note writes parse to blocks first, then recompile.

The parser (parse_html_to_blocks) and compiler (compile_blocks_to_html) are already proven — they processed all 274 notes during the Phase 7c backfill. The _recompile() helper in routes/blocks.py already handles the blocks→html_content sync. This phase wires up the reverse direction.

File Targets

Files the agent should modify:

  • src/pal_e_docs/routes/notes.pycreate_note() and update_note() functions. When html_content is provided:
    1. Parse to blocks via parse_html_to_blocks()
    2. Store blocks (for create: insert new; for update: delete existing + insert new)
    3. Recompile via the same _recompile() pattern from routes/blocks.py
    4. Set html_content from compiled output (not from raw input — ensures round-trip consistency)

Files the agent should reference (read, don't modify):

  • src/pal_e_docs/routes/blocks.py_recompile() helper is the pattern to follow
  • src/pal_e_docs/blocks/parser.pyparse_html_to_blocks()
  • src/pal_e_docs/blocks/compiler.pycompile_blocks_to_html()
  • src/pal_e_docs/models.pyBlock, CompiledPage, Note models

Files the agent should NOT touch:

  • src/pal_e_docs/routes/blocks.py — block write endpoints already work correctly
  • src/pal_e_docs/blocks/parser.py — parser is proven, no changes needed
  • src/pal_e_docs/blocks/compiler.py — compiler is proven, no changes needed

Acceptance Criteria

  • create_note(html_content=...) results in blocks + compiled_page created alongside the note
  • update_note(html_content=...) results in blocks matching the updated HTML (delete old blocks, parse new ones)
  • Metadata-only updates (title, tags, status, etc. without html_content) do NOT trigger block processing
  • Round-trip consistency: create note → read blocks → matches original HTML structure
  • Round-trip consistency: update note via html_content → get_section() returns updated content
  • Existing tests pass unchanged (API contract is identical — consumers still send/receive html_content)

Test Expectations

  • Unit test: create_note with html_content → verify blocks created with correct types and content
  • Unit test: update_note with html_content → verify old blocks deleted, new blocks created
  • Unit test: update_note without html_content (metadata only) → verify blocks unchanged
  • Integration test: create note → get_note_toc → verify TOC matches headings
  • Integration test: update_note → get_section → verify section content matches update
  • Round-trip test: create → update via blocks → read html_content → update via html_content → read blocks → all consistent
  • Run command: pytest tests/ -v

Constraints

  • Follow the _recompile() pattern from routes/blocks.py — don't reinvent the sync logic
  • Consider extracting _recompile() to a shared module if importing across route files is cleaner than duplicating
  • Parser produces block dicts with block_type, content, anchor_id, position — these map directly to the Block model
  • html_content stored on the note should be the COMPILED output, not the raw input — this ensures the round-trip is deterministic

Checklist

  • PR opened
  • Tests pass
  • No unrelated changes
  • phase-postgres-7e-compiled-pages — phase note with full context
  • phase-postgres-7d-api-mcp-tools — block API (prerequisite, COMPLETED)
  • benchmark-phase7-block-baseline — baseline measurements
### Lineage `plan-2026-02-26-tf-modularize-postgres` → Phase 7 (Block Content) → Phase 7e (Compiled Pages) → Phase 7e-1 ### Repo `forgejo_admin/pal-e-docs` ### User Story As an agent using block tools (get_section, update_block), I want blocks to always reflect the latest note content, So that block reads are never stale after an update_note() call. ### Context Block API endpoints exist (Phase 7d) and work correctly — block writes trigger recompilation of both `compiled_pages` and `notes.html_content`. But the reverse path is broken: `create_note()` and `update_note()` write directly to `html_content` without parsing to blocks. Every note write via the original API silently causes blocks and html_content to go out of sync. This makes block tools unreliable. An agent that calls `update_note(content=...)` then later calls `get_section()` will get stale block data. The fix: make blocks canonical by having note writes parse to blocks first, then recompile. The parser (`parse_html_to_blocks`) and compiler (`compile_blocks_to_html`) are already proven — they processed all 274 notes during the Phase 7c backfill. The `_recompile()` helper in `routes/blocks.py` already handles the blocks→html_content sync. This phase wires up the reverse direction. ### File Targets Files the agent should modify: - `src/pal_e_docs/routes/notes.py` — `create_note()` and `update_note()` functions. When `html_content` is provided: 1. Parse to blocks via `parse_html_to_blocks()` 2. Store blocks (for create: insert new; for update: delete existing + insert new) 3. Recompile via the same `_recompile()` pattern from `routes/blocks.py` 4. Set `html_content` from compiled output (not from raw input — ensures round-trip consistency) Files the agent should reference (read, don't modify): - `src/pal_e_docs/routes/blocks.py` — `_recompile()` helper is the pattern to follow - `src/pal_e_docs/blocks/parser.py` — `parse_html_to_blocks()` - `src/pal_e_docs/blocks/compiler.py` — `compile_blocks_to_html()` - `src/pal_e_docs/models.py` — `Block`, `CompiledPage`, `Note` models Files the agent should NOT touch: - `src/pal_e_docs/routes/blocks.py` — block write endpoints already work correctly - `src/pal_e_docs/blocks/parser.py` — parser is proven, no changes needed - `src/pal_e_docs/blocks/compiler.py` — compiler is proven, no changes needed ### Acceptance Criteria - [ ] `create_note(html_content=...)` results in blocks + compiled_page created alongside the note - [ ] `update_note(html_content=...)` results in blocks matching the updated HTML (delete old blocks, parse new ones) - [ ] Metadata-only updates (title, tags, status, etc. without html_content) do NOT trigger block processing - [ ] Round-trip consistency: create note → read blocks → matches original HTML structure - [ ] Round-trip consistency: update note via html_content → get_section() returns updated content - [ ] Existing tests pass unchanged (API contract is identical — consumers still send/receive html_content) ### Test Expectations - [ ] Unit test: create_note with html_content → verify blocks created with correct types and content - [ ] Unit test: update_note with html_content → verify old blocks deleted, new blocks created - [ ] Unit test: update_note without html_content (metadata only) → verify blocks unchanged - [ ] Integration test: create note → get_note_toc → verify TOC matches headings - [ ] Integration test: update_note → get_section → verify section content matches update - [ ] Round-trip test: create → update via blocks → read html_content → update via html_content → read blocks → all consistent - Run command: `pytest tests/ -v` ### Constraints - Follow the `_recompile()` pattern from `routes/blocks.py` — don't reinvent the sync logic - Consider extracting `_recompile()` to a shared module if importing across route files is cleaner than duplicating - Parser produces block dicts with `block_type`, `content`, `anchor_id`, `position` — these map directly to the Block model - `html_content` stored on the note should be the COMPILED output, not the raw input — this ensures the round-trip is deterministic ### Checklist - [ ] PR opened - [ ] Tests pass - [ ] No unrelated changes ### Related - `phase-postgres-7e-compiled-pages` — phase note with full context - `phase-postgres-7d-api-mcp-tools` — block API (prerequisite, COMPLETED) - `benchmark-phase7-block-baseline` — baseline measurements
Commenting is not possible because the repository is archived.
No milestone
No project
No assignees
1 participant
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
ldraney/pal-e-api#109
No description provided.