Harden check-note-template.sh: verify mermaid fence + facet keyword for architecture notes #239

Closed
opened 2026-04-10 21:51:15 +00:00 by forgejo_admin · 2 comments
Contributor

Type

Feature

Lineage

Standalone — discovered during 2026-04-10 architecture-template gate review. Current hook enforces headings only; a note with ### Diagram\n(TODO: add diagram) would pass. Confirmed live: the new arch-domain-pal-e-docs note was accepted by the hook without any fence validation. This ticket tightens the gate to catch empty diagrams and facet-keyword mismatches.

Scope review round 1: NEEDS_REFINEMENT (review-945-2026-04-10). Fixed in this revision: (1) fence detection regex expanded to include the dominant <pre><code class="language-mermaid"> form the pal-e-docs markdown→HTML pipeline produces, (2) regression fixture list expanded from 3 to 13 (7 full triplets + 6 non-triplet arch notes), (3) new AC added for server-rendered HTML round-trip, (4) arch:hooks relabeled to arch:claude-custom (maps to existing component), (5) story:agent-write home clarified (the reviewer searched project-claude-custom which does not exist — the story lives in project-pal-e-docs user-stories table).

Repo

forgejo_admin/claude-custom

User Story

As an Agent (Ava creating architecture notes via MCP)
I want the template hook to fail when the Diagram section lacks a valid mermaid fence or contains the wrong diagram type for the facet
So that zero malformed architecture notes land in production (matching the story:agent-write success metric: "Zero malformed notes in production")

Traces to: story:agent-write on project-pal-e-docs (confirmed via get_section(slug="project-pal-e-docs", anchor_id="user-stories")).

Architecture

  • arch:claude-custom — the claude-custom row in the Shared Infrastructure table of arch-domain-pal-e-agency

(Note: arch-domain-pal-e-agency predates the strict Components-section template; its Shared Infrastructure table functions as the de facto Components surface. Template-conformance migration of that note is separate scope — do not touch in this ticket.)

Context

hooks/check-note-template.sh routes on note_type=architecture and currently enforces that four heading strings are present: Diagram, Components, Key Decisions, Related (lines 75–81). It does NOT verify:

  1. That a mermaid code fence actually exists in the note body.
  2. That the fence's leading keyword matches the facet the slug implies.

The template prescribes a fixed mermaid type per facet (see template-architecture):

Slug pattern Expected mermaid keyword
arch-domain-{project} erDiagram
arch-dataflow-{project} sequenceDiagram
arch-deployment-{project} graph TB (accept graph TD and flowchart TB/flowchart TD as aliases)

The triplet-per-project axiom is the value of the template. Enforcing the mermaid type at the gate makes the three-facet discipline self-enforcing.

File Targets

Files to modify:

  • hooks/check-note-template.sh — extend the architecture) case to verify (a) a mermaid fence is present, (b) the fence keyword matches the expected facet derived from the slug prefix.

Files the agent should NOT touch:

  • Any other hook — this change is scoped to architecture-note validation only
  • template-architecture in pal-e-docs — template prescribes the rule, hook enforces it; no template change needed
  • arch-domain-pal-e-agency — template-conformance migration is separate scope

Fence Detection — Container Forms to Accept

The hook reads tool_input.content, which may arrive in three container forms depending on who's calling create_note/update_note:

  1. Markdown source (what humans type): ```mermaid ... ```
  2. Inline HTML class (what some callers pre-render): <pre class="mermaid"> ... </pre>
  3. Nested HTML class (what the pal-e-docs markdown→HTML pipeline produces, and what every existing architecture note stores): <pre><code class="language-mermaid"> ... </code></pre>

All three must be accepted. The server-rendered form (#3) is the dominant stored form — confirmed by reading arch-domain-pal-e-docs, arch-domain-westside-basketball, etc. If only forms #1 and #2 are detected, agent round-trips (get_note → modify → update_note) will be denied because the content coming back from the server is in form #3.

Suggested detection pattern (grep -E):

<pre class="mermaid">|<pre><code class="[^"]*language-mermaid[^"]*">|<pre><code class="[^"]*mermaid[^"]*">|\x60\x60\x60mermaid

(The \x60 escape avoids triple-backtick collision inside a bash heredoc.)

Keyword Extraction

After detecting a fence, extract the first non-comment, non-blank line inside the fence and match against the expected facet keyword. Keyword check must tolerate:

  • Leading whitespace
  • Leading %% comment lines (Mermaid comment syntax)
  • Trailing diagram config (e.g., erDiagram or sequenceDiagram autonumber — match on the keyword prefix, not exact line equality)

Acceptance Criteria

  • Architecture note with no mermaid fence → hook denies with message naming the missing fence
  • Architecture note with mermaid fence but wrong keyword for the facet (e.g., arch-domain-foo containing sequenceDiagram) → hook denies with message naming the expected keyword
  • Architecture note with correct facet keyword, any of the three container forms → hook allows
  • Round-trip case: architecture note update where tool_input.content is server-rendered HTML (form #3, <pre><code class="language-mermaid">) → hook allows
  • Non-architecture notes unaffected
  • Hook remains fail-open on any parse error (existing behavior; do not regress)
  • Slug-to-facet detection handles all three prefixes: arch-domain-, arch-dataflow-, arch-deployment-
  • Arch notes with slugs NOT matching any prefix (e.g., arch-auth-westside-basketball, arch-sitemap-westside-basketball, arch-domain-sponsorship, arch-generic-checkout, arch-secrets-pipeline) → fence required, keyword check skipped

Test Expectations

  • Manual fixture tests via mcp__pal-e-docs__create_note with stub content:
    • Missing fence → deny
    • Wrong keyword → deny
    • Correct keyword (form #1 markdown) → allow
    • Correct keyword (form #2 inline HTML) → allow
    • Correct keyword (form #3 nested HTML) → allow
    • Unknown slug prefix with valid fence → allow
  • Regression: dry-run the new hook logic against all existing architecture notes (complete list below). Every single one must pass — any failure is a false positive and must be fixed before merge.
  • Run: bash -n hooks/check-note-template.sh to parse-check, then the fixture tests

Regression Fixture List (must all still pass)

Full triplets (7):

  • arch-domain-westside-basketball, arch-dataflow-westside-basketball, arch-deployment-westside-basketball
  • arch-domain-mcd-tracker, arch-dataflow-mcd-tracker, arch-deployment-mcd-tracker
  • arch-domain-pal-e-pac, arch-dataflow-pal-e-pac, arch-deployment-pal-e-pac
  • arch-domain-pal-e-app, arch-dataflow-pal-e-app, arch-deployment-pal-e-app
  • arch-domain-twitch-2k-wager, arch-dataflow-twitch-2k-wager, arch-deployment-twitch-2k-wager
  • arch-domain-westside-ops, arch-dataflow-westside-ops, arch-deployment-westside-ops
  • arch-domain-westside-mcp, arch-dataflow-westside-mcp, arch-deployment-westside-mcp

New (1):

  • arch-domain-pal-e-docs (created 2026-04-10 as backing note for ticket pal-e-api#252; dataflow + deployment siblings TODO)

Non-triplet arch notes (6) — fence required, facet keyword NOT checked:

  • arch-auth-westside-basketball
  • arch-sitemap-westside-basketball
  • arch-domain-sponsorship
  • arch-generic-checkout
  • arch-secrets-pipeline
  • arch-domain-pal-e-agency (legacy org chart — predates template; uses graph TD instead of erDiagram despite the arch-domain- prefix. This is a known inconsistency that should be accepted as pre-existing, not a regression target for this ticket. Consider adding a grandfather list if clean enforcement would break it.)

Heads-up on the legacy note: arch-domain-pal-e-agency will FAIL a strict arch-domain-erDiagram check because it uses graph TD. The hook should either (a) grandfather this note by slug, or (b) accept the failure and flag it as a separate migration ticket. Recommendation: grandfather by slug in a clearly-commented exclusion list, and file a follow-up ticket to migrate the org chart to a template-conformant note. Do not block this ticket on the migration.

Constraints

  • Preserve fail-open behavior on any error — a broken hook must not block knowledge-base writes
  • String matching is acceptable; do not pull in a YAML/mermaid parser
  • Match existing hook style (bash, jq, trap-based fail-open)

Checklist

  • PR opened
  • Hook parse-checked (bash -n)
  • Regression verified against all 22 existing architecture notes listed above
  • No unrelated hook changes
  • project-pal-e-docs — knowledge base this protects; home of story:agent-write
  • template-architecture — the rule being enforced
  • arch-domain-pal-e-docs — example of a template-conformant arch note (created 2026-04-10)
  • convention-architecture-ids — arch label derivation depends on valid diagrams
### Type Feature ### Lineage Standalone — discovered during 2026-04-10 architecture-template gate review. Current hook enforces headings only; a note with `### Diagram\n(TODO: add diagram)` would pass. Confirmed live: the new `arch-domain-pal-e-docs` note was accepted by the hook without any fence validation. This ticket tightens the gate to catch empty diagrams and facet-keyword mismatches. **Scope review round 1:** NEEDS_REFINEMENT (review-945-2026-04-10). Fixed in this revision: (1) fence detection regex expanded to include the dominant `<pre><code class="language-mermaid">` form the pal-e-docs markdown→HTML pipeline produces, (2) regression fixture list expanded from 3 to 13 (7 full triplets + 6 non-triplet arch notes), (3) new AC added for server-rendered HTML round-trip, (4) `arch:hooks` relabeled to `arch:claude-custom` (maps to existing component), (5) `story:agent-write` home clarified (the reviewer searched `project-claude-custom` which does not exist — the story lives in `project-pal-e-docs` user-stories table). ### Repo `forgejo_admin/claude-custom` ### User Story As an Agent (Ava creating architecture notes via MCP) I want the template hook to fail when the Diagram section lacks a valid mermaid fence or contains the wrong diagram type for the facet So that zero malformed architecture notes land in production (matching the `story:agent-write` success metric: "Zero malformed notes in production") Traces to: `story:agent-write` on `project-pal-e-docs` (confirmed via `get_section(slug="project-pal-e-docs", anchor_id="user-stories")`). ### Architecture - `arch:claude-custom` — the `claude-custom` row in the Shared Infrastructure table of `arch-domain-pal-e-agency` (Note: `arch-domain-pal-e-agency` predates the strict Components-section template; its Shared Infrastructure table functions as the de facto Components surface. Template-conformance migration of that note is separate scope — do not touch in this ticket.) ### Context `hooks/check-note-template.sh` routes on `note_type=architecture` and currently enforces that four heading strings are present: `Diagram`, `Components`, `Key Decisions`, `Related` (lines 75–81). It does NOT verify: 1. That a mermaid code fence actually exists in the note body. 2. That the fence's leading keyword matches the facet the slug implies. The template prescribes a fixed mermaid type per facet (see `template-architecture`): | Slug pattern | Expected mermaid keyword | |---|---| | `arch-domain-{project}` | `erDiagram` | | `arch-dataflow-{project}` | `sequenceDiagram` | | `arch-deployment-{project}` | `graph TB` (accept `graph TD` and `flowchart TB`/`flowchart TD` as aliases) | The triplet-per-project axiom is the value of the template. Enforcing the mermaid type at the gate makes the three-facet discipline self-enforcing. ### File Targets Files to modify: - `hooks/check-note-template.sh` — extend the `architecture)` case to verify (a) a mermaid fence is present, (b) the fence keyword matches the expected facet derived from the slug prefix. Files the agent should NOT touch: - Any other hook — this change is scoped to architecture-note validation only - `template-architecture` in pal-e-docs — template prescribes the rule, hook enforces it; no template change needed - `arch-domain-pal-e-agency` — template-conformance migration is separate scope ### Fence Detection — Container Forms to Accept The hook reads `tool_input.content`, which may arrive in three container forms depending on who's calling `create_note`/`update_note`: 1. **Markdown source** (what humans type): ```` ```mermaid ```` ... ```` ``` ```` 2. **Inline HTML class** (what some callers pre-render): `<pre class="mermaid">` ... `</pre>` 3. **Nested HTML class** (what the pal-e-docs markdown→HTML pipeline produces, and what every existing architecture note stores): `<pre><code class="language-mermaid">` ... `</code></pre>` All three must be accepted. The server-rendered form (#3) is **the dominant stored form** — confirmed by reading `arch-domain-pal-e-docs`, `arch-domain-westside-basketball`, etc. If only forms #1 and #2 are detected, agent round-trips (`get_note` → modify → `update_note`) will be denied because the content coming back from the server is in form #3. Suggested detection pattern (`grep -E`): ``` <pre class="mermaid">|<pre><code class="[^"]*language-mermaid[^"]*">|<pre><code class="[^"]*mermaid[^"]*">|\x60\x60\x60mermaid ``` (The `\x60` escape avoids triple-backtick collision inside a bash heredoc.) ### Keyword Extraction After detecting a fence, extract the first non-comment, non-blank line inside the fence and match against the expected facet keyword. Keyword check must tolerate: - Leading whitespace - Leading `%%` comment lines (Mermaid comment syntax) - Trailing diagram config (e.g., `erDiagram` or `sequenceDiagram autonumber` — match on the keyword prefix, not exact line equality) ### Acceptance Criteria - [ ] Architecture note with no mermaid fence → hook denies with message naming the missing fence - [ ] Architecture note with mermaid fence but wrong keyword for the facet (e.g., `arch-domain-foo` containing `sequenceDiagram`) → hook denies with message naming the expected keyword - [ ] Architecture note with correct facet keyword, any of the three container forms → hook allows - [ ] **Round-trip case:** architecture note update where `tool_input.content` is server-rendered HTML (form #3, `<pre><code class="language-mermaid">`) → hook allows - [ ] Non-architecture notes unaffected - [ ] Hook remains fail-open on any parse error (existing behavior; do not regress) - [ ] Slug-to-facet detection handles all three prefixes: `arch-domain-`, `arch-dataflow-`, `arch-deployment-` - [ ] Arch notes with slugs NOT matching any prefix (e.g., `arch-auth-westside-basketball`, `arch-sitemap-westside-basketball`, `arch-domain-sponsorship`, `arch-generic-checkout`, `arch-secrets-pipeline`) → fence required, keyword check skipped ### Test Expectations - [ ] Manual fixture tests via `mcp__pal-e-docs__create_note` with stub content: - Missing fence → deny - Wrong keyword → deny - Correct keyword (form #1 markdown) → allow - Correct keyword (form #2 inline HTML) → allow - Correct keyword (form #3 nested HTML) → allow - Unknown slug prefix with valid fence → allow - [ ] Regression: dry-run the new hook logic against all existing architecture notes (complete list below). Every single one must pass — any failure is a false positive and must be fixed before merge. - Run: `bash -n hooks/check-note-template.sh` to parse-check, then the fixture tests ### Regression Fixture List (must all still pass) **Full triplets (7):** - `arch-domain-westside-basketball`, `arch-dataflow-westside-basketball`, `arch-deployment-westside-basketball` - `arch-domain-mcd-tracker`, `arch-dataflow-mcd-tracker`, `arch-deployment-mcd-tracker` - `arch-domain-pal-e-pac`, `arch-dataflow-pal-e-pac`, `arch-deployment-pal-e-pac` - `arch-domain-pal-e-app`, `arch-dataflow-pal-e-app`, `arch-deployment-pal-e-app` - `arch-domain-twitch-2k-wager`, `arch-dataflow-twitch-2k-wager`, `arch-deployment-twitch-2k-wager` - `arch-domain-westside-ops`, `arch-dataflow-westside-ops`, `arch-deployment-westside-ops` - `arch-domain-westside-mcp`, `arch-dataflow-westside-mcp`, `arch-deployment-westside-mcp` **New (1):** - `arch-domain-pal-e-docs` (created 2026-04-10 as backing note for ticket pal-e-api#252; dataflow + deployment siblings TODO) **Non-triplet arch notes (6) — fence required, facet keyword NOT checked:** - `arch-auth-westside-basketball` - `arch-sitemap-westside-basketball` - `arch-domain-sponsorship` - `arch-generic-checkout` - `arch-secrets-pipeline` - `arch-domain-pal-e-agency` (legacy org chart — predates template; uses `graph TD` instead of `erDiagram` despite the `arch-domain-` prefix. This is a known inconsistency that should be accepted as pre-existing, not a regression target for this ticket. Consider adding a grandfather list if clean enforcement would break it.) **Heads-up on the legacy note:** `arch-domain-pal-e-agency` will FAIL a strict `arch-domain-` → `erDiagram` check because it uses `graph TD`. The hook should either (a) grandfather this note by slug, or (b) accept the failure and flag it as a separate migration ticket. Recommendation: grandfather by slug in a clearly-commented exclusion list, and file a follow-up ticket to migrate the org chart to a template-conformant note. Do not block this ticket on the migration. ### Constraints - Preserve fail-open behavior on any error — a broken hook must not block knowledge-base writes - String matching is acceptable; do not pull in a YAML/mermaid parser - Match existing hook style (bash, jq, trap-based fail-open) ### Checklist - [ ] PR opened - [ ] Hook parse-checked (`bash -n`) - [ ] Regression verified against all 22 existing architecture notes listed above - [ ] No unrelated hook changes ### Related - `project-pal-e-docs` — knowledge base this protects; home of `story:agent-write` - `template-architecture` — the rule being enforced - `arch-domain-pal-e-docs` — example of a template-conformant arch note (created 2026-04-10) - `convention-architecture-ids` — arch label derivation depends on valid diagrams
Author
Contributor

Scope Review: NEEDS_REFINEMENT

Review note: review-945-2026-04-10

Scope is solid — file targets and line references verify cleanly. Two content fixes and two traceability gaps need to be resolved before dispatch.

[BODY] fixes required in issue body:

  • Fence detection regex misses the dominant stored form. Stored html_content for existing arch notes uses <pre><code class="language-mermaid"> (markdown→HTML conversion), not <pre class="mermaid">. Agents round-tripping via get_note → update_note will send this form. Extend the detection regex to a third alternation (e.g., <code class="[^"]*mermaid[^"]*">) and make keyword extraction work across all three container forms. Without this, the hook will deny legitimate updates to every existing architecture note.
  • Regression fixture list understated. Ticket says "3 triplets" but the DB has 7 full triplets (westside-basketball, mcd-tracker, pal-e-pac, pal-e-app, twitch-2k-wager, westside-ops, westside-mcp) plus 6 non-triplet arch notes (arch-auth-westside-basketball, arch-sitemap-westside-basketball, arch-domain-sponsorship, arch-generic-checkout, arch-secrets-pipeline, arch-domain-pal-e-agency). Regression must dry-run all ~27 current architecture notes. The non-prefix ones are load-bearing fixtures for AC 7.
  • Add an AC covering the round-trip case: update_note where content is server-rendered HTML (<pre><code class="language-mermaid">…) must be accepted, not denied.

[SCOPE] gaps (Ava decision):

  • story:agent-write label has no backing story note. No project-claude-custom project page exists. Either create one with a user-stories section or add the agent-write story under project-pal-e-docs.
  • arch:hooks label has no backing arch note. search_notes("arch-hooks") returns zero. Either create an arch note that lists hooks as a component or relabel to an existing component.

Decomposition: not needed. 1 file, 1 function, 7 AC, string-matching. ~15-30 min agent work.

## Scope Review: NEEDS_REFINEMENT Review note: `review-945-2026-04-10` Scope is solid — file targets and line references verify cleanly. Two content fixes and two traceability gaps need to be resolved before dispatch. **[BODY] fixes required in issue body:** - Fence detection regex misses the dominant stored form. Stored `html_content` for existing arch notes uses `<pre><code class="language-mermaid">` (markdown→HTML conversion), not `<pre class="mermaid">`. Agents round-tripping via `get_note → update_note` will send this form. Extend the detection regex to a third alternation (e.g., `<code class="[^"]*mermaid[^"]*">`) and make keyword extraction work across all three container forms. Without this, the hook will deny legitimate updates to every existing architecture note. - Regression fixture list understated. Ticket says "3 triplets" but the DB has 7 full triplets (westside-basketball, mcd-tracker, pal-e-pac, pal-e-app, twitch-2k-wager, westside-ops, westside-mcp) plus 6 non-triplet arch notes (arch-auth-westside-basketball, arch-sitemap-westside-basketball, arch-domain-sponsorship, arch-generic-checkout, arch-secrets-pipeline, arch-domain-pal-e-agency). Regression must dry-run all ~27 current architecture notes. The non-prefix ones are load-bearing fixtures for AC 7. - Add an AC covering the round-trip case: update_note where content is server-rendered HTML (`<pre><code class="language-mermaid">…`) must be accepted, not denied. **[SCOPE] gaps (Ava decision):** - `story:agent-write` label has no backing story note. No `project-claude-custom` project page exists. Either create one with a user-stories section or add the agent-write story under `project-pal-e-docs`. - `arch:hooks` label has no backing arch note. `search_notes("arch-hooks")` returns zero. Either create an arch note that lists hooks as a component or relabel to an existing component. Decomposition: not needed. 1 file, 1 function, 7 AC, string-matching. ~15-30 min agent work.
Author
Contributor

Scope Review: APPROVED

Review note: review-945-2026-04-10-r2

Round 2 review verdict: APPROVED. All five round-1 blockers are resolved or correctly refuted.

  • [BODY] fence regex — RESOLVED (all three container forms enumerated with concrete grep pattern; server-rendered form #3 called out as dominant)
  • [BODY] regression fixture list — RESOLVED (28 slugs enumerated: 21 triplet + 1 new + 6 non-triplet)
  • [BODY] round-trip AC — RESOLVED (new AC4 covers server-rendered HTML update path)
  • [SCOPE] story:agent-write — REFUTED correctly; verified in project-pal-e-docs user-stories table row 4 (round 1 searched nonexistent project-claude-custom)
  • [SCOPE] arch:hooks — RESOLVED via relabel to arch:claude-custom; verified in arch-domain-pal-e-agency#shared-infrastructure (claude-custom row: "Hooks, skills, agent configs, commands")

Traceability triangle fully verified end-to-end. File target hooks/check-note-template.sh confirmed at 251 lines, architecture case at lines 75–81. Grandfather strategy for legacy arch-domain-pal-e-agency (uses graph TD not erDiagram) is reasonable and correctly defers migration to a follow-up ticket.

Minor non-blocking nit: the Checklist says "all 22" and the Lineage says "expanded from 3 to 13" but the Regression Fixture List itself enumerates 28 explicit slugs. The enumerated list is authoritative — dev should iterate it directly. Worth tidying eventually, not a review blocker.

Ticket can advance backlog → todo.

## Scope Review: APPROVED Review note: `review-945-2026-04-10-r2` Round 2 review verdict: **APPROVED**. All five round-1 blockers are resolved or correctly refuted. - `[BODY]` fence regex — RESOLVED (all three container forms enumerated with concrete grep pattern; server-rendered form #3 called out as dominant) - `[BODY]` regression fixture list — RESOLVED (28 slugs enumerated: 21 triplet + 1 new + 6 non-triplet) - `[BODY]` round-trip AC — RESOLVED (new AC4 covers server-rendered HTML update path) - `[SCOPE]` story:agent-write — REFUTED correctly; verified in `project-pal-e-docs` user-stories table row 4 (round 1 searched nonexistent `project-claude-custom`) - `[SCOPE]` arch:hooks — RESOLVED via relabel to `arch:claude-custom`; verified in `arch-domain-pal-e-agency#shared-infrastructure` (claude-custom row: "Hooks, skills, agent configs, commands") Traceability triangle fully verified end-to-end. File target `hooks/check-note-template.sh` confirmed at 251 lines, architecture case at lines 75–81. Grandfather strategy for legacy `arch-domain-pal-e-agency` (uses `graph TD` not `erDiagram`) is reasonable and correctly defers migration to a follow-up ticket. Minor non-blocking nit: the Checklist says "all 22" and the Lineage says "expanded from 3 to 13" but the Regression Fixture List itself enumerates 28 explicit slugs. The enumerated list is authoritative — dev should iterate it directly. Worth tidying eventually, not a review blocker. Ticket can advance backlog → todo.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
ldraney/claude-custom#239
No description provided.