Woodpecker plan-step quoting bug recurs on certain content patterns (#307 v2 fix incomplete) #321

Open
opened 2026-04-30 11:55:07 +00:00 by forgejo_admin · 0 comments
Contributor

Type

Bug

Lineage

Discovered scope from PR #320's CI run (pipeline 473, 2026-04-30). PR #307's v2 fix (commit a49c1d0, merged via PR #309) successfully fixed the original unterminated quoted string bug for PRs #310 and #313 — both passed CI cleanly. PR #320's CI hit the SAME error class. The v2 fix's coverage is content-dependent.

Repo

forgejo_admin/pal-e-platform

What Broke

PR #320 changes the SQL heredoc inside kubernetes_job_v1.admin_app_user_provision's container args (replaces DO block with psql \gset+\if). Tofu plan ran successfully and emitted the diff. Then in the post-plan comment-posting block (the v2 3-command pattern):

+ apk add --no-cache jq           <-- step succeeds, jq installed
OK: 29 MiB in 41 packages
/bin/sh: syntax error: unterminated quoted string   <-- next step's sh fails to parse

The error happens at the start of the second Woodpecker command (the heredoc-write pattern from v2). v2 was supposed to immunize this from quote-heavy content because the heredoc body is bytes-only via <<'POSTEOF' (no shell substitution). Yet the error recurs.

Repro Steps

  1. Open a PR to pal-e-platform that introduces SQL containing psql backslash directives + colons (e.g. \gset, \if, :'var')
  2. Pipeline runs plan step
  3. After apk add jq succeeds, next step (the - | heredoc-write block) fails with unterminated quoted string
  4. Reproduces consistently for PR #320 (commit fe701ad). Did NOT reproduce for PRs #310 or #313 (which had simpler plan output).

Expected Behavior

v2 fix should immunize the comment-posting from any plan output content. Either:

  • The fix is content-independent (should always work)
  • OR there's a remaining edge case for certain content patterns we need to identify and harden against

Environment

  • pal-e-platform .woodpecker.yaml lines 162-195 (the v2 comment-posting block)
  • Tofu image ghcr.io/opentofu/opentofu:1.9 (Alpine 3.20 + busybox sh)
  • Triggered by: PR #320's plan diff content (psql metacommands in SQL heredoc inside kubernetes_job_v1.spec.template.spec.containers[0].args)

User Story

story:platform-stability — As a dev/operator merging infra PRs, I need the Woodpecker plan step's comment-posting to NOT false-positive on certain plan content, so that valid infra changes don't keep requiring force-merge bypasses (eroding the safety the hook is meant to enforce).

Architecture

arch:ci-pipeline

Context

Hypothesis: The plan output is captured to /tmp/plan-output.txt BEFORE the comment-posting block runs. Then apk add jq runs in its own command. Then the heredoc-write command starts. The heredoc body itself is static (doesn't reference plan content) — should be content-independent.

But the error emerges in step 2 (heredoc write) for #320 only. What's different about #320's environment when step 2 starts?

Possible angles:

  • Plan output written to /tmp/plan-output.txt has content that breaks something at step boundary (Woodpecker base64-decoding the next step's CI_SCRIPT?)
  • Some interaction with the larger plan diff size?
  • Specific characters in #320's plan that match some Woodpecker shellescape edge case?

Worth: pull pipeline 473's full step boundary trace; compare against 453's (which succeeded). Diff isolates the trigger pattern.

This is the second recurrence of #307's pattern. The original bug (PR #304's plan with k8s metadata labels) was thought fixed by v1 — wasn't. v2 thought fixed it generally — also wasn't. Each force-merge bypass is a safety erosion. Worth treating as P1 once westside-admin bootstrap completes.

File Targets

  • .woodpecker.yaml lines 162-195 — investigate + harden the v2 pattern OR move to Path C/D from #307's original ticket (separate alpine step + jq, or wget --post-data with URL-encoding)

Acceptance Criteria

  • Reproduce #320's failure on a synthetic branch with similar SQL heredoc content
  • Identify the specific trigger character/pattern
  • Harden the v2 fix OR migrate to a different post-comment approach
  • Verify across diverse plan content (k8s labels, SQL heredocs, multi-line YAML, large diffs)

Test Expectations

  • Synthetic PR with SQL heredoc + psql backslash directives + colons passes plan step cleanly
  • Existing pattern (k8s metadata labels) continues to pass
  • Large plan diffs (>30k chars) pass

Constraints

  • No regression on existing working PR shapes
  • Don't disable CI's plan step — the gate is load-bearing

Checklist

  • Reviewed via /review-ticket
  • Triggered investigation of pipeline 473 vs 453 step boundaries
  • Fix applied + tested
  • PR opened with Closes #THIS
  • Verified across multiple plan content shapes
  • Original: #307 (v1 + v2 both incomplete)
  • Force-merged due to this bug: #304 (PR), #308 (PR), #320 (PR — this discovery)
  • Memory: feedback_woodpecker_quote_heavy_scripts (now needs update to reflect v2 incompleteness)
### Type Bug ### Lineage Discovered scope from PR #320's CI run (pipeline 473, 2026-04-30). PR #307's v2 fix (commit a49c1d0, merged via PR #309) successfully fixed the original `unterminated quoted string` bug for PRs #310 and #313 — both passed CI cleanly. PR #320's CI hit the SAME error class. The v2 fix's coverage is content-dependent. ### Repo forgejo_admin/pal-e-platform ### What Broke PR #320 changes the SQL heredoc inside `kubernetes_job_v1.admin_app_user_provision`'s container args (replaces DO block with psql `\gset`+`\if`). Tofu plan ran successfully and emitted the diff. Then in the post-plan comment-posting block (the v2 3-command pattern): ``` + apk add --no-cache jq <-- step succeeds, jq installed OK: 29 MiB in 41 packages /bin/sh: syntax error: unterminated quoted string <-- next step's sh fails to parse ``` The error happens at the start of the second Woodpecker command (the heredoc-write pattern from v2). v2 was supposed to immunize this from quote-heavy content because the heredoc body is bytes-only via `<<'POSTEOF'` (no shell substitution). Yet the error recurs. ### Repro Steps 1. Open a PR to pal-e-platform that introduces SQL containing psql backslash directives + colons (e.g. `\gset`, `\if`, `:'var'`) 2. Pipeline runs plan step 3. After `apk add jq` succeeds, next step (the `- |` heredoc-write block) fails with `unterminated quoted string` 4. Reproduces consistently for PR #320 (commit fe701ad). Did NOT reproduce for PRs #310 or #313 (which had simpler plan output). ### Expected Behavior v2 fix should immunize the comment-posting from any plan output content. Either: - The fix is content-independent (should always work) - OR there's a remaining edge case for certain content patterns we need to identify and harden against ### Environment - pal-e-platform `.woodpecker.yaml` lines 162-195 (the v2 comment-posting block) - Tofu image `ghcr.io/opentofu/opentofu:1.9` (Alpine 3.20 + busybox sh) - Triggered by: PR #320's plan diff content (psql metacommands in SQL heredoc inside kubernetes_job_v1.spec.template.spec.containers[0].args) ### User Story story:platform-stability — As a dev/operator merging infra PRs, I need the Woodpecker plan step's comment-posting to NOT false-positive on certain plan content, so that valid infra changes don't keep requiring force-merge bypasses (eroding the safety the hook is meant to enforce). ### Architecture arch:ci-pipeline ### Context Hypothesis: The plan output is captured to `/tmp/plan-output.txt` BEFORE the comment-posting block runs. Then `apk add jq` runs in its own command. Then the heredoc-write command starts. The heredoc body itself is static (doesn't reference plan content) — should be content-independent. But the error emerges in step 2 (heredoc write) for #320 only. What's different about #320's environment when step 2 starts? Possible angles: - Plan output written to /tmp/plan-output.txt has content that breaks something at step boundary (Woodpecker base64-decoding the next step's CI_SCRIPT?) - Some interaction with the larger plan diff size? - Specific characters in #320's plan that match some Woodpecker shellescape edge case? Worth: pull pipeline 473's full step boundary trace; compare against 453's (which succeeded). Diff isolates the trigger pattern. This is the **second** recurrence of #307's pattern. The original bug (PR #304's plan with k8s metadata labels) was thought fixed by v1 — wasn't. v2 thought fixed it generally — also wasn't. Each force-merge bypass is a safety erosion. Worth treating as P1 once westside-admin bootstrap completes. ### File Targets - `.woodpecker.yaml` lines 162-195 — investigate + harden the v2 pattern OR move to Path C/D from #307's original ticket (separate alpine step + jq, or wget --post-data with URL-encoding) ### Acceptance Criteria - [ ] Reproduce #320's failure on a synthetic branch with similar SQL heredoc content - [ ] Identify the specific trigger character/pattern - [ ] Harden the v2 fix OR migrate to a different post-comment approach - [ ] Verify across diverse plan content (k8s labels, SQL heredocs, multi-line YAML, large diffs) ### Test Expectations - Synthetic PR with SQL heredoc + psql backslash directives + colons passes plan step cleanly - Existing pattern (k8s metadata labels) continues to pass - Large plan diffs (>30k chars) pass ### Constraints - No regression on existing working PR shapes - Don't disable CI's plan step — the gate is load-bearing ### Checklist - [ ] Reviewed via /review-ticket - [ ] Triggered investigation of pipeline 473 vs 453 step boundaries - [ ] Fix applied + tested - [ ] PR opened with `Closes #THIS` - [ ] Verified across multiple plan content shapes ### Related - Original: #307 (v1 + v2 both incomplete) - Force-merged due to this bug: #304 (PR), #308 (PR), #320 (PR — this discovery) - Memory: `feedback_woodpecker_quote_heavy_scripts` (now needs update to reflect v2 incompleteness)
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
ldraney/pal-e-platform#321
No description provided.