Woodpecker plan step fails with shell escaping when tofu plan output contains certain quote patterns #307

Closed
opened 2026-04-26 01:02:08 +00:00 by forgejo_admin · 1 comment
Contributor

Type

Bug

Lineage

Long-standing CI infra issue — repeatedly forces force-merges on infra PRs because the failing step is the PR-comment-posting wrapper, NOT the actual tofu plan (which succeeds). Surfaced again 2026-04-25 on PR #304 (admin-app Postgres Job). User authorized force-merge as workaround pending this fix.

Repo

forgejo_admin/pal-e-platform

What Broke

The plan step in .woodpecker.yaml (line 173) uses fragile nested shell quoting to build a JSON payload for the PR comment containing the plan output:

printf '{"body":"%s"}' "$$(echo "$$BODY" | sed 's/"/\\"/g' | sed ':a;N;$$!ba;s/\n/\\n/g')" > /tmp/comment-payload.json

When tofu plan output contains certain quote patterns (e.g., Kubernetes metadata labels like "app.kubernetes.io/managed-by" = "pal-e-platform-terraform", sensitive-value markers, multi-line resource diffs), the nested sed sequence inside $$(...) produces an unterminated quoted string in the wrapping shell, causing:

/bin/sh: syntax error: unterminated quoted string

The step exits with code 2 → pipeline marked failure → forgejo combined commit status failurecheck-infra-validation.sh hook blocks merge.

The actual plan succeedsPlan: 4 to add, 1 to change, 0 to destroy is visible in the logs above the syntax error. The infrastructure change is sound. Only the comment-posting wrapper is broken.

Repro Steps

  1. Open a PR against forgejo_admin/pal-e-platform whose tofu plan output contains kubernetes resource diffs with labels (e.g., any kubernetes_secret_v1, kubernetes_namespace_v1, kubernetes_service_v1 addition)
  2. Wait for Woodpecker pipeline to run on the PR
  3. Observe pipeline 442 (PR #304 of 2026-04-25) as a reference repro: https://woodpecker.tail5b443a.ts.net/repos/24/pipeline/442/1
  4. plan step shows tofu plan ran successfully then crashes on /bin/sh: syntax error: unterminated quoted string
  5. check-infra-validation.sh hook blocks PR merge despite the plan being valid

Expected Behavior

  • The plan step posts the plan output as a PR comment regardless of quote characters in the plan body
  • Step exit code reflects whether the actual tofu plan succeeded, NOT whether the comment-posting wrapper had a quoting accident
  • Infra PRs with sound tofu plan output can merge through the standard hook gate

Environment

  • pal-e-platform .woodpecker.yaml (line 173 specifically; also potentially line 162 tr '\n' ', ')
  • Woodpecker CI on tail5b443a
  • Tofu image ghcr.io/opentofu/opentofu:1.9 (uses Alpine /bin/sh, not bash — fewer escape niceties)
  • Triggers consistently on PRs that introduce new k8s resources

User Story

story:platform-stability — As an operator merging infra PRs, I need the Woodpecker plan step to fail only when the actual tofu plan fails, not when the comment-posting wrapper hits a shell quoting edge case, so that valid infra changes don't require force-merge bypasses that defeat the safety the hook is meant to enforce.

Context

Pattern repeats: feedback_ci_pipeline_lessons lists 12 prior root causes already fixed in this CI; this is the 13th. The escaping approach in the comment-posting block is the architectural smell. Better options:

  • Replace the sed-pipeline with jq -Rs '.' to JSON-encode the plan output safely (jq is reliable for this; the fragile bit is the hand-rolled escape).
  • Write the JSON payload via printf to a heredoc then post — let the shell handle the bytes raw, encode at the JSON layer.
  • Switch image: alpine for the comment-posting sub-step and install jq once — separate concerns from the tofu container.
  • Use wget --post-data with URL-encoding rather than building JSON by hand — simpler shape.

File Targets

  • ~/pal-e-platform/.woodpecker.yaml line 168-178 (the BODY= / printf / wget block in the plan step)
  • Possibly: same pattern duplicated in the apply step (verify by inspection)

Acceptance Criteria

  • Plan step posts PR comment for any tofu plan output, including outputs containing nested quotes, k8s label maps, sensitive-value markers, multi-line resource diffs
  • Plan step exit code reflects ONLY the success/failure of tofu plan itself, not the comment-posting wrapper
  • Apply step (line 182+) audited for the same pattern; fixed if present
  • Verified end-to-end on a fresh PR that adds a kubernetes_secret_v1 (matches the failing pattern)

Test Expectations

  • Test PR with kubernetes_secret_v1 addition runs through plan step without unterminated quoted string error
  • PR comment is posted with full plan output (or properly truncated to 60000 chars per existing logic)
  • check-infra-validation.sh hook reports success for the plan step

Constraints

  • Don't bypass plan validation — the actual gate (was tofu plan clean?) must remain enforced
  • Don't introduce new dependencies in the tofu image — if jq is needed, run the comment-post in a separate Alpine step
  • Existing 60000-char truncation logic must be preserved

Checklist

  • Reviewed via /review-ticket and moved backlog → todo
  • Comment-posting block rewritten with safe JSON encoding
  • Apply step audited
  • PR opened with Closes #THIS
  • Verified on a real PR that previously triggered the bug (e.g., re-run pipeline 442 on a synthetic branch)
  • Last force-merge driven by this bug: PR #304 (2026-04-25)
  • feedback_ci_pipeline_lessons — 12 prior fixes; this is the next one
  • Memory: feedback_discovered_scope_always_tracked, feedback_no_workarounds_just_fix
### Type Bug ### Lineage Long-standing CI infra issue — repeatedly forces force-merges on infra PRs because the failing step is the PR-comment-posting wrapper, NOT the actual `tofu plan` (which succeeds). Surfaced again 2026-04-25 on PR #304 (admin-app Postgres Job). User authorized force-merge as workaround pending this fix. ### Repo forgejo_admin/pal-e-platform ### What Broke The `plan` step in `.woodpecker.yaml` (line 173) uses fragile nested shell quoting to build a JSON payload for the PR comment containing the plan output: ```sh printf '{"body":"%s"}' "$$(echo "$$BODY" | sed 's/"/\\"/g' | sed ':a;N;$$!ba;s/\n/\\n/g')" > /tmp/comment-payload.json ``` When `tofu plan` output contains certain quote patterns (e.g., Kubernetes metadata labels like `"app.kubernetes.io/managed-by" = "pal-e-platform-terraform"`, sensitive-value markers, multi-line resource diffs), the nested sed sequence inside `$$(...)` produces an unterminated quoted string in the wrapping shell, causing: ``` /bin/sh: syntax error: unterminated quoted string ``` The step exits with code 2 → pipeline marked `failure` → forgejo combined commit status `failure` → `check-infra-validation.sh` hook blocks merge. **The actual plan succeeds** — `Plan: 4 to add, 1 to change, 0 to destroy` is visible in the logs above the syntax error. The infrastructure change is sound. Only the comment-posting wrapper is broken. ### Repro Steps 1. Open a PR against `forgejo_admin/pal-e-platform` whose `tofu plan` output contains kubernetes resource diffs with labels (e.g., any kubernetes_secret_v1, kubernetes_namespace_v1, kubernetes_service_v1 addition) 2. Wait for Woodpecker pipeline to run on the PR 3. Observe pipeline 442 (PR #304 of 2026-04-25) as a reference repro: https://woodpecker.tail5b443a.ts.net/repos/24/pipeline/442/1 4. `plan` step shows `tofu plan` ran successfully then crashes on `/bin/sh: syntax error: unterminated quoted string` 5. `check-infra-validation.sh` hook blocks PR merge despite the plan being valid ### Expected Behavior - The `plan` step posts the plan output as a PR comment regardless of quote characters in the plan body - Step exit code reflects whether the actual `tofu plan` succeeded, NOT whether the comment-posting wrapper had a quoting accident - Infra PRs with sound `tofu plan` output can merge through the standard hook gate ### Environment - pal-e-platform `.woodpecker.yaml` (line 173 specifically; also potentially line 162 `tr '\n' ', '`) - Woodpecker CI on tail5b443a - Tofu image `ghcr.io/opentofu/opentofu:1.9` (uses Alpine `/bin/sh`, not bash — fewer escape niceties) - Triggers consistently on PRs that introduce new k8s resources ### User Story story:platform-stability — As an operator merging infra PRs, I need the Woodpecker plan step to fail only when the actual `tofu plan` fails, not when the comment-posting wrapper hits a shell quoting edge case, so that valid infra changes don't require force-merge bypasses that defeat the safety the hook is meant to enforce. ### Context Pattern repeats: `feedback_ci_pipeline_lessons` lists 12 prior root causes already fixed in this CI; this is the 13th. The escaping approach in the comment-posting block is the architectural smell. Better options: - **Replace the sed-pipeline with `jq -Rs '.'`** to JSON-encode the plan output safely (jq is reliable for this; the fragile bit is the hand-rolled escape). - **Write the JSON payload via `printf` to a heredoc** then post — let the shell handle the bytes raw, encode at the JSON layer. - **Switch `image: alpine` for the comment-posting sub-step** and install `jq` once — separate concerns from the tofu container. - **Use `wget --post-data` with URL-encoding** rather than building JSON by hand — simpler shape. ### File Targets - `~/pal-e-platform/.woodpecker.yaml` line 168-178 (the `BODY=` / `printf` / `wget` block in the `plan` step) - Possibly: same pattern duplicated in the `apply` step (verify by inspection) ### Acceptance Criteria - [ ] Plan step posts PR comment for any tofu plan output, including outputs containing nested quotes, k8s label maps, sensitive-value markers, multi-line resource diffs - [ ] Plan step exit code reflects ONLY the success/failure of `tofu plan` itself, not the comment-posting wrapper - [ ] Apply step (line 182+) audited for the same pattern; fixed if present - [ ] Verified end-to-end on a fresh PR that adds a kubernetes_secret_v1 (matches the failing pattern) ### Test Expectations - Test PR with kubernetes_secret_v1 addition runs through plan step without `unterminated quoted string` error - PR comment is posted with full plan output (or properly truncated to 60000 chars per existing logic) - `check-infra-validation.sh` hook reports `success` for the plan step ### Constraints - Don't bypass plan validation — the actual gate (was `tofu plan` clean?) must remain enforced - Don't introduce new dependencies in the tofu image — if `jq` is needed, run the comment-post in a separate Alpine step - Existing 60000-char truncation logic must be preserved ### Checklist - [ ] Reviewed via /review-ticket and moved backlog → todo - [ ] Comment-posting block rewritten with safe JSON encoding - [ ] Apply step audited - [ ] PR opened with `Closes #THIS` - [ ] Verified on a real PR that previously triggered the bug (e.g., re-run pipeline 442 on a synthetic branch) ### Related - Last force-merge driven by this bug: PR #304 (2026-04-25) - `feedback_ci_pipeline_lessons` — 12 prior fixes; this is the next one - Memory: `feedback_discovered_scope_always_tracked`, `feedback_no_workarounds_just_fix`
Author
Contributor

PR opened: forgejo_admin/pal-e-platform#309

Fix uses approach #1 from the ticket Context section: replaced the sed-based JSON escaping in the plan step's comment-posting wrapper with jq -n --rawfile body /tmp/comment-body.txt '{body: $body}'. Added apk add --no-cache jq since the opentofu image (Alpine-based) does not ship jq.

Apply step audit: the apply step runs on event: push to main and posts no PR comment, so no parallel fix needed. The cross-pillar-review step already uses jq correctly.

Verification will land on this PR's own CI run — watch for the comment to post cleanly.

PR opened: https://forgejo.tail5b443a.ts.net/forgejo_admin/pal-e-platform/pulls/309 Fix uses approach #1 from the ticket Context section: replaced the sed-based JSON escaping in the plan step's comment-posting wrapper with `jq -n --rawfile body /tmp/comment-body.txt '{body: $body}'`. Added `apk add --no-cache jq` since the opentofu image (Alpine-based) does not ship jq. Apply step audit: the `apply` step runs on `event: push` to main and posts no PR comment, so no parallel fix needed. The `cross-pillar-review` step already uses jq correctly. Verification will land on this PR's own CI run — watch for the comment to post cleanly.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
ldraney/pal-e-platform#307
No description provided.