Validate session 2026-03-28 merges (3 merged + 1 pending) #223

Open
opened 2026-03-28 12:09:02 +00:00 by forgejo_admin · 3 comments

Type

Task

Scope

Validate 3 merged PRs and 1 pending PR from session 2026-03-28. Woodpecker pipeline #350 shows FAILURE on staging namespace push.

Lineage: Validation audit — session 2026-03-28 pipeline gap

User Story: As a platform operator, I want to verify that merged code is deployed and working, so that done means done.

PRs merged:

PR pending:

  • #222 — blocked until CI is verified green

CI status: Pipeline #350 FAILURE (staging namespace push)

Test Expectations: Pipeline green on Woodpecker, tofu plan -lock=false runs clean, tofu apply succeeds, no infrastructure drift.

Constraints: Do NOT merge PR #222 until pal-e-platform CI is verified green. Diagnose before fixing — pull logs first. All tofu commands must use -lock=false to avoid blocking CI.

Acceptance Criteria

  • Pipeline #350 failure diagnosed — root cause documented
  • tofu plan succeeds cleanly (with -lock=false)
  • tofu apply succeeds or path to success is clear
  • PR #222 merged after CI is verified green
  • Pipeline verified
  • Deployment confirmed
  • Features validated
  • project-pal-e-platform
### Type Task ### Scope Validate 3 merged PRs and 1 pending PR from session 2026-03-28. Woodpecker pipeline #350 shows FAILURE on staging namespace push. **Lineage:** Validation audit — session 2026-03-28 pipeline gap **User Story:** As a platform operator, I want to verify that merged code is deployed and working, so that done means done. **PRs merged:** - #216 - #217 - #218 **PR pending:** - #222 — blocked until CI is verified green **CI status:** Pipeline #350 FAILURE (staging namespace push) **Test Expectations:** Pipeline green on Woodpecker, `tofu plan -lock=false` runs clean, `tofu apply` succeeds, no infrastructure drift. **Constraints:** Do NOT merge PR #222 until pal-e-platform CI is verified green. Diagnose before fixing — pull logs first. All tofu commands must use `-lock=false` to avoid blocking CI. ### Acceptance Criteria - [ ] Pipeline #350 failure diagnosed — root cause documented - [ ] `tofu plan` succeeds cleanly (with `-lock=false`) - [ ] `tofu apply` succeeds or path to success is clear - [ ] PR #222 merged after CI is verified green - [ ] Pipeline verified - [ ] Deployment confirmed - [ ] Features validated ### Related - `project-pal-e-platform`
Author
Owner

Investigation complete. Root cause: tofu apply hangs on staging namespace creation.

The staging module (PR #218) creates kubernetes_namespace_v1 but may be missing depends_on for core modules. The apply step's output was captured to /tmp/apply-output.txt but the actual error isn't visible in Woodpecker logs — the process appears to hang rather than error.

Possible causes:

  1. Missing depends_on = [module.networking, module.monitoring] in staging module
  2. CI service account may lack namespace-creation RBAC
  3. Network policy in network-policies.tf references staging namespace before it exists

This needs manual investigation with tofu plan -lock=false on archbox before dispatching a fix.

**Investigation complete.** Root cause: `tofu apply` hangs on staging namespace creation. The staging module (PR #218) creates `kubernetes_namespace_v1` but may be missing `depends_on` for core modules. The apply step's output was captured to `/tmp/apply-output.txt` but the actual error isn't visible in Woodpecker logs — the process appears to hang rather than error. Possible causes: 1. Missing `depends_on = [module.networking, module.monitoring]` in staging module 2. CI service account may lack namespace-creation RBAC 3. Network policy in network-policies.tf references staging namespace before it exists **This needs manual investigation with `tofu plan -lock=false` on archbox before dispatching a fix.**
Author
Owner

Scope Review: NEEDS_REFINEMENT

Review note: review-512-2026-03-27

Scope is well-structured but exceeds the 5-minute rule and has template gaps that need tightening before dispatch.

Issues found:

  • Decomposition needed: 7 AC across sequential phases with a manual gate (archbox tofu plan). Exceeds 5-AC and 5-minute thresholds. Recommend splitting into 3 tickets via template-board: (1) diagnose pipeline #350, (2) fix + apply staging module, (3) merge PR #222 + validate features.
  • Vague acceptance criteria: AC5 "Pipeline verified", AC6 "Deployment confirmed", AC7 "Features validated" lack specific verification commands. An agent cannot determine completion.
  • Missing template sections: ### Repo, ### Context, ### Test Expectations (separate section), ### Checklist
  • Undocumented dependencies: Board item #411 (Harbor CI timeout, in_progress) shares arch:ci-pipeline and could cause apply failures for unrelated reasons. Board item #515 (pal-e-deployments validation) is sequentially dependent on this ticket.
  • Human-in-the-loop gate: Constraints say "manual investigation with tofu plan on archbox" — clarify which steps are manual-only vs agent-dispatchable.

Recommend decomposition via template-board.

## Scope Review: NEEDS_REFINEMENT Review note: `review-512-2026-03-27` Scope is well-structured but exceeds the 5-minute rule and has template gaps that need tightening before dispatch. **Issues found:** - **Decomposition needed:** 7 AC across sequential phases with a manual gate (archbox `tofu plan`). Exceeds 5-AC and 5-minute thresholds. Recommend splitting into 3 tickets via `template-board`: (1) diagnose pipeline #350, (2) fix + apply staging module, (3) merge PR #222 + validate features. - **Vague acceptance criteria:** AC5 "Pipeline verified", AC6 "Deployment confirmed", AC7 "Features validated" lack specific verification commands. An agent cannot determine completion. - **Missing template sections:** `### Repo`, `### Context`, `### Test Expectations` (separate section), `### Checklist` - **Undocumented dependencies:** Board item #411 (Harbor CI timeout, in_progress) shares `arch:ci-pipeline` and could cause apply failures for unrelated reasons. Board item #515 (pal-e-deployments validation) is sequentially dependent on this ticket. - **Human-in-the-loop gate:** Constraints say "manual investigation with tofu plan on archbox" — clarify which steps are manual-only vs agent-dispatchable. Recommend decomposition via `template-board`.
Author
Owner

Validation Investigation Findings (2026-03-28)

tofu plan result: 0 to add, 4 to change, 2 to destroy

Child ticket created: #224 — Apply 6 pending terraform changes

Key findings:

  • secrets.auto.tfvars was missing woodpecker_api_token and woodpecker_agent_secret (added locally)
  • Staging module resources are orphaned (being destroyed)
  • Woodpecker helm values need update for new token/secret
  • Blackbox probes need switch to internal URLs
  • DORA exporter needs new API token
  • MinIO signatures bucket policy needs consolidation

This parent ticket is validated once #224 is applied and tofu plan shows 0 changes.

## Validation Investigation Findings (2026-03-28) **tofu plan result:** 0 to add, 4 to change, 2 to destroy Child ticket created: **#224** — Apply 6 pending terraform changes **Key findings:** - `secrets.auto.tfvars` was missing `woodpecker_api_token` and `woodpecker_agent_secret` (added locally) - Staging module resources are orphaned (being destroyed) - Woodpecker helm values need update for new token/secret - Blackbox probes need switch to internal URLs - DORA exporter needs new API token - MinIO signatures bucket policy needs consolidation This parent ticket is validated once #224 is applied and `tofu plan` shows 0 changes.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
forgejo_admin/pal-e-platform#223
No description provided.