Fix cache step falsely failing the CI pipeline #97

Merged
ldraney merged 1 commit from fix-cache-step-blocking-deploy into main 2026-06-04 12:21:39 +00:00
Owner

Summary

  • save-bundle-cache was marking the entire pipeline as failure even though build-and-push succeeded
  • Seven consecutive main-branch pipelines showed red despite images deploying fine via ArgoCD
  • Add failure: ignore since bundle cache is an optimization, not a deploy gate

Changes

  • .woodpecker.yaml: Add failure: ignore to save-bundle-cache step, mkdir -p /cache safety net, and ls -la /cache/ diagnostic output

Test Plan

  • Merge and verify next main-branch pipeline shows green
  • Check save-bundle-cache logs for ls -la /cache/ output to diagnose the mount issue

Review Checklist

  • No secrets committed
  • No unnecessary file changes
  • Commit messages are descriptive
  • ldraney/landscaping-assistant #94 — related deploy issue (fix merged but not deployed)
  • landscaping-assistant — project

Closes #96

## Summary - `save-bundle-cache` was marking the entire pipeline as failure even though `build-and-push` succeeded - Seven consecutive main-branch pipelines showed red despite images deploying fine via ArgoCD - Add `failure: ignore` since bundle cache is an optimization, not a deploy gate ## Changes - `.woodpecker.yaml`: Add `failure: ignore` to save-bundle-cache step, `mkdir -p /cache` safety net, and `ls -la /cache/` diagnostic output ## Test Plan - [ ] Merge and verify next main-branch pipeline shows green - [ ] Check save-bundle-cache logs for `ls -la /cache/` output to diagnose the mount issue ## Review Checklist - [ ] No secrets committed - [ ] No unnecessary file changes - [ ] Commit messages are descriptive ## Related Notes - `ldraney/landscaping-assistant #94` — related deploy issue (fix merged but not deployed) - `landscaping-assistant` — project Closes #96
Fix cache step falsely failing the CI pipeline
Some checks failed
CI / scan_ruby (pull_request) Has been cancelled
CI / scan_js (pull_request) Has been cancelled
CI / lint (pull_request) Has been cancelled
c2a8621636
save-bundle-cache was marking the entire pipeline as failure even though
build-and-push succeeded. This blocked 7 merges from showing green despite
images deploying fine via ArgoCD. Add failure: ignore since cache is an
optimization, not a gate. Also add mkdir -p and mount diagnostics for
the underlying hostPath volume issue.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Author
Owner

PR #97 Review

DOMAIN REVIEW

Tech stack: Woodpecker CI pipeline (YAML). CI/CD domain review applied.

failure: ignore placement is correct. The save-bundle-cache step is a cache optimization, not a deploy gate. It runs after lint and test, and critically, build-and-push depends only on lint and test -- never on save-bundle-cache. Adding failure: ignore here means a cache-save failure no longer marks the pipeline red while images deploy successfully via ArgoCD. This directly addresses the seven consecutive false-failure pipelines described in the issue.

mkdir -p /cache is a reasonable safety net. The volume mount already specifies type: DirectoryOrCreate, so the host path should exist. However, if the volume mount itself fails to attach (the suspected root cause), mkdir -p /cache would create a local container directory instead -- cache would be written to ephemeral storage and lost, but the step would not error out. Combined with failure: ignore, this is defense-in-depth.

ls -la /cache/ diagnostic is good for root-cause investigation. The 2>&1 || echo "WARNING..." pattern ensures the step continues regardless. This will reveal whether /cache is the host volume or an empty local dir after mkdir -p. Consider removing this diagnostic line once the mount issue is diagnosed and resolved.

No symmetry issue with restore-bundle-cache. The restore step does NOT have failure: ignore. If the volume mount fails during restore, the restore step would also fail and block bundle-install. This is arguably correct (a failed restore is less forgiving since it gates the install), but worth noting as a potential follow-up if the mount issue persists. Out of scope for this PR.

BLOCKERS

None. This is a CI config fix with no application code, no secrets exposure, and no security surface.

NITS

  1. Diagnostic line is temporary. The ls -la /cache/ 2>&1 || echo "WARNING: /cache volume not mounted" line is useful now for debugging the mount issue. Once the root cause is found and fixed, consider removing it to keep the pipeline config clean. Not blocking -- leaving diagnostic output in CI steps is low-risk.

  2. restore-bundle-cache symmetry. If the volume mount issue is systemic, restore-bundle-cache could also benefit from failure: ignore. That said, a failed restore should probably block the pipeline (it means bundle-install runs without cache, which is slower but functional). This is a separate concern -- noting it for awareness only.

SOP COMPLIANCE

  • Branch named after issue -- branch is fix-cache-step-blocking-deploy, should be 96-fix-cache-step-blocking-deploy per convention
  • PR body follows template -- Summary, Changes, Test Plan, Related all present
  • Related references plan slug -- no plan slug referenced (acknowledged; none exists for this hotfix)
  • No secrets committed -- Harbor credentials use from_secret, no plaintext secrets
  • No unnecessary file changes -- single file changed, tightly scoped to the issue
  • Commit message is descriptive

PROCESS OBSERVATIONS

This is a hotfix to unblock CI visibility. Seven consecutive green-in-practice but red-in-CI pipelines is a meaningful DORA metric drag -- it erodes trust in pipeline status and delays detection of real failures. Merging this restores pipeline signal fidelity.

The branch naming miss (no issue number prefix) is a minor SOP gap. For hotfixes like this, the tradeoff is understandable -- speed of fix vs. naming convention. Not worth blocking over.

The missing plan slug is also expected for a reactive hotfix. No planning artifact was needed here.

VERDICT: APPROVED

Small, focused, correct fix. The failure: ignore directive is the right Woodpecker mechanism for a non-critical cache step. The diagnostic output and mkdir -p are sensible additions for investigating the underlying volume mount issue. No blockers found.

## PR #97 Review ### DOMAIN REVIEW **Tech stack**: Woodpecker CI pipeline (YAML). CI/CD domain review applied. **`failure: ignore` placement is correct.** The `save-bundle-cache` step is a cache optimization, not a deploy gate. It runs after `lint` and `test`, and critically, `build-and-push` depends only on `lint` and `test` -- never on `save-bundle-cache`. Adding `failure: ignore` here means a cache-save failure no longer marks the pipeline red while images deploy successfully via ArgoCD. This directly addresses the seven consecutive false-failure pipelines described in the issue. **`mkdir -p /cache` is a reasonable safety net.** The volume mount already specifies `type: DirectoryOrCreate`, so the host path should exist. However, if the volume mount itself fails to attach (the suspected root cause), `mkdir -p /cache` would create a local container directory instead -- cache would be written to ephemeral storage and lost, but the step would not error out. Combined with `failure: ignore`, this is defense-in-depth. **`ls -la /cache/` diagnostic is good for root-cause investigation.** The `2>&1 || echo "WARNING..."` pattern ensures the step continues regardless. This will reveal whether `/cache` is the host volume or an empty local dir after `mkdir -p`. Consider removing this diagnostic line once the mount issue is diagnosed and resolved. **No symmetry issue with `restore-bundle-cache`.** The restore step does NOT have `failure: ignore`. If the volume mount fails during restore, the restore step would also fail and block `bundle-install`. This is arguably correct (a failed restore is less forgiving since it gates the install), but worth noting as a potential follow-up if the mount issue persists. Out of scope for this PR. ### BLOCKERS None. This is a CI config fix with no application code, no secrets exposure, and no security surface. ### NITS 1. **Diagnostic line is temporary.** The `ls -la /cache/ 2>&1 || echo "WARNING: /cache volume not mounted"` line is useful now for debugging the mount issue. Once the root cause is found and fixed, consider removing it to keep the pipeline config clean. Not blocking -- leaving diagnostic output in CI steps is low-risk. 2. **`restore-bundle-cache` symmetry.** If the volume mount issue is systemic, `restore-bundle-cache` could also benefit from `failure: ignore`. That said, a failed restore should probably block the pipeline (it means bundle-install runs without cache, which is slower but functional). This is a separate concern -- noting it for awareness only. ### SOP COMPLIANCE - [ ] Branch named after issue -- branch is `fix-cache-step-blocking-deploy`, should be `96-fix-cache-step-blocking-deploy` per convention - [x] PR body follows template -- Summary, Changes, Test Plan, Related all present - [ ] Related references plan slug -- no plan slug referenced (acknowledged; none exists for this hotfix) - [x] No secrets committed -- Harbor credentials use `from_secret`, no plaintext secrets - [x] No unnecessary file changes -- single file changed, tightly scoped to the issue - [x] Commit message is descriptive ### PROCESS OBSERVATIONS This is a hotfix to unblock CI visibility. Seven consecutive green-in-practice but red-in-CI pipelines is a meaningful DORA metric drag -- it erodes trust in pipeline status and delays detection of real failures. Merging this restores pipeline signal fidelity. The branch naming miss (no issue number prefix) is a minor SOP gap. For hotfixes like this, the tradeoff is understandable -- speed of fix vs. naming convention. Not worth blocking over. The missing plan slug is also expected for a reactive hotfix. No planning artifact was needed here. ### VERDICT: APPROVED Small, focused, correct fix. The `failure: ignore` directive is the right Woodpecker mechanism for a non-critical cache step. The diagnostic output and `mkdir -p` are sensible additions for investigating the underlying volume mount issue. No blockers found.
ldraney deleted branch fix-cache-step-blocking-deploy 2026-06-04 12:21:39 +00:00
Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
ldraney/landscaping-assistant!97
No description provided.