fix: replace plugin-git clone with alpine/git + sleep for CI connectivity #93
No reviewers
Labels
No labels
domain:backend
domain:devops
domain:frontend
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
forgejo_admin/pal-e-docs-app!93
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "92-fix-replace-plugin-git-clone-with-alpine"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
All 108 pal-e-docs-app CI pipelines have failed at the clone step due to a race condition in k3s pod networking. The
woodpeckerci/plugin-gitimage starts cloning immediately, before iptables ClusterIP rules propagate to the new pod's network namespace. This fix switches to thealpine/git+sleep 2pattern used by every other working repo.Changes
.woodpecker.yaml-- Replacewoodpeckerci/plugin-gitclone step withalpine/gitimage + manual git commands with 2-second sleep, matching the proven pattern from basketball-api, pal-e-docs, and all other repos.Investigation Findings
pal-e-docs-appexists?pal-e-app?harbor-credsin namespace?woodpeckerci/plugin-gituses curl/libcurl which hits k3s iptables race;wgetin same pod succeeds; addingsleep 2before git commands resolves itRoot Cause Detail
Tested from within the woodpecker namespace:
wget http://forgejo-http.forgejo.svc.cluster.local/-- succeeds immediatelync -vz 10.43.106.198 80-- succeeds immediatelycurl http://forgejo-http.forgejo.svc.cluster.local/-- "Connection refused" after 2mssleep 3 && git ls-remote ...-- succeedsThe k3s ClusterIP iptables rules need a brief window to propagate to newly scheduled pods. The
sleep 2workaround is already in use by every other repo in the fleet.Test Plan
pal-e-docs-app/appReview Checklist
Related Notes
forgejo_admin/pal-e-docs-app #92-- the Forgejo issue this PR implementsforgejo_admin/pal-e-platform #234-- parent tracking issue (ImagePullBackOff investigation)Closes #92
PR #93 Review
DOMAIN REVIEW
Tech stack: Woodpecker CI YAML (CI/CD pipeline configuration).
Clone pattern consistency verified. I compared this PR's clone step against every
.woodpecker.yamlin the fleet. There are two variants:alpine/git+ 5 commandsFORGEJO_TOKEN+.netrcsetuppal-e-docs-appis a public repo ("private": falseconfirmed via Forgejo API). This PR correctly uses the no-auth variant, matching westside-contracts, westside-app, basketball-api, minio-api, and pal-e-mail exactly.Command sequence is identical to fleet standard:
All five commands match the canonical public-repo pattern character-for-character. The
${CI_REPO}variable correctly replaces the previously hardcodedforgejo_admin/pal-e-docs-app.gitpath, which is a minor improvement (the oldplugin-gitstep hardcoded the repo path insettings.remote).Root cause analysis in PR body is thorough. The investigation table documenting the Harbor project state, curl vs wget behavior, and iptables race condition is excellent operational documentation.
BLOCKERS
None.
This is a 1-file CI config change. No application code, no secrets, no user input handling, no test coverage requirement. The change is a direct port of a proven pattern already running in 5+ repos.
NITS
alpine/git(no tag), which resolves tolatest. Other repos in the fleet also do this, so it is consistent -- but worth noting as fleet-wide tech debt. A pinned tag (e.g.,alpine/git:2.43) would prevent surprise breakage if the upstream image changes. Non-blocking since the entire fleet has this same exposure.SOP COMPLIANCE
92-fix-replace-plugin-git-clone-with-alpinereferences issue #92)PROCESS OBSERVATIONS
sleep 2workaround is already fleet-standard. A future fleet-wide fix (e.g., Woodpecker init container with retry) would eliminate this class of failure entirely, but that is out of scope here.VERDICT: APPROVED