Kaniko build-and-push intermittently fails: cluster-internal Harbor unreachable #82
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Type
Bug
Lineage
Related to
ldraney/landscaping-assistant #23(CI optimization) and#77(build-arg regression). This is the deeper root cause that #77's fixes only partially address.Repo
ldraney/landscaping-assistantWhat Broke
Kaniko build-and-push step intermittently fails to connect to
harbor.harbor.svc.cluster.local. The failure pattern:The
--insecure-pullflag (PR #79) does not skip the HTTPS attempt — it only allows HTTP fallback. By the time the HTTPS probe times out (~30s), the HTTP connection is also refused.Meanwhile, regular pods in the woodpecker namespace (busybox, test pods) can reach Harbor on port 80 without issues. The problem is specific to Kaniko's connection behavior after the HTTPS timeout.
Successful builds (#164, #161, #147) used the Tailscale FQDN for pulls and cluster-internal for push — but the FQDN path also fails intermittently via DERP relay drops.
Repro Steps
build-and-pushstarts — ~50% chance of failureExpected Behavior
Kaniko reliably pulls base images and pushes to Harbor on every pipeline run.
Environment
Investigation Notes
What was tried:
--insecure-pull→ still tries HTTPS first, times outPossible fixes (not yet tried):
--registry-mirrorwith HTTP mirror for cluster-internal HarborAuto-cancel interference: Rapid merges to main cause pipelines to supersede each other (pipeline #184 was canceled by #185). Issue #62 tracks Woodpecker agent scaling which would help.
Acceptance Criteria
Related
ldraney/landscaping-assistant #23— parent CI optimization issueldraney/landscaping-assistant #62— Woodpecker agent scalingldraney/landscaping-assistant #77— build-arg regression (partially addressed)landscaping-assistant— project this affects