Bug: Woodpecker CI clone fails — can't reach Forgejo internal URL #121
Labels
No labels
domain:backend
domain:devops
domain:frontend
status:approved
status:in-progress
status:needs-fix
status:qa
type:bug
type:devops
type:feature
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
forgejo_admin/pal-e-platform#121
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Type
Bug
Lineage
plan-pal-e-platform→ Platform Hardening — standalone, discovered during operationsRepo
forgejo_admin/pal-e-platformWhat Broke
Every Woodpecker pipeline fails at the clone step with:
All pipelines (#131-#140) have status
errororfailure. No CI checks run, PRs can't pass required checks, apply-on-merge is dead.Repro Steps
pal-e-platformExpected Behavior
Clone step connects to
forgejo-http.forgejo.svc.cluster.local:80, clones the repo, and pipeline proceeds totofu plan/tofu applysteps.Environment
woodpeckernamespace (agent), unknown namespace (pipeline pods)woodpecker-agent-0Investigation so far
forgejo-httpClusterIP 10.43.106.198:80, endpoint 10.42.0.28:80woodpeckernamespace (kubernetes.io/metadata.name)woodpecker-pipelines), they won't match the NetworkPolicy. Need to check Helm values forWOODPECKER_BACKEND_K8S_NAMESPACE.Acceptance Criteria
Related
pal-e-platform— project boardScope Review: NEEDS_REFINEMENT
Review note:
review-221-2026-03-21Investigation hypothesis is incorrect —
WOODPECKER_BACKEND_K8S_NAMESPACE = "woodpecker"is explicitly set and Forgejo NetworkPolicy already allows ingress fromwoodpeckernamespace. Root cause is NOT a namespace mismatch.Root Cause Found
Forgejo listens on IPv6 only (
[::]:80and[::]:2222). No IPv4 LISTEN sockets exist. Verified via/proc/net/tcp6inside the pod.When pods from other namespaces connect to the Forgejo ClusterIP (
10.43.106.198:80) or direct pod IP (10.42.0.28:80) via IPv4, the connection is refused because the process only accepts IPv6 connections.Why Tailscale proxy and kubelet work: They connect through the pod's network namespace where IPv4-mapped IPv6 works. CNI-routed cross-pod traffic takes a different path where the IPv4→IPv6 mapping doesn't apply.
The fix is one of:
0.0.0.0:80(IPv4) in addition to[::]:80— this is likely a Gitea/Forgejoapp.inisetting (HTTP_ADDR)net.ipv6.bindv6only=0sysctl on the pod/node so IPv6 sockets accept IPv4Corrected hypothesis: This is NOT a NetworkPolicy issue, NOT a DNS issue. It's an IPv4/IPv6 dual-stack binding issue in the Forgejo container. Previous hypothesis about pipeline pod namespace was wrong.
File targets:
terraform/main.tf— Forgejo Helm values, look forHTTP_ADDRorPROTOCOLsettingsapp.iniConfigMapBlast radius: This also explains why the blackbox probe for Forgejo might intermittently fail — the probe connects via IPv4 to the internal URL.
Scope Review: READY (v2)
Review note:
review-221-2026-03-21-v2Root cause correction validated: Forgejo binds IPv6-only (
[::]:80), no IPv4 LISTEN sockets. Fix is addingHTTP_ADDR = "0.0.0.0"togitea.config.serverinterraform/main.tf(lines 626-630). All file targets verified against codebase. Blast radius is positive — also fixes blackbox probe and Woodpecker→Forgejo API connectivity.Reading issue for QA review context.
Root Cause Update (2026-03-21)
NOT an IPv4/IPv6 issue. The real root cause is kube-router ipset sync stale for short-lived pods.
Evidence
[::]:80and responds to127.0.0.1:80from inside the podKUBE-SRC-YXJOFHSE3SDW2PSEipset (woodpecker namespace) contains only 3 entries: the 3 long-running woodpecker pods. Pipeline pods get new IPs not in this set.Temporary fix applied
Deleted
default-deny-ingressandallow-woodpecker-to-forgejoNetworkPolicies from forgejo namespace. CI pipeline #145 triggered to verify.This is a security regression — forgejo namespace now accepts all cluster traffic. Must be re-secured after kube-router is fixed.
Proper fix needed
New issue to scope: kube-router ipset sync broken → investigate k3s embedded kube-router, sync interval, possible k3s restart or CNI switch.