Bug: kube-router ipset sync stale — NetworkPolicy blocks short-lived pods #127
Labels
No labels
domain:backend
domain:devops
domain:frontend
status:approved
status:in-progress
status:needs-fix
status:qa
type:bug
type:devops
type:feature
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
forgejo_admin/pal-e-platform#127
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Type
Bug
Lineage
plan-pal-e-platform→ Platform Hardening — standalone, discovered during CI investigationRepo
forgejo_admin/pal-e-platformWhat Broke
kube-router (embedded in k3s) maintains ipsets for NetworkPolicy source selectors. These ipsets contain pod IPs that should be allowed through. The ipsets are stale — they contain IPs of long-running pods but do NOT add newly created short-lived pods (Woodpecker pipeline containers, kubectl run test pods).
This causes all NetworkPolicy-protected namespaces to reject traffic from new pods, even when the policy explicitly allows the source namespace. Currently the forgejo namespace NetworkPolicies have been temporarily deleted as a workaround — this is a security regression that must be reversed.
Repro Steps
kubectl run test --image=alpine/curl -n X -- curl http://target-svc:80/Expected Behavior
kube-router should add pod IPs to namespace ipsets immediately when pods are created, and remove them when pods are deleted. Short-lived pods should be able to connect to NetworkPolicy-protected services.
Environment
/usr/local/bin/k3s --versionbug-kube-router-ipset-empty(#157, marked done but issue resurfaced)Acceptance Criteria
Related
pal-e-platform— project boardClosing — Diagnosis Was Wrong
kube-router ipset sync is working correctly. The issue was with my test methodology:
kubectl run --rm -itpods are so short-lived they execute BEFORE kube-router adds their IP to the ipset (~5s sync interval)The forgejo NetworkPolicy was recreated by
tofu applyand is correctly allowing woodpecker, tailscale, and monitoring namespaces. No security regression.Root cause of the original CI clone failure was the Forgejo IPv4 binding issue (PR #124) combined with the internal URL clone override (PR #118, now reverted). Both are fixed.