Woodpecker agent label routing — platform-wide pipeline contract #191
Labels
No labels
domain:backend
domain:devops
domain:frontend
status:approved
status:in-progress
status:needs-fix
status:qa
type:bug
type:devops
type:feature
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
forgejo_admin/pal-e-platform#191
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Type
Feature
Lineage
Discovered during incident #184 investigation. Mac agent was root cause of "random" CI failures across all repos.
Repo
forgejo_admin/pal-e-platform(primary — Helm values + this repo's pipeline)Cross-repo: all repos with
.woodpecker.yamlneed label updates.User Story
As a platform operator
I want pipelines routed to capable agents via label contracts
So that adding a second agent (Mac, ARM, etc.) doesn't cause random CI failures across all repos
Context
Woodpecker uses work-stealing scheduling. When the Mac agent (
lucass-macbook-air-1, agent ID 3, local backend) registered withno_schedule = false, it began racing the k8s agent for every queued workflow. Every pipeline it won failed — no Forgejo git credentials, no Harbor access, no container backend.Agent-side
WOODPECKER_FILTER_LABELSonly restricts what an agent accepts. But if pipelines declare nolabels:constraint, they match ANY agent. The platform had no routing contract — the k8s agent has no filter labels either.Additionally,
WOODPECKER_CONNECT_RETRY_COUNT=1makes the k8s agent fragile during pod restart cascades (DB → server → agent ordering).Immediate mitigation (done 2026-03-27): Mac agent set to
no_schedule = truein DB.File Targets
Files to modify:
terraform/main.tf— k8s agent Helm values: addWOODPECKER_FILTER_LABELS, bumpCONNECT_RETRY_COUNT.woodpecker.yaml— addlabels: { platform: linux }to all workflowsFiles NOT to touch:
salt/pillar/mac-agent.sls— already hasfilter_labels: "platform=darwin"(correct)salt/states/mac-agent/— plist template already rendersWOODPECKER_FILTER_LABELS(correct)Cross-repo follow-ups (separate issues):
basketball-api/.woodpecker.yamlpal-e-deployments/.woodpecker.yamlAcceptance Criteria
WOODPECKER_FILTER_LABELS=platform=linuxin Helm valuesWOODPECKER_CONNECT_RETRY_COUNT=10pal-e-platform/.woodpecker.yamlhaslabels: { platform: linux }on all workflowstofu planshows only agent env changes (no unrelated drift)convention-pipeline-labelsdocuments the routing contractno_schedule = false) and verified: unlabeled pipelines don't route to itTest Expectations
tofu validatepassestofu plan -lock=falseshows expected agent env changes onlytofu plan -lock=falseinterraform/Constraints
tofunotterraformtofu planmust include-lock=false(state lock blocks CI)labels:syntax:labels: { platform: linux }at workflow level in.woodpecker.yamlChecklist
Related
project-pal-e-platformScope Review: READY
Review note:
review-425-2026-03-26All file targets verified, template complete, traceability triangle satisfied. Scope is solid — ready to move to next_up.
One sequencing note for the implementation agent: cross-repo
.woodpecker.yamllabel updates must land before or simultaneously with the k8s agentWOODPECKER_FILTER_LABELSchange, otherwise unlabeled pipelines will match no agent.