infra: Salt-manage k3s unit file with maxPods=250 #332
Labels
No labels
domain:backend
domain:devops
domain:frontend
status:approved
status:in-progress
status:needs-fix
status:qa
type:bug
type:devops
type:feature
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
ldraney/pal-e-platform#332
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Type
Feature
Lineage
Standalone — discovered during westside-admin-dev overlay setup (2026-05-04). Node hit 110 pod ceiling, fixed live, needs codification.
Repo
forgejo_admin/pal-e-platformUser Story
As a platform operator
I want the k3s maxPods limit codified in Salt
So that it survives reprovisioning and is auditable as infrastructure-as-code
Context
archbox hit the default kubelet
maxPods: 110ceiling with 125 pods running (38 Tailscale proxies alone). The limit was raised to 250 via direct edit to/etc/systemd/system/k3s.serviceon 2026-05-04. Currentlysalt/states/k3s/init.slsonly verifies the binary exists and service is running — it explicitly does NOT manage the unit file. This ticket changes that by templating the unit file through Salt with configurable kubelet args via pillar.File Targets
Files to modify or create:
salt/states/k3s/init.sls— convert from verify-only to managed unit filesalt/states/k3s/k3s.service.j2— new Jinja2 template for the systemd unitsalt/pillar/k3s.sls— new pillar file with k3s config valuessalt/pillar/top.sls— include new k3s pillarFiles NOT to touch:
terraform/— this is Salt-layer (system config), not Terraform-layer (k8s resources)Acceptance Criteria
file.managed--kubelet-arg=max-pods=250rendered in ExecStart argspillar:k3s:kubelet:max_podsservice.runningwatches the unit file (auto-restart on change)Test Expectations
salt-call state.show_sls k3srenders without errorssalt-call state.apply k3s test=Trueshows no changes (matches live state)sudo salt-call state.apply k3s test=TrueConstraints
/etc/systemd/system/k3s.serviceexactly (including--disable=traefik)cmd.runforsystemctl daemon-reloadtriggered by unit file changessalt/states/firewall/init.slsfor template+service pattern)Checklist
salt-call state.apply k3s test=Trueshows no diffRelated
project-pal-e-platform— platform infrastructure