infra: Salt-manage k3s unit file with maxPods=250 #332

Closed
opened 2026-05-05 01:24:03 +00:00 by forgejo_admin · 0 comments
Contributor

Type

Feature

Lineage

Standalone — discovered during westside-admin-dev overlay setup (2026-05-04). Node hit 110 pod ceiling, fixed live, needs codification.

Repo

forgejo_admin/pal-e-platform

User Story

As a platform operator
I want the k3s maxPods limit codified in Salt
So that it survives reprovisioning and is auditable as infrastructure-as-code

Context

archbox hit the default kubelet maxPods: 110 ceiling with 125 pods running (38 Tailscale proxies alone). The limit was raised to 250 via direct edit to /etc/systemd/system/k3s.service on 2026-05-04. Currently salt/states/k3s/init.sls only verifies the binary exists and service is running — it explicitly does NOT manage the unit file. This ticket changes that by templating the unit file through Salt with configurable kubelet args via pillar.

File Targets

Files to modify or create:

  • salt/states/k3s/init.sls — convert from verify-only to managed unit file
  • salt/states/k3s/k3s.service.j2 — new Jinja2 template for the systemd unit
  • salt/pillar/k3s.sls — new pillar file with k3s config values
  • salt/pillar/top.sls — include new k3s pillar

Files NOT to touch:

  • terraform/ — this is Salt-layer (system config), not Terraform-layer (k8s resources)

Acceptance Criteria

  • k3s Salt state templates the systemd unit file via file.managed
  • --kubelet-arg=max-pods=250 rendered in ExecStart args
  • max-pods value driven by pillar:k3s:kubelet:max_pods
  • service.running watches the unit file (auto-restart on change)
  • Template matches current live state (no unintended drift on first apply)

Test Expectations

  • salt-call state.show_sls k3s renders without errors
  • salt-call state.apply k3s test=True shows no changes (matches live state)
  • Run command: sudo salt-call state.apply k3s test=True

Constraints

  • Template must match the current live /etc/systemd/system/k3s.service exactly (including --disable=traefik)
  • Use cmd.run for systemctl daemon-reload triggered by unit file changes
  • Follow existing Salt patterns in this repo (see salt/states/firewall/init.sls for template+service pattern)

Checklist

  • PR opened
  • salt-call state.apply k3s test=True shows no diff
  • No unrelated changes
  • project-pal-e-platform — platform infrastructure
### Type Feature ### Lineage Standalone — discovered during westside-admin-dev overlay setup (2026-05-04). Node hit 110 pod ceiling, fixed live, needs codification. ### Repo `forgejo_admin/pal-e-platform` ### User Story As a platform operator I want the k3s maxPods limit codified in Salt So that it survives reprovisioning and is auditable as infrastructure-as-code ### Context archbox hit the default kubelet `maxPods: 110` ceiling with 125 pods running (38 Tailscale proxies alone). The limit was raised to 250 via direct edit to `/etc/systemd/system/k3s.service` on 2026-05-04. Currently `salt/states/k3s/init.sls` only verifies the binary exists and service is running — it explicitly does NOT manage the unit file. This ticket changes that by templating the unit file through Salt with configurable kubelet args via pillar. ### File Targets Files to modify or create: - `salt/states/k3s/init.sls` — convert from verify-only to managed unit file - `salt/states/k3s/k3s.service.j2` — new Jinja2 template for the systemd unit - `salt/pillar/k3s.sls` — new pillar file with k3s config values - `salt/pillar/top.sls` — include new k3s pillar Files NOT to touch: - `terraform/` — this is Salt-layer (system config), not Terraform-layer (k8s resources) ### Acceptance Criteria - [ ] k3s Salt state templates the systemd unit file via `file.managed` - [ ] `--kubelet-arg=max-pods=250` rendered in ExecStart args - [ ] max-pods value driven by `pillar:k3s:kubelet:max_pods` - [ ] `service.running` watches the unit file (auto-restart on change) - [ ] Template matches current live state (no unintended drift on first apply) ### Test Expectations - [ ] `salt-call state.show_sls k3s` renders without errors - [ ] `salt-call state.apply k3s test=True` shows no changes (matches live state) - Run command: `sudo salt-call state.apply k3s test=True` ### Constraints - Template must match the current live `/etc/systemd/system/k3s.service` exactly (including `--disable=traefik`) - Use `cmd.run` for `systemctl daemon-reload` triggered by unit file changes - Follow existing Salt patterns in this repo (see `salt/states/firewall/init.sls` for template+service pattern) ### Checklist - [ ] PR opened - [ ] `salt-call state.apply k3s test=True` shows no diff - [ ] No unrelated changes ### Related - `project-pal-e-platform` — platform infrastructure
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
ldraney/pal-e-platform#332
No description provided.