Clean up pal-enterprises stale ReplicaSets + CreateContainerConfigError pod #371

Open
opened 2026-05-19 02:18:02 +00:00 by ldraney · 0 comments
Owner

Type

Bug

Lineage

Standalone — discovered during platform health audit 2026-05-18.

Repo

ldraney/pal-e-platform

What Broke

pal-enterprises has a stuck pod (pal-enterprises-94c89dc5c-hgjvh) in CreateContainerConfigError for 6+ days. Pod references env vars from secret pal-enterprises-secrets which is missing or incomplete. Service is healthy on a newer RS (5747957c7b, 1/1 Running) using rails-env secret. 12 total ReplicaSets, 10 scaled to 0 and stale (7-9 days old).

Repro Steps

  1. kubectl get pods -n pal-enterprises — observe CreateContainerConfigError pod
  2. kubectl get rs -n pal-enterprises — observe 10 stale ReplicaSets at 0 desired
  3. kubectl get secret pal-enterprises-secrets -n pal-enterprises — check if secret exists

Expected Behavior

Only healthy ReplicaSets should exist. Failed rollouts should be cleaned up.

Environment

  • Cluster/namespace: prod / pal-enterprises
  • Related alerts: EndpointDown (pal-enterprises)

Acceptance Criteria

  • Stuck pod pal-enterprises-94c89dc5c-hgjvh deleted
  • Stale ReplicaSets (10, scaled to 0) cleaned up
  • Healthy pod (5747957c7b) remains Running
  • Orphan pal-enterprises-secrets secret deleted if it exists
  • project-pal-e-platform — platform health
### Type Bug ### Lineage Standalone — discovered during platform health audit 2026-05-18. ### Repo `ldraney/pal-e-platform` ### What Broke pal-enterprises has a stuck pod (`pal-enterprises-94c89dc5c-hgjvh`) in CreateContainerConfigError for 6+ days. Pod references env vars from secret `pal-enterprises-secrets` which is missing or incomplete. Service is healthy on a newer RS (`5747957c7b`, 1/1 Running) using `rails-env` secret. 12 total ReplicaSets, 10 scaled to 0 and stale (7-9 days old). ### Repro Steps 1. `kubectl get pods -n pal-enterprises` — observe CreateContainerConfigError pod 2. `kubectl get rs -n pal-enterprises` — observe 10 stale ReplicaSets at 0 desired 3. `kubectl get secret pal-enterprises-secrets -n pal-enterprises` — check if secret exists ### Expected Behavior Only healthy ReplicaSets should exist. Failed rollouts should be cleaned up. ### Environment - Cluster/namespace: prod / pal-enterprises - Related alerts: EndpointDown (pal-enterprises) ### Acceptance Criteria - [ ] Stuck pod `pal-enterprises-94c89dc5c-hgjvh` deleted - [ ] Stale ReplicaSets (10, scaled to 0) cleaned up - [ ] Healthy pod (`5747957c7b`) remains Running - [ ] Orphan `pal-enterprises-secrets` secret deleted if it exists ### Related - `project-pal-e-platform` — platform health
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
ldraney/pal-e-platform#371
No description provided.