Fix embedding worker CrashLoopBackOff — set replicas to 0 #142

Closed
opened 2026-03-13 06:38:48 +00:00 by forgejo_admin · 0 comments

Lineage

bug-embedding-worker-crashloop (no plan ancestry — bug fix)

Repo

forgejo_admin/pal-e-docs

User Story

As a platform operator
I want the embedding worker deployment scaled to 0 replicas
So that CrashLoopBackOff noise stops and ArgoCD shows a clean sync

Context

Two embedding worker pods have been in CrashLoopBackOff for 3.5+ days (1000+ restarts). The error is No module named pal_e_docs.embedding_worker — the deployment manifest pins an image tag (bec9a4ecb1...) that predates the embedding_worker.py module. The module exists in the repo at src/pal_e_docs/embedding_worker.py but was never included in that image build. Embeddings are computed inline in the API, so this worker is not needed for current functionality.

ArgoCD reconciles kubectl scale changes back to replicas: 1, so the fix must go through the repo.

File Targets

Files to modify:

  • k8s/embedding-worker.yaml — change replicas: 1 to replicas: 0 on line 6

Files NOT to touch:

  • Everything else — this is a 1-line fix

Acceptance Criteria

  • k8s/embedding-worker.yaml has replicas: 0
  • No other changes in the PR

Test Expectations

  • No tests needed — k8s manifest change only
  • Verify: grep replicas k8s/embedding-worker.yaml shows replicas: 0

Constraints

  • Do NOT remove the deployment entirely — we'll need it when the image is rebuilt with the worker module
  • Do NOT change the image tag or any other field

Checklist

  • PR opened
  • No unrelated changes
  • pal-e-docs — affected project
  • bug-embedding-worker-crashloop — investigation note in pal-e-docs
### Lineage `bug-embedding-worker-crashloop` (no plan ancestry — bug fix) ### Repo `forgejo_admin/pal-e-docs` ### User Story As a platform operator I want the embedding worker deployment scaled to 0 replicas So that CrashLoopBackOff noise stops and ArgoCD shows a clean sync ### Context Two embedding worker pods have been in CrashLoopBackOff for 3.5+ days (1000+ restarts). The error is `No module named pal_e_docs.embedding_worker` — the deployment manifest pins an image tag (`bec9a4ecb1...`) that predates the `embedding_worker.py` module. The module exists in the repo at `src/pal_e_docs/embedding_worker.py` but was never included in that image build. Embeddings are computed inline in the API, so this worker is not needed for current functionality. ArgoCD reconciles `kubectl scale` changes back to `replicas: 1`, so the fix must go through the repo. ### File Targets Files to modify: - `k8s/embedding-worker.yaml` — change `replicas: 1` to `replicas: 0` on line 6 Files NOT to touch: - Everything else — this is a 1-line fix ### Acceptance Criteria - [ ] `k8s/embedding-worker.yaml` has `replicas: 0` - [ ] No other changes in the PR ### Test Expectations - No tests needed — k8s manifest change only - Verify: `grep replicas k8s/embedding-worker.yaml` shows `replicas: 0` ### Constraints - Do NOT remove the deployment entirely — we'll need it when the image is rebuilt with the worker module - Do NOT change the image tag or any other field ### Checklist - [ ] PR opened - [ ] No unrelated changes ### Related - `pal-e-docs` — affected project - `bug-embedding-worker-crashloop` — investigation note in pal-e-docs
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
forgejo_admin/pal-e-api#142
No description provided.