ArgoCD CMP sidecar fails to render kustomize+SOPS overlays (blocks services apply) #225

Closed
opened 2026-03-28 16:21:59 +00:00 by forgejo_admin · 3 comments
Contributor

Type

Bug

Lineage

  • Board: board-pal-e-platform
  • Story: story:superuser-deploy
  • Arch: arch:argocd, arch:k8s-deploy

Repo

forgejo_admin/pal-e-services (primary — ArgoCD Helm release with CMP sidecar definition at terraform/main.tf lines 35-191)

What Broke

ArgoCD's CMP sidecar (cmp-sops) fails to render kustomize+SOPS overlays from pal-e-deployments. tofu apply for pal-e-services fails on 3+ ArgoCD app updates with:

plugin sidecar failed. error generating manifests in cmp: 
rpc error: code = Unavailable desc = error reading from server: EOF

The cmp-sops container starts successfully (logs show it's serving on socket), the SOPS age key exists in the argocd namespace, but manifest generation fails with EOF.

Affected apps: pal-e-docs, gcal-scheduler, pal-e-app, platform-validation (all apps pointing to pal-e-deployments overlays).

Repro Steps

  1. Run cd ~/pal-e-services && tofu apply
  2. Observe: ArgoCD app resources fail with CMP sidecar EOF error
  3. Check pod: kubectl -n argocd describe pod -l app.kubernetes.io/name=argocd-repo-server
  4. Observe: cmp-sops sidecar running but manifest generation fails

Expected Behavior

tofu apply for pal-e-services succeeds. ArgoCD dashboard shows all apps Synced. CMP sidecar logs show successful kustomize+SOPS manifest generation.

Environment

  • Cluster/namespace: prod / argocd
  • Service version: ArgoCD Helm release in pal-e-services/terraform/main.tf
  • Related alerts: ArgoCD app sync failures

File Targets

Files the agent should investigate and modify:

  • ~/pal-e-services/terraform/main.tf:35-191 — ArgoCD Helm release with CMP sidecar definition (repoServer.extraContainers, volumes, volumeMounts)
  • ~/pal-e-services/terraform/main.tf — CMP plugin config (generate command, discover config)

Files to check (read-only investigation):

  • kubectl -n argocd logs for repo-server cmp-sops container
  • kubectl -n argocd describe pod for resource limits / OOMKill
  • ArgoCD configmap for plugin registration

Files the agent should NOT touch:

  • ~/pal-e-platform/terraform/modules/ci/ — this is Woodpecker CI, not ArgoCD

Acceptance Criteria

  • tofu apply for pal-e-services succeeds for all ArgoCD apps
  • ArgoCD dashboard shows all apps Synced
  • CMP sidecar logs show successful manifest generation
  • No regression in existing working ArgoCD apps

Test Expectations

  • ArgoCD repo-server pod 2/2 Running with no restarts after fix
  • tofu plan -lock=false for pal-e-services shows 0 changes after successful apply

Constraints

  • Must not disrupt existing working ArgoCD apps
  • CMP sidecar handles SOPS decryption — age key at /sops/age/age.agekey
  • Use tofu not terraform
  • Include -lock=false on plan commands

Checklist

  • PR opened
  • Tests pass
  • No unrelated changes
  • pal-e-platform — project
  • Blocks: pal-e-services #39 (ArgoCD source migrations)
  • Blocks: pal-e-app #88 (CI validation)
  • Discovered during: validation campaign 2026-03-28
### Type Bug ### Lineage - Board: board-pal-e-platform - Story: story:superuser-deploy - Arch: arch:argocd, arch:k8s-deploy ### Repo `forgejo_admin/pal-e-services` (primary — ArgoCD Helm release with CMP sidecar definition at `terraform/main.tf` lines 35-191) ### What Broke ArgoCD's CMP sidecar (cmp-sops) fails to render kustomize+SOPS overlays from pal-e-deployments. `tofu apply` for pal-e-services fails on 3+ ArgoCD app updates with: ``` plugin sidecar failed. error generating manifests in cmp: rpc error: code = Unavailable desc = error reading from server: EOF ``` The cmp-sops container starts successfully (logs show it's serving on socket), the SOPS age key exists in the argocd namespace, but manifest generation fails with EOF. Affected apps: pal-e-docs, gcal-scheduler, pal-e-app, platform-validation (all apps pointing to pal-e-deployments overlays). ### Repro Steps 1. Run `cd ~/pal-e-services && tofu apply` 2. Observe: ArgoCD app resources fail with CMP sidecar EOF error 3. Check pod: `kubectl -n argocd describe pod -l app.kubernetes.io/name=argocd-repo-server` 4. Observe: cmp-sops sidecar running but manifest generation fails ### Expected Behavior `tofu apply` for pal-e-services succeeds. ArgoCD dashboard shows all apps Synced. CMP sidecar logs show successful kustomize+SOPS manifest generation. ### Environment - Cluster/namespace: prod / argocd - Service version: ArgoCD Helm release in pal-e-services/terraform/main.tf - Related alerts: ArgoCD app sync failures ### File Targets Files the agent should investigate and modify: - `~/pal-e-services/terraform/main.tf:35-191` — ArgoCD Helm release with CMP sidecar definition (repoServer.extraContainers, volumes, volumeMounts) - `~/pal-e-services/terraform/main.tf` — CMP plugin config (generate command, discover config) Files to check (read-only investigation): - `kubectl -n argocd logs` for repo-server cmp-sops container - `kubectl -n argocd describe pod` for resource limits / OOMKill - ArgoCD configmap for plugin registration Files the agent should NOT touch: - `~/pal-e-platform/terraform/modules/ci/` — this is Woodpecker CI, not ArgoCD ### Acceptance Criteria - [ ] `tofu apply` for pal-e-services succeeds for all ArgoCD apps - [ ] ArgoCD dashboard shows all apps Synced - [ ] CMP sidecar logs show successful manifest generation - [ ] No regression in existing working ArgoCD apps ### Test Expectations - ArgoCD repo-server pod 2/2 Running with no restarts after fix - `tofu plan -lock=false` for pal-e-services shows 0 changes after successful apply ### Constraints - Must not disrupt existing working ArgoCD apps - CMP sidecar handles SOPS decryption — age key at `/sops/age/age.agekey` - Use `tofu` not `terraform` - Include `-lock=false` on plan commands ### Checklist - [ ] PR opened - [ ] Tests pass - [ ] No unrelated changes ### Related - `pal-e-platform` — project - Blocks: pal-e-services #39 (ArgoCD source migrations) - Blocks: pal-e-app #88 (CI validation) - Discovered during: validation campaign 2026-03-28
Author
Contributor

Scope Review: NEEDS_REFINEMENT

Review note: review-525-2026-03-29

File targets point to wrong repo/directory -- terraform/modules/ci/ is Woodpecker CI, not ArgoCD. Actual CMP sidecar config is in pal-e-services/terraform/main.tf lines 35-191.

  • [BODY] Fix file targets: replace terraform/modules/ci/ with pal-e-services/terraform/main.tf (cmp-sops at lines 114-167, init at 99-113, plugin config at 51-76)
  • [BODY] Fix repo description: ArgoCD Helm config is in pal-e-services, not pal-e-platform (inverted)
  • [SCOPE] Consider moving issue to forgejo_admin/pal-e-services since all code changes are in that repo
  • [LABEL] Consider type:bug instead of type:feature -- CMP sidecar fails to render (broken behavior)
  • [BODY] Add investigation commands to Checklist: kubectl describe/logs for repo-server and cmp-sops containers
## Scope Review: NEEDS_REFINEMENT Review note: `review-525-2026-03-29` File targets point to wrong repo/directory -- `terraform/modules/ci/` is Woodpecker CI, not ArgoCD. Actual CMP sidecar config is in `pal-e-services/terraform/main.tf` lines 35-191. - **[BODY]** Fix file targets: replace `terraform/modules/ci/` with `pal-e-services/terraform/main.tf` (cmp-sops at lines 114-167, init at 99-113, plugin config at 51-76) - **[BODY]** Fix repo description: ArgoCD Helm config is in pal-e-services, not pal-e-platform (inverted) - **[SCOPE]** Consider moving issue to `forgejo_admin/pal-e-services` since all code changes are in that repo - **[LABEL]** Consider `type:bug` instead of `type:feature` -- CMP sidecar fails to render (broken behavior) - **[BODY]** Add investigation commands to Checklist: `kubectl describe/logs` for repo-server and cmp-sops containers
Author
Contributor

Scope refinement (review-525-2026-03-29):

  1. Changed type:feature → type:bug (broken behavior, not new functionality)
  2. Corrected primary repo to pal-e-services (ArgoCD Helm release at terraform/main.tf:35-191)
  3. Updated file targets — removed terraform/modules/ci/ (that's Woodpecker), added explicit pal-e-services paths
  4. Added full bug template sections (What Broke, Repro Steps, Expected Behavior, Environment)
  5. Added -lock=false to tofu plan command per convention
**Scope refinement (review-525-2026-03-29):** 1. Changed type:feature → type:bug (broken behavior, not new functionality) 2. Corrected primary repo to pal-e-services (ArgoCD Helm release at terraform/main.tf:35-191) 3. Updated file targets — removed terraform/modules/ci/ (that's Woodpecker), added explicit pal-e-services paths 4. Added full bug template sections (What Broke, Repro Steps, Expected Behavior, Environment) 5. Added -lock=false to tofu plan command per convention
Author
Contributor

Scope Review: READY

Review note: review-525-2026-03-29-v2

Re-review after refinements — all prior issues addressed. Type corrected to bug, file targets corrected to pal-e-services/terraform/main.tf, full bug template sections present, tofu -lock=false convention applied. File targets verified against codebase (lines 35-191 confirmed). Ticket is actionable for a single agent pass.

Minor documentation gaps (not blockers):

  • [SCOPE] Architecture note arch-argocd does not exist in pal-e-docs
  • [SCOPE] Architecture note arch-k8s-deploy does not exist in pal-e-docs
## Scope Review: READY Review note: `review-525-2026-03-29-v2` Re-review after refinements — all prior issues addressed. Type corrected to bug, file targets corrected to pal-e-services/terraform/main.tf, full bug template sections present, tofu -lock=false convention applied. File targets verified against codebase (lines 35-191 confirmed). Ticket is actionable for a single agent pass. Minor documentation gaps (not blockers): - `[SCOPE]` Architecture note `arch-argocd` does not exist in pal-e-docs - `[SCOPE]` Architecture note `arch-k8s-deploy` does not exist in pal-e-docs
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
ldraney/pal-e-platform#225
No description provided.