ArgoCD CMP sidecar fails to render kustomize+SOPS overlays (blocks services apply) #225

Open
opened 2026-03-28 16:21:59 +00:00 by forgejo_admin · 0 comments

Type

Feature

Lineage

  • Board: board-pal-e-platform
  • Story: story:superuser-deploy
  • Arch: arch:argocd, arch:k8s-deploy

Repo

pal-e-platform (ArgoCD Helm config), pal-e-services (ArgoCD app definitions)

User Story

As the superuser, I need ArgoCD's CMP sidecar (cmp-sops) to successfully render kustomize+SOPS overlays from pal-e-deployments so that services can be migrated to centralized deployments.

Context

During validation session 2026-03-28, tofu apply for pal-e-services failed on 3+ ArgoCD app updates with:

plugin sidecar failed. error generating manifests in cmp: 
rpc error: code = Unavailable desc = error reading from server: EOF

The cmp-sops container starts successfully (logs show it's serving on socket), the SOPS age key exists in the argocd namespace, but manifest generation fails with EOF.

Affected apps: pal-e-docs, gcal-scheduler, pal-e-app, platform-validation (all apps pointing to pal-e-deployments overlays).

Scope

  1. Investigate why cmp-sops fails during render (memory? timeout? tool missing?)
  2. Check CMP plugin definition in ArgoCD Helm values
  3. Test manual kustomize+SOPS render inside the container
  4. Fix the root cause
  5. Retry pal-e-services tofu apply

File Targets

  • terraform/modules/ci/ or ArgoCD Helm values (CMP plugin definition)
  • ArgoCD repo-server pod spec (sidecar resources, volume mounts)

Constraints

  • Must not disrupt existing working ArgoCD apps
  • CMP sidecar handles SOPS decryption — age key at /sops/age/age.agekey

Acceptance Criteria

  • tofu apply for pal-e-services succeeds for all ArgoCD apps
  • ArgoCD dashboard shows all apps Synced
  • CMP sidecar logs show successful manifest generation

Checklist

  • Check cmp-sops resource limits (OOMKill?)
  • Check CMP plugin config (generate command)
  • Test manual render in container
  • Fix and verify

Test Expectations

  • ArgoCD repo-server pod 2/2 Running with no restarts after fix
  • tofu plan for pal-e-services shows 0 changes after successful apply
  • Blocks: pal-e-services #39 (ArgoCD source migrations)
  • Blocks: pal-e-app #88 (CI validation — needs centralized deploy)
  • Discovered during: validation campaign 2026-03-28
### Type Feature ### Lineage - Board: board-pal-e-platform - Story: story:superuser-deploy - Arch: arch:argocd, arch:k8s-deploy ### Repo pal-e-platform (ArgoCD Helm config), pal-e-services (ArgoCD app definitions) ### User Story As the superuser, I need ArgoCD's CMP sidecar (cmp-sops) to successfully render kustomize+SOPS overlays from pal-e-deployments so that services can be migrated to centralized deployments. ### Context During validation session 2026-03-28, `tofu apply` for pal-e-services failed on 3+ ArgoCD app updates with: ``` plugin sidecar failed. error generating manifests in cmp: rpc error: code = Unavailable desc = error reading from server: EOF ``` The cmp-sops container starts successfully (logs show it's serving on socket), the SOPS age key exists in the argocd namespace, but manifest generation fails with EOF. Affected apps: pal-e-docs, gcal-scheduler, pal-e-app, platform-validation (all apps pointing to pal-e-deployments overlays). ### Scope 1. Investigate why cmp-sops fails during render (memory? timeout? tool missing?) 2. Check CMP plugin definition in ArgoCD Helm values 3. Test manual kustomize+SOPS render inside the container 4. Fix the root cause 5. Retry pal-e-services tofu apply ### File Targets - `terraform/modules/ci/` or ArgoCD Helm values (CMP plugin definition) - ArgoCD repo-server pod spec (sidecar resources, volume mounts) ### Constraints - Must not disrupt existing working ArgoCD apps - CMP sidecar handles SOPS decryption — age key at `/sops/age/age.agekey` ### Acceptance Criteria - [ ] `tofu apply` for pal-e-services succeeds for all ArgoCD apps - [ ] ArgoCD dashboard shows all apps Synced - [ ] CMP sidecar logs show successful manifest generation ### Checklist - [ ] Check cmp-sops resource limits (OOMKill?) - [ ] Check CMP plugin config (generate command) - [ ] Test manual render in container - [ ] Fix and verify ### Test Expectations - ArgoCD repo-server pod 2/2 Running with no restarts after fix - `tofu plan` for pal-e-services shows 0 changes after successful apply ### Related - Blocks: pal-e-services #39 (ArgoCD source migrations) - Blocks: pal-e-app #88 (CI validation — needs centralized deploy) - Discovered during: validation campaign 2026-03-28
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
forgejo_admin/pal-e-platform#225
No description provided.