Fix production CrashLoopBackOff: activate Puma control app #68
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "fix/puma-control-app"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
activate_control_appinconfig/puma.rbCloses #67
Changes
config/puma.rb: Removeif ENV.fetch("RAILS_ENV", "development") == "development"guard fromactivate_control_app— safe in all environments (binds a Unix socket within the container)Test Plan
/uphealth check returns 200/metricsendpoint returns Puma thread/worker gaugesReview Checklist
Related Notes
ldraney/landscaping-assistant #67— production CrashLoopBackOff bugldraney/landscaping-assistant #19— original observability issue (regression source)landscaping-assistant— projectPR #68 Review
DOMAIN REVIEW
Tech stack: Ruby on Rails 8.1, Puma web server, yabeda-puma-plugin (Prometheus metrics), Woodpecker CI, k8s deployment.
Change: Single-file, 2-line diff in
config/puma.rb. Removes theif ENV.fetch("RAILS_ENV", "development") == "development"guard fromactivate_control_app, making the Puma control app active in all environments. Adds a clarifying comment.Analysis of the fix:
yabeda-puma-pluginrequiresactivate_control_appto collect thread/worker metrics. Gating it to development-only would crash production at startup when the plugin tries to read metrics from the control app socket.activate_control_appwithout arguments binds to a random Unix socket (not a TCP port), so there is no network exposure risk. This is safe for production use inside a container.No security concerns: The control app Unix socket is container-local and not exposed externally. No secrets, no new network listeners.
BLOCKERS
None.
Test coverage exemption reasoning: This is a one-line configuration fix to a Puma DSL directive. The change is not testable via unit/integration tests -- it is a server boot configuration that can only be validated by observing the pod starts successfully and
/metricsreturns Puma gauges. The PR's Test Plan appropriately covers this with post-deploy verification steps. The BLOCKER criterion for "new functionality with zero test coverage" does not apply to infrastructure configuration fixes.NITS
None. The diff is clean, minimal, and well-commented.
SOP COMPLIANCE
fix/puma-control-app, not67-puma-control-appor67-fix-puma-control-app. Convention calls for{issue-number}-{kebab-case-purpose}. Minor deviation, but this is a hotfix for a production crash so the urgency is understood.PROCESS OBSERVATIONS
rails runner "puts :ok"or a container startup probe test).VERDICT: APPROVED
The fix is correct, minimal, safe, and well-documented. The branch naming deviation is noted but not blocking given the production urgency. Ship it and verify pod recovery.