Draft "Rename Audit" SOP — completeness checklist for service/hostname renames #294

Open
opened 2026-04-17 05:47:16 +00:00 by forgejo_admin · 0 comments
Contributor

Type

Feature

Lineage

Three post-rename drift bugs in succession (#287 monitoring refs, #281 westside-landing, pal-e-deployments#125 CORS env var). Each was a user-visible regression from an incomplete pal-e-production → pal-e-app rename. QA on PR #125 flagged this as a meta-pattern requiring a process fix, not just another point fix.

Repo

forgejo_admin/pal-e-platform

User Story

As an engineer (human or agent) executing a service or hostname rename, I want a single SOP to walk so that I can declare the rename done with confidence — instead of re-deriving "did I get everything" each time and inevitably missing one of the long tail.

Context

Renames touch a surprisingly large surface: source code, kustomize overlays, ArgoCD apps, Tailscale funnels, Harbor projects, Keycloak clients, monitoring probes, network policies, cross-service env vars (*_BASE_URL, *_ALLOWED_ORIGINS, OAuth callbacks), secret references, and pal-e-docs dictionary entries. Without a checklist, each agent and human re-discovers the surface on every rename. We've burned three incidents proving this.

The SOP is the cheapest possible intervention: documentation that each rename PR cites in its description and walks line-by-line.

File Targets

  • pal-e-docs note: new SOP at slug sop-rename-audit (note_type=sop, tags=sop,active)
  • pal-e-docs note: link from definition-app
  • pal-e-docs note: link from convention-dictionary-authority
  • template-pr-body (if a rename-specific template makes sense): add a ## Rename Audit section that references the SOP

Test Expectations

  • Walk the SOP against the open pal-e-app#110 rename revert as a dry-run; gaps in the SOP get patched immediately as part of this ticket (close-the-loop test)
  • Each checklist item must have a concrete verification command (rg, kubectl get, argocd app list, etc.) — not vague guidance
  • A second engineer (or agent) can execute the SOP without reading the original incident write-ups

Constraints

  • SOP-only deliverable — no automation in this ticket. Automation is a separate (future) feature.
  • Cover at least these 12 categories: source code refs; package metadata (name fields); kustomize overlays; deployment env vars; ingresses + Tailscale funnels; ArgoCD application names + paths; Harbor project paths; Keycloak client IDs + redirect URIs; monitoring probes (PrometheusRule, blackbox, Grafana); network policies (allow-from selectors); cross-service URL env vars; pal-e-docs dictionary entries
  • Each item: command to detect drift + command to verify post-rename + recommended fix path
  • Must reference, not duplicate, convention-dictionary-authority and feedback_naming_convention.md

Acceptance Criteria

  • sop-rename-audit note created in pal-e-docs
  • All 12 categories covered with concrete verification commands
  • Linked from definition-app and convention-dictionary-authority
  • Dry-run against pal-e-app#110 revert; gaps in SOP captured + fixed in this ticket
  • PR template updated with ## Rename Audit section pointer
  • Pinned in sop-index under "platform operations"

Checklist

Same as Acceptance Criteria above; tracked there.

Out of Scope

  • Automation that runs the audit (separate ticket if pursued)
  • Backfilling historical renames
  • Renaming Tailscale device hostnames at the device layer (only ingress/funnel resource names)

Environment

  • Repo: pal-e-docs (the SOP itself)
  • Trigger context: any future rename PR
  • Reference incidents: #287, #281, pal-e-deployments#125, pal-e-app#110
  • forgejo_admin/pal-e-deployments#124, #125, #126, #127 — concrete drift incidents
  • forgejo_admin/pal-e-app#110 — open rename revert; dry-run target
  • Past rename PRs: #287 (monitoring), #281 (westside-landing rename)
  • feedback_naming_convention.md, definition-app, convention-dictionary-authority
### Type Feature ### Lineage Three post-rename drift bugs in succession (`#287` monitoring refs, `#281` westside-landing, `pal-e-deployments#125` CORS env var). Each was a user-visible regression from an incomplete `pal-e-production → pal-e-app` rename. QA on PR #125 flagged this as a meta-pattern requiring a process fix, not just another point fix. ### Repo `forgejo_admin/pal-e-platform` ### User Story As an engineer (human or agent) executing a service or hostname rename, I want a single SOP to walk so that I can declare the rename done with confidence — instead of re-deriving "did I get everything" each time and inevitably missing one of the long tail. ### Context Renames touch a surprisingly large surface: source code, kustomize overlays, ArgoCD apps, Tailscale funnels, Harbor projects, Keycloak clients, monitoring probes, network policies, cross-service env vars (`*_BASE_URL`, `*_ALLOWED_ORIGINS`, OAuth callbacks), secret references, and pal-e-docs dictionary entries. Without a checklist, each agent and human re-discovers the surface on every rename. We've burned three incidents proving this. The SOP is the cheapest possible intervention: documentation that each rename PR cites in its description and walks line-by-line. ### File Targets - `pal-e-docs` note: new SOP at slug `sop-rename-audit` (note_type=`sop`, tags=`sop,active`) - `pal-e-docs` note: link from `definition-app` - `pal-e-docs` note: link from `convention-dictionary-authority` - `template-pr-body` (if a rename-specific template makes sense): add a `## Rename Audit` section that references the SOP ### Test Expectations - Walk the SOP against the open `pal-e-app#110` rename revert as a dry-run; gaps in the SOP get patched immediately as part of this ticket (close-the-loop test) - Each checklist item must have a concrete verification command (`rg`, `kubectl get`, `argocd app list`, etc.) — not vague guidance - A second engineer (or agent) can execute the SOP without reading the original incident write-ups ### Constraints - SOP-only deliverable — no automation in this ticket. Automation is a separate (future) feature. - Cover at least these 12 categories: source code refs; package metadata (`name` fields); kustomize overlays; deployment env vars; ingresses + Tailscale funnels; ArgoCD application names + paths; Harbor project paths; Keycloak client IDs + redirect URIs; monitoring probes (PrometheusRule, blackbox, Grafana); network policies (allow-from selectors); cross-service URL env vars; pal-e-docs dictionary entries - Each item: command to detect drift + command to verify post-rename + recommended fix path - Must reference, not duplicate, `convention-dictionary-authority` and `feedback_naming_convention.md` ### Acceptance Criteria - [ ] `sop-rename-audit` note created in pal-e-docs - [ ] All 12 categories covered with concrete verification commands - [ ] Linked from `definition-app` and `convention-dictionary-authority` - [ ] Dry-run against `pal-e-app#110` revert; gaps in SOP captured + fixed in this ticket - [ ] PR template updated with `## Rename Audit` section pointer - [ ] Pinned in `sop-index` under "platform operations" ### Checklist Same as Acceptance Criteria above; tracked there. ### Out of Scope - Automation that runs the audit (separate ticket if pursued) - Backfilling historical renames - Renaming Tailscale device hostnames at the device layer (only ingress/funnel resource names) ### Environment - Repo: `pal-e-docs` (the SOP itself) - Trigger context: any future rename PR - Reference incidents: `#287`, `#281`, `pal-e-deployments#125`, `pal-e-app#110` ### Related - `forgejo_admin/pal-e-deployments#124`, `#125`, `#126`, `#127` — concrete drift incidents - `forgejo_admin/pal-e-app#110` — open rename revert; dry-run target - Past rename PRs: `#287` (monitoring), `#281` (westside-landing rename) - `feedback_naming_convention.md`, `definition-app`, `convention-dictionary-authority`
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
ldraney/pal-e-platform#294
No description provided.