refactor: consolidate 3+3 agent model back to Dev + QA (Phase 12v L2 finding) #107

Closed
opened 2026-03-15 02:30:45 +00:00 by forgejo_admin · 0 comments
Contributor

Lineage

plan-pal-e-agency → Phase 12 → Phase 12v (L2 quality comparison conclusion)

Repo

forgejo_admin/claude-custom

User Story

As Lucas
I want a single Dev agent and single QA agent that leverage Claude Opus's inherent capability
So that reviews are both thorough AND domain-aware without overengineered specialization

Context

Phase 12v L2 quality comparison proved two things:

  1. QA specialization hurts more than it helps. Domain QA agents found 7 domain-specific nits but missed 6 generic blockers (missing tests, DRY violations, input validation) that the old generic QA caught. Claude Opus already has domain expertise — the specialization CONSTRAINED attention rather than ADDING capability.

  2. Execution specialization adds value only when it adds TOOLS (Impeccable skills for frontend, tofu commands for infra), not when it just adds constraints (dev-backend's "never write frontend" — the model naturally stays in-domain). One enhanced Dev agent with all tools trusts the model to apply the right ones based on context.

Decision: Consolidate back to the five-agent model: Betty Sue, Penny, Dev, QA, Dottie. This matches the project page architecture diagram which already shows Dev (code writer) and QA (reviewer) as workflow-based agents, not domain-based.

File Targets

Files to CREATE (enhanced versions):

  • agents/dev.md — NEW enhanced Dev with all capabilities: Impeccable skills (for frontend work), tofu enforcement (for infra work), ruff enforcement (for Python work). No domain constraints. Model decides what's relevant based on repo/issue context.
  • agents/qa.md — NEW enhanced QA with: generic code quality checklist (DRY, tests, input validation, hardcoded values), DOMAIN REVIEW section (model dynamically applies domain expertise), PROCESS OBSERVATIONS section (DORA), explicit BLOCKER criteria (missing tests for new functionality = BLOCKER, not nit).

Files to DEPRECATE (rename to .deprecated):

  • agents/dev-backend.mdagents/dev-backend.md.deprecated
  • agents/dev-frontend.mdagents/dev-frontend.md.deprecated
  • agents/devops.mdagents/devops.md.deprecated
  • agents/dev-qa.mdagents/dev-qa.md.deprecated
  • agents/frontend-qa.mdagents/frontend-qa.md.deprecated
  • agents/devops-qa.mdagents/devops-qa.md.deprecated

Files to UPDATE:

  • schemas/agent-spawn-requirements.json — replace 6 specialized types with dev and qa. Keep general-purpose, Explore, claude-code-guide unchanged.
  • skills/review-pr/SKILL.md — remove domain label routing (Steps 3-4). Simplify to: parse PR, find parent issue, spawn qa. No domain label lookup needed.

Files NOT to touch:

  • agents/dottie.md — unchanged
  • agents/betty-sue.md — unchanged
  • agents/dev.md.deprecated — this is the OLD dev config. Don't un-deprecate it — create a NEW enhanced dev.md instead.
  • agents/qa.md.deprecated — same. Create a NEW enhanced qa.md.

Acceptance Criteria

  • agents/dev.md exists with Impeccable skills, tofu enforcement, ruff enforcement, and NO domain constraints
  • agents/qa.md exists with generic code quality checklist, DOMAIN REVIEW section, PROCESS OBSERVATIONS, and explicit BLOCKER criteria
  • All 6 specialized agent configs renamed to .deprecated
  • schemas/agent-spawn-requirements.json has dev and qa entries, no dev-backend/dev-frontend/devops/dev-qa/frontend-qa/devops-qa
  • skills/review-pr/SKILL.md spawns qa directly without domain routing
  • JSON schema validates with jq

Test Expectations

  • jq . schemas/agent-spawn-requirements.json validates
  • ls agents/*.md shows only: betty-sue.md, dev.md, dottie.md, qa.md
  • ls agents/*.deprecated shows 8 files (old dev, old qa, plus 6 specialized)
  • grep -l "Impeccable\|frontend-design" agents/dev.md confirms skills are present
  • grep -l "tofu fmt" agents/dev.md confirms tofu enforcement is present
  • grep -l "BLOCKER" agents/qa.md confirms severity calibration is present
  • Run command: jq '.types | keys' schemas/agent-spawn-requirements.json should show dev, qa, general-purpose, Explore, claude-code-guide

Constraints

  • The NEW dev.md should be a FRESH write, not a copy of old dev.md.deprecated. It should incorporate the best of all three specialized configs.
  • The NEW qa.md should be a FRESH write incorporating L2 lessons: generic hygiene items as table-stakes PLUS dynamic domain expertise.
  • QA blocker criteria MUST include: "new functionality with zero test coverage", "unvalidated user input", "secrets/credentials in code", "DRY violation in auth/security paths"
  • Dev must include ALL Impeccable skills from dev-frontend: /frontend-design, /audit, /polish, /bolder, /colorize, /optimize, /critique
  • Dev must include tofu enforcement from devops: tofu not terraform, tofu fmt, tofu validate, tofu plan output in PR body for TF changes
  • Dev PR template should be dynamic: include "Design Decisions" section when doing frontend work, include "tofu plan Output" when doing infra work. Instruct the agent to include the relevant section based on the type of work.
  • Keep the review-pr skill file at skills/review-pr/SKILL.md — just simplify the routing logic

Checklist

  • PR opened
  • Tests pass (jq + ls + grep checks)
  • No unrelated changes
  • project-pal-e-agency — five-agent model matches architecture diagram
  • Phase 12v L2 data: legacy generic QA outperformed domain-specific QA on blockers (6 vs 0)
### Lineage `plan-pal-e-agency` → Phase 12 → Phase 12v (L2 quality comparison conclusion) ### Repo `forgejo_admin/claude-custom` ### User Story As Lucas I want a single Dev agent and single QA agent that leverage Claude Opus's inherent capability So that reviews are both thorough AND domain-aware without overengineered specialization ### Context Phase 12v L2 quality comparison proved two things: 1. **QA specialization hurts more than it helps.** Domain QA agents found 7 domain-specific nits but missed 6 generic blockers (missing tests, DRY violations, input validation) that the old generic QA caught. Claude Opus already has domain expertise — the specialization CONSTRAINED attention rather than ADDING capability. 2. **Execution specialization adds value only when it adds TOOLS** (Impeccable skills for frontend, `tofu` commands for infra), not when it just adds constraints (dev-backend's "never write frontend" — the model naturally stays in-domain). One enhanced Dev agent with all tools trusts the model to apply the right ones based on context. **Decision:** Consolidate back to the five-agent model: Betty Sue, Penny, Dev, QA, Dottie. This matches the project page architecture diagram which already shows `Dev (code writer)` and `QA (reviewer)` as workflow-based agents, not domain-based. ### File Targets Files to CREATE (enhanced versions): - `agents/dev.md` — NEW enhanced Dev with all capabilities: Impeccable skills (for frontend work), `tofu` enforcement (for infra work), ruff enforcement (for Python work). No domain constraints. Model decides what's relevant based on repo/issue context. - `agents/qa.md` — NEW enhanced QA with: generic code quality checklist (DRY, tests, input validation, hardcoded values), DOMAIN REVIEW section (model dynamically applies domain expertise), PROCESS OBSERVATIONS section (DORA), explicit BLOCKER criteria (missing tests for new functionality = BLOCKER, not nit). Files to DEPRECATE (rename to .deprecated): - `agents/dev-backend.md` → `agents/dev-backend.md.deprecated` - `agents/dev-frontend.md` → `agents/dev-frontend.md.deprecated` - `agents/devops.md` → `agents/devops.md.deprecated` - `agents/dev-qa.md` → `agents/dev-qa.md.deprecated` - `agents/frontend-qa.md` → `agents/frontend-qa.md.deprecated` - `agents/devops-qa.md` → `agents/devops-qa.md.deprecated` Files to UPDATE: - `schemas/agent-spawn-requirements.json` — replace 6 specialized types with `dev` and `qa`. Keep `general-purpose`, `Explore`, `claude-code-guide` unchanged. - `skills/review-pr/SKILL.md` — remove domain label routing (Steps 3-4). Simplify to: parse PR, find parent issue, spawn `qa`. No domain label lookup needed. Files NOT to touch: - `agents/dottie.md` — unchanged - `agents/betty-sue.md` — unchanged - `agents/dev.md.deprecated` — this is the OLD dev config. Don't un-deprecate it — create a NEW enhanced `dev.md` instead. - `agents/qa.md.deprecated` — same. Create a NEW enhanced `qa.md`. ### Acceptance Criteria - [ ] `agents/dev.md` exists with Impeccable skills, tofu enforcement, ruff enforcement, and NO domain constraints - [ ] `agents/qa.md` exists with generic code quality checklist, DOMAIN REVIEW section, PROCESS OBSERVATIONS, and explicit BLOCKER criteria - [ ] All 6 specialized agent configs renamed to `.deprecated` - [ ] `schemas/agent-spawn-requirements.json` has `dev` and `qa` entries, no `dev-backend`/`dev-frontend`/`devops`/`dev-qa`/`frontend-qa`/`devops-qa` - [ ] `skills/review-pr/SKILL.md` spawns `qa` directly without domain routing - [ ] JSON schema validates with `jq` ### Test Expectations - [ ] `jq . schemas/agent-spawn-requirements.json` validates - [ ] `ls agents/*.md` shows only: betty-sue.md, dev.md, dottie.md, qa.md - [ ] `ls agents/*.deprecated` shows 8 files (old dev, old qa, plus 6 specialized) - [ ] `grep -l "Impeccable\|frontend-design" agents/dev.md` confirms skills are present - [ ] `grep -l "tofu fmt" agents/dev.md` confirms tofu enforcement is present - [ ] `grep -l "BLOCKER" agents/qa.md` confirms severity calibration is present - Run command: `jq '.types | keys' schemas/agent-spawn-requirements.json` should show dev, qa, general-purpose, Explore, claude-code-guide ### Constraints - The NEW dev.md should be a FRESH write, not a copy of old dev.md.deprecated. It should incorporate the best of all three specialized configs. - The NEW qa.md should be a FRESH write incorporating L2 lessons: generic hygiene items as table-stakes PLUS dynamic domain expertise. - QA blocker criteria MUST include: "new functionality with zero test coverage", "unvalidated user input", "secrets/credentials in code", "DRY violation in auth/security paths" - Dev must include ALL Impeccable skills from dev-frontend: /frontend-design, /audit, /polish, /bolder, /colorize, /optimize, /critique - Dev must include tofu enforcement from devops: `tofu` not `terraform`, `tofu fmt`, `tofu validate`, `tofu plan` output in PR body for TF changes - Dev PR template should be dynamic: include "Design Decisions" section when doing frontend work, include "tofu plan Output" when doing infra work. Instruct the agent to include the relevant section based on the type of work. - Keep the review-pr skill file at `skills/review-pr/SKILL.md` — just simplify the routing logic ### Checklist - [ ] PR opened - [ ] Tests pass (jq + ls + grep checks) - [ ] No unrelated changes ### Related - `project-pal-e-agency` — five-agent model matches architecture diagram - Phase 12v L2 data: legacy generic QA outperformed domain-specific QA on blockers (6 vs 0)
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
ldraney/claude-custom#107
No description provided.