Spike: Keycloak programmatic configuration and architecture validation #133

Merged
ldraney merged 3 commits from 132-spike-keycloak-programmatic-configuratio into main 2026-06-06 22:54:23 +00:00
Owner

Summary

Spike investigation into how Keycloak should be programmatically configured for the landscaping-assistant app. Produces docs/keycloak-setup.md covering the full architecture: Terraform IaC approach, realm/client design, ROPC rejection, secrets wiring, and step-by-step setup instructions.

Changes

  • docs/keycloak-setup.md -- New spike document covering:
    • Current Keycloak state across pal-e-platform (server) and pal-e-services (realm/client IaC)
    • Recommended approach: Terraform via pal-e-services (not Admin API, not realm export, not manual)
    • Realm design: new landscaping realm with 5 roles (admin, lead, member, client, super_admin)
    • Client design: confidential landscaping-assistant client with Authorization Code + PKCE
    • ROPC validation: Direct Access Grants rejected in favor of standard flow (security, platform convention, feature preservation)
    • Secrets wiring: 4 new keys (KEYCLOAK_URL, KEYCLOAK_REALM, KEYCLOAK_CLIENT_ID, KEYCLOAK_CLIENT_SECRET) into existing k8s secret
    • 6 follow-up tickets identified across pal-e-services, pal-e-deployments, and landscaping-assistant

Test Plan

  • Review document for accuracy against source files:
    • pal-e-services/terraform/keycloak.tf and k3s.tfvars for current state
    • pal-e-platform/terraform/modules/keycloak/main.tf for infrastructure
    • basketball-api/services/keycloak.py for ROPC analysis
    • pal-e-deployments/overlays/landscaping-assistant/prod/deployment-patch.yaml for secrets pattern
  • Verify proposed tfvars entries are syntactically valid against variable definitions in variables.tf

Review Checklist

  • Spike doc covers all 8 investigation areas from the issue
  • Terraform config matches existing variable type definitions in pal-e-services
  • ROPC rejection is backed by platform convention, SOP, and security rationale
  • Secrets wiring matches existing pattern in deployment-patch.yaml
  • Follow-up tickets are actionable and scoped
  • sop-keycloak-client-creation -- current SOP (admin-console path, superseded by Terraform)

Closes #132

## Summary Spike investigation into how Keycloak should be programmatically configured for the landscaping-assistant app. Produces `docs/keycloak-setup.md` covering the full architecture: Terraform IaC approach, realm/client design, ROPC rejection, secrets wiring, and step-by-step setup instructions. ## Changes - `docs/keycloak-setup.md` -- New spike document covering: - Current Keycloak state across pal-e-platform (server) and pal-e-services (realm/client IaC) - Recommended approach: Terraform via pal-e-services (not Admin API, not realm export, not manual) - Realm design: new `landscaping` realm with 5 roles (admin, lead, member, client, super_admin) - Client design: confidential `landscaping-assistant` client with Authorization Code + PKCE - ROPC validation: Direct Access Grants rejected in favor of standard flow (security, platform convention, feature preservation) - Secrets wiring: 4 new keys (KEYCLOAK_URL, KEYCLOAK_REALM, KEYCLOAK_CLIENT_ID, KEYCLOAK_CLIENT_SECRET) into existing k8s secret - 6 follow-up tickets identified across pal-e-services, pal-e-deployments, and landscaping-assistant ## Test Plan - Review document for accuracy against source files: - `pal-e-services/terraform/keycloak.tf` and `k3s.tfvars` for current state - `pal-e-platform/terraform/modules/keycloak/main.tf` for infrastructure - `basketball-api/services/keycloak.py` for ROPC analysis - `pal-e-deployments/overlays/landscaping-assistant/prod/deployment-patch.yaml` for secrets pattern - Verify proposed tfvars entries are syntactically valid against variable definitions in `variables.tf` ## Review Checklist - [x] Spike doc covers all 8 investigation areas from the issue - [x] Terraform config matches existing variable type definitions in pal-e-services - [x] ROPC rejection is backed by platform convention, SOP, and security rationale - [x] Secrets wiring matches existing pattern in deployment-patch.yaml - [x] Follow-up tickets are actionable and scoped ## Related Notes - `sop-keycloak-client-creation` -- current SOP (admin-console path, superseded by Terraform) ## Related Closes #132
Add Keycloak setup spike doc for landscaping-assistant (#132)
Some checks are pending
CI / scan_ruby (pull_request) Waiting to run
CI / scan_js (pull_request) Waiting to run
CI / lint (pull_request) Waiting to run
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
e55f303f38
Documents the full Keycloak programmatic configuration approach:
Terraform IaC via pal-e-services, realm/client design, ROPC
rejection in favor of Authorization Code + PKCE, secrets wiring,
and step-by-step setup instructions with follow-up tickets.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Author
Owner

QA Review

Docs-only spike PR (single file: docs/keycloak-setup.md, 343 additions). Verified claims against source material.

Accuracy Checks

Claim Source Status
Keycloak 26.0.7, start-dev, H2 dev-file, 2Gi PVC pal-e-platform/terraform/modules/keycloak/main.tf Correct
Four realms, five clients in pal-e-services pal-e-services/terraform/k3s.tfvars Correct
basketball-api ROPC is admin-only against master realm basketball-api/services/keycloak.py lines 39-48 Correct
westside-app uses keycloak-js PKCE, not ROPC for users westside-app/src/lib/keycloak.js Correct
SOP forbids Direct Access Grants sop-keycloak-client-creation Correct
Current k8s secret has 8 keys kubectl get secret output Correct
Proposed tfvars match variable type definitions pal-e-services/terraform/variables.tf Correct

Findings

No blocking issues found. One minor nit:

  1. Nit -- "New keys needed" table says "Four" but lists four rows: The introductory sentence says "Four new keys" which is correct, but the earlier "New keys needed for Keycloak" heading's table has a mismatch between the body text ("Three new keys") in the Summary section saying "4 new keys". This is actually consistent throughout the doc itself -- just flagging that both the intro sentence and table agree at four keys, which is correct.

No code changes, no runtime impact, no security concerns beyond what the doc already addresses (ROPC rejection, secrets handling).

VERDICT: APPROVE

Clean spike doc with accurate cross-repo investigation. All claims verified against source files. Proposed Terraform config is syntactically valid against existing variable definitions. ROPC rejection is well-reasoned and backed by platform convention, SOP, and OAuth 2.1 deprecation.

## QA Review Docs-only spike PR (single file: `docs/keycloak-setup.md`, 343 additions). Verified claims against source material. ### Accuracy Checks | Claim | Source | Status | |-------|--------|--------| | Keycloak 26.0.7, start-dev, H2 dev-file, 2Gi PVC | `pal-e-platform/terraform/modules/keycloak/main.tf` | Correct | | Four realms, five clients in pal-e-services | `pal-e-services/terraform/k3s.tfvars` | Correct | | basketball-api ROPC is admin-only against master realm | `basketball-api/services/keycloak.py` lines 39-48 | Correct | | westside-app uses keycloak-js PKCE, not ROPC for users | `westside-app/src/lib/keycloak.js` | Correct | | SOP forbids Direct Access Grants | `sop-keycloak-client-creation` | Correct | | Current k8s secret has 8 keys | `kubectl get secret` output | Correct | | Proposed tfvars match variable type definitions | `pal-e-services/terraform/variables.tf` | Correct | ### Findings No blocking issues found. One minor nit: 1. **Nit -- "New keys needed" table says "Four" but lists four rows**: The introductory sentence says "Four new keys" which is correct, but the earlier "New keys needed for Keycloak" heading's table has a mismatch between the body text ("Three new keys") in the Summary section saying "4 new keys". This is actually consistent throughout the doc itself -- just flagging that both the intro sentence and table agree at four keys, which is correct. No code changes, no runtime impact, no security concerns beyond what the doc already addresses (ROPC rejection, secrets handling). ### VERDICT: APPROVE Clean spike doc with accurate cross-repo investigation. All claims verified against source files. Proposed Terraform config is syntactically valid against existing variable definitions. ROPC rejection is well-reasoned and backed by platform convention, SOP, and OAuth 2.1 deprecation.
Author
Owner

PR #133 Review

DOMAIN REVIEW

Tech stack: Documentation (Markdown spike output). Domain review applies Keycloak/OIDC, Terraform/IaC, and Kubernetes infrastructure expertise.

Technical accuracy -- cross-referenced against sop-keycloak-client-creation (updated 2026-06-06), docs/user-stories-auth.md, and open issues:

  • 4 realms, 5 clients: The spike correctly enumerates all four existing realms (platform, westside-basketball, pal-e-docs, pal-enterprises) and five existing clients. Matches the SOP exactly.
  • Provider: mrparkers/keycloak v5.0 -- correct per SOP.
  • Terraform path: pal-e-services/terraform/keycloak.tf + k3s.tfvars -- correct per SOP.
  • Keycloak image: quay.io/keycloak/keycloak:26.0.7 with H2 dev-file DB -- correct per SOP.
  • Five roles: admin, lead, member, client, super_admin -- matches docs/user-stories-auth.md exactly. The role table correctly describes each role's purpose and assignment path.
  • ROPC rejection: Thorough and well-reasoned. Cites OAuth 2.1 (RFC 9700), platform convention, feature loss, and turbo-ios compatibility. The basketball-api ROPC clarification (admin-to-admin only, not user auth) is important context that was missing from prior discussions. Consistent with #115's updated title ("Authorization Code + PKCE").
  • Client config: Confidential client, standard flow, PKCE S256, direct access grants disabled -- matches the SOP's OIDC Client Config Summary table exactly.
  • Secrets wiring: Four keys (KEYCLOAK_URL, KEYCLOAK_REALM, KEYCLOAK_CLIENT_ID, KEYCLOAK_CLIENT_SECRET) with kubectl create secret ... --dry-run=client -o yaml | kubectl apply -f - pattern and secretKeyRef in deployment-patch.yaml -- matches the SOP's Secrets Wiring section.

Quality of the spike output: This is an excellent spike document. The comparison table (Terraform vs Admin API vs Realm Export vs Manual) is clear. The ROPC analysis is the strongest section -- it correctly identifies that the basketball-api's grant_type=password usage is admin-to-admin, not a precedent for user-facing ROPC. The step-by-step phases are logically ordered with clear dependencies.

BLOCKERS

1. Missing Network Policy step (SOP gap)

The updated SOP has a "Network Policy" section stating:

Keycloak has a default-deny ingress NetworkPolicy in pal-e-platform/terraform/network-policies.tf. New apps must be added to the allowlist before they can reach Keycloak. Add the app's namespace to the existing policy and run tofu apply.

The spike doc's step-by-step instructions and cross-repo changes table do not mention this. Without adding landscaping-assistant to the Keycloak NetworkPolicy allowlist, the app will be unable to reach Keycloak at all. This is a deployment-blocking omission.

Fix: Add a step between Phase 1 and Phase 2 (or as a sub-step of Phase 1): "Add landscaping-assistant namespace to Keycloak ingress NetworkPolicy in pal-e-platform/terraform/network-policies.tf and run tofu apply." Also add a row to the Cross-Repo Changes table for pal-e-platform.

2. Follow-up ticket #6 is stale

The spike doc's last follow-up ticket says:

SOP update: sop-keycloak-client-creation -- The SOP references admin-console-only creation and the westside-basketball realm exclusively. It should be updated to reference the Terraform approach in pal-e-services as the primary path, with admin console as fallback for user management.

This has already been done. The SOP was rewritten (updated 2026-06-06) and now documents Terraform as "Path A: Primary" with admin console as "Path B: User Management." The SOP title was also changed to "SOP: Keycloak Client & Realm Management." This follow-up ticket is no longer needed and should be removed or marked as already completed to avoid creating duplicate work.

NITS

  1. Missing state parameter / CSRF note: The SOP has a dedicated section on state parameter validation that the consuming app must implement. The spike doc's Phase 5 (Rails Auth Implementation) lists "Implement callback controller, session management" but does not mention OIDC state validation or code_verifier lifecycle. Consider adding a note: "The callback controller MUST validate the OIDC state parameter (most OmniAuth strategies handle this automatically)."

  2. Missing front_channel_logout: The SOP's OIDC Client Config Summary lists Front channel logout = ON (Recommended for SLO). The proposed client config in k3s.tfvars does not include this setting. If the Terraform variable definition supports it, it should be included for single-logout support.

  3. brute_force_protected not in realm config: The rationale section says brute_force_protected = true (default) but this field does not appear in the proposed keycloak_realms HCL block. If it is truly a Keycloak default, this is fine, but the comment could mislead someone into thinking it is being explicitly set. Clarify: either add it to the HCL block or change the rationale to say "Keycloak enables brute force protection by default at the realm level; no explicit setting needed in tfvars."

  4. SOP Status section is now inaccurate: The spike doc says the SOP "has since been superseded by the Terraform approach." This undersells it -- the SOP was fully rewritten to document both paths. Consider changing to: "The SOP has been updated to document Terraform as the primary path (Path A) with admin console as Path B for user management."

  5. PR body says "ROPC rejection": The PR Summary mentions "ROPC rejection" which is accurate but slightly misleading in the PR description since the doc actually recommends Authorization Code + PKCE (the rejection is one section of the doc, not its main purpose). Minor wording nit.

  6. README Roles table out of date: The README shows only four roles (no super_admin). This is not a spike doc issue, but a gap surfaced by the review. The README should be updated to include super_admin, either in this PR or as a follow-up.

SOP COMPLIANCE

  • Branch named after issue (132-spike-keycloak-programmatic-configuratio -- truncated but follows the {issue-number}-{kebab-case} pattern)
  • PR body follows template (Summary, Changes, Test Plan, Related all present)
  • Related references parent issue (Closes #132)
  • Related references plan slug (no plan slug referenced -- minor)
  • No secrets committed (doc contains placeholder <from-keycloak>, no actual credentials)
  • No unnecessary file changes (single file, docs/keycloak-setup.md, directly addresses the spike)
  • Commit messages are descriptive (based on PR structure)

PROCESS OBSERVATIONS

  • Spike deliverables check: The spike template requires (1) a docs/ file and (2) follow-up tickets. The docs/ file is present (343 lines, thorough). Six follow-up tickets are identified with clear scope and cross-repo ownership. One (#6) is stale and should be removed. The remaining five are actionable.
  • Cross-repo awareness: The doc correctly identifies work needed in three repos (pal-e-services, pal-e-deployments, landscaping-assistant) but misses pal-e-platform (NetworkPolicy). This is the primary gap.
  • DORA impact: This spike unblocks Phase 1 auth work (#115). The follow-up tickets create a clear dependency chain. Merging this doc gives the implementation tickets a concrete spec to work against, which should reduce change failure rate for the auth implementation.
  • Documentation quality: This is a high-quality spike output. The ROPC analysis alone saves significant future discussion. The proposed tfvars entries are syntactically consistent with existing entries, reducing implementation friction.

VERDICT: NOT APPROVED

Two blockers must be addressed before merge:

  1. Add the NetworkPolicy step for pal-e-platform/terraform/network-policies.tf to both the step-by-step instructions and the Cross-Repo Changes table.
  2. Remove or mark follow-up ticket #6 (SOP update) as already completed -- the SOP has been rewritten and this ticket would create duplicate/conflicting work.
## PR #133 Review ### DOMAIN REVIEW **Tech stack**: Documentation (Markdown spike output). Domain review applies Keycloak/OIDC, Terraform/IaC, and Kubernetes infrastructure expertise. **Technical accuracy** -- cross-referenced against `sop-keycloak-client-creation` (updated 2026-06-06), `docs/user-stories-auth.md`, and open issues: - **4 realms, 5 clients**: The spike correctly enumerates all four existing realms (platform, westside-basketball, pal-e-docs, pal-enterprises) and five existing clients. Matches the SOP exactly. - **Provider**: `mrparkers/keycloak` v5.0 -- correct per SOP. - **Terraform path**: `pal-e-services/terraform/keycloak.tf` + `k3s.tfvars` -- correct per SOP. - **Keycloak image**: `quay.io/keycloak/keycloak:26.0.7` with H2 dev-file DB -- correct per SOP. - **Five roles**: admin, lead, member, client, super_admin -- matches `docs/user-stories-auth.md` exactly. The role table correctly describes each role's purpose and assignment path. - **ROPC rejection**: Thorough and well-reasoned. Cites OAuth 2.1 (RFC 9700), platform convention, feature loss, and turbo-ios compatibility. The basketball-api ROPC clarification (admin-to-admin only, not user auth) is important context that was missing from prior discussions. Consistent with #115's updated title ("Authorization Code + PKCE"). - **Client config**: Confidential client, standard flow, PKCE S256, direct access grants disabled -- matches the SOP's OIDC Client Config Summary table exactly. - **Secrets wiring**: Four keys (KEYCLOAK_URL, KEYCLOAK_REALM, KEYCLOAK_CLIENT_ID, KEYCLOAK_CLIENT_SECRET) with `kubectl create secret ... --dry-run=client -o yaml | kubectl apply -f -` pattern and `secretKeyRef` in deployment-patch.yaml -- matches the SOP's Secrets Wiring section. **Quality of the spike output**: This is an excellent spike document. The comparison table (Terraform vs Admin API vs Realm Export vs Manual) is clear. The ROPC analysis is the strongest section -- it correctly identifies that the basketball-api's `grant_type=password` usage is admin-to-admin, not a precedent for user-facing ROPC. The step-by-step phases are logically ordered with clear dependencies. ### BLOCKERS **1. Missing Network Policy step (SOP gap)** The updated SOP has a "Network Policy" section stating: > Keycloak has a default-deny ingress NetworkPolicy in `pal-e-platform/terraform/network-policies.tf`. New apps must be added to the allowlist before they can reach Keycloak. Add the app's namespace to the existing policy and run `tofu apply`. The spike doc's step-by-step instructions and cross-repo changes table do not mention this. Without adding `landscaping-assistant` to the Keycloak NetworkPolicy allowlist, the app will be unable to reach Keycloak at all. This is a deployment-blocking omission. **Fix**: Add a step between Phase 1 and Phase 2 (or as a sub-step of Phase 1): "Add `landscaping-assistant` namespace to Keycloak ingress NetworkPolicy in `pal-e-platform/terraform/network-policies.tf` and run `tofu apply`." Also add a row to the Cross-Repo Changes table for `pal-e-platform`. **2. Follow-up ticket #6 is stale** The spike doc's last follow-up ticket says: > **SOP update: sop-keycloak-client-creation** -- The SOP references admin-console-only creation and the `westside-basketball` realm exclusively. It should be updated to reference the Terraform approach in pal-e-services as the primary path, with admin console as fallback for user management. This has already been done. The SOP was rewritten (updated 2026-06-06) and now documents Terraform as "Path A: Primary" with admin console as "Path B: User Management." The SOP title was also changed to "SOP: Keycloak Client & Realm Management." This follow-up ticket is no longer needed and should be removed or marked as already completed to avoid creating duplicate work. ### NITS 1. **Missing `state` parameter / CSRF note**: The SOP has a dedicated section on `state` parameter validation that the consuming app must implement. The spike doc's Phase 5 (Rails Auth Implementation) lists "Implement callback controller, session management" but does not mention OIDC `state` validation or `code_verifier` lifecycle. Consider adding a note: "The callback controller MUST validate the OIDC `state` parameter (most OmniAuth strategies handle this automatically)." 2. **Missing `front_channel_logout`**: The SOP's OIDC Client Config Summary lists `Front channel logout = ON (Recommended for SLO)`. The proposed client config in `k3s.tfvars` does not include this setting. If the Terraform variable definition supports it, it should be included for single-logout support. 3. **`brute_force_protected` not in realm config**: The rationale section says `brute_force_protected = true (default)` but this field does not appear in the proposed `keycloak_realms` HCL block. If it is truly a Keycloak default, this is fine, but the comment could mislead someone into thinking it is being explicitly set. Clarify: either add it to the HCL block or change the rationale to say "Keycloak enables brute force protection by default at the realm level; no explicit setting needed in tfvars." 4. **SOP Status section is now inaccurate**: The spike doc says the SOP "has since been superseded by the Terraform approach." This undersells it -- the SOP was fully rewritten to document both paths. Consider changing to: "The SOP has been updated to document Terraform as the primary path (Path A) with admin console as Path B for user management." 5. **PR body says "ROPC rejection"**: The PR Summary mentions "ROPC rejection" which is accurate but slightly misleading in the PR description since the doc actually recommends Authorization Code + PKCE (the rejection is one section of the doc, not its main purpose). Minor wording nit. 6. **README Roles table out of date**: The README shows only four roles (no super_admin). This is not a spike doc issue, but a gap surfaced by the review. The README should be updated to include super_admin, either in this PR or as a follow-up. ### SOP COMPLIANCE - [x] Branch named after issue (`132-spike-keycloak-programmatic-configuratio` -- truncated but follows the `{issue-number}-{kebab-case}` pattern) - [x] PR body follows template (Summary, Changes, Test Plan, Related all present) - [x] Related references parent issue (`Closes #132`) - [ ] Related references plan slug (no plan slug referenced -- minor) - [x] No secrets committed (doc contains placeholder `<from-keycloak>`, no actual credentials) - [x] No unnecessary file changes (single file, `docs/keycloak-setup.md`, directly addresses the spike) - [x] Commit messages are descriptive (based on PR structure) ### PROCESS OBSERVATIONS - **Spike deliverables check**: The spike template requires (1) a docs/ file and (2) follow-up tickets. The docs/ file is present (343 lines, thorough). Six follow-up tickets are identified with clear scope and cross-repo ownership. One (#6) is stale and should be removed. The remaining five are actionable. - **Cross-repo awareness**: The doc correctly identifies work needed in three repos (pal-e-services, pal-e-deployments, landscaping-assistant) but misses pal-e-platform (NetworkPolicy). This is the primary gap. - **DORA impact**: This spike unblocks Phase 1 auth work (#115). The follow-up tickets create a clear dependency chain. Merging this doc gives the implementation tickets a concrete spec to work against, which should reduce change failure rate for the auth implementation. - **Documentation quality**: This is a high-quality spike output. The ROPC analysis alone saves significant future discussion. The proposed tfvars entries are syntactically consistent with existing entries, reducing implementation friction. ### VERDICT: NOT APPROVED Two blockers must be addressed before merge: 1. Add the NetworkPolicy step for `pal-e-platform/terraform/network-policies.tf` to both the step-by-step instructions and the Cross-Repo Changes table. 2. Remove or mark follow-up ticket #6 (SOP update) as already completed -- the SOP has been rewritten and this ticket would create duplicate/conflicting work.
Fix QA blockers: add NetworkPolicy step, remove stale SOP follow-up
Some checks are pending
CI / scan_ruby (pull_request) Waiting to run
CI / scan_js (pull_request) Waiting to run
CI / lint (pull_request) Waiting to run
ci/woodpecker/pr/woodpecker Pipeline was successful
ci/woodpecker/push/woodpecker Pipeline was successful
d6cac0d424
- Added Phase 1b: NetworkPolicy step to setup instructions
- Added pal-e-platform row to cross-repo changes table
- Replaced stale follow-up #6 (SOP update) with NetworkPolicy ticket
- SOP was already updated this session, noted as done

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Author
Owner

PR #133 Review (Re-Review)

PREVIOUS BLOCKERS -- RESOLVED

1. NetworkPolicy step was missing -- FIXED

  • Phase 1b added with clear instructions: edit network-policies.tf, add namespace, run tofu apply.
  • pal-e-platform row added to the cross-repo changes table.
  • Follow-up ticket #6 explicitly covers the NetworkPolicy change in pal-e-platform.
  • Cross-verified against the live SOP (sop-keycloak-client-creation), which documents this exact requirement under its "Network Policy" section. The spike doc's coverage now matches.

2. Stale SOP follow-up -- FIXED

  • The stale recommendation to update the SOP has been replaced with: "Note: The SOP (sop-keycloak-client-creation) has already been updated to reflect Terraform as the primary path -- no follow-up needed there."
  • Cross-verified: the live SOP in pal-e-docs does already cover Terraform as the primary path (Path A), with the admin console as secondary for user management. The spike doc's claim is accurate.

DOMAIN REVIEW

Tech stack: Documentation / Terraform / Kubernetes / Keycloak OIDC

Document accuracy checks performed:

  • Realm config matches the SOP template structure in k3s.tfvars (correct keys: display_name, registration_allowed, reset_password_allowed, login_with_email_allowed, roles).
  • Client config matches the SOP template and the pal-enterprises confidential client pattern (correct keys: realm_key, client_id, public_client, standard_flow_enabled, direct_access_grants_enabled, pkce_code_challenge_method, etc.).
  • Five roles (admin, lead, member, client, super_admin) align with docs/user-stories-auth.md role definitions.
  • ROPC rejection is well-reasoned: cites OAuth 2.1 (RFC 9700) deprecation, platform convention (SOP explicitly forbids ROPC), feature loss, and turbo-ios workarounds. The basketball-api ROPC usage is correctly identified as admin-to-admin (master realm), not end-user auth.
  • Secrets wiring follows the established kubectl create secret generic ... --dry-run=client -o yaml | kubectl apply -f - pattern documented in the SOP.
  • Phase ordering is logically sound: Terraform first, NetworkPolicy second, users third, secrets fourth, deployment overlay fifth, Rails implementation last.
  • Follow-up tickets are well-scoped: 6 tickets across 3 repos with clear dependency ordering.

BLOCKERS

None.

NITS

  1. app-architecture.md line 20 still says "HTTP POST (ROPC auth)" in its Mermaid diagram for the Rails-to-Keycloak connection. This contradicts the spike doc's ROPC rejection. Not introduced by this PR and not in scope, but worth a follow-up to update that diagram label to reflect Authorization Code + PKCE. Consider adding this to ticket #115 or creating a separate doc-fix ticket.

  2. brute_force_protected = true listed as "(default)" in the realm settings rationale, but it does not appear in the proposed k3s.tfvars block. If this is truly a Keycloak realm default, that is fine -- but it may be worth a comment in the tfvars block when the actual Terraform change is made to confirm the assumption.

  3. Test user table: lucas-super-admin has roles "admin + super_admin" but user-stories-auth.md describes super_admin as "separate from business roles" with Lucas having "both admin and super_admin." The table is consistent with the auth doc, but the fact that one test user holds two roles while others hold one could be called out more explicitly (e.g., "compound role assignment for testing cross-role behavior").

SOP COMPLIANCE

  • Branch named after issue (132-spike-keycloak-programmatic-configuratio -- truncated but clearly issue-132-derived)
  • PR body follows template (Summary, Changes, Test Plan, Review Checklist, Related)
  • Related references closing issue (Closes #132)
  • No secrets committed (Keycloak URL is a Tailscale hostname, not a secret; client secret placeholder is <from-keycloak>)
  • No unnecessary file changes (single file: docs/keycloak-setup.md, 354 additions, 0 deletions)
  • Scope matches issue (spike doc covering Keycloak programmatic configuration and architecture validation)

PROCESS OBSERVATIONS

  • Clean spike output: single doc file, no code changes, no infrastructure modifications. Low change-failure risk.
  • The spike effectively de-risks the auth implementation by validating the approach before any code is written. The ROPC rejection is particularly valuable -- it prevents a security anti-pattern from being adopted.
  • Six follow-up tickets provide a clear execution path across three repos with dependency ordering. Good decomposition.
  • The cross-repo changes table and follow-up tickets are now consistent (both include pal-e-platform NetworkPolicy work).

VERDICT: APPROVED

## PR #133 Review (Re-Review) ### PREVIOUS BLOCKERS -- RESOLVED **1. NetworkPolicy step was missing -- FIXED** - Phase 1b added with clear instructions: edit `network-policies.tf`, add namespace, run `tofu apply`. - `pal-e-platform` row added to the cross-repo changes table. - Follow-up ticket #6 explicitly covers the NetworkPolicy change in `pal-e-platform`. - Cross-verified against the live SOP (`sop-keycloak-client-creation`), which documents this exact requirement under its "Network Policy" section. The spike doc's coverage now matches. **2. Stale SOP follow-up -- FIXED** - The stale recommendation to update the SOP has been replaced with: "Note: The SOP (sop-keycloak-client-creation) has already been updated to reflect Terraform as the primary path -- no follow-up needed there." - Cross-verified: the live SOP in pal-e-docs does already cover Terraform as the primary path (Path A), with the admin console as secondary for user management. The spike doc's claim is accurate. ### DOMAIN REVIEW **Tech stack: Documentation / Terraform / Kubernetes / Keycloak OIDC** Document accuracy checks performed: - **Realm config** matches the SOP template structure in `k3s.tfvars` (correct keys: `display_name`, `registration_allowed`, `reset_password_allowed`, `login_with_email_allowed`, `roles`). - **Client config** matches the SOP template and the `pal-enterprises` confidential client pattern (correct keys: `realm_key`, `client_id`, `public_client`, `standard_flow_enabled`, `direct_access_grants_enabled`, `pkce_code_challenge_method`, etc.). - **Five roles** (admin, lead, member, client, super_admin) align with `docs/user-stories-auth.md` role definitions. - **ROPC rejection** is well-reasoned: cites OAuth 2.1 (RFC 9700) deprecation, platform convention (SOP explicitly forbids ROPC), feature loss, and turbo-ios workarounds. The basketball-api ROPC usage is correctly identified as admin-to-admin (master realm), not end-user auth. - **Secrets wiring** follows the established `kubectl create secret generic ... --dry-run=client -o yaml | kubectl apply -f -` pattern documented in the SOP. - **Phase ordering** is logically sound: Terraform first, NetworkPolicy second, users third, secrets fourth, deployment overlay fifth, Rails implementation last. - **Follow-up tickets** are well-scoped: 6 tickets across 3 repos with clear dependency ordering. ### BLOCKERS None. ### NITS 1. **`app-architecture.md` line 20 still says `"HTTP POST (ROPC auth)"`** in its Mermaid diagram for the Rails-to-Keycloak connection. This contradicts the spike doc's ROPC rejection. Not introduced by this PR and not in scope, but worth a follow-up to update that diagram label to reflect Authorization Code + PKCE. Consider adding this to ticket #115 or creating a separate doc-fix ticket. 2. **`brute_force_protected = true` listed as "(default)"** in the realm settings rationale, but it does not appear in the proposed `k3s.tfvars` block. If this is truly a Keycloak realm default, that is fine -- but it may be worth a comment in the tfvars block when the actual Terraform change is made to confirm the assumption. 3. **Test user table**: `lucas-super-admin` has roles "admin + super_admin" but `user-stories-auth.md` describes super_admin as "separate from business roles" with Lucas having "both admin and super_admin." The table is consistent with the auth doc, but the fact that one test user holds two roles while others hold one could be called out more explicitly (e.g., "compound role assignment for testing cross-role behavior"). ### SOP COMPLIANCE - [x] Branch named after issue (`132-spike-keycloak-programmatic-configuratio` -- truncated but clearly issue-132-derived) - [x] PR body follows template (Summary, Changes, Test Plan, Review Checklist, Related) - [x] Related references closing issue (`Closes #132`) - [x] No secrets committed (Keycloak URL is a Tailscale hostname, not a secret; client secret placeholder is `<from-keycloak>`) - [x] No unnecessary file changes (single file: `docs/keycloak-setup.md`, 354 additions, 0 deletions) - [x] Scope matches issue (spike doc covering Keycloak programmatic configuration and architecture validation) ### PROCESS OBSERVATIONS - Clean spike output: single doc file, no code changes, no infrastructure modifications. Low change-failure risk. - The spike effectively de-risks the auth implementation by validating the approach before any code is written. The ROPC rejection is particularly valuable -- it prevents a security anti-pattern from being adopted. - Six follow-up tickets provide a clear execution path across three repos with dependency ordering. Good decomposition. - The cross-repo changes table and follow-up tickets are now consistent (both include pal-e-platform NetworkPolicy work). ### VERDICT: APPROVED
Fix QA nits: update ROPC refs, brute_force_protected, compound roles
Some checks failed
ci/woodpecker/push/woodpecker Pipeline was successful
ci/woodpecker/pr/woodpecker Pipeline was successful
CI / scan_ruby (pull_request) Has been cancelled
CI / scan_js (pull_request) Has been cancelled
CI / lint (pull_request) Has been cancelled
20ef7576d8
- app-architecture.md: replace all ROPC references with Authorization
  Code + PKCE flow (mermaid diagram, auth section, file structure,
  design decisions)
- keycloak-setup.md: add brute_force_protected default verification note
- keycloak-setup.md: expand lucas-super-admin compound role assignment
  with explicit instructions for dual-role mapping

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Author
Owner

PR #133 Review (Re-review Pass 3)

NIT FIX VERIFICATION

All three nits from the previous review have been properly resolved:

1. ROPC references in app-architecture.md -- FIXED

All 6 ROPC references have been replaced with Authorization Code + PKCE flow:

  • Line 20: "HTTP POST (ROPC auth)" replaced with "OIDC (Auth Code + PKCE)" in Mermaid diagram
  • Line 140: Auth section rewritten from "Keycloak with Direct Access Grants (ROPC)" to "Keycloak with Authorization Code flow + PKCE (S256)"
  • Lines 148-156: Mermaid sequence diagram completely rewritten. Old 5-step ROPC flow (POST /login, POST token endpoint) replaced with 8-step Auth Code flow (redirect to KC, callback with code, server-side token exchange)
  • Line 160: Client description updated from "confidential, ROPC enabled" to "confidential, standard flow + PKCE S256"
  • Line 232: sessions_controller.rb comment updated from "Login/logout via Keycloak ROPC" to "OmniAuth callback + logout"
  • Line 263: keycloak_auth_service.rb replaced with comment "Auth handled by omniauth_openid_connect gem (no custom service needed)"
  • Line 282: Design decision #6 rewritten from "ROPC over redirect flow" to "Standard OIDC flow (Auth Code + PKCE)" with rationale referencing OAuth 2.1 deprecation and platform SOP

Verdict: Thoroughly fixed. No stale ROPC references remain in either file.

2. brute_force_protected assumption -- FIXED

Lines 85-86 of keycloak-setup.md now contain a verification note:

brute_force_protected: not shown in tfvars (defaults to true in the Keycloak provider). Verify this default holds when applying -- if not, set it explicitly.

This is followed by a separate bullet confirming the security baseline. The implementer is now prompted to verify the default rather than blindly trusting it.

Verdict: Fixed. The actionable verification clause is present.

3. Compound role assignment for lucas-super-admin -- FIXED

The test users table (line 141) now has an expanded description for lucas-super-admin:

Platform owner -- gets BOTH roles. super_admin grants Platform view (feature flags, system settings). admin grants full business access (Crew tab, Week tab). This user exercises the compound role path.

Lines 147-148 add explicit instructions:

lucas-super-admin is the only user with multiple realm roles. In the Keycloak admin console, assign both roles via Role mapping -> Assign role -> select admin AND super_admin. The Rails app reads realm_access.roles from the ID token, which returns an array -- role checks use roles.include?("super_admin").

Verdict: Fixed. The dual-role assignment is now unambiguous with step-by-step Keycloak console instructions and the Rails-side array semantics explained.

DOMAIN REVIEW

Tech stack: Documentation (Markdown with Mermaid diagrams, HCL code blocks, YAML snippets, Bash commands). Domain expertise applied: Keycloak/OIDC, Terraform/IaC, Kubernetes secrets management, Rails auth patterns.

keycloak-setup.md (new, 357 lines):

  • Well-structured spike output. Covers current state, recommended approach, realm design, ROPC analysis, secrets wiring, step-by-step instructions, cross-repo changes, and follow-up tickets.
  • Terraform HCL blocks are syntactically valid and follow the existing k3s.tfvars patterns described.
  • The ROPC analysis is thorough and correctly cites OAuth 2.1 (RFC 9700) deprecation.
  • Secrets wiring correctly identifies the kubectl create secret generic ... --dry-run=client -o yaml | kubectl apply -f - pattern for secret updates.
  • NetworkPolicy phase (1b) is a good catch -- without it, pods cannot reach Keycloak.
  • The westside-app legacy direct_access_grants_enabled = true observation is accurate and properly contextualized.

app-architecture.md (6 changed regions):

  • Auth section now internally consistent: the Mermaid diagram, prose, file structure, and design decisions all reference Auth Code + PKCE.
  • The sequence diagram is correct for a standard OIDC Authorization Code flow with server-side token exchange.
  • File structure correctly removes keycloak_auth_service.rb and updates sessions_controller.rb description.
  • Design decision #6 rationale is well-written and forward-looking (turbo-ios handling, custom theme planned).

BLOCKERS

None. This is a documentation-only PR (spike output + architecture doc updates). No code, no credentials, no security concerns.

NITS

  1. README not updated: docs/keycloak-setup.md is not listed in the README Documentation table. Every other doc in docs/ has a row there. Suggested row: | [Keycloak Setup](docs/keycloak-setup.md) | Keycloak realm/client configuration, auth flow, secrets wiring |

  2. Redundant brute_force_protected bullets (keycloak-setup.md lines 85-86): There are two consecutive bullets about brute_force_protected -- the first is the verification note (good), the second just restates "brute_force_protected = true (default): security baseline." These could be consolidated into a single bullet that combines both the default assumption and the verification instruction.

  3. Client settings rationale parenthetical (keycloak-setup.md line 129): "The user logs in via the Keycloak login page (or a Rails-rendered login page that posts to Keycloak)" -- the parenthetical about a Rails-rendered login page posting to Keycloak is a bit misleading for a standard flow description. In Auth Code flow, a Rails page would initiate a redirect to Keycloak, not POST to it. Minor wording nit.

SOP COMPLIANCE

  • Branch named after issue: 132-spike-keycloak-programmatic-configuratio references issue #132 (truncated but valid)
  • PR body follows template: Summary, Changes, Test Plan, Review Checklist, Related sections all present
  • Related references issue: Closes #132 present
  • No secrets committed: No credentials, tokens, or .env files in the diff. URLs are Tailscale internal hostnames, not secret.
  • No scope creep: Changes limited to docs/keycloak-setup.md (new) and docs/app-architecture.md (ROPC cleanup) -- both directly relevant to the spike

PROCESS OBSERVATIONS

  • This is the third review pass. All three prior nits have been resolved cleanly. The fix quality is high -- each fix addresses the root concern, not just the surface symptom.
  • The spike doc is comprehensive enough to serve as the implementation spec for 6 follow-up tickets across 4 repos. This is a good example of a spike producing actionable output.
  • The README nit (nit #1) is worth addressing to maintain the doc index, but it is not blocking.

VERDICT: APPROVED

## PR #133 Review (Re-review Pass 3) ### NIT FIX VERIFICATION All three nits from the previous review have been properly resolved: **1. ROPC references in app-architecture.md -- FIXED** All 6 ROPC references have been replaced with Authorization Code + PKCE flow: - Line 20: `"HTTP POST (ROPC auth)"` replaced with `"OIDC (Auth Code + PKCE)"` in Mermaid diagram - Line 140: Auth section rewritten from "Keycloak with Direct Access Grants (ROPC)" to "Keycloak with Authorization Code flow + PKCE (S256)" - Lines 148-156: Mermaid sequence diagram completely rewritten. Old 5-step ROPC flow (POST /login, POST token endpoint) replaced with 8-step Auth Code flow (redirect to KC, callback with code, server-side token exchange) - Line 160: Client description updated from "confidential, ROPC enabled" to "confidential, standard flow + PKCE S256" - Line 232: `sessions_controller.rb` comment updated from "Login/logout via Keycloak ROPC" to "OmniAuth callback + logout" - Line 263: `keycloak_auth_service.rb` replaced with comment "Auth handled by omniauth_openid_connect gem (no custom service needed)" - Line 282: Design decision #6 rewritten from "ROPC over redirect flow" to "Standard OIDC flow (Auth Code + PKCE)" with rationale referencing OAuth 2.1 deprecation and platform SOP Verdict: Thoroughly fixed. No stale ROPC references remain in either file. **2. brute_force_protected assumption -- FIXED** Lines 85-86 of `keycloak-setup.md` now contain a verification note: > `brute_force_protected`: not shown in tfvars (defaults to `true` in the Keycloak provider). Verify this default holds when applying -- if not, set it explicitly. This is followed by a separate bullet confirming the security baseline. The implementer is now prompted to verify the default rather than blindly trusting it. Verdict: Fixed. The actionable verification clause is present. **3. Compound role assignment for lucas-super-admin -- FIXED** The test users table (line 141) now has an expanded description for `lucas-super-admin`: > Platform owner -- gets BOTH roles. `super_admin` grants Platform view (feature flags, system settings). `admin` grants full business access (Crew tab, Week tab). This user exercises the compound role path. Lines 147-148 add explicit instructions: > `lucas-super-admin` is the only user with multiple realm roles. In the Keycloak admin console, assign both roles via Role mapping -> Assign role -> select `admin` AND `super_admin`. The Rails app reads `realm_access.roles` from the ID token, which returns an array -- role checks use `roles.include?("super_admin")`. Verdict: Fixed. The dual-role assignment is now unambiguous with step-by-step Keycloak console instructions and the Rails-side array semantics explained. ### DOMAIN REVIEW **Tech stack**: Documentation (Markdown with Mermaid diagrams, HCL code blocks, YAML snippets, Bash commands). Domain expertise applied: Keycloak/OIDC, Terraform/IaC, Kubernetes secrets management, Rails auth patterns. **keycloak-setup.md (new, 357 lines)**: - Well-structured spike output. Covers current state, recommended approach, realm design, ROPC analysis, secrets wiring, step-by-step instructions, cross-repo changes, and follow-up tickets. - Terraform HCL blocks are syntactically valid and follow the existing `k3s.tfvars` patterns described. - The ROPC analysis is thorough and correctly cites OAuth 2.1 (RFC 9700) deprecation. - Secrets wiring correctly identifies the `kubectl create secret generic ... --dry-run=client -o yaml | kubectl apply -f -` pattern for secret updates. - NetworkPolicy phase (1b) is a good catch -- without it, pods cannot reach Keycloak. - The `westside-app` legacy `direct_access_grants_enabled = true` observation is accurate and properly contextualized. **app-architecture.md (6 changed regions)**: - Auth section now internally consistent: the Mermaid diagram, prose, file structure, and design decisions all reference Auth Code + PKCE. - The sequence diagram is correct for a standard OIDC Authorization Code flow with server-side token exchange. - File structure correctly removes `keycloak_auth_service.rb` and updates `sessions_controller.rb` description. - Design decision #6 rationale is well-written and forward-looking (turbo-ios handling, custom theme planned). ### BLOCKERS None. This is a documentation-only PR (spike output + architecture doc updates). No code, no credentials, no security concerns. ### NITS 1. **README not updated**: `docs/keycloak-setup.md` is not listed in the README Documentation table. Every other doc in `docs/` has a row there. Suggested row: `| [Keycloak Setup](docs/keycloak-setup.md) | Keycloak realm/client configuration, auth flow, secrets wiring |` 2. **Redundant brute_force_protected bullets** (keycloak-setup.md lines 85-86): There are two consecutive bullets about `brute_force_protected` -- the first is the verification note (good), the second just restates "`brute_force_protected = true` (default): security baseline." These could be consolidated into a single bullet that combines both the default assumption and the verification instruction. 3. **Client settings rationale parenthetical** (keycloak-setup.md line 129): "The user logs in via the Keycloak login page (or a Rails-rendered login page that posts to Keycloak)" -- the parenthetical about a Rails-rendered login page posting to Keycloak is a bit misleading for a standard flow description. In Auth Code flow, a Rails page would initiate a redirect to Keycloak, not POST to it. Minor wording nit. ### SOP COMPLIANCE - [x] Branch named after issue: `132-spike-keycloak-programmatic-configuratio` references issue #132 (truncated but valid) - [x] PR body follows template: Summary, Changes, Test Plan, Review Checklist, Related sections all present - [x] Related references issue: `Closes #132` present - [x] No secrets committed: No credentials, tokens, or `.env` files in the diff. URLs are Tailscale internal hostnames, not secret. - [x] No scope creep: Changes limited to `docs/keycloak-setup.md` (new) and `docs/app-architecture.md` (ROPC cleanup) -- both directly relevant to the spike ### PROCESS OBSERVATIONS - This is the third review pass. All three prior nits have been resolved cleanly. The fix quality is high -- each fix addresses the root concern, not just the surface symptom. - The spike doc is comprehensive enough to serve as the implementation spec for 6 follow-up tickets across 4 repos. This is a good example of a spike producing actionable output. - The README nit (nit #1) is worth addressing to maintain the doc index, but it is not blocking. ### VERDICT: APPROVED
ldraney deleted branch 132-spike-keycloak-programmatic-configuratio 2026-06-06 22:54:23 +00:00
Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
ldraney/landscaping-assistant!133
No description provided.