Spike: Feature flag strategy (gate features, rollback, gradual rollout)

ldraney commented

2026-06-06 21:09:28 +00:00

Owner

Type

Spike

Lineage

Cross-cutting concern. Should land early so every feature built after this can be gated. Standalone — emerged from operational need to toggle features without redeploying.

Repo

ldraney/landscaping-assistant

Question

What is the simplest feature flag approach that lets us turn features off in prod without redeploying, and how does it compose with the four-role permission model?

Philosophy — what gets flagged? Every new feature? Only risky/external integrations (Stripe, Keycloak)? Only user-facing UI? Define the rule so it's consistent.
Implementation approach — evaluate: ENV vars + restart (simplest), database-backed FeatureFlag model (no restart), Flipper gem (per-user/role/percentage — possibly overkill), single YAML/JSON config (middle ground).
Minimum bar — "turn it off in prod without redeploying." Which approaches meet this?
Granularity — global on/off? Per-role? Per-user? Per-business (when multi-tenancy lands)?
Multi-tenancy readiness — should a FeatureFlag model be scoped to Business from the start, or retrofitted in Phase 4? The Business entity is already defined in docs/user-stories-auth.md.
Admin UI — does admin get a feature flags panel, or is this developer-only (console)?
Flag lifecycle — when does a flag get removed? After N days stable? Who cleans up stale flags?
Composition with roles — feature flags gate availability, roles gate permissions. Example: Stripe flagged off globally → no one sees payment links, even admin. How do views/controllers check both?
Turbo Streams interaction — the app uses Hotwire with Turbo Streams for real-time updates. When a flag is toggled, how do stale Turbo frames/broadcasts get handled for connected clients?
Testing — how do specs handle flags? Always on? Always off? Parameterized?

Deliverables

Required outputs:

docs/feature-flags.md created — philosophy + chosen approach, with rationale. Merged via docs-only PR.
At least one implementation ticket created from decisions made. Existing tickets (#125, #123) updated if they need flag gating noted.

Time-box

1 session. Favor decision speed — this is V1, we can evolve the approach later. If time-box expires, document findings in the docs file and escalate remaining questions to Lucas for direction.

landscaping-assistant — project slug
#125 — Stripe integration (first candidate for feature flagging)
#123 — Client request UI (second candidate)
#107 — Auth/roles (flags compose with role-based access)
#115 — Keycloak login (potential flag candidate for auth switchover)
docs/user-stories-auth.md — role permission matrix that flags interact with

### Type Spike ### Lineage Cross-cutting concern. Should land early so every feature built after this can be gated. Standalone — emerged from operational need to toggle features without redeploying. ### Repo `ldraney/landscaping-assistant` ### Question What is the simplest feature flag approach that lets us turn features off in prod without redeploying, and how does it compose with the four-role permission model? - **Philosophy** — what gets flagged? Every new feature? Only risky/external integrations (Stripe, Keycloak)? Only user-facing UI? Define the rule so it's consistent. - **Implementation approach** — evaluate: ENV vars + restart (simplest), database-backed `FeatureFlag` model (no restart), Flipper gem (per-user/role/percentage — possibly overkill), single YAML/JSON config (middle ground). - **Minimum bar** — "turn it off in prod without redeploying." Which approaches meet this? - **Granularity** — global on/off? Per-role? Per-user? Per-business (when multi-tenancy lands)? - **Multi-tenancy readiness** — should a `FeatureFlag` model be scoped to `Business` from the start, or retrofitted in Phase 4? The `Business` entity is already defined in `docs/user-stories-auth.md`. - **Admin UI** — does admin get a feature flags panel, or is this developer-only (console)? - **Flag lifecycle** — when does a flag get removed? After N days stable? Who cleans up stale flags? - **Composition with roles** — feature flags gate availability, roles gate permissions. Example: Stripe flagged off globally → no one sees payment links, even admin. How do views/controllers check both? - **Turbo Streams interaction** — the app uses Hotwire with Turbo Streams for real-time updates. When a flag is toggled, how do stale Turbo frames/broadcasts get handled for connected clients? - **Testing** — how do specs handle flags? Always on? Always off? Parameterized? ### Deliverables Required outputs: - [ ] `docs/feature-flags.md` created — philosophy + chosen approach, with rationale. Merged via docs-only PR. - [ ] At least one implementation ticket created from decisions made. Existing tickets (#125, #123) updated if they need flag gating noted. ### Time-box 1 session. Favor decision speed — this is V1, we can evolve the approach later. If time-box expires, document findings in the docs file and escalate remaining questions to Lucas for direction. ### Related - `landscaping-assistant` — project slug - #125 — Stripe integration (first candidate for feature flagging) - #123 — Client request UI (second candidate) - #107 — Auth/roles (flags compose with role-based access) - #115 — Keycloak login (potential flag candidate for auth switchover) - `docs/user-stories-auth.md` — role permission matrix that flags interact with

ldraney commented

2026-06-06 21:18:00 +00:00

Author

Owner

Issue #129 Spike Template Review

TEMPLATE CONFORMANCE

Checked against template-issue-spike from pal-e-docs. Section-by-section:

Section	Status	Notes
Type	PASS	"Spike" -- correct
Lineage	PASS	"Standalone -- emerged from operational need" -- well framed
Repo	PASS	`ldraney/landscaping-assistant` -- correct
Question	PASS	Clear framing question with 8 sub-questions
Deliverables	NIT	See below
Time-box	NIT	See below
Related	NIT	See below

Deliverables -- missing checkbox format. The template specifies checkboxes (- [ ]) for both required deliverables. The issue uses plain bullet points (*). This is cosmetic but matters for trackability -- Forgejo renders checkboxes as interactive task lists.

Current:

* docs/feature-flags.md created...
* At least one implementation ticket created...

Should be:

- [ ] docs/feature-flags.md created...
- [ ] At least one implementation ticket created...

Time-box -- missing escalation clause. Template requires: "If time-box expires without answer: close spike, document findings in the docs file, escalate to Lucas for direction." The issue says only "1 session. Favor decision speed -- this is V1, we can evolve the approach later." The pragmatic note is good but the escalation path is absent. Since this is a solo-developer project and "escalate to Lucas" would mean escalating to yourself, this is a minor nit -- but the template exists to keep spike discipline consistent.

Related -- missing project slug. Template requires a project-slug line (e.g., landscaping-assistant -- the project this affects). The issue lists issue references and a doc file path but no pal-e-docs project slug. Add:

- landscaping-assistant -- project this affects

CONTENT QUALITY

Framing question: strong. "What is the simplest feature flag approach that lets us turn features off in prod without redeploying, and how does it compose with the four-role permission model?" -- this is clear, answerable, and correctly scoped as a "which approach" question per template guidance.

Sub-questions: comprehensive and well-ordered. The 8 sub-questions cover:

Philosophy (what gets flagged)
Implementation approaches (ENV, DB model, Flipper, YAML)
Minimum bar (no-redeploy toggle)
Granularity (global/role/user/business)
Admin UI (developer-only vs admin panel)
Flag lifecycle (cleanup policy)
Composition with roles (how flags + roles interact in views/controllers)
Testing (spec strategy for flags)

This is thorough without being bloated. The composition-with-roles question (#7) is the most important one and correctly references the interaction pattern.

One missing sub-question: multi-tenancy interaction. The granularity bullet mentions "Per-business (when multi-tenancy lands)" but does not dig into it as a sub-question. Given that the data model in docs/user-stories-auth.md defines Business as the multi-tenancy entity and the delivery phases show multi-tenancy landing in Phase 4, a sub-question worth adding:

Multi-tenancy readiness -- if flags are database-backed, should the FeatureFlag model be scoped to Business from the start, or added later? What is the migration cost of retrofitting per-business scoping?

This is not a blocker -- the spike can discover this organically -- but explicitly calling it out would ensure it is not missed.

Another angle worth considering: Hotwire/Turbo interaction. The app uses Hotwire with Turbo Streams (per the conflict handling section of user-stories-auth.md -- "admin sees the change in real-time via Turbo Streams"). Feature flags that gate UI elements need to interact correctly with Turbo frame updates. If a flag is toggled while a user has the page open, Turbo Stream broadcasts could show/hide elements that the flag state no longer permits. Worth a sub-question:

Turbo Streams interaction -- when a flag toggles, do Turbo Stream broadcasts respect the new flag state, or could stale frames show gated features?

Again, not a blocker for the spike ticket itself -- these are "nice to have" investigation angles.

Deliverables: concrete and verifiable. The docs file target (docs/feature-flags.md) is specific. The follow-up ticket deliverable correctly calls out updating existing tickets (#125, #123) if they need flag gating. Good.

Time-box: realistic. "1 session" is appropriate for a decision-focused spike that evaluates 4 known approaches against clear criteria. The "favor decision speed" note aligns with spike philosophy.

Related references: all verified. All four referenced issues exist and are open:

#125 (Stripe integration) -- correct, first flag candidate
#123 (Client request UI) -- correct, second flag candidate
#107 (Auth/roles) -- correct, flags compose with roles
#115 (Keycloak login) -- correct, auth switchover candidate

The doc file reference (docs/user-stories-auth.md) exists and contains the role permission matrix that flags would interact with.

SOP COMPLIANCE

All required template sections present (Type, Lineage, Repo, Question, Deliverables, Time-box, Related)
Deliverables use checkbox format (uses plain bullets instead)
Time-box includes escalation clause
Related includes project slug (landscaping-assistant)
No secrets or credentials in issue body
Referenced issues all exist and are open
Framing question is answerable (yes -- "which approach" format)
Deliverables include both required artifacts (docs file + follow-up tickets)

VERDICT: PASS (with nits)

The spike is well-written and ready to execute. The three template conformance nits are minor formatting/completeness issues, not structural problems. The content quality is strong -- the framing question is clear, sub-questions are comprehensive, and the composition-with-roles angle correctly identifies the key design tension.

Required fixes (template conformance):

Convert deliverable bullets to checkbox format (- [ ])
Add escalation clause to Time-box (or explicitly note it is N/A for solo-dev)
Add landscaping-assistant project slug to Related

Optional improvements (content quality):

Add sub-question on multi-tenancy readiness (per-Business flag scoping)
Add sub-question on Turbo Streams interaction with flag state changes

## Issue #129 Spike Template Review ### TEMPLATE CONFORMANCE Checked against `template-issue-spike` from pal-e-docs. Section-by-section: | Section | Status | Notes | |---------|--------|-------| | Type | PASS | "Spike" -- correct | | Lineage | PASS | "Standalone -- emerged from operational need" -- well framed | | Repo | PASS | `ldraney/landscaping-assistant` -- correct | | Question | PASS | Clear framing question with 8 sub-questions | | Deliverables | NIT | See below | | Time-box | NIT | See below | | Related | NIT | See below | **Deliverables -- missing checkbox format.** The template specifies checkboxes (`- [ ]`) for both required deliverables. The issue uses plain bullet points (`*`). This is cosmetic but matters for trackability -- Forgejo renders checkboxes as interactive task lists. Current: > `* docs/feature-flags.md created...` > `* At least one implementation ticket created...` Should be: > `- [ ] docs/feature-flags.md created...` > `- [ ] At least one implementation ticket created...` **Time-box -- missing escalation clause.** Template requires: "If time-box expires without answer: close spike, document findings in the docs file, escalate to Lucas for direction." The issue says only "1 session. Favor decision speed -- this is V1, we can evolve the approach later." The pragmatic note is good but the escalation path is absent. Since this is a solo-developer project and "escalate to Lucas" would mean escalating to yourself, this is a minor nit -- but the template exists to keep spike discipline consistent. **Related -- missing project slug.** Template requires a `project-slug` line (e.g., `landscaping-assistant` -- the project this affects). The issue lists issue references and a doc file path but no pal-e-docs project slug. Add: > `- landscaping-assistant -- project this affects` ### CONTENT QUALITY **Framing question: strong.** "What is the simplest feature flag approach that lets us turn features off in prod without redeploying, and how does it compose with the four-role permission model?" -- this is clear, answerable, and correctly scoped as a "which approach" question per template guidance. **Sub-questions: comprehensive and well-ordered.** The 8 sub-questions cover: 1. Philosophy (what gets flagged) 2. Implementation approaches (ENV, DB model, Flipper, YAML) 3. Minimum bar (no-redeploy toggle) 4. Granularity (global/role/user/business) 5. Admin UI (developer-only vs admin panel) 6. Flag lifecycle (cleanup policy) 7. Composition with roles (how flags + roles interact in views/controllers) 8. Testing (spec strategy for flags) This is thorough without being bloated. The composition-with-roles question (#7) is the most important one and correctly references the interaction pattern. **One missing sub-question: multi-tenancy interaction.** The granularity bullet mentions "Per-business (when multi-tenancy lands)" but does not dig into it as a sub-question. Given that the data model in `docs/user-stories-auth.md` defines `Business` as the multi-tenancy entity and the delivery phases show multi-tenancy landing in Phase 4, a sub-question worth adding: > **Multi-tenancy readiness** -- if flags are database-backed, should the `FeatureFlag` model be scoped to `Business` from the start, or added later? What is the migration cost of retrofitting per-business scoping? This is not a blocker -- the spike can discover this organically -- but explicitly calling it out would ensure it is not missed. **Another angle worth considering: Hotwire/Turbo interaction.** The app uses Hotwire with Turbo Streams (per the conflict handling section of `user-stories-auth.md` -- "admin sees the change in real-time via Turbo Streams"). Feature flags that gate UI elements need to interact correctly with Turbo frame updates. If a flag is toggled while a user has the page open, Turbo Stream broadcasts could show/hide elements that the flag state no longer permits. Worth a sub-question: > **Turbo Streams interaction** -- when a flag toggles, do Turbo Stream broadcasts respect the new flag state, or could stale frames show gated features? Again, not a blocker for the spike ticket itself -- these are "nice to have" investigation angles. **Deliverables: concrete and verifiable.** The docs file target (`docs/feature-flags.md`) is specific. The follow-up ticket deliverable correctly calls out updating existing tickets (#125, #123) if they need flag gating. Good. **Time-box: realistic.** "1 session" is appropriate for a decision-focused spike that evaluates 4 known approaches against clear criteria. The "favor decision speed" note aligns with spike philosophy. **Related references: all verified.** All four referenced issues exist and are open: - #125 (Stripe integration) -- correct, first flag candidate - #123 (Client request UI) -- correct, second flag candidate - #107 (Auth/roles) -- correct, flags compose with roles - #115 (Keycloak login) -- correct, auth switchover candidate The doc file reference (`docs/user-stories-auth.md`) exists and contains the role permission matrix that flags would interact with. ### SOP COMPLIANCE - [x] All required template sections present (Type, Lineage, Repo, Question, Deliverables, Time-box, Related) - [ ] Deliverables use checkbox format (uses plain bullets instead) - [ ] Time-box includes escalation clause - [ ] Related includes project slug (`landscaping-assistant`) - [x] No secrets or credentials in issue body - [x] Referenced issues all exist and are open - [x] Framing question is answerable (yes -- "which approach" format) - [x] Deliverables include both required artifacts (docs file + follow-up tickets) ### VERDICT: PASS (with nits) The spike is well-written and ready to execute. The three template conformance nits are minor formatting/completeness issues, not structural problems. The content quality is strong -- the framing question is clear, sub-questions are comprehensive, and the composition-with-roles angle correctly identifies the key design tension. **Required fixes (template conformance):** 1. Convert deliverable bullets to checkbox format (`- [ ]`) 2. Add escalation clause to Time-box (or explicitly note it is N/A for solo-dev) 3. Add `landscaping-assistant` project slug to Related **Optional improvements (content quality):** 1. Add sub-question on multi-tenancy readiness (per-Business flag scoping) 2. Add sub-question on Turbo Streams interaction with flag state changes