Bug: Alembic migration chain stuck — 020 applied but not stamped, app down #184

Closed
opened 2026-03-27 06:04:33 +00:00 by forgejo_admin · 1 comment

Type

Bug

Lineage

Discovered during westside deploy session. App is down — pod crash-loops on migration 020 (column already exists).

Repo

forgejo_admin/basketball-api

What Broke

Alembic migration 020 (custom_notes column) was applied to the database but the alembic_version table was never stamped past 018. After the chain was linearized to 018→019→020→..., alembic tries to run 020 again and fails because the column already exists.

Migration state:

Migration Expected DB State
019 player_teams player_teams table Not applied (table missing)
020 custom_notes custom_notes column Applied but NOT stamped
021 oauth_tokens oauth_tokens table Not applied
022 merge_heads passthrough Not applied
023 jersey backfill data only Not applied

Repro Steps

  1. Pod starts, runs alembic upgrade
  2. Alembic sees current version = 018
  3. Runs 019 (succeeds — player_teams doesn't exist)
  4. Runs 020 (FAILS — custom_notes column already exists)
  5. Pod crash-loops

Expected Behavior

Migration 020 should be idempotent — check if column exists before adding. Or: alembic should be stamped to 020 to skip the already-applied migration.

Environment

  • Cluster/namespace: basketball-api
  • Current alembic version stamped: 018
  • Actual DB state: 020 applied (column exists), 019/021/022/023 not applied

Acceptance Criteria

  • Migration 020 is idempotent (checks column existence before ALTER TABLE)
  • Alembic runs cleanly from 018 through 023
  • App starts without crash-loop
  • All 5 migrations (019-023) applied and stamped

Test Expectations

  • pytest tests/ passes (existing 555 tests)
  • Alembic upgrade runs cleanly on a fresh DB
  • Alembic upgrade runs cleanly on a DB with 020 pre-applied (the current state)
  • Run command: pytest tests/ -x

Constraints

  • Do NOT stamp the DB manually — fix the migration code to be idempotent
  • Do NOT alter prod DB directly — migration goes through the normal deploy pipeline
  • The app is DOWN — this is high priority

Checklist

  • PR opened
  • Tests pass
  • No unrelated changes
### Type Bug ### Lineage Discovered during westside deploy session. App is down — pod crash-loops on migration 020 (column already exists). ### Repo `forgejo_admin/basketball-api` ### What Broke Alembic migration 020 (custom_notes column) was applied to the database but the alembic_version table was never stamped past 018. After the chain was linearized to 018→019→020→..., alembic tries to run 020 again and fails because the column already exists. Migration state: | Migration | Expected | DB State | |-----------|----------|----------| | 019 player_teams | player_teams table | Not applied (table missing) | | 020 custom_notes | custom_notes column | Applied but NOT stamped | | 021 oauth_tokens | oauth_tokens table | Not applied | | 022 merge_heads | passthrough | Not applied | | 023 jersey backfill | data only | Not applied | ### Repro Steps 1. Pod starts, runs alembic upgrade 2. Alembic sees current version = 018 3. Runs 019 (succeeds — player_teams doesn't exist) 4. Runs 020 (FAILS — custom_notes column already exists) 5. Pod crash-loops ### Expected Behavior Migration 020 should be idempotent — check if column exists before adding. Or: alembic should be stamped to 020 to skip the already-applied migration. ### Environment - Cluster/namespace: basketball-api - Current alembic version stamped: 018 - Actual DB state: 020 applied (column exists), 019/021/022/023 not applied ### Acceptance Criteria - [ ] Migration 020 is idempotent (checks column existence before ALTER TABLE) - [ ] Alembic runs cleanly from 018 through 023 - [ ] App starts without crash-loop - [ ] All 5 migrations (019-023) applied and stamped ### Test Expectations - [ ] `pytest tests/` passes (existing 555 tests) - [ ] Alembic upgrade runs cleanly on a fresh DB - [ ] Alembic upgrade runs cleanly on a DB with 020 pre-applied (the current state) - Run command: `pytest tests/ -x` ### Constraints - Do NOT stamp the DB manually — fix the migration code to be idempotent - Do NOT alter prod DB directly — migration goes through the normal deploy pipeline - The app is DOWN — this is high priority ### Checklist - [ ] PR opened - [ ] Tests pass - [ ] No unrelated changes ### Related - `project-westside-basketball` - forgejo_admin/pal-e-platform#184 — incident context
Author
Owner

Scope Review: READY

Review note: review-449-2026-03-26
Scope is solid. All bug template sections present, file targets verified (020 confirmed non-idempotent add_column), migration chain 018-023 validated, no blocking dependencies. Acceptance criteria are agent-verifiable.

  • Blast radius note: 30+ non-idempotent add_column calls across basketball-api migrations; same pattern in mcd-tracker-api. Consider follow-up hardening ticket.
## Scope Review: READY Review note: `review-449-2026-03-26` Scope is solid. All bug template sections present, file targets verified (020 confirmed non-idempotent add_column), migration chain 018-023 validated, no blocking dependencies. Acceptance criteria are agent-verifiable. - Blast radius note: 30+ non-idempotent add_column calls across basketball-api migrations; same pattern in mcd-tracker-api. Consider follow-up hardening ticket.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
forgejo_admin/basketball-api#184
No description provided.