Outbox processor dead since March 25 — 23 unsent welcome emails #402
Labels
No labels
domain:backend
domain:devops
domain:frontend
status:approved
status:in-progress
status:needs-fix
status:qa
type:bug
type:devops
type:feature
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
ldraney/basketball-api#402
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Type
Bug
Lineage
Standalone — discovered during post-signing user flow investigation (2026-04-08).
Repo
forgejo_admin/basketball-apiWhat Broke
The outbox processor stopped processing
contract_signedevents on 2026-03-25. Since then, 23 welcome emails have not been sent (21 pending, 2 failed). Every family that signed a contract in the last two weeks got radio silence — no welcome email, no GroupMe invite link.Root cause: There is no Kubernetes CronJob deployed to poll the outbox. The only processing path is a fire-and-forget HTTP ping from westside-contracts (
POST /admin/process-outbox?tenant_id=1), which silently fails with no alerting. The twofailedevents (David Kaneko 2026-03-26, Tristen Thorn 2026-03-27) suggest the Gmail OAuth token expired around that time (Google app in Testing mode, 7-day token expiry). After those failures, subsequent events were never even attempted.Current state (as of 2026-04-08):
Gmail OAuth is currently healthy — announcement emails sent successfully today (2026-04-08).
Additional data issue: Outbox event #13 (Creed Draney Jr) has
team_id=11in its payload, but only teams 1–7 exist. The processor falls back gracefully ("Westside Kings & Queens", no GroupMe link), but the stale team_id should be corrected before draining.Repro Steps
status=pendingExpected Behavior
Environment
k8s/directorysrc/basketball_api/services/outbox.py— code is correct, just never invokedAcceptance Criteria
k8s/that calls/admin/process-outbox?tenant_id=1on a schedule (every 5 min)failedevents have a retry mechanism (reset to pending after N minutes, with max retry count)Checklist
Related
westside-basketball— project this affectssrc/basketball_api/services/outbox.pysrc/basketball_api/services/email.py:888-1010