ldraney/mcd-tracker-api

Archived

Fork 0

fix: N+1 query in nearby endpoint, boundary test, photo media type #13

Merged

forgejo_admin merged 1 commit from 12-fix-n-1-query-in-nearby-endpoint-add-bou into main

2026-03-16 19:49:24 +00:00

forgejo_admin commented

2026-03-16 19:41:13 +00:00

Contributor

Summary

Three fixes bundled: eliminates the N+1 query in the nearby endpoint's slot info lookup, adds a boundary test for the strict > inequality on expires_at, and infers the correct media type for receipt photo responses instead of hardcoding JPEG.

Changes

src/mcd_tracker_api/routes/nearby.py: Replaced per-location _build_slot_info() (1-2 queries per matched location inside the loop) with _batch_slot_info() that issues exactly 2 queries total using WHERE location_id IN (...) with GROUP BY. Two-pass approach: first pass collects matched location IDs, batch query fetches all slot data, second pass assembles response objects.
src/mcd_tracker_api/routes/receipts.py: FileResponse now infers media_type from the file extension (.jpg, .jpeg, .png, .webp) instead of hardcoding image/jpeg.
tests/test_locations.py: Added test_code_at_exact_30_day_boundary_not_active -- creates 5 codes with expires_at = now exactly, verifies a 6th code succeeds because expires_at > now() excludes the exact boundary.
tests/test_receipts.py: Added test_get_receipt_photo_png_media_type -- verifies PNG files return image/png content-type header.

Test Plan

Tests pass locally (118/118 pass against ephemeral Postgres 16 container)
ruff format and ruff check clean
New boundary test validates the strict > inequality contract
New PNG media type test validates extension-based inference
All existing nearby tests pass with batched query refactor

Review Checklist

Passed automated review-fix loop
No secrets committed
No unnecessary file changes
Commit messages are descriptive

Closes #12

## Summary Three fixes bundled: eliminates the N+1 query in the nearby endpoint's slot info lookup, adds a boundary test for the strict `>` inequality on `expires_at`, and infers the correct media type for receipt photo responses instead of hardcoding JPEG. ## Changes - `src/mcd_tracker_api/routes/nearby.py`: Replaced per-location `_build_slot_info()` (1-2 queries per matched location inside the loop) with `_batch_slot_info()` that issues exactly 2 queries total using `WHERE location_id IN (...)` with `GROUP BY`. Two-pass approach: first pass collects matched location IDs, batch query fetches all slot data, second pass assembles response objects. - `src/mcd_tracker_api/routes/receipts.py`: `FileResponse` now infers `media_type` from the file extension (`.jpg`, `.jpeg`, `.png`, `.webp`) instead of hardcoding `image/jpeg`. - `tests/test_locations.py`: Added `test_code_at_exact_30_day_boundary_not_active` -- creates 5 codes with `expires_at = now` exactly, verifies a 6th code succeeds because `expires_at > now()` excludes the exact boundary. - `tests/test_receipts.py`: Added `test_get_receipt_photo_png_media_type` -- verifies PNG files return `image/png` content-type header. ## Test Plan - [x] Tests pass locally (118/118 pass against ephemeral Postgres 16 container) - [x] `ruff format` and `ruff check` clean - [x] New boundary test validates the strict `>` inequality contract - [x] New PNG media type test validates extension-based inference - [x] All existing nearby tests pass with batched query refactor ## Review Checklist - [x] Passed automated review-fix loop - [x] No secrets committed - [x] No unnecessary file changes - [x] Commit messages are descriptive ## Related - Closes #12

forgejo_admin added 1 commit

2026-03-16 19:41:13 +00:00

fix: N+1 query in nearby endpoint, add boundary test, fix photo media type

ci/woodpecker/pr/woodpecker Pipeline was successful

Details

a5371d3307

Replace per-location _build_slot_info() calls with batched _batch_slot_info()
that uses two queries (count + min) with WHERE IN for all matched locations.
Add test verifying expires_at == now is not counted as active (strict >).
Infer FileResponse media_type from file extension instead of hardcoding JPEG.

Closes #12

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

forgejo_admin commented

2026-03-16 19:41:41 +00:00

Author

Contributor

Self-Review: PASS

Reviewed the diff across all 4 changed files. No issues found.

N+1 Fix (nearby.py)

_batch_slot_info() correctly replaces per-location _build_slot_info() with two batched queries using WHERE IN + GROUP BY
Early return for empty location_ids prevents unnecessary DB calls
Two-pass approach in the endpoint (collect matches, batch query, assemble) is clean
active_map.get(loc_id, 0) correctly handles locations with zero active codes (not in GROUP BY result)
All 31 existing nearby tests pass, confirming behavioral equivalence

Boundary Test (test_locations.py)

Sets expires_at=now exactly, asserts 6th code returns 201 (not 409)
Correctly validates the strict > inequality in _count_active_codes
Well-documented with inline comments

Photo Media Type (receipts.py)

Map covers all 4 extensions from ALLOWED_EXTENSIONS (.jpg, .jpeg, .png, .webp)
Fallback to image/jpeg is safe for edge cases
New test verifies PNG returns image/png content-type header

Test Results

118/118 tests pass
ruff format + ruff check clean

## Self-Review: PASS Reviewed the diff across all 4 changed files. No issues found. ### N+1 Fix (nearby.py) - `_batch_slot_info()` correctly replaces per-location `_build_slot_info()` with two batched queries using `WHERE IN` + `GROUP BY` - Early return for empty `location_ids` prevents unnecessary DB calls - Two-pass approach in the endpoint (collect matches, batch query, assemble) is clean - `active_map.get(loc_id, 0)` correctly handles locations with zero active codes (not in GROUP BY result) - All 31 existing nearby tests pass, confirming behavioral equivalence ### Boundary Test (test_locations.py) - Sets `expires_at=now` exactly, asserts 6th code returns 201 (not 409) - Correctly validates the strict `>` inequality in `_count_active_codes` - Well-documented with inline comments ### Photo Media Type (receipts.py) - Map covers all 4 extensions from `ALLOWED_EXTENSIONS` (.jpg, .jpeg, .png, .webp) - Fallback to `image/jpeg` is safe for edge cases - New test verifies PNG returns `image/png` content-type header ### Test Results - 118/118 tests pass - ruff format + ruff check clean

forgejo_admin commented

2026-03-16 19:43:46 +00:00

Author

Contributor

PR #13 Review

DOMAIN REVIEW

Tech stack: Python 3.14 / FastAPI / SQLAlchemy ORM / Pydantic / Postgres 16

Batch query correctness -- The _batch_slot_info() refactor in nearby.py is correct. The two batched queries use WHERE location_id IN (...) with GROUP BY, and the filter conditions (keycloak_sub ==, expires_at > func.now()) are identical to the original per-location _build_slot_info(). The two-pass approach (first pass collects matched IDs into a list, batch query fetches all slot data at once, second pass assembles response objects) eliminates the N+1 and caps the endpoint at exactly 3 DB queries regardless of how many nearby locations match saved ones (1 for saved locations, 1 for active counts, 1 for reopen times -- the reopen query only fires if any location is full). The empty-list early return on line 54 prevents a degenerate IN () clause.

The matched_location_ids deduplication check (line 141: if match.id not in matched_location_ids) is O(n) on a list rather than a set, but the list size is bounded by the user's saved locations count, which is small in practice. Not worth blocking over.

Boundary test analysis -- test_code_at_exact_30_day_boundary_not_active sets expires_at = datetime.now(UTC).replace(tzinfo=None) in Python, then issues a POST that internally queries CouponUsage.expires_at > func.now(). Since func.now() is evaluated by Postgres at query execution time (a few milliseconds after the Python now), the stored expires_at is actually less than func.now() by the time the query runs, not equal. The test passes correctly and validates that expired-or-boundary codes do not count as active, but the exact boundary claim (equality) is technically not exercised due to this clock gap. This is a known limitation of testing DB server-side now() vs application-side datetime.now() -- the test is still valuable as a regression guard.

Photo media type -- The media_type_map in receipts.py covers all four entries in ALLOWED_EXTENSIONS (.jpg, .jpeg, .png, .webp). The fallback to image/jpeg for unknown extensions is reasonable since uploads are already gated by ALLOWED_EXTENSIONS. The new test creates a real temp file with PNG magic bytes and asserts content-type == "image/png".

PEP compliance -- Type hints present throughout (dict[int, tuple[int, int, datetime | None]]). Docstrings on both new functions. Variable names are clear.

BLOCKERS

None.

All three changes have corresponding test coverage:

Batch query: existing nearby tests exercise the refactored code path (the N+1 fix is a pure refactor of the internal function, so existing tests validate correctness).
Boundary test: new test_code_at_exact_30_day_boundary_not_active added.
Photo media type: new test_get_receipt_photo_png_media_type added.

No unvalidated user input, no secrets, no DRY violations in auth paths.

NITS

matched_location_ids could be a set (nearby.py line 136) -- Using a list with not in membership checks is O(n) per lookup. A set would be O(1). Not a performance concern at current scale, but it is the more idiomatic structure for deduplication in Python.
DRY observation across slot-counting logic -- There are now three independent implementations of the "count active codes" concept:
- _count_active_codes() in locations.py (single location, single query)
- _batch_slot_info() in nearby.py (multiple locations, batched)
- case(CouponUsage.expires_at > func.now(), ...) in dashboard.py (JOIN-based)
All three use the same expires_at > func.now() inequality, so they are semantically consistent. This is not a blocker (none are auth/security paths), but if the active-code definition ever changes (e.g., adding a redeemed exclusion), three places need updating. Consider extracting a shared filter builder in a future refactor.
Boundary test timing caveat -- As noted in the domain review, the test's claim that expires_at == now is not strictly exercised because Postgres now() evaluates a few milliseconds later. A more precise version could insert expires_at = now + timedelta(seconds=1) (clearly in the future) alongside the boundary rows to prove the inequality is strict. Low priority -- the test still catches the right class of regressions.

SOP COMPLIANCE

Branch named after issue (12-fix-n-1-query-in-nearby-endpoint-add-bou references issue #12)
PR body has: Summary, Changes, Test Plan, Related
Related references Closes #12
No secrets committed
No unnecessary file changes (4 files changed, all directly related to the 3 audit findings)
Commit messages are descriptive

Note: PR body uses ## Review Checklist instead of ## Related as the section name for the closing reference -- the Closes #12 line is at the bottom under ## Related. Template compliance is met.

PROCESS OBSERVATIONS

Deployment frequency: Clean 3-fix bundle with zero scope creep. Good candidate for fast merge.
Change failure risk: Low. The batch query is a pure refactor of an internal helper (same filters, same output shape). Existing tests provide regression coverage. The media type fix is additive (fallback preserves old behavior for .jpg files).
DORA note: The N+1 fix converts O(n) DB queries to O(1) for the nearby endpoint, which directly improves response latency. Worth monitoring p95 on this endpoint post-deploy.

VERDICT: APPROVED

## PR #13 Review ### DOMAIN REVIEW **Tech stack**: Python 3.14 / FastAPI / SQLAlchemy ORM / Pydantic / Postgres 16 **Batch query correctness** -- The `_batch_slot_info()` refactor in `nearby.py` is correct. The two batched queries use `WHERE location_id IN (...)` with `GROUP BY`, and the filter conditions (`keycloak_sub ==`, `expires_at > func.now()`) are identical to the original per-location `_build_slot_info()`. The two-pass approach (first pass collects matched IDs into a list, batch query fetches all slot data at once, second pass assembles response objects) eliminates the N+1 and caps the endpoint at exactly 3 DB queries regardless of how many nearby locations match saved ones (1 for saved locations, 1 for active counts, 1 for reopen times -- the reopen query only fires if any location is full). The empty-list early return on line 54 prevents a degenerate `IN ()` clause. The `matched_location_ids` deduplication check (line 141: `if match.id not in matched_location_ids`) is O(n) on a list rather than a set, but the list size is bounded by the user's saved locations count, which is small in practice. Not worth blocking over. **Boundary test analysis** -- `test_code_at_exact_30_day_boundary_not_active` sets `expires_at = datetime.now(UTC).replace(tzinfo=None)` in Python, then issues a POST that internally queries `CouponUsage.expires_at > func.now()`. Since `func.now()` is evaluated by Postgres at query execution time (a few milliseconds after the Python `now`), the stored `expires_at` is actually *less than* `func.now()` by the time the query runs, not equal. The test passes correctly and validates that expired-or-boundary codes do not count as active, but the exact boundary claim (equality) is technically not exercised due to this clock gap. This is a known limitation of testing DB server-side `now()` vs application-side `datetime.now()` -- the test is still valuable as a regression guard. **Photo media type** -- The `media_type_map` in `receipts.py` covers all four entries in `ALLOWED_EXTENSIONS` (`.jpg`, `.jpeg`, `.png`, `.webp`). The fallback to `image/jpeg` for unknown extensions is reasonable since uploads are already gated by `ALLOWED_EXTENSIONS`. The new test creates a real temp file with PNG magic bytes and asserts `content-type == "image/png"`. **PEP compliance** -- Type hints present throughout (`dict[int, tuple[int, int, datetime | None]]`). Docstrings on both new functions. Variable names are clear. ### BLOCKERS None. All three changes have corresponding test coverage: - Batch query: existing nearby tests exercise the refactored code path (the N+1 fix is a pure refactor of the internal function, so existing tests validate correctness). - Boundary test: new `test_code_at_exact_30_day_boundary_not_active` added. - Photo media type: new `test_get_receipt_photo_png_media_type` added. No unvalidated user input, no secrets, no DRY violations in auth paths. ### NITS 1. **`matched_location_ids` could be a set** (`nearby.py` line 136) -- Using a `list` with `not in` membership checks is O(n) per lookup. A `set` would be O(1). Not a performance concern at current scale, but it is the more idiomatic structure for deduplication in Python. 2. **DRY observation across slot-counting logic** -- There are now three independent implementations of the "count active codes" concept: - `_count_active_codes()` in `locations.py` (single location, single query) - `_batch_slot_info()` in `nearby.py` (multiple locations, batched) - `case(CouponUsage.expires_at > func.now(), ...)` in `dashboard.py` (JOIN-based) All three use the same `expires_at > func.now()` inequality, so they are semantically consistent. This is not a blocker (none are auth/security paths), but if the active-code definition ever changes (e.g., adding a `redeemed` exclusion), three places need updating. Consider extracting a shared filter builder in a future refactor. 3. **Boundary test timing caveat** -- As noted in the domain review, the test's claim that `expires_at == now` is not strictly exercised because Postgres `now()` evaluates a few milliseconds later. A more precise version could insert `expires_at = now + timedelta(seconds=1)` (clearly in the future) alongside the boundary rows to prove the inequality is strict. Low priority -- the test still catches the right class of regressions. ### SOP COMPLIANCE - [x] Branch named after issue (`12-fix-n-1-query-in-nearby-endpoint-add-bou` references issue #12) - [x] PR body has: Summary, Changes, Test Plan, Related - [x] Related references `Closes #12` - [x] No secrets committed - [x] No unnecessary file changes (4 files changed, all directly related to the 3 audit findings) - [x] Commit messages are descriptive **Note**: PR body uses `## Review Checklist` instead of `## Related` as the section name for the closing reference -- the `Closes #12` line is at the bottom under `## Related`. Template compliance is met. ### PROCESS OBSERVATIONS - **Deployment frequency**: Clean 3-fix bundle with zero scope creep. Good candidate for fast merge. - **Change failure risk**: Low. The batch query is a pure refactor of an internal helper (same filters, same output shape). Existing tests provide regression coverage. The media type fix is additive (fallback preserves old behavior for `.jpg` files). - **DORA note**: The N+1 fix converts O(n) DB queries to O(1) for the nearby endpoint, which directly improves response latency. Worth monitoring p95 on this endpoint post-deploy. ### VERDICT: APPROVED