feat: pre-seed McDonald's locations, query Postgres instead of Overpass #21
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "20-pre-seed-mcdonald-s-locations-smart-prox"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Replace real-time Overpass API calls in
/locations/nearbywith queries against a pre-seededmcdonalds_locationstable. Overpass is now only used by the seed script, never at runtime. The response schema (NearbyResponse/NearbyLocation) is unchanged so the frontend requires no updates.Changes
src/mcd_tracker_api/models.py-- AddedMcDonaldsLocationmodel withosm_id(unique, BigInteger),name,address,city,state,latitude,longitudealembic/versions/004_add_mcdonalds_locations.py-- Reversible migration creating themcdonalds_locationstable with unique index onosm_idand composite index on(latitude, longitude)src/mcd_tracker_api/scripts/seed_locations.py-- Idempotent seed script that batch-queries Overpass and upserts into Postgres. Configurable center point and radius (default: Denver metro 50km). Runnable viapython -m mcd_tracker_api.scripts.seed_locationssrc/mcd_tracker_api/routes/nearby.py-- Refactored to querymcdonalds_locationstable with bounding-box pre-filter + exact haversine, instead of calling Overpass at runtime. Endpoint changed fromasyncto sync (no more external HTTP). Saved location matching and slot info logic preserved unchangedalembic/env.py-- ImportMcDonaldsLocationso Alembic sees the new tabletests/conftest.py-- ImportMcDonaldsLocationsoBase.metadata.create_allincludes the new tabletests/test_nearby.py-- Rewritten to seedMcDonaldsLocationrows directly instead of mocking Overpass. Tests cover: basic queries, empty results, radius filtering, saved location matching, slot counts, distance sorting, field propagationtests/test_seed_locations.py-- New tests for seed script: Overpass response parsing, idempotent upsert, mixed insert/updateTest Plan
ruff formatandruff checkpass cleanReview Checklist
Related
project-mcd-trackerPR Review -- #21
Model & Migration
McDonaldsLocationusesBigIntegerforosm_id-- correct, OSM node IDs can exceed 32-bit range.osm_idand composite index on(latitude, longitude)support both upsert and bounding-box queries.Route Refactor
asyncto sync -- no more external HTTP calls at runtime.radius / 111_000degrees) followed by exact haversine is sound. The longitude approximation is slightly conservative at mid-latitudes (1 degree longitude = ~85km at lat 40, not 111km), which means the bounding box may be narrower than ideal on the longitude axis. This is a minor conservatism -- a few edge-case locations might be missed -- not a correctness bug. Acceptable per the constraint "don't add PostGIS."NearbyResponse,NearbyLocation) unchanged -- no frontend breakage.Seed Script
_parse_addressand_parse_namefromservices/overpass.py-- no duplication.osm_idwith proper rollback on failure.SEED_TIMEOUT = 120sappropriate for large batch queries.Tests
ruff formatandruff checkclean.Nit (non-blocking)
radius / (111_000 * cos(radians(lat)))for more accurate results at higher latitudes. Not required for Denver metro use case but worth noting for future nationwide seeding.VERDICT: APPROVE
PR #21 Review
DOMAIN REVIEW
Tech stack: Python / FastAPI / SQLAlchemy / Alembic / Pydantic / pytest
This PR replaces runtime Overpass API calls in
/locations/nearbywith queries against a pre-seededmcdonalds_locationsPostgres table. The Overpass service module is preserved for the seed script. The response schema (NearbyResponse/NearbyLocation) is unchanged, so the frontend requires no updates.Architecture assessment: Sound decision. Moving an external API dependency out of the runtime request path eliminates a latency and reliability bottleneck. The bounding-box pre-filter + haversine refinement is a reasonable approach for the current data volume without requiring PostGIS.
Model (
McDonaldsLocation): Well-structured. UsesBigIntegerforosm_id(correct for OSM node IDs), proper nullable annotations,Mapped[]type hints throughout. Theserver_default=func.now()oncreated_atis consistent with the existing model patterns.Migration (
004): Correct chain (003 -> 004). Reversible with properdowngrade(). Indexes onosm_id(unique) and(latitude, longitude)composite are appropriate. Migration matches the model definition exactly.Endpoint refactor (
nearby.py): Clean conversion fromasyncto sync (justified since no external HTTP calls remain). The bounding-box pre-filter usingradius / 111_000degrees-to-metres approximation is acceptable for the use case -- it is intentionally generous (a pre-filter, not exact), and the haversine refinement handles precision. Input validation via FastAPIQuery(ge=, le=)constraints is correct and unchanged.Seed script (
seed_locations.py): Properly structured withfetch_locations()andupsert_locations()separated for testability. Idempotent upsert onosm_idnatural key. Session management follows try/commit/except-rollback/finally-close pattern correctly. The script imports_parse_addressand_parse_namefrom the overpass module, reusing existing parsing logic (DRY).Session management note for
upsert_locations: The function creates its own engine and session from a database URL parameter. In tests, this means it operates on a separate session from the test'sdbfixture. This works because the testsetup_dbfixture usescreate_all/drop_allat the table level (not transactions), and theupsert_locationsfunction commits its own transaction. Thedbfixture then reads from the same physical database. This is functional but worth being aware of for future test maintenance.Test quality:
test_nearby.py-- 14 tests covering: basic queries, empty results, radius filtering, custom radius, saved location matching (within threshold, distant, no-coords), slot counts (partial, full with next_reopen), distance calculation, sorting, field propagation, null fields. Comprehensive.test_seed_locations.py-- 6 tests covering: Overpass response parsing, non-node element skipping, empty response, insert, idempotent upsert, mixed insert/update. Good coverage of the seed path.TestNearbyErrorsclass (Overpass 503 error tests) -- correct since the endpoint no longer calls Overpass.test_overpass.py(10 tests) preserved and unmodified -- correct since the service module is unchanged.PEP compliance: Type hints throughout (PEP 484). Docstrings present on all public functions and classes (PEP 257). Formatting appears ruff-compliant.
BLOCKERS
None.
ge,le).get_current_userdependency.NITS
Stale schema docstrings (
src/mcd_tracker_api/schemas.pylines 117, 120, 129): The section comment still says# Nearby locations schema (Overpass / OSM),SavedLocationMatchdocstring says "nearby OSM location", andNearbyLocationdocstring says "A McDonald's found via Overpass". These should be updated to reflect the new data source (pre-seeded Postgres). Non-blocking since the schema structure is unchanged, but the comments are now misleading.Private function imports in seed script (
seed_locations.pyline 24-28): The seed script imports_parse_addressand_parse_namefromservices.overpass. The underscore prefix conventionally signals "internal use only." If these are now part of the public API surface (consumed by the seed script), consider removing the underscore prefix or extracting them to a shared parsing utility. Non-blocking since both modules are within the same package.Longitude bounding-box approximation (
nearby.pyline 108): Theradius / 111_000approximation is the same factor for both latitude and longitude. At higher latitudes, a degree of longitude is significantly shorter than a degree of latitude. This means the bounding box could be too tight on the longitude axis and miss locations near the east/west edges. For Denver (~39.7N), 1 degree of longitude is roughly 85km vs 111km for latitude, so a 50km radius search could clip results. This is already partially mitigated by the haversine refinement only filtering results within the box, but the box itself could exclude valid candidates. Non-blocking for current Denver-area use case, but worth a TODO comment if the feature expands to higher latitudes.SOP COMPLIANCE
20-pre-seed-mcdonald-s-locations-smart-proxreferences issue #20project-mcd-trackerPROCESS OBSERVATIONS
Deployment Frequency: This PR is well-scoped to a single concern (data source migration for nearby endpoint). Clean merge candidate.
Change Failure Risk: Low. Response schema is unchanged, so frontend compatibility is preserved. The endpoint signature change from
asyncto sync is transparent to clients. The migration is additive (new table, no existing table modifications). The only risk is an emptymcdonalds_locationstable in production until the seed script is run -- the endpoint will simply return empty results, which is graceful degradation.Operational note: The seed script must be run after migration to populate data. The PR documentation mentions this in the seed script docstring but it would be good to have a deployment note or runbook entry for the initial seed.
VERDICT: APPROVED