Pattern

Geospatial / proximity lookup

When to reach for this

Reach for this when…

"Find N nearest" queries
Bounded-radius search ("within 2 km")
Real-time location tracking + dispatch
Map-based product UX

Not really this pattern when…

Locations are fixed and small in number (< 1000) — memory + linear scan is fine
Lookup is by region id, not coordinates (that's a plain index)

Good vs bad answer

Interviewer probe

“Find all drivers within 2 km of a pickup.”

Weak answer

"WHERE lat BETWEEN ? AND ? AND lng BETWEEN ? AND ?."

Strong answer

"Full scan at any real scale. Use a spatial index. Primary: PostGIS with a 2dsphere GiST index — ST_DWithin returns matches in milliseconds. Hot path: Redis GEO sharded by region, sub-ms GEORADIUS for the dispatch loop. Driver pings update both (Postgres as truth, Redis as fast path). Redis dies → falls back to PostGIS with higher latency. At planet scale (Uber territory) I'd move to H3 cells with custom per-cell sharded in-memory stores."

Why it wins: Rejects BETWEEN, names the right index, layers truth + hot tier, and calibrates to scale.

Cheat sheet

•Never BETWEEN on lat/lng.
•Default: PostGIS 2dsphere.
•Hot path: Redis GEO for sub-ms.
•Planet-scale: S2 or H3 custom.
•Geohash: query 9 neighbours + filter actual distance.
•Shard by region when one index outgrows a node.

Core concept

A spatial index answers "find all points within radius R of (lat, lng)" efficiently. Every modern option beats two B-trees intersected.

The main systems:

PostGIS (Postgres + GiST/SP-GiST): R-tree-based. Supports points, polygons, complex shapes. Default when you're on Postgres.
MongoDB 2dsphere: simple API, JSON-native. Fine for most apps.
Redis GEO: geohash-backed sorted set; sub-ms GEORADIUS. Perfect for hot dispatch tier.
Elasticsearch geo_point: combine geo with full-text search and aggregations.
S2 / H3: cell-based libraries (Google / Uber). Planet-scale, uniform cells. Custom infrastructure.

Geohash edge-cell problem: two points near a cell boundary can have totally different prefixes despite being close. Always query the target cell + 8 neighbours, then filter by actual Haversine distance.

Architecture at scale (ride-sharing):

Truth store: PostGIS for durability.
Hot tier: Redis GEO sharded by region, rebuilt from truth on failure.
Dispatch reads from Redis; falls back to PostGIS on Redis outage (slower but works).

Canonical examples

→Uber / Lyft driver matching
→DoorDash restaurant discovery
→Yelp / Google Maps nearby search
→Pokemon Go / AR proximity
→ATM / store locator

Decision levers

Primary store

PostGIS if you're on Postgres. MongoDB 2dsphere if Mongo. For pure location workloads at high scale, a Redis-first design with a durable truth store.

Hot-path cache

Redis GEO for sub-ms. Size: 1M points ≈ 200 MB in a Redis instance. Shard by region to keep each shard small.

Query shape

Radius vs bounding box vs kNN. Radius + kNN most common in dispatch. Bounding box for map viewport queries.

Scale tier

< 10M points globally: single PostGIS index. 10–100M: sharded PostGIS + Redis GEO. 100M+ with ride-dispatch QPS: S2/H3 custom (Uber territory).

Failure modes

BETWEEN on lat/lng

Two B-trees intersected. Full scan once rows > 10k. Use a spatial index always.

Geohash without neighbour lookup

Two points 50m apart on opposite sides of a cell have different prefixes. Query 9 cells + filter by actual distance.

Giant single indexAdvanced

100M points in one index is painful. Shard by region / country / S2 cell-level.

geometry vs geography confusionAdvanced

PostGIS geometry uses planar math (wrong for lat/lng at distance); geography uses geodesic (right). Mixing silently breaks.

Drills

Why does BETWEEN fail?Reveal

It's two independent B-tree lookups intersected. With N points in a latitude band and M in a longitude band, you fetch N+M and intersect — linear in bands, not in the answer size. A spatial index walks only cells that overlap the query region.

kNN on a geospatial index — how?Reveal

Priority-queue walk of nearby cells. Start at the target cell; expand to neighbours; keep a running top-K heap; stop when no undiscovered cell can contain a point closer than the current K-th. PostGIS: ORDER BY location <-> target LIMIT K uses the spatial index natively.

11% complete

Current

When to reach for this

Step 1 of 9

Good vs bad answer

Jump to next

All patterns

Pattern

Geospatial / proximity lookup

When to reach for this

Reach for this when…

"Find N nearest" queries
Bounded-radius search ("within 2 km")
Real-time location tracking + dispatch
Map-based product UX

Not really this pattern when…

Locations are fixed and small in number (< 1000) — memory + linear scan is fine
Lookup is by region id, not coordinates (that's a plain index)

Good vs bad answer

Interviewer probe

“Find all drivers within 2 km of a pickup.”

Weak answer

"WHERE lat BETWEEN ? AND ? AND lng BETWEEN ? AND ?."

Strong answer

Why it wins: Rejects BETWEEN, names the right index, layers truth + hot tier, and calibrates to scale.

Cheat sheet

•Never BETWEEN on lat/lng.
•Default: PostGIS 2dsphere.
•Hot path: Redis GEO for sub-ms.
•Planet-scale: S2 or H3 custom.
•Geohash: query 9 neighbours + filter actual distance.
•Shard by region when one index outgrows a node.

Core concept

A spatial index answers "find all points within radius R of (lat, lng)" efficiently. Every modern option beats two B-trees intersected.

The main systems:

PostGIS (Postgres + GiST/SP-GiST): R-tree-based. Supports points, polygons, complex shapes. Default when you're on Postgres.
MongoDB 2dsphere: simple API, JSON-native. Fine for most apps.
Redis GEO: geohash-backed sorted set; sub-ms GEORADIUS. Perfect for hot dispatch tier.
Elasticsearch geo_point: combine geo with full-text search and aggregations.
S2 / H3: cell-based libraries (Google / Uber). Planet-scale, uniform cells. Custom infrastructure.

Architecture at scale (ride-sharing):

Truth store: PostGIS for durability.
Hot tier: Redis GEO sharded by region, rebuilt from truth on failure.
Dispatch reads from Redis; falls back to PostGIS on Redis outage (slower but works).

Canonical examples

→Uber / Lyft driver matching
→DoorDash restaurant discovery
→Yelp / Google Maps nearby search
→Pokemon Go / AR proximity
→ATM / store locator

Decision levers

Primary store

PostGIS if you're on Postgres. MongoDB 2dsphere if Mongo. For pure location workloads at high scale, a Redis-first design with a durable truth store.

Hot-path cache

Redis GEO for sub-ms. Size: 1M points ≈ 200 MB in a Redis instance. Shard by region to keep each shard small.

Query shape

Radius vs bounding box vs kNN. Radius + kNN most common in dispatch. Bounding box for map viewport queries.

Scale tier

< 10M points globally: single PostGIS index. 10–100M: sharded PostGIS + Redis GEO. 100M+ with ride-dispatch QPS: S2/H3 custom (Uber territory).

Failure modes

BETWEEN on lat/lng

Two B-trees intersected. Full scan once rows > 10k. Use a spatial index always.

Geohash without neighbour lookup

Two points 50m apart on opposite sides of a cell have different prefixes. Query 9 cells + filter by actual distance.

Giant single indexAdvanced

100M points in one index is painful. Shard by region / country / S2 cell-level.

geometry vs geography confusionAdvanced

PostGIS geometry uses planar math (wrong for lat/lng at distance); geography uses geodesic (right). Mixing silently breaks.

Drills

Why does BETWEEN fail?Reveal

kNN on a geospatial index — how?Reveal

11% complete

Current

When to reach for this

Step 1 of 9

Good vs bad answer

Jump to next