← System Design
🚕
System Design

Uber Ride Request

Driver matching, geo-spatial sharding, and real-time location streams at city scale.

TL;DR

When you tap "Request Ride," a geo-spatially sharded matching system finds drivers within a 5-mile radius using a quadtree or H3 hex grid, scores them by ETA + driver acceptance rate + surge pricing, dispatches the request via a fan-out, and tracks the resulting trip with real-time location streams over WebSocket — all backed by a Kafka event log so the rest of the company can react asynchronously.

When to use

Any system that matches geographically dispersed supply to real-time demand — ride-hailing, food delivery, on-demand grocery, last-mile logistics, dynamic field service, even multiplayer game matchmaking with proximity. The core primitives (geo-indexing, scoring, fan-out dispatch, real-time location streams) generalize way beyond Uber.

Why this problem is harder than it looks

When you tap Request Ride, three constraints fight each other:

  • Latency — the rider expects a driver within 5 seconds
  • Quality — assign the right driver, not just the nearest one
  • Scale — millions of drivers and riders, all moving, all the time

The full request lifecycle

sequenceDiagram
    autonumber
    participant R as Rider
    participant API as API Gateway
    participant M as Matching Service
    participant Geo as Geo-index<br/>(Redis + H3)
    participant D1 as Driver (top 1-3)
    participant D2 as Driver (next 5)
    participant T as Trip Service
    participant K as Kafka

    R->>API: Request ride
    API->>M: Match request
    M->>Geo: Drivers within 5mi of (lat,lng)
    Geo-->>M: ~100 candidates
    Note over M: Score by ETA + accept rate +<br/>cancellation + vehicle + surge
    M->>D1: Dispatch to top 3
    Note over M,D1: Wait 6s for accept
    D1-->>M: No accept (timeout)
    M->>D2: Expand to next 5
    D2-->>M: ✓ Driver accepts
    M->>T: Create trip (atomic UPDATE)
    T-->>R: Matched! Driver ETA 4 min
    T->>K: trip.matched event
    K-->>K: Pricing · Payments · Fraud ·<br/>Notifications · Analytics

Six round-trips, sub-second from tap to “matched.” Each step is its own scaling story.

1. The geo-spatial index: how you find candidates fast

Every driver’s phone publishes a location update every few seconds. Each update lands in a geo-spatial index that lets you ask:

Three index families dominate:

IndexHowUsed by
QuadtreeRecursive 2D subdivisionOlder Uber, classical games
S2 cellsHilbert curve on the sphereGoogle, Foursquare
H3 hexagonsHexagonal hierarchicalModern Uber, DoorDash

Uber settled on H3 — hexagons tile the sphere with minimal distortion and every cell has exactly six neighbors at equal distance. To find candidates within 5 miles, you compute the hex containing the rider, then expand k rings of neighbors. Each ring is a constant-time set operation.

rider_hex = h3.geoToH3(lat, lng, resolution=8)  // ~1 km hex
candidates = []
for ring in range(0, 5):
    for hex in h3.kRing(rider_hex, ring):
        candidates += redis.smembers("drivers:" + hex)

This drops 5 million → 100 candidates in microseconds.

2. The scoring function: which driver actually wins

Distance is a starting point, not the answer. The matching service scores each candidate on multiple axes:

  • ETA — accounting for traffic, not just Euclidean distance
  • Driver acceptance rate — drivers who accept frequently get prioritized
  • Cancellation history — drivers who cancel often get deprioritized
  • Vehicle type match — UberX vs Comfort vs XL
  • Surge zone — driver currently in a high-demand area gets surge multiplier
  • Streak / earnings target — if the driver is one ride from a bonus, they’re more likely to accept

The score is a weighted sum, often run through a small ML model. The output: top 10 candidates, ranked.

3. Tiered dispatch: fan-out without spam

If you ping all 10 candidates simultaneously, three problems:

  • Driver UX — they all see the same request, only one wins, the rest got a useless ping
  • Race conditions — two drivers tap “Accept” at the same time
  • Inefficiency — most of the candidates are noise

The solution: tiered fan-out.

1. Send to top 3 candidates
2. Wait 6 seconds for any accept
3. If none, expand to next 5
4. Wait another 6 seconds
5. If none, expand the radius and re-score

4. The state machine of a trip

stateDiagram-v2
    [*] --> REQUESTED
    REQUESTED --> DISPATCHING: matching starts
    DISPATCHING --> MATCHED: driver accepts
    MATCHED --> DRIVER_EN_ROUTE: driver moves toward rider
    DRIVER_EN_ROUTE --> DRIVER_ARRIVED: GPS within 50m
    DRIVER_ARRIVED --> ON_TRIP: rider taps "I'm in"
    ON_TRIP --> COMPLETED: trip end + payment
    DISPATCHING --> NO_DRIVERS: timeout, no accepts
    MATCHED --> RIDER_CANCELED: rider cancels
    MATCHED --> DRIVER_CANCELED: driver cancels
    ON_TRIP --> PAYMENT_FAILED: charge declines
    PAYMENT_FAILED --> COMPLETED: retry succeeds
    NO_DRIVERS --> [*]
    RIDER_CANCELED --> [*]
    DRIVER_CANCELED --> [*]
    COMPLETED --> [*]

5. The location stream: WebSockets all the way

Once matched, the rider’s app needs to see the driver’s car move. This is a soft real-time problem — updates every few seconds, latency under 1 second.

Architecture:

  • Driver’s phone holds a persistent WebSocket (or MQTT) to the Location service
  • Each location update writes to Redis Geo (with TTL) and emits to Kafka
  • The rider’s phone holds a separate WebSocket to a Notify service
  • The Notify service subscribes to the Kafka partition for “driver_id == matched_driver”
  • Filtered updates flow rider-ward at 1–2 Hz

6. Sharding: how the system scales horizontally

Two natural shard axes:

  • Geographic sharding — each city is its own cluster (US-East has its own matching service, India-Bangalore its own). Riders and drivers in a city only ever talk to that city’s cluster. Cross-city is rare and goes through a federation layer.
  • Driver-ID sharding — within a city, location updates are partitioned by hash(driver_id) % N. This keeps Redis hotspots even.

7. Surge pricing: a feedback loop

When demand exceeds supply in a hex cell, surge kicks in. The pricing engine watches:

  • Active rider requests in the hex
  • Available drivers in the hex
  • Recent acceptance rate

If the ratio of demand-to-supply crosses a threshold, the price multiplier rises. This:

  • Discourages some riders (price-sensitive ones drop off)
  • Attracts more drivers (they see “$2.5x” on their map and drive into the zone)
  • Self-balances within minutes

8. What goes wrong (and how Uber handles it)

9. Why this architecture is everywhere now

Uber pioneered the pattern, but you’ll see it in DoorDash (food), Instacart (grocery), Lyft (rides), Lime (scooters), Grubhub (delivery), and increasingly in B2B field service software. Anywhere you match supply to demand by location in real time, the same five primitives appear:

  1. Geo-indexing for fast candidate retrieval
  2. Multi-factor scoring for quality matches
  3. Tiered fan-out dispatch for fairness without spam
  4. Authoritative state machine for the matched entity
  5. Event-driven downstream via Kafka
🧪 Simulator

An interactive simulator for this concept is on the way — tweak the knobs, watch behaviour change in real time.

🎨 Visualization

An interactive diagram you can hover, click, and explore.

💻 Code

A 30-line build challenge with starter code, hints, and a reference implementation.

🎯 Common interview questions
Q1. How do you find "all drivers within 5 miles" efficiently?

Geo-spatial indexing — partition the world into cells (quadtree, S2 cells, or Uber's H3 hexagons). Drivers publish their cell ID with each location update. To find candidates, look up the cell containing the rider plus all neighboring cells, then refine by exact distance. This turns an O(N) problem ("scan every driver") into O(log N) or O(1) for a single hex lookup.

Q2. Why hexagons and not squares?

Three reasons. (1) Every neighbor of a hex is exactly the same distance away (squares have diagonal neighbors at √2 distance). (2) Hexagons tile the sphere with less distortion than lat/lng grids. (3) You can resolve nicely at multiple zoom levels — Uber's H3 has 16 resolutions. The simpler the neighbor model, the simpler the matching code.

Q3. A rider taps Request — how do you avoid notifying every nearby driver at once?

Tiered fan-out. Send to the top 1–3 candidates first, wait 5–10 seconds for an accept. If none accept, expand to the next ring of 5. Keep expanding until somebody accepts or you exhaust the radius. This protects driver UX (no "everybody got the same ping") and ensures only one driver actually accepts per ride.

Q4. How do you handle the live location stream of millions of drivers?

Drivers send location updates every ~3–5 seconds over a persistent WebSocket or MQTT connection. The location service is sharded by driver_id, writes to an in-memory store (Redis Geo, or DynamoDB with TTL), and emits to Kafka for downstream consumers. Riders poll or subscribe to a single driver's stream once matched. Don't store every position permanently — keep a recent window and aggregate.

Q5. What happens when a driver loses connectivity mid-trip?

The system detects the gap (heartbeat missed for >30s), marks the driver as DEGRADED, and switches the rider's UI to "approximate location" mode. If the gap exceeds a threshold (~2 min), it triggers an alert to the rider and offers a rebook. Trip state is reconciled from the GPS history of the driver's phone when they reconnect. The trip never crashes — it gracefully degrades.

↗ Related concepts

Comments 0

Discuss this page. Markdown supported. Be kind.

Loading…
Loading comments…