← Distributed Systems

🔄

Distributed Systems

Consistent Hashing

How to shard data across servers so adding or removing a node only moves a small fraction of keys.

TL;DR

Consistent hashing maps keys and servers to positions on a circular hash ring. Each key is owned by the next server clockwise. Adding or removing a server only moves the keys in that server's slice — about 1/N of the data — instead of forcing a full re-shard. Add virtual nodes (multiple positions per server) and the load distributes evenly even with few servers.

When to use

Whenever you need to partition data across N servers and want to add/remove servers without reshuffling everything. Used by every major distributed cache (Memcached, Redis Cluster), every distributed database (DynamoDB, Cassandra, Riak), every CDN (Cloudflare, Akamai), every load balancer with sticky sessions, and every vector database that shards across nodes.

Try it

Why this beats `hash mod N`

Imagine you have 4 cache servers and the naive partitioner server = hash(key) % 4. Now you add a 5th server. The new function is hash(key) % 5. Almost every key now maps to a different server. Your cache hit rate plunges to near zero while everything re-warms.

The hash ring, visualised

flowchart LR
    subgraph Ring [Hash ring &lpar;0..2³²&rpar;]
        direction TB
        S0((S0))
        S1((S1))
        S2((S2))
        S3((S3))

        K1[K1 → S1]
        K2[K2 → S2]
        K3[K3 → S2]
        K4[K4 → S3]
        K5[K5 → S0]

        K1 -.lookup<br/>clockwise.-> S1
        K2 -.lookup<br/>clockwise.-> S2
        K3 -.lookup<br/>clockwise.-> S2
        K4 -.lookup<br/>clockwise.-> S3
        K5 -.wrap.-> S0
    end

    style S0 fill:#1e3a8a,stroke:#3b82f6,color:#fff
    style S1 fill:#0e7490,stroke:#06b6d4,color:#fff
    style S2 fill:#581c87,stroke:#a855f7,color:#fff
    style S3 fill:#365314,stroke:#84cc16,color:#fff
    style K1 fill:#1c2333,stroke:#475569,color:#cdd3df
    style K2 fill:#1c2333,stroke:#475569,color:#cdd3df
    style K3 fill:#1c2333,stroke:#475569,color:#cdd3df
    style K4 fill:#1c2333,stroke:#475569,color:#cdd3df
    style K5 fill:#1c2333,stroke:#475569,color:#cdd3df

Both keys and servers get hashed onto positions around a fixed circle (typically [0, 2³²)). Each key is owned by the next server going clockwise.

How real systems use it

Cassandra & DynamoDB — partition the keyspace via consistent hashing. Each node owns several token ranges (vnodes). Replication factor R means each key is stored on the next R nodes clockwise.
Memcached clients (Ketama) — every client uses the same ring so all clients agree on which server owns a key. This is how Facebook ran web-scale caching long before centralized cluster managers.
Cloudflare’s load balancer — uses a variant called “rendezvous hashing” (HRW) which is a close cousin and also avoids the mod N problem.
CDN edge selection — request URLs are hashed to pick which edge cache holds the asset, so cache misses don’t propagate when the edge fleet changes.

The vnode trick

The visualization shows it: with one position per server, the random hash angles produce uneven slices. Server S0 might end up with 25% of keys while S2 gets 5%. Bump virtual nodes to 10+ and the load smooths out — by the law of large numbers, more random samples converge to a uniform distribution.

What about hot keys?

When NOT to use it

Trade-offs

Pros: Add/remove servers cheaply, even load distribution, no global coordinator needed.

Cons: No range queries (keys are scattered), hot keys still concentrated, slightly more complex than mod N.

🧪 Simulator soon

An interactive simulator for this concept is on the way — tweak the knobs, watch behaviour change in real time.

💻 Code Phase 4 soon

A 30-line build challenge with starter code, hints, and a reference implementation.

🎯 Common interview questions

Q1. Why not just `hash(key) % N`? ▾

That works until N changes. Add one server (N→N+1) and almost EVERY key now maps to a different server, forcing a full re-shard. With consistent hashing, adding a server only moves the keys in that server's arc — about 1/N of the data — and existing key→server mappings stay the same.

Q2. Why are virtual nodes necessary? ▾

With one position per server, the random angles cause skew — some servers get 2× the keys of others. Each server publishing 100–256 virtual nodes (replicas at different ring positions) averages out the angles. Cassandra, DynamoDB, Riak all use vnodes for this reason. The visualization above shows it directly — bump vnodes from 1 to 10 and watch the load flatten.

Q3. How is replication handled? ▾

Most systems store each key on the next R servers clockwise (the "preference list"). If R=3, the key lives on the 1st, 2nd, and 3rd servers clockwise from its hash position. When the primary fails, reads transparently fall back to the secondaries — no key migration needed.

Q4. What's a "hot key" and how does consistent hashing handle it? ▾

A single key getting 1000× more traffic than average. Consistent hashing alone doesn't help — that key is still on one server. The fix is layered — replicate hot keys aggressively (R=10 instead of 3), or use a separate "hot key cache" in front, or shard by `hash(key + user_id)` so the same logical key spreads across users.

Q5. When should you NOT use consistent hashing? ▾

When you need range queries — `SELECT * WHERE id BETWEEN 100 AND 200` is impossible if those IDs are scattered across the ring. For range queries, use range-based sharding (each server owns a contiguous key range). For point lookups and uniformly-distributed loads, consistent hashing wins.

↗ Related concepts

Comments 0

Discuss this page. Markdown supported. Be kind.

Loading…

Loading comments…