Consistent hashing maps keys and servers to positions on a circular hash ring. Each key is owned by the next server clockwise. Adding or removing a server only moves the keys in that server's slice — about 1/N of the data — instead of forcing a full re-shard. Add virtual nodes (multiple positions per server) and the load distributes evenly even with few servers.
Whenever you need to partition data across N servers and want to add/remove servers without reshuffling everything. Used by every major distributed cache (Memcached, Redis Cluster), every distributed database (DynamoDB, Cassandra, Riak), every CDN (Cloudflare, Akamai), every load balancer with sticky sessions, and every vector database that shards across nodes.
Try it
Why this beats hash mod N
Imagine you have 4 cache servers and the naive partitioner server = hash(key) % 4. Now you add a 5th server. The new function is hash(key) % 5. Almost every key now maps to a different server. Your cache hit rate plunges to near zero while everything re-warms.
The hash ring, visualised
flowchart LR
subgraph Ring [Hash ring (0..2³²)]
direction TB
S0((S0))
S1((S1))
S2((S2))
S3((S3))
K1[K1 → S1]
K2[K2 → S2]
K3[K3 → S2]
K4[K4 → S3]
K5[K5 → S0]
K1 -.lookup<br/>clockwise.-> S1
K2 -.lookup<br/>clockwise.-> S2
K3 -.lookup<br/>clockwise.-> S2
K4 -.lookup<br/>clockwise.-> S3
K5 -.wrap.-> S0
end
style S0 fill:#1e3a8a,stroke:#3b82f6,color:#fff
style S1 fill:#0e7490,stroke:#06b6d4,color:#fff
style S2 fill:#581c87,stroke:#a855f7,color:#fff
style S3 fill:#365314,stroke:#84cc16,color:#fff
style K1 fill:#1c2333,stroke:#475569,color:#cdd3df
style K2 fill:#1c2333,stroke:#475569,color:#cdd3df
style K3 fill:#1c2333,stroke:#475569,color:#cdd3df
style K4 fill:#1c2333,stroke:#475569,color:#cdd3df
style K5 fill:#1c2333,stroke:#475569,color:#cdd3df
Both keys and servers get hashed onto positions around a fixed circle (typically [0, 2³²)). Each key is owned by the next server going clockwise.
How real systems use it
- Cassandra & DynamoDB — partition the keyspace via consistent hashing. Each node owns several token ranges (vnodes). Replication factor
Rmeans each key is stored on the nextRnodes clockwise. - Memcached clients (Ketama) — every client uses the same ring so all clients agree on which server owns a key. This is how Facebook ran web-scale caching long before centralized cluster managers.
- Cloudflare’s load balancer — uses a variant called “rendezvous hashing” (HRW) which is a close cousin and also avoids the
mod Nproblem. - CDN edge selection — request URLs are hashed to pick which edge cache holds the asset, so cache misses don’t propagate when the edge fleet changes.
The vnode trick
The visualization shows it: with one position per server, the random hash angles produce uneven slices. Server S0 might end up with 25% of keys while S2 gets 5%. Bump virtual nodes to 10+ and the load smooths out — by the law of large numbers, more random samples converge to a uniform distribution.
What about hot keys?
When NOT to use it
Trade-offs
Pros: Add/remove servers cheaply, even load distribution, no global coordinator needed.
Cons: No range queries (keys are scattered), hot keys still concentrated, slightly more complex than mod N.
Comments 0
Discuss this page. Markdown supported. Be kind.