← Distributed Systems

⚖️

Distributed Systems

CAP Theorem

You can have at most two of consistency, availability, and partition tolerance — and partition isn't optional.

TL;DR

A distributed data store can only fully guarantee two of consistency, availability, and partition tolerance. In any real network, partitions WILL happen — so you're effectively choosing between CP (refuse stale reads when partitioned) and AP (keep serving but accept stale data). The "CA" corner is a single-machine system. The pick is per-system, sometimes per-API.

When to use

Every time you design a distributed data store, you implicitly pick a CAP corner. Knowing which one tells you immediately how the system behaves under failure — whether reads return stale data, whether writes are accepted, whether the service stays up. Modern systems (Cassandra, DynamoDB, MongoDB) have tunable consistency knobs that let different operations pick different corners.

Pick a corner

The theorem in one sentence

Eric Brewer’s 2000 conjecture (proven 2002 by Gilbert & Lynch):

Twenty-five years later, this is the most-cited and most-misunderstood result in distributed systems.

Why “pick two” is misleading

The problem with “pick two” is that partition tolerance isn’t really optional in any real network. Packets drop, switches die, fiber gets cut. If your system runs across more than one machine, you must handle partitions — there’s no “we’ll just not have any.”

The CP vs AP fork, visualised

flowchart TD
    P[Network partition occurs<br/><i>switch dies, link drops, region goes dark</i>]
    P --> Q{System detects<br/>partition}

    Q -->|CP choice| CP[CP — Consistency wins]
    Q -->|AP choice| AP[AP — Availability wins]

    CP --> CP1[Minority side stops<br/>accepting writes<br/><i>no quorum</i>]
    CP --> CP2[Some reads return errors<br/>instead of stale data]
    CP --> CP3[Majority side continues<br/>at reduced capacity]
    CP --> CPex[Etcd · Zookeeper · Spanner ·<br/>MongoDB default · HBase]

    AP --> AP1[Both sides accept writes]
    AP --> AP2[Reads return whatever<br/>local replica has]
    AP --> AP3[Conflicts reconciled later<br/><i>LWW · vector clocks · CRDTs</i>]
    AP --> APex[Cassandra · DynamoDB ·<br/>Riak · CouchDB]

    style P fill:#7e1d1d,stroke:#ef4444,color:#fff
    style Q fill:#9a3412,stroke:#f97316,color:#fff
    style CP fill:#1e3a8a,stroke:#3b82f6,color:#fff
    style AP fill:#0e7490,stroke:#06b6d4,color:#fff
    style CP1 fill:#0f1320,stroke:#475569,color:#cdd3df
    style CP2 fill:#0f1320,stroke:#475569,color:#cdd3df
    style CP3 fill:#0f1320,stroke:#475569,color:#cdd3df
    style AP1 fill:#0f1320,stroke:#475569,color:#cdd3df
    style AP2 fill:#0f1320,stroke:#475569,color:#cdd3df
    style AP3 fill:#0f1320,stroke:#475569,color:#cdd3df
    style CPex fill:#581c87,stroke:#a855f7,color:#fff
    style APex fill:#365314,stroke:#84cc16,color:#fff

What CP looks like in practice

A CP system, on detecting a partition, refuses to serve some clients to preserve consistency:

The minority side of the partition stops accepting writes (it can’t reach quorum)
Some reads return errors instead of stale data
The majority side continues, but with reduced capacity

You see this with Etcd, Zookeeper, MongoDB (default), HBase. They use Raft, ZAB, or Paxos to ensure linearizable reads and writes.

What AP looks like in practice

An AP system, on partition, keeps every replica answering — accepting that they may temporarily disagree:

Both sides of the partition accept writes
Reads return whatever the local replica has
Conflicts are reconciled later (last-write-wins, vector clocks, CRDTs)

You see this with Cassandra, DynamoDB, Riak, CouchDB. They optimize for “always responsive” and provide tunable consistency — you can ask for stronger guarantees per-operation if you need them.

The right choice depends on what’s worse

For many domains, brief inconsistency is fine but downtime isn’t:

Social feeds (eventually consistent likes/comments)
Shopping carts (always accept items, reconcile at checkout)
Sensor data (last value wins is fine)

For other domains, stale data is dangerous:

Bank transfers (don’t show the balance before the deposit settled)
Inventory (don’t sell the same item twice)
Locks and leases (only one holder at a time)

CAP-C vs ACID-C — same word, different meaning

PACELC — the more honest framework

What this means when you read system architectures

💻 Code Phase 4 soon

A 30-line build challenge with starter code, hints, and a reference implementation.

🎯 Common interview questions

Q1. Why is CA not really an option in distributed systems? ▾

Because partitions WILL happen — networks fail, packets drop, switches crash. Saying "we don't tolerate partitions" really means "if a partition happens, we lose either consistency or availability." So in practice, "CA" is just CP or AP under a different name. The only true CA system is single-node — and that's not distributed.

Q2. Is this really a binary choice — fully consistent OR fully available? ▾

No. Modern systems are tunable. Cassandra has consistency levels per operation — you can write at QUORUM (CP-leaning) or ONE (AP-leaning). DynamoDB has strong vs eventual consistency reads. MongoDB has read concerns and write concerns. The CAP corner is per-operation, not per-system.

Q3. What's the difference between consistency in CAP and consistency in ACID? ▾

They mean different things. ACID-C is "constraints aren't violated by a transaction" (foreign keys, uniqueness). CAP-C is "after a write, every reader sees the new value" (linearizability). A system can be CAP-strong-consistent and ACID-weak (or vice versa). Confusingly the same word means two unrelated things.

Q4. When does it actually matter for users? ▾

It matters in failure modes. In normal operation, CP and AP systems look the same. The difference shows up when a partition happens — does your shopping-cart API accept new items (AP) or refuse to update until it can talk to the master (CP)? The right choice depends on whether stale data or lost availability is worse for your users.

Q5. How does this relate to PACELC? ▾

PACELC extends CAP. CAP only describes behavior during a partition; PACELC adds "Else, in normal operation, choose between Latency and Consistency." DynamoDB is PA/EL (available during partition, low-latency in normal). Spanner is PC/EC (consistent during partition, consistent in normal — paying latency for it). PACELC is a more complete framework for real-world systems.

↗ Related concepts

Comments 0

Discuss this page. Markdown supported. Be kind.

Loading…

Loading comments…