Cosmos DB in 5 Minutes — Cosmos DB

TL;DR

Cosmos DB is Microsoft's globally distributed, multi-model database — every container is automatically partitioned, every region is a peer, and every operation has a guaranteed latency at the 99th percentile. You write to one endpoint; under the hood it's running a sharded, replicated, multi-master cluster with five tunable consistency levels.

Key takeaways

▸One database, five APIs — SQL (Core), MongoDB, Cassandra, Gremlin (graph), Table. Pick the API your app already speaks.
▸Every container is partitioned by a key you choose. Get this right and Cosmos scales horizontally; get it wrong and you'll fight hot partitions forever.
▸Throughput is sold as Request Units per second (RU/s). Reads, writes, queries each cost RUs — predictability over raw IOPS.
▸Multi-region writes are turnkey. Tick a checkbox; reads and writes happen in the closest region. Conflicts resolve via last-write-wins or a custom merge function.
▸Five consistency levels. From strong (linearizable, slow) to eventual (fast, possibly stale). Most production apps run on session — strong enough, fast enough.

You open the Azure portal, click + Cosmos DB, fill in three fields, and 90 seconds later you have a globally replicated, infinitely-scaling, multi-model database. The same one Microsoft uses to back parts of Xbox Live, LinkedIn, and Skype. So what’s actually inside that turnkey button?

This lesson is the panoramic view. The other 19 lessons drill into pieces; here we just look at the whole machine.

The five APIs

Cosmos DB is one storage engine with five wire-protocol front-ends:

API	Wire protocol	Use when
SQL (Core)	HTTPS + JSON	New apps. The most powerful — has stored procedures, triggers, vector search.
MongoDB	Mongo wire	You have an existing Mongo app. Drop-in compatibility, no driver changes.
Cassandra	CQL	Existing Cassandra app — wide-column data model.
Gremlin	Apache Tinkerpop	Graph data — friend networks, recommendation graphs, knowledge graphs.
Table	Azure Table Storage	Legacy migration from Azure Table Storage.

You pick one per account at creation time. You can’t mix — but you can run multiple Cosmos accounts side-by-side.

The rest of this course assumes the SQL (Core) API unless explicitly noted. It’s the API Microsoft is investing in most heavily, and it’s what you’d pick for any new application.

Three nouns to know

Cosmos’s hierarchy is shorter than you’d expect:

Account                          (one per region group, has its own endpoint URL)
  └── Database                   (logical namespace — like a Postgres database)
      └── Container              (the unit of throughput + partitioning — like a table)
          └── Item               (a JSON document)

That’s it. No tables-of-tables, no schemas, no namespaces inside namespaces. Container is the unit you reason about. Every container has:

A partition key — a JSON path you choose, e.g. /tenantId or /userId. This is the single most consequential decision (lesson V02 is dedicated to it).
A throughput setting — RUs per second, either provisioned (fixed) or autoscale (4× burst capacity).
An indexing policy — by default, every path is indexed. You opt out for cost (lesson V07).
A TTL setting — optional auto-expiry on items.

Partitioning, in one paragraph

Cosmos hashes your partition-key value into one of ~256 logical partitions, then maps logical partitions onto physical partitions (the actual VMs). Each physical partition holds up to 50 GB and 10,000 RU/s. When a logical partition outgrows that, Cosmos automatically splits the physical partition behind your back — no downtime, no migration.

What this means for you: all documents sharing a partition-key value live together and can be queried/transacted atomically. Crossing partitions costs more RUs and breaks transactions. Picking the wrong key — /status (only 3 distinct values) or /customerId for one giant customer — creates hot partitions that throttle while everything else sits idle.

The currency: Request Units

Cosmos charges in RUs, not IOPS or seconds. Every operation costs a deterministic number of RUs:

Operation	Approximate cost
Point read (by id + partition key)	1 RU
Single-partition query, 10 results	~5–10 RUs
Cross-partition query (fan-out)	30+ RUs
Insert / replace, 1 KB doc, indexed	5–7 RUs
Delete	~5 RUs

You provision RU/s at the container (or database) level. Hit the limit, requests get HTTP 429 and the SDK retries with backoff. Lesson V09 dives in.

The big idea: RUs make cost predictable. You can model a workload’s cost on a napkin: (reads × 1) + (writes × 6) + (queries × 8). No surprise IOPS bill at the end of the month.

Five consistency levels

This is the headline feature. Cosmos lets you choose, per request, where on the consistency spectrum you want to sit:

Level	What it guarantees	When to use
Strong	Linearizable. Every read sees the latest committed write.	Financial ledgers. Slow at multi-region scale.
Bounded staleness	Reads can lag by at most K versions or T seconds.	Leaderboards, inventory — “fresh enough”.
Session	A single client session is read-your-writes consistent. Other sessions may see staleness.	The default. What 90% of apps want.
Consistent prefix	You see writes in order, but maybe not the latest.	Comments, feeds — order matters, freshness less so.
Eventual	All replicas converge eventually.	Likes, view counts, anything where stale-by-seconds is invisible.

You set a default at the account level, and override per request when needed. Stronger = slower and more expensive in RUs. Lesson V04 walks through each with a concrete scenario.

Multi-region writes — the magic checkbox

The feature that genuinely sets Cosmos apart: turnkey multi-master.

Add a region in the portal → that region becomes a peer write endpoint. Your app, using the SDK with PreferredLocations = ["West US 2", "East US"], automatically writes to the nearest region. Conflicts resolve by:

Last-write-wins (default) using a server-side timestamp
Custom merge via a stored procedure you provide

Other databases either don’t do multi-master at all, or do it with painful caveats (eventual-only, no transactions, manual conflict files, etc.). Cosmos makes it a checkbox.

The trade-off is real: multi-region writes pin you to weaker-than-strong consistency (you can pick session or weaker globally). And cross-region writes cost more RUs than single-region.

What’s next in this course

You’ve now seen the whole machine at altitude. The next 14 engineer-course lessons drill into the moving parts:

V02 — Partitioning (the #1 decision)
V03 — Data Modeling (embed vs reference)
V04 — Consistency Levels (the five knobs in detail)
V05 — CRUD, Batch & Transactions
V06–V07 — Querying & Indexing
V08 — SDK Best Practices
V09 — Request Units & Cost
V10 — Monitoring & Troubleshooting
V11 — Change Feed
V12 — Global Distribution
V13 — Security & Access
V14 — Vector Search & AI
V15 — Local Dev & CI/CD

If you want the underlying theory first — replication algorithms, storage engines, partitioning math — start with the Foundations track (F1–F5). Otherwise, jump into V02 next; partition keys are where most production pain originates.

🎯 Common questions

Q1. When should I pick Cosmos DB over a regular SQL database? ▾

When you need (1) global distribution with single-digit ms latency in multiple regions, (2) automatic horizontal scaling without manual sharding, (3) tunable consistency per operation, or (4) multi-model access (e.g. SQL queries on JSON documents). If your data fits comfortably on one Postgres node and you don't need multi-region writes, plain Postgres is simpler and cheaper.

Q2. How is Cosmos different from MongoDB Atlas or DynamoDB? ▾

All three are managed NoSQL. Cosmos's differentiators — multi-API surface (SQL + Mongo wire protocol + Cassandra wire protocol), five named consistency levels (vs Mongo's two and Dynamo's two), and SLA-backed latency at the 99th percentile in every region. DynamoDB is simpler and often cheaper at low scale; Mongo Atlas has the richest aggregation pipeline; Cosmos wins on multi-region writes + flexibility.

Q3. What's the cheapest way to start? ▾

Serverless mode (no minimum RU/s, pay per request) for dev/test or spiky workloads. Free tier gives you 1,000 RU/s + 25 GB free forever in one container — enough for most personal projects.

Q4. What does "schema-free" actually mean in Cosmos? ▾

You don't declare columns up-front. Every document is JSON; fields can vary across documents in the same container. Cosmos still indexes everything by default — meaning queries work on any field, but write cost grows with index breadth (lesson V07 covers tuning this).