CRUD, Batch & Transactions — Cosmos DB

TL;DR

Cosmos has no `BEGIN TRANSACTION`. What it has instead is **transactional batch** — atomic up to 100 operations, all within the same partition key. For multi-document workflows that span partitions, you fall back to compensating transactions or stored procedures. The good news — point reads are 1 RU and faster than any SQL DB.

Key takeaways

▸A point read (id + partition key) is 1 RU and < 10 ms — the cheapest, fastest op in Cosmos.
▸Use upsert when you don't care if the doc exists; use create + replace when you do.
▸Transactional batch — up to 100 ops, must share a partition key, all-or-nothing, < 2 MB total.
▸Stored procedures run server-side in JS, also single-partition, useful for read-then-write atomicity.
▸Cross-partition transactions don't exist. If you need them, your model is wrong (revisit lesson V03).

After modeling and partitioning, the day-to-day stuff — read a doc, write a doc, update a doc. Cosmos’s CRUD surface looks similar to a SQL database at first, but the rules underneath are different. The big surprise — there are no cross-partition transactions, and that constraint shapes everything else.

The four basic operations

Op	RU cost	Notes
Point read	1 RU	id + partition key, fastest path
Query	2.5+ RUs	Indexed, can fan out across partitions
Create	~5–7 RUs (1 KB doc)	Fails if id already exists
Replace	~5–7 RUs	Fails if doc doesn’t exist
Upsert	~5–7 RUs	Create-or-replace, idempotent
Patch	~10 RUs	Partial update — change one field, don’t rewrite the doc
Delete	~5 RUs

Patch is underrated. A 1 MB doc with a small views counter — incrementing via replace rewrites all 1 MB; via Patch you only touch the counter. Same RUs paid, but no read-modify-write race.

Transactional batch

Cosmos’s “transaction” primitive. Up to 100 operations, all within the same logical partition (same partition-key value), all-or-nothing.

container.execute_item_batch([
    ("create", (order, {})),
    ("patch", (cart_id, {"partition_key": user_id}, [{"op": "set", "path": "/state", "value": "submitted"}])),
    ("create", (audit_event, {})),
], partition_key=user_id)

If any op fails (validation error, etag conflict, throttle), all are rolled back. You get a list of BatchOperationResult — success/failure per op.

Constraints:

Same partition key for every op
Total payload < 2 MB
100 ops max
No queries inside the batch — just CRUD ops

This is the primitive for “place order, decrement inventory, log audit” within a single user/tenant scope.

Stored procedures (use sparingly)

Server-side JavaScript. Runs inside the partition replica, sees up-to-the-microsecond data, no network hops. Good for read-then-write atomicity that batch can’t express (e.g. “find the doc with the lowest score, increment it”).

function incrementScore(id) {
  const collection = getContext().getCollection();
  collection.queryDocuments(collection.getSelfLink(),
    `SELECT * FROM c WHERE c.id = "${id}"`,
    (err, docs) => {
      if (err || !docs.length) throw new Error('not found');
      docs[0].score++;
      collection.replaceDocument(docs[0]._self, docs[0]);
    });
}

The catches — JavaScript-only, hard to debug, single-partition only, and the runtime kills your sproc at 5 seconds. Most teams find batch + Patch cover everything stored procs would.

Cross-partition writes — the hard case

Sometimes you genuinely need to write across partitions atomically. Two patterns:

Saga — break the workflow into independently idempotent steps with compensating actions. Service A creates the order; if Service B’s payment fails, Service A’s compensator cancels the order. Frameworks — Temporal, Dapr, your own state machine.

Outbox — write the cross-partition intent as an event in the same partition as the primary write. A separate consumer (Change Feed, lesson V11) reads the outbox and applies the secondary write with retries. Eventual, but reliably eventual.

If you’re doing this often, your model is probably wrong — re-read lesson V03 and look for a partition key that keeps related data together.

Idempotency, the unsung hero

Cosmos retries on its own — network blips, throttling, transient failures. The SDK retries the same write up to 9 times by default. If your write isn’t idempotent, retries can produce duplicates.

Two cheap habits — always use upsert instead of create when business logic permits, and always pass an explicit id (don’t let the SDK generate one). Then a retry of the same operation produces the same result.

🎯 Common questions

Q1. What's the difference between upsert and replace? ▾

Replace requires the document to exist — fails with 404 otherwise. Upsert inserts if missing, replaces if present. Use replace when "this should already exist" is part of your business logic; use upsert when you're idempotent by design (event handlers, syncs).

Q2. How do I implement an atomic counter? ▾

Three options — (1) a stored procedure that does read-modify-write server-side, (2) a transactional batch with a Patch op (`$inc`), (3) for high-write counters, multiple bucket documents that you sum at read time (avoids the single-partition write hot-spot).

Q3. Can I get a SQL-style transaction across two partition keys? ▾

No. The recommended pattern is the **saga** — break the workflow into steps, each idempotent and reversible. If step 3 fails, run the compensating actions for steps 1 and 2. More work than `BEGIN/COMMIT`, but it's the price of horizontal scale.