Consistency trade-offs
Strong vs eventual vs causal, quorum, read-your-writes.
CAP is not a trivia question. It's the trade-off that every distributed system lives under, and getting it wrong is how you end up with "strong consistency" backed by a single node — or "eventual consistency" on data that absolutely cannot be eventually wrong.
Read this if your last attempt…
- You said "strong consistency" for the whole system without justifying it per data class
- You can't explain the difference between linearizable, sequential, causal, and eventual
- You've heard of CAP but haven't heard of PACELC
- You default to "eventual consistency" for everything because "it's more scalable"
The concept
CAP vs PACELC
The CAP theorem says: in the presence of a network partition (which will happen in any distributed system), you choose between Consistency (all nodes see the same data) and Availability (every request gets a response). Partition tolerance is mandatory — networks fail. So you're really choosing between C and A during partitions. During normal operation, you can have both.
Not a binary choice. Each data class picks its level on this spectrum — strong costs latency, eventual costs correctness guarantees.
Consistency levels you'll actually cite.
| Level | Guarantee | Cost | Typical use |
|---|---|---|---|
| Linearizable (strong) | Every read sees the latest committed write | Needs consensus; high latency | Uniqueness checks, financial balances, inventory |
| Sequential | All nodes agree on operation order | Similar to linearizable; needs global ordering | Audit logs, state machines, distributed locks |
| Causal | Causally related operations appear in order | Vector clocks; moderate overhead | Comments (reply after parent), collaborative docs |
| Read-your-writes | A client sees its own writes immediately | Session affinity or sticky reads | Dashboards, edit-then-view flows, profile updates |
| Monotonic reads | A client never sees time go backwards | Session routing / version tracking | Timelines, comment threads, notifications |
| Eventual | Replicas converge given no new writes | Cheapest — no coordination | Counts, metrics, recommendations, search indexes |
Variants
Linearizable (strong consistency)
Every read returns the most recent committed write, as if there's a single copy of the data.
N=3, W=2, R=2. Any write quorum (2 nodes) and any read quorum (2 nodes) must overlap in at least one node — that node has the latest write.
How it works: Single-leader with synchronous replication to a quorum, or a consensus protocol like Raft or Paxos. Every write blocks until a majority of replicas acknowledge.
Real-world systems: Google Spanner (TrueTime + Paxos), CockroachDB, etcd/ZooKeeper (for metadata/coordination).
Cost: Every write has the latency of the slowest quorum member. For a 3-replica cluster with one cross-AZ replica, that's ~2-5ms extra per write. For cross-region: 50-200ms per write — often prohibitive.
When it's worth it:
- Uniqueness constraints (username, short-URL alias, ticket seat)
- Financial transactions (account balance, transfer)
- Distributed locks / leader election
- Anything where reading stale data causes a business-level bug (double-booking, double-spending)
When it's NOT worth it: Anything read-heavy where staleness is tolerable. Forcing linearizable reads on a social media timeline kills throughput for zero user benefit.
Choose this variant when
- Uniqueness checks
- Financial data
- Inventory / booking
- Coordination metadata
Causal consistency
Causally related operations appear in order; concurrent operations may appear in any order.
Regional primary (most common) vs Spanner-style (globally linearizable) vs multi-leader (eventual). Picking wrong costs either write latency or correctness.
The insight: most operations aren't concurrent with each other — they have causal relationships. A reply to a comment is causally dependent on the comment. Causal consistency guarantees you'll never see the reply without the comment.
How it works: Track dependencies with vector clocks or logical timestamps. When operation B depends on A, B carries A's timestamp, and no replica serves B until it has also processed A.
Why it matters: Causal consistency gives you much of the benefit of strong consistency with much less coordination overhead. You don't need global ordering — just ordering of related operations.
Real-world systems: MongoDB (causal consistency sessions), COPS (academic), some Cassandra configurations with lightweight transactions.
Example: In a collaborative document, if User A types "Hello" and User B replies "Hi", causal consistency ensures every user sees "Hello" before "Hi". But two independent edits to different paragraphs can appear in either order.
Choose this variant when
- Comment threads (reply after parent)
- Collaborative editing
- Chat message ordering within a conversation
- Any case where "this happened because of that" matters
Read-your-writes
A client always sees its own writes immediately; other clients may see a delay.
Write returns an LSN; client stores it in session; subsequent reads pass the LSN; replica waits until it has applied up to that LSN before answering.
The most common "good enough" consistency level. Users expect that after they click "Save", they see their changes. They don't expect other users to see the change instantly.
Implementation options:
- 1Session affinity: route the user's reads to the same replica they wrote to. Simple but brittle (what if that replica fails?).
- 2Read-after-write token: the write returns a logical timestamp; the client passes it on the next read; the replica waits until it's caught up to that timestamp before responding.
- 3Read from leader: after a write, read from the leader for N seconds, then fall back to replicas. Blunt but effective.
Where most candidates go wrong: They promise "strong consistency" when all they actually need is read-your-writes. The difference is huge: strong consistency requires every read from every client to see the latest write, which needs consensus. Read-your-writes only requires your own session to be consistent, which just needs routing.
Cost: Nearly free — just route correctly. No consensus needed.
Choose this variant when
- User dashboards
- Profile edits
- Settings changes
- Any "I just saved, why don't I see it?" scenario
Eventual consistency
All replicas converge to the same state given no new writes — but there's no guarantee on when.
Two replicas accept conflicting writes to the same field. Each strategy picks a different tradeoff between simplicity, correctness, and data preservation.
Default for read-heavy, scale-out systems. Async replication means writes are fast (ack from one node) and replicas catch up in the background. The convergence window is typically milliseconds to low seconds.
The catch: conflict resolution. If two replicas accept conflicting writes concurrently (e.g., two users edit the same field), you need a strategy:
- Last-writer-wins (LWW): use timestamps; latest write wins. Simple but loses data silently.
- Merge / CRDT: automatically merge concurrent updates (e.g., sets can union, counters can add). No data loss but not always possible for arbitrary data.
- Application-level resolution: surface the conflict to the user (like Git merge conflicts).
Real-world systems: DynamoDB (default), Cassandra, S3, DNS.
When it's appropriate:
- Click counts, view counts, analytics
- Recommendation scores
- Search index updates
- Session data that's regenerable
- Anything where "slightly stale" is invisible to the user
When it's NOT appropriate: Anything where the divergence window creates a business bug (double-sell, double-book, lost money).
Choose this variant when
- Counters and analytics
- Search indexes
- Recommendations
- CDN cache content
- Any data where millisecond staleness is invisible
Worked example
Scenario: designing an e-commerce platform. The interviewer asks about consistency.
Data class analysis:
Product inventory → Linearizable. Overselling is a business-critical bug. Write quorum W=2 of N=3 on decrement; read quorum R=2 on stock check. Or single-leader with sync replication.
User cart → Read-your-writes. User adds item, expects to see it. No other user reads this cart. Route reads to the same replica via session affinity or read-after-write token.
Product reviews → Eventual. A review appearing 5 seconds late is invisible. Async replication to read replicas; replicas serve the read-heavy review page.
Product catalog (name, price, description) → Eventual with short window. Price changes propagate within seconds but don't need consensus. Cache with 30s TTL.
Order confirmation → Linearizable. The order record must be durable and consistent before sending the confirmation email. Write to leader + sync replicate to standby before returning 200.
What the interviewer hears: "I'm not using one consistency level for the whole system. Inventory gets linearizable because overselling costs money. Cart gets read-your-writes because only the owner reads it. Reviews are eventual because seconds of staleness are invisible. I can implement this with a single Postgres cluster: SELECT FOR UPDATE on inventory, session-pinned reads for cart, read-replica serving for reviews."
Quorum math for inventory: N=3 replicas, W=2, R=2. W+R=4 > N=3, so reads always overlap with the latest write. Writes are slower by ~2ms (latency of second replica within AZ), but inventory accuracy is worth it.
Good vs bad answer
Interviewer probe
“What consistency model does your system use?”
Weak answer
"Strong consistency — we use Postgres which is ACID compliant."
Strong answer
"It depends on the data class. Inventory is linearizable — I use SELECT FOR UPDATE on the stock count because overselling costs real money. The user's cart is read-your-writes — I route their reads to the replica they last wrote to via a session token, which is nearly free. Product reviews are eventually consistent — async replication to read replicas is fine because a 2-second delay on a review is invisible. I'm not paying the consensus overhead for data that doesn't need it."
Why it wins: Names three different consistency levels for three data classes, justifies each with a business reason, costs out the trade-off, and doesn't over-engineer.
When it comes up
- Whenever a database is introduced in the HLD
- During multi-region design — CAP becomes acute
- When replication lag, stale reads, or read-your-writes come up
- Whenever "ACID", "strong consistency", or "eventual consistency" enters the conversation
- In any deep-dive on inventory, booking, balance, or uniqueness
Order of reveal
- 1Reject one-size-fits-all. "Consistency isn't a system-wide switch; it's a per-data-class decision. Different pieces of data need different guarantees."
- 2Classify the data. "Three classes here: must-be-exact (inventory, balance, uniqueness), must-see-my-own-writes (dashboards, profiles), and stale-is-fine (counts, analytics, search results)."
- 3Assign the minimum viable level per class. "Linearizable for inventory — overselling costs money. Read-your-writes for the user's own dashboard — route reads with session affinity. Eventual for review counts — they'll converge in seconds."
- 4State the cost of each choice. "Linearizable writes pay quorum latency — ~2 ms intra-AZ, 50-100 ms cross-region. Read-your-writes is nearly free. Eventual is the cheapest at every layer."
- 5Use PACELC, not just CAP. "CAP is about partitions — rare. PACELC includes normal-operation latency vs consistency, which affects every request. Most systems are PA/EL or PC/EC."
- 6Prove consistency with quorum math if quorum is used. "N=3 replicas, W=2, R=2. W+R=4 > N=3, so reads overlap with the latest write. Linearizable by construction."
Signature phrases
- “Consistency is per data class, not per system” — The single most important reframe; separates seniors instantly.
- “PACELC, not just CAP” — Signals depth beyond the interview cliché.
- “Read-your-writes is cheap; strong consistency is expensive” — Prevents over-engineering.
- “W + R > N ⇒ linearizable reads” — Concrete math, not vibes.
- “Eventual is fine when staleness is invisible” — Justifies the choice with a user-centric test.
- “"ACID" is local; CAP is distributed” — Catches the common conflation.
Likely follow-ups
?“Your system is globally distributed. Give me the full consistency story across regions.”Reveal
The core tension: cross-region latency is 50-200 ms. Synchronous cross-region replication turns every strongly-consistent write into a multi-region round-trip. Most systems can't pay this.
Three common shapes:
1. Regional primary, global async replicas (most common).
- Each user is pinned to their home region. Writes go to the regional primary.
- Async replicas in other regions for DR and global read traffic.
- Consistency: strong within region, eventual across regions.
- Cost: cross-region reads from replicas may be seconds behind. Failover to another region has RPO > 0.
- When: geographically partitioned users (US users mostly write US data, EU users mostly write EU data).
2. Spanner-style globally consistent.
- Shards distributed across regions; each shard's leaders in 3+ regions; commit requires majority.
- Consistency: linearizable globally.
- Cost: write latency ~100-150 ms (cross-region RTT for quorum). Expensive engineering (TrueTime, GPS clocks).
- When: regulatory requirement for strong global consistency (financial exchanges, compliance-heavy). Rare.
3. Multi-leader with conflict resolution.
- Every region accepts all writes; async replication between regions; conflicts resolved by LWW, CRDTs, or app-level merge.
- Consistency: eventual with convergence.
- Cost: conflicts are a real problem; LWW loses data silently, CRDTs limit data types.
- When: high write rate with clear convergence semantics (collaborative editing, shopping carts). Cassandra, Dynamo Global Tables, Riak.
Per-data-class across regions:
- Global balance / inventory: Spanner-style or regional-primary with cross-region confirmation.
- User profile: regional primary, eventual cross-region.
- Analytics: eventual everywhere, async streaming to a central warehouse.
- Session state: regional (no cross-region consistency needed if session is tied to the user's home region).
The honest interview answer: "Global strong consistency is expensive. I'd partition by user region and accept eventual for cross-region reads. For truly global strong data (like unique usernames), I'd either use a dedicated Spanner-backed service or accept a single-region bottleneck for that one check."
?“What is read-your-writes consistency and how do you implement it without requiring strong consistency?”Reveal
Definition: a client always sees its own writes, even if other clients might not yet. Weaker than linearizable (which requires every client to see the write). Stronger than eventual.
Why it matters: most user-facing "consistency" complaints are really RYW violations. A user saves a setting, reloads, sees the old value — they're confused. But whether another user sees the new setting immediately is irrelevant.
Three implementation strategies:
1. Session affinity (sticky reads).
- Route all reads for a user session to the same replica they wrote to.
- Cheap, simple, works for session lifetime.
- Breaks if the replica fails mid-session (fallback to leader).
2. Read-after-write token / LSN tracking.
- Write returns a logical timestamp (Postgres LSN, Cassandra write time, MongoDB's
ClusterTime). - Client stores the token in the session.
- Subsequent reads pass the token; the replica either:
- Waits until it has applied up to that LSN, OR - Rejects the read (redirect to another replica or the leader).
- More robust than session affinity; handles replica failover gracefully.
3. Leader reads for N seconds post-write.
- After a write, route reads to the leader for a bounded time (say 5 seconds — longer than typical replica lag).
- After the window, reads go to replicas.
- Blunt but works for most apps. Simple to implement if you already track "time since last write" per session.
Cost comparison:
- Linearizable reads: quorum read on every request. Full quorum latency for every read.
- Read-your-writes: session routing or token check. ~0 extra latency; uses existing replication.
Typical production shape: token-based RYW for the user's own sessions, eventual reads for other users' views of the same data. This is what MongoDB's causal consistency sessions implement; Postgres supports it via pg_current_wal_lsn() + pg_last_wal_replay_lsn() on replicas.
The interview move: when someone says "we need strong consistency," ask "strong for whom?" Usually they want RYW.
?“You have a 3-node cluster with W=1, R=1. A user writes to node A, then reads from node B. What do they see?”Reveal
Probably not their write. W=1 means only node A has the write committed locally. R=1 means we return the value from whatever node the read hits — which could be B, which hasn't received the replication yet.
W + R = 2, N = 3. W + R ≤ N, so reads are NOT guaranteed to see the latest write. This is the eventual consistency case.
The user experience: they save, they refresh, they see stale data. Confusing and feels like a bug.
Options to fix, in order of cost:
1. Increase R to 3 (read from all, take latest).
- Now W + R = 4 > 3. Linearizable reads.
- Cost: reads need responses from all nodes; latency = max of N nodes.
- Good for small clusters where N=3 is typical.
2. Increase W to 2 (write quorum).
- N=3, W=2, R=1. W + R = 3 = N. Still not guaranteed linearizable (need strict >, not ≥).
- Actually: if you can guarantee read hits a node that has the write, strong. If read hits the one node without it, stale.
- Usually combine W=2, R=2 (sum = 4 > 3) for safety.
3. Route the read to node A (session affinity / RYW).
- Now the read hits the same node as the write; immediate visibility.
- Only the writing user sees the write immediately; others still eventual.
- Cheapest option for the common case.
4. Use read-after-write tokens.
- Write returns the LSN. Subsequent reads pass the LSN. Replica waits to apply or redirects.
- Robust to replica failover.
The interview insight: the "correct" answer depends on what guarantee you need. For dashboards: option 3 (RYW). For uniqueness checks: option 1 or 2 (linearizable). For social-media timelines: accept option 0 (eventual) and move on.
?“Two users edit the same document field simultaneously in an eventually consistent system. What happens and how do you resolve it?”Reveal
What happens by default: both writes are accepted by whatever replica they hit. Eventually they propagate. When the replicas meet, they see two conflicting versions.
Resolution strategies (pick one or combine):
1. Last-writer-wins (LWW) by timestamp.
- Each write carries a timestamp; the one with the later time wins.
- Simple, but loses data silently. If User A writes at t=10:00:00.500 and User B writes at t=10:00:00.501, A's change vanishes with no warning.
- Requires synchronised clocks — cross-region clock skew breaks this.
- When: data where "last edit wins" is semantically correct (user profile bio, where both users see "that's my edit now"). Not for collaborative work or counters.
2. CRDTs (Conflict-free Replicated Data Types).
- Data types with mathematical merge rules. Counters (G-Counter, PN-Counter) merge by addition. Sets (G-Set, OR-Set) merge by union. Registers (LWW-Register, MV-Register) keep multiple values.
- No data loss, deterministic merge.
- Limitation: works only for supported types. You can't CRDT-merge arbitrary application objects; you must design data structures that decompose into CRDT primitives.
- When: counters (likes, views), shopping carts (union of items), collaborative text (Y.js, Automerge).
3. Vector clocks + application-level merge.
- Each replica tracks per-node version vectors. Concurrent writes are detected explicitly.
- When conflict detected, return both values to the application. The app (or user) resolves.
- When: Git-like workflows, version-controlled documents, complex business objects where only a domain expert can merge.
4. Pessimistic locking.
- Before writing, acquire a distributed lock on the resource.
- Forces serialization; no conflicts possible.
- Cost: latency of the lock round-trip, availability risk if the lock service fails.
- When: narrow, high-value operations where you can't afford conflict (financial transfers, seat reservations).
5. Operational transformation (OT).
- Used by Google Docs, Office Online. Operations are transformed against concurrent operations so they compose correctly.
- Complex to implement; being displaced by CRDTs in newer systems.
- When: real-time collaborative editing where latency matters.
Interview answer: "It depends on the data. Counters → CRDT (PN-Counter). User profile field → LWW with caveat documented. Collaborative doc content → CRDT (Y.js-style) or OT. Anything financial → pessimistic lock or strong consistency; I wouldn't use eventual for that at all."
?“Why would you ever choose eventual over strong? Isn't strong always safer?”Reveal
Strong is never "free." It costs latency on every write and scalability on every read.
The concrete tradeoff:
| Strong | Eventual | |
|---|---|---|
| Write latency | Quorum round-trip (2-100+ ms) | Local node ack (sub-ms) |
| Write availability | Blocks if quorum unreachable | Accepts writes from any live node |
| Read throughput | Quorum read rate | Per-replica read rate × replica count |
| Complexity | Consensus protocol (Raft, Paxos) | Async replication |
| Cross-region cost | 50-200 ms per write | Negligible |
For a social media feed viewed 1M times/sec:
- Strong: 1M reads/sec ÷ quorum rate. At 10K reads/sec per quorum-read: need 100 quorum-read capacity. Expensive.
- Eventual: 1M reads/sec ÷ per-replica rate. At 50K reads/sec per replica: need 20 replicas. Each replica serves independently. Much cheaper, linearly scalable.
When eventual is genuinely safer:
- Partition tolerance. Strong requires majority available. During a network partition, a strong system refuses writes (unavailable). An eventual system keeps accepting writes (available). For a content feed, unavailable is worse than stale.
- Availability under node failure. Strong needs quorum; lose half the nodes, lose the service. Eventual degrades gracefully.
- Geographic distribution. Strong across continents is ~150 ms per write. Users won't wait. Eventual with regional primaries lets you serve from close by.
The test for "is eventual OK?":
- 1Is the divergence window shorter than human perception (typically < 1 s after convergence)? If yes, eventual is invisible.
- 2Does a stale read cause a business bug (double-sell, double-spend, incorrect balance)? If yes, eventual is not safe. Use strong.
- 3Is the write rate high enough that strong consistency would create a single-node bottleneck? If yes, eventual is necessary for scale.
The interview mental model: "Strong is the default for data where mistakes cost money. Eventual is the default for data where mistakes are invisible. Pick per class."
Code examples
// --- Writer returns the logical timestamp (LSN / clusterTime / etag) ---
async function writeAndGetToken(db: Db, doc: Doc): Promise<string> {
const result = await db.execute(
'INSERT INTO items (...) VALUES (...) RETURNING pg_current_wal_lsn()::text AS lsn',
doc,
);
return result.rows[0].lsn; // e.g. "0/1A4F3B8"
}
// --- Client stores the token in session cookie / localStorage ---
session.set('min_read_lsn', await writeAndGetToken(db, doc));
// --- Subsequent read forces the replica to catch up ---
async function readWithRyw<T>(db: Db, sql: string, lsn: string): Promise<T> {
// On a Postgres read replica:
// 1. Compare replica's applied LSN with the required LSN.
// 2. Either wait (bounded) or redirect to leader.
const status = await db.execute(
`SELECT pg_last_wal_replay_lsn() >= $1::pg_lsn AS caught_up`, [lsn]);
if (!status.rows[0].caught_up) {
// Option A: wait up to 500ms
await db.execute(`SELECT pg_wait_for_lsn($1, '500ms')`, [lsn]);
// Option B: throw and let router redirect to leader
}
return db.execute(sql);
}async function quorumRead<T>(
replicas: Replica[], // N replicas
key: string,
R: number, // read quorum size
): Promise<T> {
// Send read to all replicas, take first R responses.
const results = await Promise.all(
replicas.map((r) => r.get(key).catch((e) => ({ err: e }))),
);
const ok = results.filter((r): r is { value: T; ts: number } => 'value' in r);
if (ok.length < R) throw new Error('quorum not reached');
// Latest-wins by timestamp; optionally trigger read-repair on stale replicas.
const winner = ok.reduce((a, b) => (a.ts > b.ts ? a : b));
scheduleReadRepair(replicas, key, winner);
return winner.value;
}
// Guarantee: if W + R > N, any read quorum overlaps with the write quorum
// in at least one replica — so the latest write is always in the read set.
// Example: N=3, W=2, R=2 ⇒ W+R=4 > 3 ⇒ linearizable reads.// Each replica tracks its own increments and decrements separately.
// Value = sum(increments) - sum(decrements). Merging is a pairwise max.
class PNCounter {
// replicaId → count contributed by that replica.
private P = new Map<string, number>(); // positive
private N = new Map<string, number>(); // negative
constructor(private readonly me: string) {}
increment(by = 1) { this.P.set(this.me, (this.P.get(this.me) ?? 0) + by); }
decrement(by = 1) { this.N.set(this.me, (this.N.get(this.me) ?? 0) + by); }
value(): number {
const sum = (m: Map<string, number>) =>
[...m.values()].reduce((a, b) => a + b, 0);
return sum(this.P) - sum(this.N);
}
// Merge with another replica's state. Idempotent, commutative, associative.
// Concurrent increments are never lost — unlike LWW on a single integer.
merge(other: PNCounter) {
for (const [k, v] of other.P) this.P.set(k, Math.max(this.P.get(k) ?? 0, v));
for (const [k, v] of other.N) this.N.set(k, Math.max(this.N.get(k) ?? 0, v));
}
}Common mistakes
Announcing "we use strong consistency" when most reads don't need it costs latency on every request and forces single-leader shapes that won't scale. Name the data that actually needs it.
Account balance, ticket inventory, alias uniqueness — these need stronger guarantees. "It'll converge" is not acceptable when the divergence window is a double-spend.
A single Postgres instance is ACID-compliant, but that says nothing about what happens with replicas, multi-region, or multiple services. ACID is local; CAP/PACELC is about the distributed system. Don't conflate them.
Eventual consistency IS faster — but the interviewer wants to hear: "the risk is a stale read during the convergence window, which for this data class is acceptable because [specific business reason]." Naming the risk is the signal.
If you say "we use quorum" but can't state N, W, R and prove W+R>N, the claim is empty. For N=3: W=2, R=2 → linearizable. W=1, R=1 → eventual. Know the numbers.
Practice drills
Interviewer: "Is your system CP or AP?" How do you answer?Reveal
Don't answer with one letter. Say: "It depends on the data class. Inventory is CP — during a partition, we reject writes rather than risk overselling. Product catalog is AP — during a partition, we serve stale data from replicas because a 30-second-old price is better than an error page. The system isn't uniformly CP or AP." Then mention PACELC for extra credit.
You have a 3-node cluster with W=1, R=1. A user writes to node A and immediately reads from node B. Do they see their write?Reveal
Not necessarily. W=1 means only node A has the write. R=1 means we read from one node, which might be B, which hasn't replicated yet. W+R=2 which is ≤ N=3, so reads are NOT guaranteed to see the latest write. For read-your-writes, either: (1) set R=3 (read from all, take latest), (2) route reads to the same node they wrote to, or (3) use a read-after-write token.
Two users simultaneously edit the same document field in an eventually consistent system. What happens?Reveal
Without conflict resolution: depends on the strategy. LWW: whichever write has the later timestamp wins; the other is silently dropped. This is fine for "last edit wins" fields (user profile bio) but terrible for counters (two increments → only one counted). CRDTs: for supported types (counters, sets, registers), merges automatically with no data loss. For arbitrary data: surface the conflict to the application layer, like Git does.
Why would you ever choose eventual consistency over strong? Isn't strong always better?Reveal
Strong consistency costs latency and throughput. For N=3 with sync replication: every write is as slow as the slowest quorum member (2-5ms within AZ, 50-200ms cross-region). Read throughput is capped at quorum read rate. Eventual consistency: writes ack from one node (sub-ms), reads from any replica (infinite horizontal scale for reads). For a social media feed viewed 1M times/sec, the choice between 1ms reads from any replica vs 5ms reads from quorum majority is the difference between 50 servers and 250 servers.
Cheat sheet
- •Name the consistency level per data class, not per system.
- •PACELC > CAP: Partition → A or C; Else → Latency or Consistency.
- •Read-your-writes ≠ strong; it's much cheaper and usually what users actually need.
- •Quorum: W+R>N → linearizable reads. Prove it with numbers.
- •Eventual + idempotent reconciliation > strong + slow for most read-heavy data.
- •Conflict resolution: LWW (lossy but simple) vs CRDTs (safe but limited types) vs app-level merge.
- •Strong consistency tax: every write waits for quorum ack. Within AZ: ~2ms. Cross-region: 50-200ms.
- •"We use Postgres so it's consistent" conflates ACID (local) with distributed consistency.
Practice this skill
These problems exercise Consistency trade-offs. Try one now to apply what you just learned.