CAP Theorem

0/2 in this phase0/45 across the roadmap

📖 Concept

The CAP Theorem states that a distributed system can provide at most two out of three guarantees simultaneously:

  • C (Consistency): Every read receives the most recent write
  • A (Availability): Every request receives a response (even if stale)
  • P (Partition Tolerance): System continues operating despite network failures between nodes

Why Can't You Have All Three?

Network partitions will happen in distributed systems. When they do, you must choose:

  • CP: Favor consistency — reject requests that can't guarantee fresh data (system becomes unavailable during partition)
  • AP: Favor availability — serve requests with potentially stale data (system stays available but may be inconsistent)

Real-World Database Classification

Database Category Behavior During Partition
PostgreSQL CP Rejects writes to disconnected replicas
MongoDB CP (default) Primary handles writes, disconnected nodes are unavailable
Cassandra AP All nodes accept writes, reconcile later
DynamoDB AP Eventually consistent reads by default
Redis CP Primary only accepts writes
CockroachDB CP Strongly consistent, Raft consensus

PACELC Extension

CAP only describes behavior during a partition. PACELC extends this:

When there IS a Partition: choose Availability or Consistency Else (normal operation): choose Latency or Consistency

System P: A/C E: L/C
Cassandra A L (fast, eventual consistency)
PostgreSQL C C (consistent, higher latency)
DynamoDB A L (fast by default, optional consistency)
MongoDB C C (consistent reads from primary)

Pro tip: In interviews, never say "I choose consistency" or "I choose availability" globally. Different parts of the system need different trade-offs. User accounts need CP; a social media feed can be AP.

💻 Code Example

codeTap to expand ⛶
1// ============================================
2// CAP Theorem — Practical Demonstration
3// ============================================
4
5// ---------- CP System (Consistency over Availability) ----------
6class CPDatabase {
7 constructor(nodes) {
8 this.nodes = nodes;
9 this.quorum = Math.floor(nodes.length / 2) + 1;
10 }
11
12 async write(key, value) {
13 let acks = 0;
14 for (const node of this.nodes) {
15 try {
16 await node.write(key, value);
17 acks++;
18 } catch (e) {
19 console.log(`Node \${node.id} unreachable`);
20 }
21 }
22
23 if (acks >= this.quorum) {
24 return { success: true, acks };
25 }
26 // CP: Reject write if quorum not met
27 throw new Error(`Write failed: only \${acks}/\${this.quorum} acks (UNAVAILABLE)`);
28 }
29
30 async read(key) {
31 let responses = [];
32 for (const node of this.nodes) {
33 try {
34 const value = await node.read(key);
35 responses.push(value);
36 } catch (e) { /* Node unreachable */ }
37 }
38
39 if (responses.length >= this.quorum) {
40 // Return the most recent value
41 return responses.sort((a, b) => b.version - a.version)[0];
42 }
43 throw new Error('Read failed: quorum not met (UNAVAILABLE)');
44 }
45}
46
47// ---------- AP System (Availability over Consistency) ----------
48class APDatabase {
49 constructor(nodes) {
50 this.nodes = nodes;
51 }
52
53 async write(key, value) {
54 // AP: Write to ANY available node, even during partition
55 for (const node of this.nodes) {
56 try {
57 await node.write(key, value);
58 // Background: async replicate to other nodes
59 this.asyncReplicate(node, key, value);
60 return { success: true, node: node.id };
61 } catch (e) { continue; }
62 }
63 throw new Error('All nodes down');
64 }
65
66 async read(key) {
67 // AP: Read from ANY available node (might be stale!)
68 for (const node of this.nodes) {
69 try {
70 return await node.read(key);
71 } catch (e) { continue; }
72 }
73 throw new Error('All nodes down');
74 }
75
76 async asyncReplicate(sourceNode, key, value) {
77 for (const node of this.nodes) {
78 if (node.id !== sourceNode.id) {
79 try { await node.write(key, value); } catch (e) { /* Retry later */ }
80 }
81 }
82 }
83}
84
85// ---------- Per-Component Consistency Decisions ----------
86const systemDesignDecisions = {
87 'User Authentication': {
88 consistency: 'CP (Strong)',
89 reason: 'Cannot serve stale auth data — security risk',
90 database: 'PostgreSQL with synchronous replication',
91 },
92 'Social Feed': {
93 consistency: 'AP (Eventual)',
94 reason: 'Stale feed for a few seconds is acceptable',
95 database: 'Cassandra with async replication',
96 },
97 'Shopping Cart': {
98 consistency: 'AP with conflict resolution',
99 reason: 'Must be available during sales events, merge conflicts later',
100 database: 'DynamoDB with last-writer-wins',
101 },
102 'Inventory Count': {
103 consistency: 'CP for checkout, AP for display',
104 reason: 'Display can be stale, but checkout must prevent overselling',
105 database: 'PostgreSQL for checkout, Redis cache for display',
106 },
107 'Chat Messages': {
108 consistency: 'AP with causal ordering',
109 reason: 'Messages must be available even during partial outages',
110 database: 'Cassandra with client-side ordering',
111 },
112};
113
114console.log('System design consistency decisions:');
115Object.entries(systemDesignDecisions).forEach(([component, config]) => {
116 console.log(`\n\${component}: \${config.consistency}`);
117 console.log(` Reason: \${config.reason}`);
118});

🏋️ Practice Exercise

  1. CAP Classification: For each database, classify as CP or AP and explain with a partition scenario: PostgreSQL, Cassandra, MongoDB (default config), Redis Cluster, DynamoDB.

  2. Per-Component Consistency: For an e-commerce platform, decide CP or AP for each: user accounts, product catalog, search results, order processing, inventory, reviews, recommendations.

  3. Partition Scenario: Two datacenters lose network connectivity. Users in DC-A update their profile while users in DC-B read the same profile. Walk through what happens for both a CP and AP system.

  4. Consistency vs Business: A bank allows "available balance" (AP) to differ slightly from "actual balance" (CP). When does each display? Why is this acceptable?

⚠️ Common Mistakes

  • Treating CAP as a permanent, system-wide choice — you don't pick CP or AP for the entire system. Different data has different consistency needs. Choose per component.

  • Forgetting that partitions WILL happen — in a distributed system, network issues are inevitable. Don't design as if all nodes are always reachable.

  • Confusing consistency models — CAP 'Consistency' means linearizability (every read sees the latest write). This is different from ACID 'Consistency' (database invariants are maintained).

  • Thinking AP means 'no consistency at all' — AP systems are eventually consistent. Writes propagate, just not instantly. Most data becomes consistent within seconds.

💼 Interview Questions

🎤 Mock Interview

Practice a live interview for CAP Theorem