Database Replication

0/6 in this phase0/45 across the roadmap

📖 Concept

Database replication is the practice of maintaining copies of a database on multiple servers. It serves two purposes: high availability (if one server dies, others take over) and read scalability (distribute reads across multiple servers).

Replication Topologies

1. Single Leader (Primary-Replica)

One server (primary) handles all writes. Changes are replicated to one or more replicas (read-only).

Client Writes → [Primary] → replicates → [Replica 1] ← Client Reads
                                        → [Replica 2] ← Client Reads
                                        → [Replica 3] ← Client Reads

Pros: Simple, consistent writes, easy to understand Cons: Write bottleneck (single primary), failover complexity

2. Multi-Leader

Multiple servers can accept writes. Changes are replicated between all leaders.

Pros: Write scalability, multi-region support Cons: Conflict resolution is HARD (two leaders update the same row differently)

3. Leaderless (Dynamo-style)

No leader — any node can accept reads and writes. Uses quorum for consistency.

Pros: High availability, no single point of failure Cons: Complex conflict resolution, eventual consistency Used by: Cassandra, DynamoDB, Riak

Replication Methods

Method How It Works Lag Consistency
Synchronous Primary waits for ALL replicas to confirm None Strong
Asynchronous Primary doesn't wait for replicas Milliseconds to seconds Eventual
Semi-synchronous Primary waits for ONE replica, rest are async Minimal Near-strong

Replication Lag Problem

With asynchronous replication, there's a delay between a write on the primary and the read seeing it on the replica. This causes:

  • Read-your-own-write inconsistency: User updates profile → reads from replica → sees old data
  • Causal inconsistency: User B sees User A's comment but not the post it refers to

Solutions:

  1. Read-after-write consistency: Route reads to primary for recently-written data
  2. Monotonic reads: Route a user to the same replica consistently (sticky routing)
  3. Causal consistency: Track dependencies and ensure replicas apply changes in order

Interview tip: When you mention read replicas, always acknowledge replication lag and how you'd handle it for your specific use case.

💻 Code Example

codeTap to expand ⛶
1// ============================================
2// Database Replication — Patterns & Pitfalls
3// ============================================
4
5// ---------- Read-Write Splitting ----------
6
7class ReplicatedDatabase {
8 constructor(primary, replicas) {
9 this.primary = primary; // Write + strong-consistency reads
10 this.replicas = replicas; // Read-only, eventually consistent
11 this.currentReplicaIndex = 0;
12 }
13
14 // ✅ Route writes to primary
15 async write(query, params) {
16 console.log(`[WRITE] → Primary: \${this.primary.host}`);
17 return await this.primary.pool.query(query, params);
18 }
19
20 // ✅ Route reads to replicas (round-robin)
21 async read(query, params) {
22 const replica = this.getNextReplica();
23 console.log(`[READ] → Replica: \${replica.host}`);
24 return await replica.pool.query(query, params);
25 }
26
27 // ✅ Strong read — when you MUST read your own write
28 async readStrong(query, params) {
29 console.log(`[STRONG READ] → Primary: \${this.primary.host}`);
30 return await this.primary.pool.query(query, params);
31 }
32
33 getNextReplica() {
34 const replica = this.replicas[this.currentReplicaIndex];
35 this.currentReplicaIndex = (this.currentReplicaIndex + 1) % this.replicas.length;
36 return replica;
37 }
38}
39
40// ---------- Read-After-Write Consistency ----------
41
42class ConsistentReadRouter {
43 constructor(db) {
44 this.db = db;
45 // Track recent writes per user (in Redis in production)
46 this.recentWrites = new Map(); // userId → timestamp
47 }
48
49 async handleWrite(userId, query, params) {
50 const result = await this.db.write(query, params);
51 // Record that this user just wrote data
52 this.recentWrites.set(userId, Date.now());
53 return result;
54 }
55
56 async handleRead(userId, query, params) {
57 const lastWrite = this.recentWrites.get(userId);
58 const replicationLagMs = 2000; // Assume 2 second max replication lag
59
60 if (lastWrite && Date.now() - lastWrite < replicationLagMs) {
61 // User wrote recently — read from primary to see their own write
62 console.log(`[READ] User \${userId} wrote recently → routing to PRIMARY`);
63 return await this.db.readStrong(query, params);
64 }
65
66 // No recent write — safe to read from replica
67 console.log(`[READ] User \${userId} no recent writes → routing to REPLICA`);
68 return await this.db.read(query, params);
69 }
70}
71
72// ---------- Failover Manager ----------
73
74class FailoverManager {
75 constructor(primary, replicas) {
76 this.primary = primary;
77 this.replicas = replicas;
78 this.healthCheckInterval = 5000; // Check every 5 seconds
79 this.failoverThreshold = 3; // 3 missed heartbeats → failover
80 this.missedHeartbeats = 0;
81 }
82
83 startHealthChecks() {
84 setInterval(async () => {
85 try {
86 await this.primary.pool.query('SELECT 1'); // Heartbeat
87 this.missedHeartbeats = 0;
88 } catch (error) {
89 this.missedHeartbeats++;
90 console.warn(
91 `⚠️ Primary heartbeat missed (\${this.missedHeartbeats}/\${this.failoverThreshold})`
92 );
93
94 if (this.missedHeartbeats >= this.failoverThreshold) {
95 await this.performFailover();
96 }
97 }
98 }, this.healthCheckInterval);
99 }
100
101 async performFailover() {
102 console.log('🚨 PRIMARY DOWN — Initiating failover...');
103
104 // 1. Select the most up-to-date replica
105 let bestReplica = null;
106 let minLag = Infinity;
107
108 for (const replica of this.replicas) {
109 try {
110 const lag = await this.getReplicationLag(replica);
111 if (lag < minLag) {
112 minLag = lag;
113 bestReplica = replica;
114 }
115 } catch (e) {
116 console.log(`Replica \${replica.host} is also down`);
117 }
118 }
119
120 if (!bestReplica) {
121 console.error('🚨 ALL REPLICAS DOWN — FULL OUTAGE');
122 return;
123 }
124
125 // 2. Promote the best replica to primary
126 console.log(`✅ Promoting \${bestReplica.host} to PRIMARY (lag: \${minLag}ms)`);
127 await bestReplica.pool.query('SELECT pg_promote()');
128
129 // 3. Update routing
130 const oldPrimary = this.primary;
131 this.primary = bestReplica;
132 this.replicas = this.replicas.filter(r => r !== bestReplica);
133
134 // 4. Notify other replicas to follow new primary
135 for (const replica of this.replicas) {
136 console.log(`Reconfiguring \${replica.host} to follow new primary`);
137 // In production: update replication settings
138 }
139
140 console.log('✅ Failover complete');
141 }
142
143 async getReplicationLag(replica) {
144 // PostgreSQL: Check replication lag in bytes
145 const result = await replica.pool.query(
146 "SELECT EXTRACT(EPOCH FROM replay_lag) * 1000 as lag_ms FROM pg_stat_replication"
147 );
148 return result.rows[0]?.lag_ms || 0;
149 }
150}
151
152// ---------- Quorum Reads/Writes (Leaderless) ----------
153
154class QuorumDatabase {
155 constructor(nodes) {
156 this.nodes = nodes; // All database nodes
157 this.n = nodes.length; // Total nodes
158 this.writeQuorum = Math.floor(this.n / 2) + 1; // Write majority
159 this.readQuorum = Math.floor(this.n / 2) + 1; // Read majority
160 // w + r > n ensures overlap → at least one node has latest data
161 }
162
163 async quorumWrite(key, value) {
164 const writePromises = this.nodes.map(node =>
165 node.write(key, value).catch(err => ({ error: err }))
166 );
167
168 const results = await Promise.all(writePromises);
169 const successes = results.filter(r => !r.error);
170
171 if (successes.length >= this.writeQuorum) {
172 console.log(`✅ Write succeeded (\${successes.length}/\${this.n} nodes)`);
173 return true;
174 } else {
175 console.log(`❌ Write failed (only \${successes.length}/\${this.writeQuorum} required)`);
176 return false;
177 }
178 }
179
180 async quorumRead(key) {
181 const readPromises = this.nodes.map(node =>
182 node.read(key).catch(err => ({ error: err }))
183 );
184
185 const results = await Promise.all(readPromises);
186 const successes = results.filter(r => !r.error);
187
188 if (successes.length >= this.readQuorum) {
189 // Return the value with the highest version/timestamp
190 const latest = successes.reduce((a, b) =>
191 a.version > b.version ? a : b
192 );
193 return latest.value;
194 }
195 throw new Error('Read quorum not met');
196 }
197}
198
199console.log("Replication patterns demonstrated.");

🏋️ Practice Exercise

  1. Read Replica Design: Your PostgreSQL primary handles 5K writes/sec and 50K reads/sec. Design a read replica architecture that distributes the read load. How many replicas do you need? How do you handle replication lag?

  2. Failover Scenario: Your primary database goes down. Walk through the exact steps of a manual failover, then design an automatic failover system. What data could be lost during failover?

  3. Multi-Region Replication: Design a replication strategy for a global application with users in US, EU, and Asia. How do you handle: (a) read latency, (b) write conflicts, (c) data residency (GDPR)?

  4. Quorum Calculator: For a 5-node leaderless system, calculate the minimum W (write) and R (read) quorum sizes for: (a) strong consistency, (b) eventual consistency, (c) if one node is permanently down.

  5. Replication Lag Measurement: Design a monitoring system that detects when replication lag exceeds 5 seconds and automatically routes reads to the primary until lag recovers.

⚠️ Common Mistakes

  • Assuming replicas are always consistent with the primary — asynchronous replication has lag. Reading from a replica immediately after writing to the primary may return stale data.

  • Not planning for failover — when the primary dies, promoting a replica isn't automatic. You need automated failover with proper health checks, or you risk extended downtime.

  • Using replicas without monitoring replication lag — if lag grows to minutes or hours (due to long-running queries or heavy writes), replicas return severely outdated data. Always monitor and alert on lag.

  • Forgetting that writes during failover may be lost — asynchronous replicas may not have received the last few writes before the primary crashed. This data is gone. For zero data loss, use synchronous replication (at the cost of write latency).

💼 Interview Questions

🎤 Mock Interview

Practice a live interview for Database Replication