Consistent Hashing

0/3 in this phase0/45 across the roadmap

📖 Concept

Consistent hashing solves the resharding problem in distributed systems. When you add or remove servers, only a small fraction of keys need to be remapped — unlike traditional hashing where adding one server reshuffles almost everything.

The Problem with Simple Hashing

server = hash(key) % num_servers When num_servers changes from 4 to 5, approximately 80% of keys map to different servers.

How Consistent Hashing Works

  1. Arrange servers on a virtual hash ring (0 to 2^32)
  2. Hash each server name to find its position on the ring
  3. Hash each key to find its position on the ring
  4. Walk clockwise from the key's position to find the first server — that's where the key is stored

Adding/Removing Servers

  • Add server: Only keys between the new server and its predecessor need to move (~1/N keys)
  • Remove server: Only keys on the removed server move to the next server clockwise

Virtual Nodes

Problem: With few servers, distribution is uneven. Solution: Each physical server gets multiple positions (virtual nodes) on the ring, ensuring even distribution.

Used by: Cassandra, DynamoDB, Memcached, Nginx, CDN edge routing

Interview tip: Mention consistent hashing whenever you discuss distributed caching, database sharding, or any system where data is distributed across multiple nodes.

💻 Code Example

codeTap to expand ⛶
1// ============================================
2// Consistent Hashing — Implementation
3// ============================================
4
5class ConsistentHash {
6 constructor(virtualNodes = 150) {
7 this.virtualNodes = virtualNodes;
8 this.ring = new Map(); // hash position → node
9 this.sortedKeys = []; // sorted hash positions
10 this.nodes = new Set();
11 }
12
13 addNode(node) {
14 this.nodes.add(node);
15 for (let i = 0; i < this.virtualNodes; i++) {
16 const hash = this.hash(`\${node}:vn\${i}`);
17 this.ring.set(hash, node);
18 this.sortedKeys.push(hash);
19 }
20 this.sortedKeys.sort((a, b) => a - b);
21 }
22
23 removeNode(node) {
24 this.nodes.delete(node);
25 for (let i = 0; i < this.virtualNodes; i++) {
26 const hash = this.hash(`\${node}:vn\${i}`);
27 this.ring.delete(hash);
28 this.sortedKeys = this.sortedKeys.filter(k => k !== hash);
29 }
30 }
31
32 getNode(key) {
33 if (this.sortedKeys.length === 0) return null;
34 const hash = this.hash(key);
35 // Find first position on ring >= hash (clockwise)
36 for (const pos of this.sortedKeys) {
37 if (pos >= hash) return this.ring.get(pos);
38 }
39 return this.ring.get(this.sortedKeys[0]); // Wrap around
40 }
41
42 hash(key) {
43 let hash = 0;
44 for (let i = 0; i < key.length; i++) {
45 hash = ((hash << 5) - hash + key.charCodeAt(i)) >>> 0;
46 }
47 return hash;
48 }
49}
50
51// Demo: Add/remove node only affects ~1/N keys
52const ch = new ConsistentHash(100);
53ch.addNode('server-1');
54ch.addNode('server-2');
55ch.addNode('server-3');
56
57const keys = Array.from({length: 100}, (_, i) => `key_\${i}`);
58const before = keys.map(k => ({ key: k, node: ch.getNode(k) }));
59
60ch.addNode('server-4');
61const after = keys.map(k => ({ key: k, node: ch.getNode(k) }));
62
63const moved = keys.filter((k, i) => before[i].node !== after[i].node);
64console.log(`Keys moved after adding server-4: \${moved.length}/\${keys.length}`);
65// ~25% moved (1/4), not 75% like with mod hashing!

🏋️ Practice Exercise

  1. Consistent Hash Ring: Implement a consistent hash ring with virtual nodes. Test that adding a server only moves ~1/N keys.

  2. Replication: Extend consistent hashing to replicate each key to the next N clockwise servers for fault tolerance.

  3. Hot Spot Handling: A celebrity user has 1000x more traffic. How do you handle this with consistent hashing? Design a "hot key" detection and redistribution mechanism.

  4. Comparison: Compare consistent hashing, range-based sharding, and directory-based sharding for a distributed cache with 100 servers.

⚠️ Common Mistakes

  • Too few virtual nodes — with 1 virtual node per server, distribution is extremely uneven. Use 100-200 virtual nodes per server for good distribution.

  • Not replicating across the ring — storing data on just one node means a node failure loses data. Replicate to the next 2-3 clockwise nodes.

  • Using consistent hashing when simple modulo works — if your number of servers never changes, consistent hashing adds unnecessary complexity.

💼 Interview Questions

🎤 Mock Interview

Practice a live interview for Consistent Hashing