Database Sharding & Partitioning

0/6 in this phase0/45 across the roadmap

📖 Concept

When a single database server can't handle your data volume or traffic, you need to split the data across multiple servers. This is called sharding (horizontal partitioning). It's one of the most important — and most complex — scaling strategies in system design.

Partitioning vs Sharding

Term Scope Description
Partitioning Single server Split one table into multiple partitions on the same server
Sharding Multiple servers Each partition lives on a different server (distributed partitioning)

Sharding Strategies

1. Range-Based Sharding

Data is split by ranges of a key value:

  • Shard 1: users with IDs 1 – 1,000,000
  • Shard 2: users with IDs 1,000,001 – 2,000,000
  • Shard 3: users with IDs 2,000,001 – 3,000,000

Pros: Simple, supports range queries Cons: Hotspots (new users always go to the last shard)

2. Hash-Based Sharding

Apply a hash function to the shard key: shard = hash(user_id) % num_shards

Pros: Even distribution, no hotspots Cons: Range queries become scatter-gather (must query all shards), resharding is painful

3. Directory-Based Sharding

A lookup service maps each key to its shard:

  • user_123 → Shard A
  • user_456 → Shard B
  • user_789 → Shard A

Pros: Maximum flexibility, easy to rebalance Cons: Lookup service becomes a bottleneck and single point of failure

4. Geographic Sharding

Data is split by geographic region:

  • US users → US datacenter
  • EU users → EU datacenter
  • Asia users → Asia datacenter

Pros: Data locality (low latency), compliance (GDPR) Cons: Cross-region queries are slow, complex for global users

Choosing a Shard Key

The shard key is the most important decision in sharding. A bad shard key creates hotspots, cross-shard queries, and data imbalances.

Good shard key properties:

  1. High cardinality — many distinct values (user_id ✅, boolean ❌)
  2. Even distribution — no value dominates (user_id ✅, country ❌)
  3. Query alignment — most queries include the shard key in WHERE clause
  4. Immutable — the shard key shouldn't change (user_id ✅, email ❌)

The Resharding Problem

When you need to add more shards (e.g., going from 4 to 8), hash(key) % num_shards produces different results — data must be redistributed. Consistent hashing solves this by minimizing data movement (covered in Phase 10).

Interview tip: Sharding is a tool of last resort. Always try vertical scaling, read replicas, and caching first. Sharding adds enormous operational complexity.

💻 Code Example

codeTap to expand ⛶
1// ============================================
2// Database Sharding — Implementation Patterns
3// ============================================
4
5// ---------- Hash-Based Sharding ----------
6
7class HashShardRouter {
8 constructor(shardCount) {
9 this.shardCount = shardCount;
10 this.shards = Array.from({ length: shardCount }, (_, i) => ({
11 id: i,
12 host: `db-shard-\${i}.example.com`,
13 connectionPool: null, // In production: pg.Pool
14 }));
15 }
16
17 // Simple modulo hashing
18 getShard(key) {
19 const hash = this.hashFunction(key);
20 const shardIndex = hash % this.shardCount;
21 return this.shards[shardIndex];
22 }
23
24 // FNV-1a hash for even distribution
25 hashFunction(key) {
26 let hash = 0x811c9dc5; // FNV offset basis
27 const keyStr = String(key);
28 for (let i = 0; i < keyStr.length; i++) {
29 hash ^= keyStr.charCodeAt(i);
30 hash = (hash * 0x01000193) >>> 0; // FNV prime, unsigned
31 }
32 return hash;
33 }
34
35 // ❌ Problem: Adding a new shard redistributes most keys
36 addShard() {
37 const newShardCount = this.shardCount + 1;
38 console.log(
39 `⚠️ Resharding from \${this.shardCount} to \${newShardCount} shards`
40 );
41 console.log(
42 ` ~\${Math.round(100 * (1 - this.shardCount / newShardCount))}% of keys will move!`
43 );
44 // Going from 4 → 5 shards: ~20% of keys move
45 // Going from 4 → 8 shards: ~50% of keys move!
46 this.shardCount = newShardCount;
47 }
48}
49
50// ---------- Range-Based Sharding ----------
51
52class RangeShardRouter {
53 constructor() {
54 this.ranges = [
55 { min: 0, max: 1_000_000, shard: 'shard-0' },
56 { min: 1_000_001, max: 2_000_000, shard: 'shard-1' },
57 { min: 2_000_001, max: 3_000_000, shard: 'shard-2' },
58 { min: 3_000_001, max: Infinity, shard: 'shard-3' },
59 ];
60 }
61
62 getShard(userId) {
63 const range = this.ranges.find(r => userId >= r.min && userId <= r.max);
64 return range ? range.shard : 'shard-default';
65 }
66
67 // ❌ Problem: Hotspot on the last shard (new users always go there)
68 // Shard-3 might have 10x more writes than Shard-0!
69}
70
71// ---------- Sharded Query Router ----------
72
73class ShardedDatabase {
74 constructor(shardCount) {
75 this.router = new HashShardRouter(shardCount);
76 }
77
78 // ✅ Query by shard key — goes to ONE shard
79 async getUserById(userId) {
80 const shard = this.router.getShard(userId);
81 console.log(`Query user \${userId} → \${shard.host}`);
82 return await shard.connectionPool.query(
83 'SELECT * FROM users WHERE id = $1', [userId]
84 );
85 }
86
87 // ❌ Query without shard key — must query ALL shards (scatter-gather)
88 async getUserByEmail(email) {
89 console.log(`⚠️ Scatter-gather query for email: \${email}`);
90 const promises = this.router.shards.map(shard =>
91 shard.connectionPool.query('SELECT * FROM users WHERE email = $1', [email])
92 );
93 const results = await Promise.all(promises);
94 // Merge results from all shards
95 return results.flat().find(user => user !== null);
96 }
97
98 // Cross-shard JOIN — the nightmare scenario
99 async getUserWithOrders(userId) {
100 // If users and orders are sharded differently, you need:
101 // 1. Query user from users shard
102 // 2. Query orders from orders shard(s)
103 // 3. Join in application code
104 // This is why you should co-locate related data!
105
106 const userShard = this.router.getShard(userId);
107 const user = await userShard.connectionPool.query(
108 'SELECT * FROM users WHERE id = $1', [userId]
109 );
110
111 // If orders are sharded by user_id too, this goes to one shard
112 // If orders are sharded differently... scatter-gather :(
113 const orderShard = this.router.getShard(userId); // Same shard key!
114 const orders = await orderShard.connectionPool.query(
115 'SELECT * FROM orders WHERE user_id = $1', [userId]
116 );
117
118 return { ...user, orders };
119 }
120}
121
122// ---------- Database Partitioning (Single Server) ----------
123
124const partitioningExamples = `
125 -- Range partitioning by date (great for time-series)
126 CREATE TABLE events (
127 id SERIAL,
128 user_id INTEGER,
129 event_type TEXT,
130 created_at TIMESTAMP
131 ) PARTITION BY RANGE (created_at);
132
133 -- One partition per month
134 CREATE TABLE events_2024_01 PARTITION OF events
135 FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');
136 CREATE TABLE events_2024_02 PARTITION OF events
137 FOR VALUES FROM ('2024-02-01') TO ('2024-03-01');
138
139 -- Benefits:
140 -- ✅ Query "events from January" only scans one partition
141 -- ✅ Drop old data by detaching a partition (instant, no DELETE)
142 -- ✅ Each partition can have its own indexes
143
144 -- List partitioning by region
145 CREATE TABLE users (
146 id SERIAL,
147 name TEXT,
148 region TEXT
149 ) PARTITION BY LIST (region);
150
151 CREATE TABLE users_us PARTITION OF users FOR VALUES IN ('us-east', 'us-west');
152 CREATE TABLE users_eu PARTITION OF users FOR VALUES IN ('eu-west', 'eu-central');
153`;
154
155// Demo
156const router = new HashShardRouter(4);
157console.log('User 123 →', router.getShard(123));
158console.log('User 456 →', router.getShard(456));
159console.log('User 789 →', router.getShard(789));
160
161const rangeRouter = new RangeShardRouter();
162console.log('User 500000 →', rangeRouter.getShard(500000));
163console.log('User 1500000 →', rangeRouter.getShard(1500000));

🏋️ Practice Exercise

  1. Shard Key Selection: For each system, recommend a shard key and sharding strategy, explaining trade-offs: (a) E-commerce orders, (b) Chat messages, (c) Global user accounts, (d) Social media posts/feed.

  2. Hotspot Simulation: With hash-based sharding (4 shards), simulate what happens when 70% of traffic goes to 5% of users (celebrity effect). How would you handle this hotspot?

  3. Cross-Shard Query Design: You've sharded users by user_id and orders by user_id. Now the product team wants "top 10 highest spending users this month." Design the query strategy across shards.

  4. Resharding Plan: You need to go from 8 shards to 16 shards with zero downtime. Design a step-by-step migration plan including: data migration, dual-writing, verification, and cutover.

  5. Partition Strategy: Design the partitioning strategy for a logging table with 50M new rows per day and a 90-day retention policy. How do you efficiently purge old data?

⚠️ Common Mistakes

  • Sharding too early — sharding adds enormous complexity (cross-shard queries, distributed transactions, operational overhead). Start with vertical scaling, read replicas, and caching. Only shard when you've exhausted other options.

  • Choosing a bad shard key — sharding by a low-cardinality key (like country) creates massive hotspots. Sharding by a key not used in queries forces expensive scatter-gather queries.

  • Ignoring cross-shard queries — if your most common query requires data from all shards, sharding may actually make performance worse. Your shard key must align with your most frequent query patterns.

  • Not co-locating related data — if users and their orders are on different shards, every 'get user with orders' query becomes a distributed join. Shard related tables by the same key.

💼 Interview Questions

🎤 Mock Interview

Practice a live interview for Database Sharding & Partitioning