Database Sharding & Partitioning
📖 Concept
When a single database server can't handle your data volume or traffic, you need to split the data across multiple servers. This is called sharding (horizontal partitioning). It's one of the most important — and most complex — scaling strategies in system design.
Partitioning vs Sharding
| Term | Scope | Description |
|---|---|---|
| Partitioning | Single server | Split one table into multiple partitions on the same server |
| Sharding | Multiple servers | Each partition lives on a different server (distributed partitioning) |
Sharding Strategies
1. Range-Based Sharding
Data is split by ranges of a key value:
- Shard 1: users with IDs 1 – 1,000,000
- Shard 2: users with IDs 1,000,001 – 2,000,000
- Shard 3: users with IDs 2,000,001 – 3,000,000
Pros: Simple, supports range queries Cons: Hotspots (new users always go to the last shard)
2. Hash-Based Sharding
Apply a hash function to the shard key: shard = hash(user_id) % num_shards
Pros: Even distribution, no hotspots Cons: Range queries become scatter-gather (must query all shards), resharding is painful
3. Directory-Based Sharding
A lookup service maps each key to its shard:
- user_123 → Shard A
- user_456 → Shard B
- user_789 → Shard A
Pros: Maximum flexibility, easy to rebalance Cons: Lookup service becomes a bottleneck and single point of failure
4. Geographic Sharding
Data is split by geographic region:
- US users → US datacenter
- EU users → EU datacenter
- Asia users → Asia datacenter
Pros: Data locality (low latency), compliance (GDPR) Cons: Cross-region queries are slow, complex for global users
Choosing a Shard Key
The shard key is the most important decision in sharding. A bad shard key creates hotspots, cross-shard queries, and data imbalances.
Good shard key properties:
- High cardinality — many distinct values (user_id ✅, boolean ❌)
- Even distribution — no value dominates (user_id ✅, country ❌)
- Query alignment — most queries include the shard key in WHERE clause
- Immutable — the shard key shouldn't change (user_id ✅, email ❌)
The Resharding Problem
When you need to add more shards (e.g., going from 4 to 8), hash(key) % num_shards produces different results — data must be redistributed. Consistent hashing solves this by minimizing data movement (covered in Phase 10).
Interview tip: Sharding is a tool of last resort. Always try vertical scaling, read replicas, and caching first. Sharding adds enormous operational complexity.
💻 Code Example
1// ============================================2// Database Sharding — Implementation Patterns3// ============================================45// ---------- Hash-Based Sharding ----------67class HashShardRouter {8 constructor(shardCount) {9 this.shardCount = shardCount;10 this.shards = Array.from({ length: shardCount }, (_, i) => ({11 id: i,12 host: `db-shard-\${i}.example.com`,13 connectionPool: null, // In production: pg.Pool14 }));15 }1617 // Simple modulo hashing18 getShard(key) {19 const hash = this.hashFunction(key);20 const shardIndex = hash % this.shardCount;21 return this.shards[shardIndex];22 }2324 // FNV-1a hash for even distribution25 hashFunction(key) {26 let hash = 0x811c9dc5; // FNV offset basis27 const keyStr = String(key);28 for (let i = 0; i < keyStr.length; i++) {29 hash ^= keyStr.charCodeAt(i);30 hash = (hash * 0x01000193) >>> 0; // FNV prime, unsigned31 }32 return hash;33 }3435 // ❌ Problem: Adding a new shard redistributes most keys36 addShard() {37 const newShardCount = this.shardCount + 1;38 console.log(39 `⚠️ Resharding from \${this.shardCount} to \${newShardCount} shards`40 );41 console.log(42 ` ~\${Math.round(100 * (1 - this.shardCount / newShardCount))}% of keys will move!`43 );44 // Going from 4 → 5 shards: ~20% of keys move45 // Going from 4 → 8 shards: ~50% of keys move!46 this.shardCount = newShardCount;47 }48}4950// ---------- Range-Based Sharding ----------5152class RangeShardRouter {53 constructor() {54 this.ranges = [55 { min: 0, max: 1_000_000, shard: 'shard-0' },56 { min: 1_000_001, max: 2_000_000, shard: 'shard-1' },57 { min: 2_000_001, max: 3_000_000, shard: 'shard-2' },58 { min: 3_000_001, max: Infinity, shard: 'shard-3' },59 ];60 }6162 getShard(userId) {63 const range = this.ranges.find(r => userId >= r.min && userId <= r.max);64 return range ? range.shard : 'shard-default';65 }6667 // ❌ Problem: Hotspot on the last shard (new users always go there)68 // Shard-3 might have 10x more writes than Shard-0!69}7071// ---------- Sharded Query Router ----------7273class ShardedDatabase {74 constructor(shardCount) {75 this.router = new HashShardRouter(shardCount);76 }7778 // ✅ Query by shard key — goes to ONE shard79 async getUserById(userId) {80 const shard = this.router.getShard(userId);81 console.log(`Query user \${userId} → \${shard.host}`);82 return await shard.connectionPool.query(83 'SELECT * FROM users WHERE id = $1', [userId]84 );85 }8687 // ❌ Query without shard key — must query ALL shards (scatter-gather)88 async getUserByEmail(email) {89 console.log(`⚠️ Scatter-gather query for email: \${email}`);90 const promises = this.router.shards.map(shard =>91 shard.connectionPool.query('SELECT * FROM users WHERE email = $1', [email])92 );93 const results = await Promise.all(promises);94 // Merge results from all shards95 return results.flat().find(user => user !== null);96 }9798 // Cross-shard JOIN — the nightmare scenario99 async getUserWithOrders(userId) {100 // If users and orders are sharded differently, you need:101 // 1. Query user from users shard102 // 2. Query orders from orders shard(s)103 // 3. Join in application code104 // This is why you should co-locate related data!105106 const userShard = this.router.getShard(userId);107 const user = await userShard.connectionPool.query(108 'SELECT * FROM users WHERE id = $1', [userId]109 );110111 // If orders are sharded by user_id too, this goes to one shard112 // If orders are sharded differently... scatter-gather :(113 const orderShard = this.router.getShard(userId); // Same shard key!114 const orders = await orderShard.connectionPool.query(115 'SELECT * FROM orders WHERE user_id = $1', [userId]116 );117118 return { ...user, orders };119 }120}121122// ---------- Database Partitioning (Single Server) ----------123124const partitioningExamples = `125 -- Range partitioning by date (great for time-series)126 CREATE TABLE events (127 id SERIAL,128 user_id INTEGER,129 event_type TEXT,130 created_at TIMESTAMP131 ) PARTITION BY RANGE (created_at);132133 -- One partition per month134 CREATE TABLE events_2024_01 PARTITION OF events135 FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');136 CREATE TABLE events_2024_02 PARTITION OF events137 FOR VALUES FROM ('2024-02-01') TO ('2024-03-01');138139 -- Benefits:140 -- ✅ Query "events from January" only scans one partition141 -- ✅ Drop old data by detaching a partition (instant, no DELETE)142 -- ✅ Each partition can have its own indexes143144 -- List partitioning by region145 CREATE TABLE users (146 id SERIAL,147 name TEXT,148 region TEXT149 ) PARTITION BY LIST (region);150151 CREATE TABLE users_us PARTITION OF users FOR VALUES IN ('us-east', 'us-west');152 CREATE TABLE users_eu PARTITION OF users FOR VALUES IN ('eu-west', 'eu-central');153`;154155// Demo156const router = new HashShardRouter(4);157console.log('User 123 →', router.getShard(123));158console.log('User 456 →', router.getShard(456));159console.log('User 789 →', router.getShard(789));160161const rangeRouter = new RangeShardRouter();162console.log('User 500000 →', rangeRouter.getShard(500000));163console.log('User 1500000 →', rangeRouter.getShard(1500000));
🏋️ Practice Exercise
Shard Key Selection: For each system, recommend a shard key and sharding strategy, explaining trade-offs: (a) E-commerce orders, (b) Chat messages, (c) Global user accounts, (d) Social media posts/feed.
Hotspot Simulation: With hash-based sharding (4 shards), simulate what happens when 70% of traffic goes to 5% of users (celebrity effect). How would you handle this hotspot?
Cross-Shard Query Design: You've sharded users by user_id and orders by user_id. Now the product team wants "top 10 highest spending users this month." Design the query strategy across shards.
Resharding Plan: You need to go from 8 shards to 16 shards with zero downtime. Design a step-by-step migration plan including: data migration, dual-writing, verification, and cutover.
Partition Strategy: Design the partitioning strategy for a logging table with 50M new rows per day and a 90-day retention policy. How do you efficiently purge old data?
⚠️ Common Mistakes
Sharding too early — sharding adds enormous complexity (cross-shard queries, distributed transactions, operational overhead). Start with vertical scaling, read replicas, and caching. Only shard when you've exhausted other options.
Choosing a bad shard key — sharding by a low-cardinality key (like country) creates massive hotspots. Sharding by a key not used in queries forces expensive scatter-gather queries.
Ignoring cross-shard queries — if your most common query requires data from all shards, sharding may actually make performance worse. Your shard key must align with your most frequent query patterns.
Not co-locating related data — if users and their orders are on different shards, every 'get user with orders' query becomes a distributed join. Shard related tables by the same key.
💼 Interview Questions
🎤 Mock Interview
Practice a live interview for Database Sharding & Partitioning