Unique ID Generation at Scale
📖 Concept
Generating unique IDs in a distributed system is surprisingly challenging. Auto-increment IDs don't work when you have multiple databases. You need IDs that are globally unique, sortable, and generated without coordination.
Approaches
| Approach | Format | Sortable | Size | Coordination |
|---|---|---|---|---|
| UUID v4 | Random 128-bit | ❌ No | 36 chars | None |
| ULID | Timestamp + random | ✅ Yes | 26 chars | None |
| Snowflake | Timestamp + machine + seq | ✅ Yes | 64-bit int | Machine ID assignment |
| Nano ID | Random (configurable) | ❌ No | Custom | None |
| DB sequence | Auto-increment | ✅ Yes | Varies | DB coordination |
Twitter Snowflake (Most Popular for System Design)
64-bit ID composed of:
- 1 bit: Reserved (sign bit)
- 41 bits: Timestamp (milliseconds since epoch → 69 years)
- 10 bits: Machine/datacenter ID (1024 machines)
- 12 bits: Sequence number (4096 IDs per millisecond per machine)
Result: Each machine generates up to 4096 unique, time-sorted IDs per millisecond. No coordination between machines. Total: ~4 million IDs/sec across 1024 machines.
Interview tip: When asked about ID generation, mention Snowflake IDs. They're time-sorted (great for database indexes), unique without coordination, and encode useful metadata.
💻 Code Example
1// ============================================2// Unique ID Generation — Snowflake Implementation3// ============================================45class SnowflakeGenerator {6 constructor(machineId, datacenterId = 0) {7 this.machineId = machineId & 0x1F; // 5 bits: 0-318 this.datacenterId = datacenterId & 0x1F; // 5 bits: 0-319 this.sequence = 0;10 this.lastTimestamp = -1;11 this.epoch = 1704067200000; // Jan 1, 202412 }1314 generate() {15 let timestamp = Date.now() - this.epoch;1617 if (timestamp === this.lastTimestamp) {18 this.sequence = (this.sequence + 1) & 0xFFF; // 12 bits: 0-409519 if (this.sequence === 0) {20 // Wait for next millisecond21 while (Date.now() - this.epoch <= this.lastTimestamp) {}22 timestamp = Date.now() - this.epoch;23 }24 } else {25 this.sequence = 0;26 }2728 this.lastTimestamp = timestamp;2930 // Compose 64-bit ID (using BigInt for 64-bit precision)31 const id = (BigInt(timestamp) << 22n)32 | (BigInt(this.datacenterId) << 17n)33 | (BigInt(this.machineId) << 12n)34 | BigInt(this.sequence);3536 return id.toString();37 }3839 // Extract components from ID40 static parse(id) {41 const n = BigInt(id);42 const epoch = 1704067200000;43 return {44 timestamp: new Date(Number(n >> 22n) + epoch),45 datacenterId: Number((n >> 17n) & 0x1Fn),46 machineId: Number((n >> 12n) & 0x1Fn),47 sequence: Number(n & 0xFFFn),48 };49 }50}5152// Demo53const gen = new SnowflakeGenerator(1, 0);54const ids = Array.from({length: 5}, () => gen.generate());55console.log('Generated IDs:', ids);56console.log('Parsed:', SnowflakeGenerator.parse(ids[0]));5758// ---------- ULID (Universally Unique Lexicographically Sortable ID) ----------59function generateULID() {60 const timestamp = Date.now();61 const timeChars = encodeBase32(timestamp, 10);62 const randomChars = encodeBase32(Math.floor(Math.random() * 2**40), 8)63 + encodeBase32(Math.floor(Math.random() * 2**40), 8);64 return timeChars + randomChars;65}6667function encodeBase32(value, length) {68 const chars = '0123456789ABCDEFGHJKMNPQRSTVWXYZ';69 let result = '';70 for (let i = 0; i < length; i++) {71 result = chars[value & 0x1F] + result;72 value = Math.floor(value / 32);73 }74 return result;75}7677console.log('ULID:', generateULID());
🏋️ Practice Exercise
Snowflake Implementation: Implement a Snowflake ID generator that handles: clock skew (system clock goes backward), sequence overflow (>4096 IDs/ms), and multi-datacenter deployment.
UUID vs Snowflake: Compare UUID v4 and Snowflake IDs for B-Tree index performance. Why are random UUIDs terrible for database indexes?
URL Shortener IDs: Design the ID generation for a URL shortener (like bit.ly) that generates 7-character alphanumeric short codes. How many unique URLs can it support?
Distributed Counter: Design a counter service that creates sequential IDs across 10 servers without a single point of failure.
⚠️ Common Mistakes
Using UUID v4 as database primary key — random UUIDs cause B-Tree index fragmentation because inserts are scattered across the entire index. Use time-sorted IDs (Snowflake, ULID) for sequential inserts.
Clock synchronization dependency — Snowflake IDs rely on timestamps. If the system clock goes backward (NTP adjustment), you could generate duplicate IDs. Implement clock skew detection.
Not considering ID size — 128-bit UUIDs use 2x the space of 64-bit Snowflake IDs. In tables with billions of rows, this significantly impacts index size and join performance.
💼 Interview Questions
🎤 Mock Interview
Practice a live interview for Unique ID Generation at Scale