Unique ID Generation at Scale

0/3 in this phase0/45 across the roadmap

📖 Concept

Generating unique IDs in a distributed system is surprisingly challenging. Auto-increment IDs don't work when you have multiple databases. You need IDs that are globally unique, sortable, and generated without coordination.

Approaches

Approach Format Sortable Size Coordination
UUID v4 Random 128-bit ❌ No 36 chars None
ULID Timestamp + random ✅ Yes 26 chars None
Snowflake Timestamp + machine + seq ✅ Yes 64-bit int Machine ID assignment
Nano ID Random (configurable) ❌ No Custom None
DB sequence Auto-increment ✅ Yes Varies DB coordination

Twitter Snowflake (Most Popular for System Design)

64-bit ID composed of:

  • 1 bit: Reserved (sign bit)
  • 41 bits: Timestamp (milliseconds since epoch → 69 years)
  • 10 bits: Machine/datacenter ID (1024 machines)
  • 12 bits: Sequence number (4096 IDs per millisecond per machine)

Result: Each machine generates up to 4096 unique, time-sorted IDs per millisecond. No coordination between machines. Total: ~4 million IDs/sec across 1024 machines.

Interview tip: When asked about ID generation, mention Snowflake IDs. They're time-sorted (great for database indexes), unique without coordination, and encode useful metadata.

💻 Code Example

codeTap to expand ⛶
1// ============================================
2// Unique ID Generation — Snowflake Implementation
3// ============================================
4
5class SnowflakeGenerator {
6 constructor(machineId, datacenterId = 0) {
7 this.machineId = machineId & 0x1F; // 5 bits: 0-31
8 this.datacenterId = datacenterId & 0x1F; // 5 bits: 0-31
9 this.sequence = 0;
10 this.lastTimestamp = -1;
11 this.epoch = 1704067200000; // Jan 1, 2024
12 }
13
14 generate() {
15 let timestamp = Date.now() - this.epoch;
16
17 if (timestamp === this.lastTimestamp) {
18 this.sequence = (this.sequence + 1) & 0xFFF; // 12 bits: 0-4095
19 if (this.sequence === 0) {
20 // Wait for next millisecond
21 while (Date.now() - this.epoch <= this.lastTimestamp) {}
22 timestamp = Date.now() - this.epoch;
23 }
24 } else {
25 this.sequence = 0;
26 }
27
28 this.lastTimestamp = timestamp;
29
30 // Compose 64-bit ID (using BigInt for 64-bit precision)
31 const id = (BigInt(timestamp) << 22n)
32 | (BigInt(this.datacenterId) << 17n)
33 | (BigInt(this.machineId) << 12n)
34 | BigInt(this.sequence);
35
36 return id.toString();
37 }
38
39 // Extract components from ID
40 static parse(id) {
41 const n = BigInt(id);
42 const epoch = 1704067200000;
43 return {
44 timestamp: new Date(Number(n >> 22n) + epoch),
45 datacenterId: Number((n >> 17n) & 0x1Fn),
46 machineId: Number((n >> 12n) & 0x1Fn),
47 sequence: Number(n & 0xFFFn),
48 };
49 }
50}
51
52// Demo
53const gen = new SnowflakeGenerator(1, 0);
54const ids = Array.from({length: 5}, () => gen.generate());
55console.log('Generated IDs:', ids);
56console.log('Parsed:', SnowflakeGenerator.parse(ids[0]));
57
58// ---------- ULID (Universally Unique Lexicographically Sortable ID) ----------
59function generateULID() {
60 const timestamp = Date.now();
61 const timeChars = encodeBase32(timestamp, 10);
62 const randomChars = encodeBase32(Math.floor(Math.random() * 2**40), 8)
63 + encodeBase32(Math.floor(Math.random() * 2**40), 8);
64 return timeChars + randomChars;
65}
66
67function encodeBase32(value, length) {
68 const chars = '0123456789ABCDEFGHJKMNPQRSTVWXYZ';
69 let result = '';
70 for (let i = 0; i < length; i++) {
71 result = chars[value & 0x1F] + result;
72 value = Math.floor(value / 32);
73 }
74 return result;
75}
76
77console.log('ULID:', generateULID());

🏋️ Practice Exercise

  1. Snowflake Implementation: Implement a Snowflake ID generator that handles: clock skew (system clock goes backward), sequence overflow (>4096 IDs/ms), and multi-datacenter deployment.

  2. UUID vs Snowflake: Compare UUID v4 and Snowflake IDs for B-Tree index performance. Why are random UUIDs terrible for database indexes?

  3. URL Shortener IDs: Design the ID generation for a URL shortener (like bit.ly) that generates 7-character alphanumeric short codes. How many unique URLs can it support?

  4. Distributed Counter: Design a counter service that creates sequential IDs across 10 servers without a single point of failure.

⚠️ Common Mistakes

  • Using UUID v4 as database primary key — random UUIDs cause B-Tree index fragmentation because inserts are scattered across the entire index. Use time-sorted IDs (Snowflake, ULID) for sequential inserts.

  • Clock synchronization dependency — Snowflake IDs rely on timestamps. If the system clock goes backward (NTP adjustment), you could generate duplicate IDs. Implement clock skew detection.

  • Not considering ID size — 128-bit UUIDs use 2x the space of 64-bit Snowflake IDs. In tables with billions of rows, this significantly impacts index size and join performance.

💼 Interview Questions

🎤 Mock Interview

Practice a live interview for Unique ID Generation at Scale