Unique ID Generation at Scale

0/3 in this phase0/45 across the roadmap

📖 Concept

Generating unique IDs in a distributed system is surprisingly challenging. Auto-increment IDs don't work when you have multiple databases. You need IDs that are globally unique, sortable, and generated without coordination.

Approaches

Approach	Format	Sortable	Size	Coordination
UUID v4	Random 128-bit	❌ No	36 chars	None
ULID	Timestamp + random	✅ Yes	26 chars	None
Snowflake	Timestamp + machine + seq	✅ Yes	64-bit int	Machine ID assignment
Nano ID	Random (configurable)	❌ No	Custom	None
DB sequence	Auto-increment	✅ Yes	Varies	DB coordination

Twitter Snowflake (Most Popular for System Design)

64-bit ID composed of:

1 bit: Reserved (sign bit)
41 bits: Timestamp (milliseconds since epoch → 69 years)
10 bits: Machine/datacenter ID (1024 machines)
12 bits: Sequence number (4096 IDs per millisecond per machine)

Result: Each machine generates up to 4096 unique, time-sorted IDs per millisecond. No coordination between machines. Total: ~4 million IDs/sec across 1024 machines.

Interview tip: When asked about ID generation, mention Snowflake IDs. They're time-sorted (great for database indexes), unique without coordination, and encode useful metadata.

💻 Code Example

codeTap to expand ⛶

1// ============================================
2// Unique ID Generation — Snowflake Implementation
3// ============================================
4
5class SnowflakeGenerator {
6  constructor(machineId, datacenterId = 0) {
7    this.machineId = machineId & 0x1F;     // 5 bits: 0-31
8    this.datacenterId = datacenterId & 0x1F; // 5 bits: 0-31
9    this.sequence = 0;
10    this.lastTimestamp = -1;
11    this.epoch = 1704067200000; // Jan 1, 2024
12  }
13
14  generate() {
15    let timestamp = Date.now() - this.epoch;
16
17    if (timestamp === this.lastTimestamp) {
18      this.sequence = (this.sequence + 1) & 0xFFF; // 12 bits: 0-4095
19      if (this.sequence === 0) {
20        // Wait for next millisecond
21        while (Date.now() - this.epoch <= this.lastTimestamp) {}
22        timestamp = Date.now() - this.epoch;
23      }
24    } else {
25      this.sequence = 0;
26    }
27
28    this.lastTimestamp = timestamp;
29
30    // Compose 64-bit ID (using BigInt for 64-bit precision)
31    const id = (BigInt(timestamp) << 22n)
32             | (BigInt(this.datacenterId) << 17n)
33             | (BigInt(this.machineId) << 12n)
34             | BigInt(this.sequence);
35
36    return id.toString();
37  }
38
39  // Extract components from ID
40  static parse(id) {
41    const n = BigInt(id);
42    const epoch = 1704067200000;
43    return {
44      timestamp: new Date(Number(n >> 22n) + epoch),
45      datacenterId: Number((n >> 17n) & 0x1Fn),
46      machineId: Number((n >> 12n) & 0x1Fn),
47      sequence: Number(n & 0xFFFn),
48    };
49  }
50}
51
52// Demo
53const gen = new SnowflakeGenerator(1, 0);
54const ids = Array.from({length: 5}, () => gen.generate());
55console.log('Generated IDs:', ids);
56console.log('Parsed:', SnowflakeGenerator.parse(ids[0]));
57
58// ---------- ULID (Universally Unique Lexicographically Sortable ID) ----------
59function generateULID() {
60  const timestamp = Date.now();
61  const timeChars = encodeBase32(timestamp, 10);
62  const randomChars = encodeBase32(Math.floor(Math.random() * 2**40), 8)
63                    + encodeBase32(Math.floor(Math.random() * 2**40), 8);
64  return timeChars + randomChars;
65}
66
67function encodeBase32(value, length) {
68  const chars = '0123456789ABCDEFGHJKMNPQRSTVWXYZ';
69  let result = '';
70  for (let i = 0; i < length; i++) {
71    result = chars[value & 0x1F] + result;
72    value = Math.floor(value / 32);
73  }
74  return result;
75}
76
77console.log('ULID:', generateULID());

🏋️ Practice Exercise

Snowflake Implementation: Implement a Snowflake ID generator that handles: clock skew (system clock goes backward), sequence overflow (>4096 IDs/ms), and multi-datacenter deployment.
UUID vs Snowflake: Compare UUID v4 and Snowflake IDs for B-Tree index performance. Why are random UUIDs terrible for database indexes?
URL Shortener IDs: Design the ID generation for a URL shortener (like bit.ly) that generates 7-character alphanumeric short codes. How many unique URLs can it support?
Distributed Counter: Design a counter service that creates sequential IDs across 10 servers without a single point of failure.

⚠️ Common Mistakes

Using UUID v4 as database primary key — random UUIDs cause B-Tree index fragmentation because inserts are scattered across the entire index. Use time-sorted IDs (Snowflake, ULID) for sequential inserts.
Clock synchronization dependency — Snowflake IDs rely on timestamps. If the system clock goes backward (NTP adjustment), you could generate duplicate IDs. Implement clock skew detection.
Not considering ID size — 128-bit UUIDs use 2x the space of 64-bit Snowflake IDs. In tables with billions of rows, this significantly impacts index size and join performance.

💼 Interview Questions

🎤 Mock Interview

Practice a live interview for Unique ID Generation at Scale

Was this topic helpful?

← PreviousConsistent Hashing Next →Rate Limiter Design