Rate Limiter Design

0/3 in this phase0/45 across the roadmap

📖 Concept

A rate limiter controls the number of requests a client can make within a time window. It protects services from abuse, prevents cascading failures, and ensures fair resource allocation.

Common Algorithms

Algorithm Description Pros Cons
Token Bucket Tokens added at fixed rate; request consumes a token Allows bursts, smooth rate Slightly complex state
Leaky Bucket Requests processed at fixed rate; overflow rejected Smooth output rate No burst handling
Fixed Window Count requests in fixed time windows Simple to implement Boundary spike problem
Sliding Window Log Track timestamp of each request Precise rate limiting High memory usage
Sliding Window Counter Combine fixed windows with weighted overlap Good accuracy, low memory Approximation

Where to Rate Limit

Location What It Protects Example
API Gateway Backend services from overload 1000 req/min per API key
Application Specific endpoints 5 login attempts per hour
Database DB from excessive queries Connection pool limits
Third-party Your quota with external APIs 100 Stripe API calls/sec

Interview tip: Rate limiting is a common system design interview question. Be prepared to implement a sliding window or token bucket algorithm and discuss distributed rate limiting with Redis.

💻 Code Example

codeTap to expand ⛶
1// ============================================
2// Rate Limiter — Algorithm Implementations
3// ============================================
4
5// ---------- Token Bucket ----------
6class TokenBucket {
7 constructor(capacity, refillRate) {
8 this.capacity = capacity;
9 this.tokens = capacity;
10 this.refillRate = refillRate; // tokens per second
11 this.lastRefill = Date.now();
12 }
13
14 allowRequest(tokensNeeded = 1) {
15 this.refill();
16 if (this.tokens >= tokensNeeded) {
17 this.tokens -= tokensNeeded;
18 return true;
19 }
20 return false;
21 }
22
23 refill() {
24 const now = Date.now();
25 const elapsed = (now - this.lastRefill) / 1000;
26 this.tokens = Math.min(this.capacity, this.tokens + elapsed * this.refillRate);
27 this.lastRefill = now;
28 }
29}
30
31// ---------- Sliding Window Counter ----------
32class SlidingWindowCounter {
33 constructor(windowMs, maxRequests) {
34 this.windowMs = windowMs;
35 this.maxRequests = maxRequests;
36 this.windows = new Map(); // userId → { prevCount, currCount, currWindowStart }
37 }
38
39 allowRequest(userId) {
40 const now = Date.now();
41 const currentWindow = Math.floor(now / this.windowMs);
42 let record = this.windows.get(userId);
43
44 if (!record || record.currWindow !== currentWindow) {
45 record = {
46 prevCount: record?.currWindow === currentWindow - 1 ? record.currCount : 0,
47 currCount: 0,
48 currWindow: currentWindow,
49 };
50 }
51
52 // Weighted count: overlap of previous window
53 const elapsed = (now % this.windowMs) / this.windowMs;
54 const estimatedCount = record.prevCount * (1 - elapsed) + record.currCount;
55
56 if (estimatedCount < this.maxRequests) {
57 record.currCount++;
58 this.windows.set(userId, record);
59 return { allowed: true, remaining: Math.floor(this.maxRequests - estimatedCount - 1) };
60 }
61
62 return { allowed: false, retryAfter: this.windowMs - (now % this.windowMs) };
63 }
64}
65
66// ---------- Distributed Rate Limiter (Redis) ----------
67class DistributedRateLimiter {
68 constructor(redis, config) {
69 this.redis = redis;
70 this.windowMs = config.windowMs || 60000;
71 this.maxRequests = config.maxRequests || 100;
72 }
73
74 async allowRequest(clientId) {
75 const key = `ratelimit:\${clientId}`;
76 const now = Date.now();
77 const windowStart = now - this.windowMs;
78
79 // Lua script for atomic operation
80 const lua = `
81 redis.call('ZREMRANGEBYSCORE', KEYS[1], 0, ARGV[1])
82 local count = redis.call('ZCARD', KEYS[1])
83 if count < tonumber(ARGV[3]) then
84 redis.call('ZADD', KEYS[1], ARGV[2], ARGV[2])
85 redis.call('PEXPIRE', KEYS[1], ARGV[4])
86 return 1
87 end
88 return 0
89 `;
90
91 const allowed = await this.redis.eval(
92 lua, 1, key,
93 windowStart.toString(), now.toString(),
94 this.maxRequests.toString(), this.windowMs.toString()
95 );
96
97 return { allowed: allowed === 1 };
98 }
99}
100
101// Demo
102const bucket = new TokenBucket(10, 2); // 10 capacity, refill 2/sec
103console.log('Request 1:', bucket.allowRequest()); // true
104for (let i = 0; i < 10; i++) bucket.allowRequest();
105console.log('Request 12:', bucket.allowRequest()); // false (bucket empty)
106
107const slider = new SlidingWindowCounter(60000, 100);
108console.log('Sliding window:', slider.allowRequest('user_1'));

🏋️ Practice Exercise

  1. Algorithm Comparison: Implement token bucket, sliding window log, and sliding window counter. Compare memory usage, accuracy, and burst handling with 1000 simulated requests.

  2. Distributed Rate Limiter: Design a rate limiter using Redis that works across 10 API servers, limiting each user to 100 requests per minute.

  3. Tiered Rate Limiting: Design rate limits for a SaaS API: Free tier (100 req/day), Pro (10K req/day), Enterprise (unlimited). Include per-endpoint limits (e.g., search is more expensive).

  4. Rate Limit Headers: Implement standard rate limit response headers: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, Retry-After.

⚠️ Common Mistakes

  • Fixed window boundary spikes — a user can make 100 requests at 11:59 and 100 at 12:00, effectively 200 in 1 minute. Use sliding window to prevent this.

  • Rate limiting only at the API gateway — internal services also need rate limiting to prevent cascading failures between microservices.

  • Not returning rate limit headers — clients need to know their current limits to implement proper backoff. Always return X-RateLimit-Remaining and Retry-After headers.

💼 Interview Questions

🎤 Mock Interview

Practice a live interview for Rate Limiter Design