Rate Limiter Design
📖 Concept
A rate limiter controls the number of requests a client can make within a time window. It protects services from abuse, prevents cascading failures, and ensures fair resource allocation.
Common Algorithms
| Algorithm | Description | Pros | Cons |
|---|---|---|---|
| Token Bucket | Tokens added at fixed rate; request consumes a token | Allows bursts, smooth rate | Slightly complex state |
| Leaky Bucket | Requests processed at fixed rate; overflow rejected | Smooth output rate | No burst handling |
| Fixed Window | Count requests in fixed time windows | Simple to implement | Boundary spike problem |
| Sliding Window Log | Track timestamp of each request | Precise rate limiting | High memory usage |
| Sliding Window Counter | Combine fixed windows with weighted overlap | Good accuracy, low memory | Approximation |
Where to Rate Limit
| Location | What It Protects | Example |
|---|---|---|
| API Gateway | Backend services from overload | 1000 req/min per API key |
| Application | Specific endpoints | 5 login attempts per hour |
| Database | DB from excessive queries | Connection pool limits |
| Third-party | Your quota with external APIs | 100 Stripe API calls/sec |
Interview tip: Rate limiting is a common system design interview question. Be prepared to implement a sliding window or token bucket algorithm and discuss distributed rate limiting with Redis.
💻 Code Example
1// ============================================2// Rate Limiter — Algorithm Implementations3// ============================================45// ---------- Token Bucket ----------6class TokenBucket {7 constructor(capacity, refillRate) {8 this.capacity = capacity;9 this.tokens = capacity;10 this.refillRate = refillRate; // tokens per second11 this.lastRefill = Date.now();12 }1314 allowRequest(tokensNeeded = 1) {15 this.refill();16 if (this.tokens >= tokensNeeded) {17 this.tokens -= tokensNeeded;18 return true;19 }20 return false;21 }2223 refill() {24 const now = Date.now();25 const elapsed = (now - this.lastRefill) / 1000;26 this.tokens = Math.min(this.capacity, this.tokens + elapsed * this.refillRate);27 this.lastRefill = now;28 }29}3031// ---------- Sliding Window Counter ----------32class SlidingWindowCounter {33 constructor(windowMs, maxRequests) {34 this.windowMs = windowMs;35 this.maxRequests = maxRequests;36 this.windows = new Map(); // userId → { prevCount, currCount, currWindowStart }37 }3839 allowRequest(userId) {40 const now = Date.now();41 const currentWindow = Math.floor(now / this.windowMs);42 let record = this.windows.get(userId);4344 if (!record || record.currWindow !== currentWindow) {45 record = {46 prevCount: record?.currWindow === currentWindow - 1 ? record.currCount : 0,47 currCount: 0,48 currWindow: currentWindow,49 };50 }5152 // Weighted count: overlap of previous window53 const elapsed = (now % this.windowMs) / this.windowMs;54 const estimatedCount = record.prevCount * (1 - elapsed) + record.currCount;5556 if (estimatedCount < this.maxRequests) {57 record.currCount++;58 this.windows.set(userId, record);59 return { allowed: true, remaining: Math.floor(this.maxRequests - estimatedCount - 1) };60 }6162 return { allowed: false, retryAfter: this.windowMs - (now % this.windowMs) };63 }64}6566// ---------- Distributed Rate Limiter (Redis) ----------67class DistributedRateLimiter {68 constructor(redis, config) {69 this.redis = redis;70 this.windowMs = config.windowMs || 60000;71 this.maxRequests = config.maxRequests || 100;72 }7374 async allowRequest(clientId) {75 const key = `ratelimit:\${clientId}`;76 const now = Date.now();77 const windowStart = now - this.windowMs;7879 // Lua script for atomic operation80 const lua = `81 redis.call('ZREMRANGEBYSCORE', KEYS[1], 0, ARGV[1])82 local count = redis.call('ZCARD', KEYS[1])83 if count < tonumber(ARGV[3]) then84 redis.call('ZADD', KEYS[1], ARGV[2], ARGV[2])85 redis.call('PEXPIRE', KEYS[1], ARGV[4])86 return 187 end88 return 089 `;9091 const allowed = await this.redis.eval(92 lua, 1, key,93 windowStart.toString(), now.toString(),94 this.maxRequests.toString(), this.windowMs.toString()95 );9697 return { allowed: allowed === 1 };98 }99}100101// Demo102const bucket = new TokenBucket(10, 2); // 10 capacity, refill 2/sec103console.log('Request 1:', bucket.allowRequest()); // true104for (let i = 0; i < 10; i++) bucket.allowRequest();105console.log('Request 12:', bucket.allowRequest()); // false (bucket empty)106107const slider = new SlidingWindowCounter(60000, 100);108console.log('Sliding window:', slider.allowRequest('user_1'));
🏋️ Practice Exercise
Algorithm Comparison: Implement token bucket, sliding window log, and sliding window counter. Compare memory usage, accuracy, and burst handling with 1000 simulated requests.
Distributed Rate Limiter: Design a rate limiter using Redis that works across 10 API servers, limiting each user to 100 requests per minute.
Tiered Rate Limiting: Design rate limits for a SaaS API: Free tier (100 req/day), Pro (10K req/day), Enterprise (unlimited). Include per-endpoint limits (e.g., search is more expensive).
Rate Limit Headers: Implement standard rate limit response headers: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, Retry-After.
⚠️ Common Mistakes
Fixed window boundary spikes — a user can make 100 requests at 11:59 and 100 at 12:00, effectively 200 in 1 minute. Use sliding window to prevent this.
Rate limiting only at the API gateway — internal services also need rate limiting to prevent cascading failures between microservices.
Not returning rate limit headers — clients need to know their current limits to implement proper backoff. Always return X-RateLimit-Remaining and Retry-After headers.
💼 Interview Questions
🎤 Mock Interview
Practice a live interview for Rate Limiter Design