Rate Limiter Design

0/3 in this phase0/45 across the roadmap

📖 Concept

A rate limiter controls the number of requests a client can make within a time window. It protects services from abuse, prevents cascading failures, and ensures fair resource allocation.

Common Algorithms

Algorithm	Description	Pros	Cons
Token Bucket	Tokens added at fixed rate; request consumes a token	Allows bursts, smooth rate	Slightly complex state
Leaky Bucket	Requests processed at fixed rate; overflow rejected	Smooth output rate	No burst handling
Fixed Window	Count requests in fixed time windows	Simple to implement	Boundary spike problem
Sliding Window Log	Track timestamp of each request	Precise rate limiting	High memory usage
Sliding Window Counter	Combine fixed windows with weighted overlap	Good accuracy, low memory	Approximation

Where to Rate Limit

Location	What It Protects	Example
API Gateway	Backend services from overload	1000 req/min per API key
Application	Specific endpoints	5 login attempts per hour
Database	DB from excessive queries	Connection pool limits
Third-party	Your quota with external APIs	100 Stripe API calls/sec

Interview tip: Rate limiting is a common system design interview question. Be prepared to implement a sliding window or token bucket algorithm and discuss distributed rate limiting with Redis.

💻 Code Example

codeTap to expand ⛶

1// ============================================
2// Rate Limiter — Algorithm Implementations
3// ============================================
4
5// ---------- Token Bucket ----------
6class TokenBucket {
7  constructor(capacity, refillRate) {
8    this.capacity = capacity;
9    this.tokens = capacity;
10    this.refillRate = refillRate; // tokens per second
11    this.lastRefill = Date.now();
12  }
13
14  allowRequest(tokensNeeded = 1) {
15    this.refill();
16    if (this.tokens >= tokensNeeded) {
17      this.tokens -= tokensNeeded;
18      return true;
19    }
20    return false;
21  }
22
23  refill() {
24    const now = Date.now();
25    const elapsed = (now - this.lastRefill) / 1000;
26    this.tokens = Math.min(this.capacity, this.tokens + elapsed * this.refillRate);
27    this.lastRefill = now;
28  }
29}
30
31// ---------- Sliding Window Counter ----------
32class SlidingWindowCounter {
33  constructor(windowMs, maxRequests) {
34    this.windowMs = windowMs;
35    this.maxRequests = maxRequests;
36    this.windows = new Map(); // userId → { prevCount, currCount, currWindowStart }
37  }
38
39  allowRequest(userId) {
40    const now = Date.now();
41    const currentWindow = Math.floor(now / this.windowMs);
42    let record = this.windows.get(userId);
43
44    if (!record || record.currWindow !== currentWindow) {
45      record = {
46        prevCount: record?.currWindow === currentWindow - 1 ? record.currCount : 0,
47        currCount: 0,
48        currWindow: currentWindow,
49      };
50    }
51
52    // Weighted count: overlap of previous window
53    const elapsed = (now % this.windowMs) / this.windowMs;
54    const estimatedCount = record.prevCount * (1 - elapsed) + record.currCount;
55
56    if (estimatedCount < this.maxRequests) {
57      record.currCount++;
58      this.windows.set(userId, record);
59      return { allowed: true, remaining: Math.floor(this.maxRequests - estimatedCount - 1) };
60    }
61
62    return { allowed: false, retryAfter: this.windowMs - (now % this.windowMs) };
63  }
64}
65
66// ---------- Distributed Rate Limiter (Redis) ----------
67class DistributedRateLimiter {
68  constructor(redis, config) {
69    this.redis = redis;
70    this.windowMs = config.windowMs || 60000;
71    this.maxRequests = config.maxRequests || 100;
72  }
73
74  async allowRequest(clientId) {
75    const key = `ratelimit:\${clientId}`;
76    const now = Date.now();
77    const windowStart = now - this.windowMs;
78
79    // Lua script for atomic operation
80    const lua = `
81      redis.call('ZREMRANGEBYSCORE', KEYS[1], 0, ARGV[1])
82      local count = redis.call('ZCARD', KEYS[1])
83      if count < tonumber(ARGV[3]) then
84        redis.call('ZADD', KEYS[1], ARGV[2], ARGV[2])
85        redis.call('PEXPIRE', KEYS[1], ARGV[4])
86        return 1
87      end
88      return 0
89    `;
90
91    const allowed = await this.redis.eval(
92      lua, 1, key,
93      windowStart.toString(), now.toString(),
94      this.maxRequests.toString(), this.windowMs.toString()
95    );
96
97    return { allowed: allowed === 1 };
98  }
99}
100
101// Demo
102const bucket = new TokenBucket(10, 2); // 10 capacity, refill 2/sec
103console.log('Request 1:', bucket.allowRequest()); // true
104for (let i = 0; i < 10; i++) bucket.allowRequest();
105console.log('Request 12:', bucket.allowRequest()); // false (bucket empty)
106
107const slider = new SlidingWindowCounter(60000, 100);
108console.log('Sliding window:', slider.allowRequest('user_1'));

🏋️ Practice Exercise

Algorithm Comparison: Implement token bucket, sliding window log, and sliding window counter. Compare memory usage, accuracy, and burst handling with 1000 simulated requests.
Distributed Rate Limiter: Design a rate limiter using Redis that works across 10 API servers, limiting each user to 100 requests per minute.
Tiered Rate Limiting: Design rate limits for a SaaS API: Free tier (100 req/day), Pro (10K req/day), Enterprise (unlimited). Include per-endpoint limits (e.g., search is more expensive).
Rate Limit Headers: Implement standard rate limit response headers: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, Retry-After.

⚠️ Common Mistakes

Fixed window boundary spikes — a user can make 100 requests at 11:59 and 100 at 12:00, effectively 200 in 1 minute. Use sliding window to prevent this.
Rate limiting only at the API gateway — internal services also need rate limiting to prevent cascading failures between microservices.
Not returning rate limit headers — clients need to know their current limits to implement proper backoff. Always return X-RateLimit-Remaining and Retry-After headers.

💼 Interview Questions

🎤 Mock Interview

Practice a live interview for Rate Limiter Design

Was this topic helpful?

← PreviousUnique ID Generation at Scale Next →Design a URL Shortener (TinyURL/Bit.ly)