Real-World DSA: Rate Limiting & Scheduling
📖 Concept
Rate limiting controls how many requests a user/IP can make in a time window. It uses queues, sliding windows, and token buckets — all built on DSA concepts.
Rate Limiting Algorithms:
1. Fixed Window Counter:
- Count requests per fixed time window (e.g., 100 req/minute)
- Simple but has "boundary burst" problem (200 requests at minute boundary)
2. Sliding Window Log:
- Store timestamp of each request
- Count requests in last N seconds
- Accurate but memory-heavy
3. Sliding Window Counter:
- Combines fixed window + interpolation
- Low memory, good accuracy
4. Token Bucket:
- Bucket fills at constant rate (r tokens/sec)
- Each request consumes a token
- Allows bursts up to bucket size
- Used by: AWS, Stripe, most cloud APIs
5. Leaky Bucket:
- Requests enter a queue (bucket), processed at constant rate
- Queue overflow → rejected
- Smooth output rate
Task Scheduling:
- Round Robin: Circular queue of tasks, each gets equal time slice
- Priority Queue: Highest priority task runs first (heap-based)
- Job Scheduler: Cron-like scheduling using min-heap of next-run times
🏠 Real-world analogy: Token bucket is like a vending machine coin slot — coins (tokens) accumulate at a constant rate. You can spend them quickly (burst) but once out, you wait. Leaky bucket is like a funnel — liquid (requests) flows out at a constant rate regardless of how fast you pour in.
💻 Code Example
1// TOKEN BUCKET RATE LIMITER2class TokenBucket {3 constructor(capacity, refillRate) {4 this.capacity = capacity;5 this.tokens = capacity;6 this.refillRate = refillRate; // tokens per second7 this.lastRefill = Date.now();8 }910 allowRequest() {11 this.refill();12 if (this.tokens >= 1) {13 this.tokens--;14 return true; // Allowed15 }16 return false; // Rate limited17 }1819 refill() {20 const now = Date.now();21 const elapsed = (now - this.lastRefill) / 1000;22 this.tokens = Math.min(this.capacity, this.tokens + elapsed * this.refillRate);23 this.lastRefill = now;24 }25}2627// SLIDING WINDOW LOG RATE LIMITER28class SlidingWindowRateLimiter {29 constructor(maxRequests, windowMs) {30 this.maxRequests = maxRequests;31 this.windowMs = windowMs;32 this.requests = new Map(); // userId → [timestamps]33 }3435 allowRequest(userId) {36 const now = Date.now();37 if (!this.requests.has(userId)) this.requests.set(userId, []);38 const timestamps = this.requests.get(userId);3940 // Remove expired timestamps41 while (timestamps.length && timestamps[0] <= now - this.windowMs) {42 timestamps.shift();43 }4445 if (timestamps.length < this.maxRequests) {46 timestamps.push(now);47 return true;48 }49 return false;50 }51}5253// TASK SCHEDULER — Minimum intervals with cooldown54function leastInterval(tasks, n) {55 const freq = new Array(26).fill(0);56 for (const t of tasks) freq[t.charCodeAt(0) - 65]++;57 freq.sort((a, b) => b - a);5859 const maxFreq = freq[0];60 let idleSlots = (maxFreq - 1) * n;6162 for (let i = 1; i < 26 && freq[i] > 0; i++) {63 idleSlots -= Math.min(freq[i], maxFreq - 1);64 }6566 return tasks.length + Math.max(0, idleSlots);67}6869// ROUND-ROBIN SCHEDULER70class RoundRobinScheduler {71 constructor(timeSlice) {72 this.queue = [];73 this.timeSlice = timeSlice;74 }7576 addTask(task) { this.queue.push(task); }7778 run() {79 while (this.queue.length) {80 const task = this.queue.shift();81 const remaining = task.execute(this.timeSlice);82 if (remaining > 0) {83 task.remainingTime = remaining;84 this.queue.push(task); // Back to queue85 }86 }87 }88}
🏋️ Practice Exercise
Practice Problems & Design Questions:
- Implement a token bucket rate limiter
- Implement a sliding window rate limiter
- Design an API rate limiter for a web service
- Task scheduler with cooldown — minimum total time
- Implement round-robin scheduling
- Design a job queue with priority and retry logic
- Rate limiting with different tiers (free: 10/min, pro: 100/min)
- Implement a leaky bucket rate limiter
- Design a distributed rate limiter (multiple servers)
- Compare fixed window vs sliding window rate limiting
⚠️ Common Mistakes
Fixed window counter boundary burst — 100 requests at :59 + 100 at :00 = 200 in 2 seconds
Not considering distributed scenarios — rate limiting per server doesn't limit total load
Not handling race conditions — concurrent requests may bypass the limit
Using in-memory rate limiting in load-balanced systems — use Redis for shared state
Not implementing graceful degradation — return 429 Too Many Requests, not 500
💼 Interview Questions
🎤 Mock Interview
Practice a live interview for Real-World DSA: Rate Limiting & Scheduling