Load Balancing
📖 Concept
A load balancer distributes incoming traffic across multiple backend servers to improve availability, throughput, and fault tolerance.
Load Balancing Algorithms
| Algorithm | How It Works | Best For |
|---|---|---|
| Round Robin | Distribute sequentially (1→2→3→1→2→3) | Equal-capacity servers |
| Weighted Round Robin | Higher-capacity servers get more traffic | Mixed server sizes |
| Least Connections | Route to server with fewest active connections | Variable request duration |
| Least Response Time | Route to fastest responding server | Heterogeneous backends |
| IP Hash | Hash client IP to determine server | Session affinity (sticky sessions) |
| Random | Pick a random server | Simple, surprisingly effective |
Types of Load Balancers
Layer 4 (Transport Layer)
Routes based on IP address and port. Very fast — doesn't inspect content. Examples: AWS NLB, HAProxy (TCP mode)
Layer 7 (Application Layer)
Routes based on HTTP content (URL path, headers, cookies). More flexible. Examples: AWS ALB, Nginx, HAProxy (HTTP mode), Envoy
Health Checks
Load balancers continuously check backend health:
- Active: LB sends periodic requests to
/healthendpoint - Passive: LB monitors response codes from real traffic
- Unhealthy server: Removed from pool, traffic redirected to healthy servers
- Recovery: After passing N consecutive health checks, server re-added
Load Balancing at Scale
| Level | What's Balanced | Technology |
|---|---|---|
| DNS | Geographic regions / datacenters | Route53, Cloudflare DNS |
| L4 | TCP connections to server pools | NLB, F5, MetalLB |
| L7 | HTTP requests to app instances | ALB, Nginx, Envoy |
| Service Mesh | Microservice-to-microservice | Istio, Linkerd |
Interview tip: In every system design, add a load balancer between the client and your app servers. It's expected in every design at scale.
💻 Code Example
1// ============================================2// Load Balancing — Algorithm Implementations3// ============================================45// ---------- Round Robin ----------6class RoundRobinBalancer {7 constructor(servers) {8 this.servers = servers;9 this.currentIndex = 0;10 }1112 getServer() {13 const server = this.servers[this.currentIndex];14 this.currentIndex = (this.currentIndex + 1) % this.servers.length;15 return server;16 }17}1819// ---------- Weighted Round Robin ----------20class WeightedRoundRobin {21 constructor(servers) {22 this.servers = servers; // [{host, weight}]23 this.currentIndex = 0;24 this.currentWeight = 0;25 this.maxWeight = Math.max(...servers.map(s => s.weight));26 this.gcd = this.calculateGCD();27 }2829 getServer() {30 while (true) {31 this.currentIndex = (this.currentIndex + 1) % this.servers.length;32 if (this.currentIndex === 0) {33 this.currentWeight -= this.gcd;34 if (this.currentWeight <= 0) this.currentWeight = this.maxWeight;35 }36 if (this.servers[this.currentIndex].weight >= this.currentWeight) {37 return this.servers[this.currentIndex];38 }39 }40 }4142 calculateGCD() {43 const weights = this.servers.map(s => s.weight);44 return weights.reduce((a, b) => {45 while (b) { [a, b] = [b, a % b]; }46 return a;47 });48 }49}5051// ---------- Least Connections ----------52class LeastConnectionsBalancer {53 constructor(servers) {54 this.servers = servers.map(s => ({ ...s, connections: 0 }));55 }5657 getServer() {58 const server = this.servers.reduce((min, s) =>59 s.connections < min.connections ? s : min60 );61 server.connections++;62 return server;63 }6465 releaseConnection(server) {66 server.connections = Math.max(0, server.connections - 1);67 }68}6970// ---------- Health-Checked Load Balancer ----------71class HealthCheckedLB {72 constructor(servers, healthCheckInterval = 5000) {73 this.servers = servers.map(s => ({ ...s, healthy: true, failCount: 0 }));74 this.balancer = new RoundRobinBalancer(this.getHealthyServers());75 setInterval(() => this.runHealthChecks(), healthCheckInterval);76 }7778 getHealthyServers() {79 return this.servers.filter(s => s.healthy);80 }8182 async runHealthChecks() {83 for (const server of this.servers) {84 try {85 const res = await fetch(`http://\${server.host}/health`, { timeout: 2000 });86 if (res.ok) {87 server.failCount = 0;88 if (!server.healthy) {89 server.healthy = true;90 console.log(`✅ \${server.host} is back HEALTHY`);91 }92 } else { throw new Error('Unhealthy'); }93 } catch (e) {94 server.failCount++;95 if (server.failCount >= 3 && server.healthy) {96 server.healthy = false;97 console.log(`❌ \${server.host} marked UNHEALTHY`);98 }99 }100 }101 this.balancer = new RoundRobinBalancer(this.getHealthyServers());102 }103104 route() { return this.balancer.getServer(); }105}106107// Demo108const rr = new RoundRobinBalancer(['server-1', 'server-2', 'server-3']);109console.log([1,2,3,4,5,6].map(() => rr.getServer()));110111const wrr = new WeightedRoundRobin([112 { host: 'big-server', weight: 5 },113 { host: 'small-server', weight: 1 },114]);115const counts = {};116for (let i = 0; i < 60; i++) {117 const s = wrr.getServer().host;118 counts[s] = (counts[s] || 0) + 1;119}120console.log('Weighted distribution:', counts);
🏋️ Practice Exercise
Algorithm Selection: For each scenario, choose the best load balancing algorithm: (a) 4 identical servers, (b) 2 powerful + 4 small servers, (c) WebSocket connections, (d) API with variable response times.
Health Check Design: Design a health check system that distinguishes between: (a) server completely down, (b) server overloaded but alive, (c) server healthy but with degraded DB connection.
Multi-Layer LB: Design the load balancing architecture for a global e-commerce site: DNS-level (geo-routing), L4 (TCP), and L7 (HTTP routing).
Sticky Sessions: When are sticky sessions necessary? Design an alternative using external session storage that avoids sticky sessions entirely.
⚠️ Common Mistakes
Using sticky sessions when services should be stateless — sticky sessions reduce the effectiveness of load balancing and create failure scenarios. Store sessions externally (Redis).
Not implementing health checks — without health checks, the load balancer routes traffic to dead servers, causing errors for users.
Single load balancer as SPOF — the load balancer itself needs redundancy. Use multiple LBs with failover (e.g., AWS ALB automatically distributed).
Using round robin with heterogeneous servers — if one server has 2x the capacity, round robin wastes half its potential. Use weighted round robin.
💼 Interview Questions
🎤 Mock Interview
Practice a live interview for Load Balancing