Load Balancing

0/3 in this phase0/45 across the roadmap

📖 Concept

A load balancer distributes incoming traffic across multiple backend servers to improve availability, throughput, and fault tolerance.

Load Balancing Algorithms

Algorithm How It Works Best For
Round Robin Distribute sequentially (1→2→3→1→2→3) Equal-capacity servers
Weighted Round Robin Higher-capacity servers get more traffic Mixed server sizes
Least Connections Route to server with fewest active connections Variable request duration
Least Response Time Route to fastest responding server Heterogeneous backends
IP Hash Hash client IP to determine server Session affinity (sticky sessions)
Random Pick a random server Simple, surprisingly effective

Types of Load Balancers

Layer 4 (Transport Layer)

Routes based on IP address and port. Very fast — doesn't inspect content. Examples: AWS NLB, HAProxy (TCP mode)

Layer 7 (Application Layer)

Routes based on HTTP content (URL path, headers, cookies). More flexible. Examples: AWS ALB, Nginx, HAProxy (HTTP mode), Envoy

Health Checks

Load balancers continuously check backend health:

  • Active: LB sends periodic requests to /health endpoint
  • Passive: LB monitors response codes from real traffic
  • Unhealthy server: Removed from pool, traffic redirected to healthy servers
  • Recovery: After passing N consecutive health checks, server re-added

Load Balancing at Scale

Level What's Balanced Technology
DNS Geographic regions / datacenters Route53, Cloudflare DNS
L4 TCP connections to server pools NLB, F5, MetalLB
L7 HTTP requests to app instances ALB, Nginx, Envoy
Service Mesh Microservice-to-microservice Istio, Linkerd

Interview tip: In every system design, add a load balancer between the client and your app servers. It's expected in every design at scale.

💻 Code Example

codeTap to expand ⛶
1// ============================================
2// Load Balancing — Algorithm Implementations
3// ============================================
4
5// ---------- Round Robin ----------
6class RoundRobinBalancer {
7 constructor(servers) {
8 this.servers = servers;
9 this.currentIndex = 0;
10 }
11
12 getServer() {
13 const server = this.servers[this.currentIndex];
14 this.currentIndex = (this.currentIndex + 1) % this.servers.length;
15 return server;
16 }
17}
18
19// ---------- Weighted Round Robin ----------
20class WeightedRoundRobin {
21 constructor(servers) {
22 this.servers = servers; // [{host, weight}]
23 this.currentIndex = 0;
24 this.currentWeight = 0;
25 this.maxWeight = Math.max(...servers.map(s => s.weight));
26 this.gcd = this.calculateGCD();
27 }
28
29 getServer() {
30 while (true) {
31 this.currentIndex = (this.currentIndex + 1) % this.servers.length;
32 if (this.currentIndex === 0) {
33 this.currentWeight -= this.gcd;
34 if (this.currentWeight <= 0) this.currentWeight = this.maxWeight;
35 }
36 if (this.servers[this.currentIndex].weight >= this.currentWeight) {
37 return this.servers[this.currentIndex];
38 }
39 }
40 }
41
42 calculateGCD() {
43 const weights = this.servers.map(s => s.weight);
44 return weights.reduce((a, b) => {
45 while (b) { [a, b] = [b, a % b]; }
46 return a;
47 });
48 }
49}
50
51// ---------- Least Connections ----------
52class LeastConnectionsBalancer {
53 constructor(servers) {
54 this.servers = servers.map(s => ({ ...s, connections: 0 }));
55 }
56
57 getServer() {
58 const server = this.servers.reduce((min, s) =>
59 s.connections < min.connections ? s : min
60 );
61 server.connections++;
62 return server;
63 }
64
65 releaseConnection(server) {
66 server.connections = Math.max(0, server.connections - 1);
67 }
68}
69
70// ---------- Health-Checked Load Balancer ----------
71class HealthCheckedLB {
72 constructor(servers, healthCheckInterval = 5000) {
73 this.servers = servers.map(s => ({ ...s, healthy: true, failCount: 0 }));
74 this.balancer = new RoundRobinBalancer(this.getHealthyServers());
75 setInterval(() => this.runHealthChecks(), healthCheckInterval);
76 }
77
78 getHealthyServers() {
79 return this.servers.filter(s => s.healthy);
80 }
81
82 async runHealthChecks() {
83 for (const server of this.servers) {
84 try {
85 const res = await fetch(`http://\${server.host}/health`, { timeout: 2000 });
86 if (res.ok) {
87 server.failCount = 0;
88 if (!server.healthy) {
89 server.healthy = true;
90 console.log(`✅ \${server.host} is back HEALTHY`);
91 }
92 } else { throw new Error('Unhealthy'); }
93 } catch (e) {
94 server.failCount++;
95 if (server.failCount >= 3 && server.healthy) {
96 server.healthy = false;
97 console.log(`❌ \${server.host} marked UNHEALTHY`);
98 }
99 }
100 }
101 this.balancer = new RoundRobinBalancer(this.getHealthyServers());
102 }
103
104 route() { return this.balancer.getServer(); }
105}
106
107// Demo
108const rr = new RoundRobinBalancer(['server-1', 'server-2', 'server-3']);
109console.log([1,2,3,4,5,6].map(() => rr.getServer()));
110
111const wrr = new WeightedRoundRobin([
112 { host: 'big-server', weight: 5 },
113 { host: 'small-server', weight: 1 },
114]);
115const counts = {};
116for (let i = 0; i < 60; i++) {
117 const s = wrr.getServer().host;
118 counts[s] = (counts[s] || 0) + 1;
119}
120console.log('Weighted distribution:', counts);

🏋️ Practice Exercise

  1. Algorithm Selection: For each scenario, choose the best load balancing algorithm: (a) 4 identical servers, (b) 2 powerful + 4 small servers, (c) WebSocket connections, (d) API with variable response times.

  2. Health Check Design: Design a health check system that distinguishes between: (a) server completely down, (b) server overloaded but alive, (c) server healthy but with degraded DB connection.

  3. Multi-Layer LB: Design the load balancing architecture for a global e-commerce site: DNS-level (geo-routing), L4 (TCP), and L7 (HTTP routing).

  4. Sticky Sessions: When are sticky sessions necessary? Design an alternative using external session storage that avoids sticky sessions entirely.

⚠️ Common Mistakes

  • Using sticky sessions when services should be stateless — sticky sessions reduce the effectiveness of load balancing and create failure scenarios. Store sessions externally (Redis).

  • Not implementing health checks — without health checks, the load balancer routes traffic to dead servers, causing errors for users.

  • Single load balancer as SPOF — the load balancer itself needs redundancy. Use multiple LBs with failover (e.g., AWS ALB automatically distributed).

  • Using round robin with heterogeneous servers — if one server has 2x the capacity, round robin wastes half its potential. Use weighted round robin.

💼 Interview Questions

🎤 Mock Interview

Practice a live interview for Load Balancing