Clustering & Horizontal Scaling

📖 Concept

Node.js runs on a single thread by default, utilizing only one CPU core. The cluster module enables running multiple Node.js processes to leverage all CPU cores.

Scaling strategies:

Strategy How Use Case
Vertical Bigger server (more CPU/RAM) Quick fix, limited ceiling
Node.js Cluster Fork worker processes Multi-core utilization
PM2 Process manager with cluster mode Production deployment
Docker + K8s Container orchestration Microservices, cloud-native
Load Balancer nginx / HAProxy / ALB Multiple servers

cluster module:

  • Master process — manages workers, doesn't handle requests
  • Worker processes — handle actual HTTP requests
  • Workers share the same port (OS distributes connections)
  • Workers are independent processes (crash isolation)

PM2 — Production Process Manager:

pm2 start app.js -i max         # Cluster mode (all cores)
pm2 start app.js -i 4           # 4 worker processes
pm2 reload app.js               # Zero-downtime reload
pm2 monit                       # Real-time monitoring
pm2 logs                        # View logs from all workers
pm2 save && pm2 startup         # Auto-start on server reboot

When to scale:

  1. Single Node.js process → PM2 cluster mode (same machine)
  2. PM2 cluster hits limits → multiple machines + load balancer
  3. Multiple machines → Docker + Kubernetes
  4. Global scale → CDN + edge computing + auto-scaling groups

🏠 Real-world analogy: Clustering is like a restaurant with multiple kitchens. Instead of one chef (single thread) handling all orders, you have multiple chefs (worker processes) in separate kitchens (processes), with a host (master/load balancer) assigning customers to the least busy kitchen.

💻 Code Example

codeTap to expand ⛶
1// Clustering & Horizontal Scaling
2
3const cluster = require("cluster");
4const os = require("os");
5const express = require("express");
6
7// 1. Built-in cluster module
8if (cluster.isPrimary) {
9 const numCPUs = os.cpus().length;
10 console.log(`Primary ${process.pid} starting ${numCPUs} workers...`);
11
12 // Fork workers
13 for (let i = 0; i < numCPUs; i++) {
14 cluster.fork();
15 }
16
17 // Handle worker crashes
18 cluster.on("exit", (worker, code, signal) => {
19 console.error(`Worker ${worker.process.pid} died (code: ${code}). Restarting...`);
20 cluster.fork(); // Auto-restart
21 });
22
23 // Graceful shutdown
24 process.on("SIGTERM", () => {
25 console.log("Primary received SIGTERM. Shutting down workers...");
26 for (const worker of Object.values(cluster.workers)) {
27 worker.process.kill("SIGTERM");
28 }
29 });
30} else {
31 // Worker process — each runs its own Express server
32 const app = express();
33
34 app.get("/api/health", (req, res) => {
35 res.json({
36 status: "healthy",
37 pid: process.pid,
38 worker: cluster.worker.id,
39 uptime: process.uptime(),
40 });
41 });
42
43 app.get("/api/heavy", (req, res) => {
44 // CPU-intensive work (only blocks THIS worker)
45 let result = 0;
46 for (let i = 0; i < 1e7; i++) result += Math.sqrt(i);
47 res.json({ result, pid: process.pid });
48 });
49
50 app.listen(3000, () => {
51 console.log(`Worker ${process.pid} listening on port 3000`);
52 });
53}
54
55// 2. PM2 ecosystem file (ecosystem.config.js)
56const pm2Config = {
57 apps: [
58 {
59 name: "my-api",
60 script: "src/server.js",
61 instances: "max", // Use all CPU cores
62 exec_mode: "cluster", // Cluster mode
63 max_memory_restart: "500M",
64 env: {
65 NODE_ENV: "production",
66 PORT: 3000,
67 },
68 env_development: {
69 NODE_ENV: "development",
70 PORT: 3000,
71 },
72 // Logging
73 log_file: "./logs/combined.log",
74 error_file: "./logs/error.log",
75 merge_logs: true,
76 log_date_format: "YYYY-MM-DD HH:mm:ss",
77 // Auto-restart
78 watch: false,
79 max_restarts: 10,
80 restart_delay: 4000,
81 // Graceful shutdown
82 kill_timeout: 5000,
83 listen_timeout: 10000,
84 },
85 ],
86};
87
88// 3. nginx load balancer configuration (reference)
89const nginxConfig = `
90# /etc/nginx/sites-available/my-api
91upstream nodejs_cluster {
92 least_conn; # Least connections algorithm
93 server 127.0.0.1:3001; # Node instance 1
94 server 127.0.0.1:3002; # Node instance 2
95 server 127.0.0.1:3003; # Node instance 3
96 server 127.0.0.1:3004; # Node instance 4
97 keepalive 64; # Connection pooling
98}
99
100server {
101 listen 80;
102 listen 443 ssl;
103 server_name api.example.com;
104
105 # SSL
106 ssl_certificate /etc/ssl/cert.pem;
107 ssl_certificate_key /etc/ssl/key.pem;
108
109 # Proxy to Node.js cluster
110 location / {
111 proxy_pass http://nodejs_cluster;
112 proxy_http_version 1.1;
113 proxy_set_header Upgrade $http_upgrade;
114 proxy_set_header Connection 'upgrade';
115 proxy_set_header Host $host;
116 proxy_set_header X-Real-IP $remote_addr;
117 proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
118 proxy_cache_bypass $http_upgrade;
119
120 # Timeouts
121 proxy_connect_timeout 60s;
122 proxy_send_timeout 60s;
123 proxy_read_timeout 60s;
124 }
125
126 # Serve static files directly (bypass Node.js)
127 location /static/ {
128 alias /var/www/static/;
129 expires 1y;
130 add_header Cache-Control "public, immutable";
131 }
132
133 # Gzip compression
134 gzip on;
135 gzip_types text/plain application/json application/javascript text/css;
136}
137`;
138
139module.exports = pm2Config;

🏋️ Practice Exercise

Exercises:

  1. Implement clustering using the cluster module — fork workers for each CPU core with auto-restart
  2. Set up PM2 with an ecosystem file — configure cluster mode, log files, and memory restart limits
  3. Configure nginx as a reverse proxy / load balancer for multiple Node.js instances
  4. Implement zero-downtime deployment using PM2's reload command
  5. Load test a single-instance server vs. clustered server — compare throughput and response times
  6. Build health check endpoints that report per-worker statistics

⚠️ Common Mistakes

  • Storing session state in process memory with clustering — each worker has its own memory; use Redis or a database for shared state

  • Not implementing graceful shutdown — when restarting workers, allow in-flight requests to complete before killing the process

  • Using cluster.fork() without auto-restart — if a worker crashes without restart logic, your capacity degrades over time

  • Running more workers than CPU cores — this causes context switching overhead; match workers to cores (or use PM2's max setting)

  • Not putting nginx in front of Node.js in production — nginx handles SSL termination, static files, gzip, rate limiting, and DDoS protection more efficiently

💼 Interview Questions

🎤 Mock Interview

Mock interview is powered by AI for Clustering & Horizontal Scaling. Login to unlock this feature.