Horizontal vs Vertical Scaling

0/3 in this phase0/45 across the roadmap

📖 Concept

Scaling is the ability to handle increased load. There are two fundamental approaches:

Vertical Scaling (Scale Up)

Add more resources to a single machine: more CPU, RAM, or faster SSDs.

Pros Cons
Simple — no code changes Hardware limits (max ~128 cores, 4TB RAM)
No distributed system complexity Single point of failure
Works for any application Expensive at high end
ACID transactions remain simple Downtime during upgrade

Horizontal Scaling (Scale Out)

Add more machines and distribute the load across them.

Pros Cons
Virtually unlimited scale Application must be designed for distribution
Better fault tolerance (no SPOF) Data consistency challenges
Commodity hardware (cheaper) Network latency between nodes
Can scale independently per component Operational complexity

When to Use Each

Scenario Strategy
Database at 80% CPU Vertical first (cheaper, simpler)
100K concurrent WebSocket users Horizontal (memory-bound, need many servers)
Startup with < 10K users Vertical (simplicity wins)
Global service with millions of users Horizontal (must distribute geographically)

The Scaling Path

Most companies follow this progression:

  1. Single server (web + DB on one machine)
  2. Separate DB (web server + dedicated database server)
  3. Read replicas (one primary, multiple replicas for reads)
  4. Caching layer (Redis between app and DB)
  5. Load balancer + multiple app servers (horizontal scaling)
  6. Database sharding (horizontal DB scaling)
  7. Microservices (scale different components independently)
  8. Multi-region (globally distributed)

Interview tip: Always mention "I'd start simple and scale as needed" rather than immediately jumping to a complex distributed architecture for a new system.

💻 Code Example

codeTap to expand ⛶
1// ============================================
2// Scaling Patterns — Practical Approaches
3// ============================================
4
5// ---------- Scaling Path Demonstration ----------
6
7// Stage 1: Single Server
8const stage1 = {
9 architecture: 'Monolith on single server',
10 capacity: '~1K req/sec, ~10K users',
11 components: ['Web Server', 'App Logic', 'Database'],
12 cost: '$50/month',
13};
14
15// Stage 2: Separate Web and DB servers
16const stage2 = {
17 architecture: 'Web server + dedicated DB',
18 capacity: '~5K req/sec, ~50K users',
19 components: ['Web Server (4 CPU, 16GB)', 'DB Server (8 CPU, 32GB)'],
20 improvement: 'DB can be scaled independently, backups easier',
21};
22
23// Stage 3: Add caching and read replicas
24const stage3 = {
25 architecture: 'Web + Cache + DB Primary + Read Replicas',
26 capacity: '~20K req/sec, ~200K users',
27 components: ['Web Server', 'Redis Cache', 'DB Primary', '2x DB Replicas'],
28 improvement: 'Cache absorbs 80% of reads, replicas handle the rest',
29};
30
31// Stage 4: Horizontal scaling with load balancer
32const stage4 = {
33 architecture: 'Load Balancer + N Web Servers + Cache + DB',
34 capacity: '~100K req/sec, ~1M users',
35 components: ['Load Balancer (Nginx/ALB)', '4x Web Servers', 'Redis Cluster', 'DB Primary + 3 Replicas'],
36 improvement: 'Add/remove web servers based on traffic',
37};
38
39// Stage 5: Full distributed system
40const stage5 = {
41 architecture: 'Multi-region with CDN, sharded DB, microservices',
42 capacity: '~1M+ req/sec, ~100M users',
43 components: ['CDN', 'API Gateway', 'Microservices', 'Sharded DB', 'Kafka', 'Redis Cluster'],
44};
45
46// ---------- Stateless Application Server (Required for Horizontal Scaling) ----------
47
48// ❌ BAD: Stateful server — can't scale horizontally
49class StatefulServer {
50 constructor() {
51 this.sessions = {}; // In-memory state!
52 }
53 handleRequest(req) {
54 // If load balancer routes to a DIFFERENT server, session is lost
55 return this.sessions[req.sessionId];
56 }
57}
58
59// ✅ GOOD: Stateless server — scales to any number of instances
60class StatelessServer {
61 constructor(redisClient) {
62 this.redis = redisClient; // External state store
63 }
64 async handleRequest(req) {
65 // Any server can handle any request
66 const session = await this.redis.get(`session:\${req.sessionId}`);
67 return JSON.parse(session);
68 }
69}
70
71// ---------- Connection Pooling (Critical for Scaling) ----------
72const { Pool } = require('pg');
73
74// ❌ BAD: New connection per request
75async function badHandler(req) {
76 const client = await new Pool().connect(); // 200ms overhead each time!
77 const result = await client.query('SELECT 1');
78 client.release();
79 return result;
80}
81
82// ✅ GOOD: Shared connection pool
83const pool = new Pool({
84 max: 20,
85 idleTimeoutMillis: 30000,
86 connectionTimeoutMillis: 2000,
87});
88
89async function goodHandler(req) {
90 const result = await pool.query('SELECT 1'); // Reuses existing connection
91 return result;
92}
93
94console.log("Scaling stages:", [stage1, stage2, stage3, stage4, stage5].map(s => s.architecture));

🏋️ Practice Exercise

  1. Scaling Plan: Your app has 1K DAU and is growing 10x yearly. Design a 3-year scaling plan with specific milestones (10K, 100K, 1M users). What changes at each stage?

  2. Cost Analysis: Compare the cost of vertically scaling one server to 128 cores / 512GB RAM vs horizontally scaling to 16 x 8-core / 32GB RAM servers. Include operational costs.

  3. Stateless Refactor: Given a Node.js app storing user sessions and file uploads in memory, refactor it to be stateless so it can run on multiple servers behind a load balancer.

  4. Bottleneck Identification: Your system handles 50K req/sec but response times spike during peak hours. Database CPU is at 90%, app servers at 30%. Identify the bottleneck and design the scaling strategy.

⚠️ Common Mistakes

  • Scaling the wrong component — if the database is the bottleneck, adding more app servers won't help. Always identify the bottleneck first (CPU, memory, I/O, network).

  • Premature horizontal scaling — adding distributed system complexity before it's needed. A single well-optimized server handles more than most startups need.

  • Not making servers stateless before scaling horizontally — if servers store session data in memory, load balancers must use sticky sessions, which defeats the purpose of horizontal scaling.

  • Ignoring database scaling — app servers scale easily horizontally, but the database is usually the bottleneck. Plan for read replicas, caching, and eventually sharding.

💼 Interview Questions

🎤 Mock Interview

Practice a live interview for Horizontal vs Vertical Scaling