Horizontal vs Vertical Scaling

0/3 in this phase0/45 across the roadmap

📖 Concept

Scaling is the ability to handle increased load. There are two fundamental approaches:

Vertical Scaling (Scale Up)

Add more resources to a single machine: more CPU, RAM, or faster SSDs.

Pros	Cons
Simple — no code changes	Hardware limits (max ~128 cores, 4TB RAM)
No distributed system complexity	Single point of failure
Works for any application	Expensive at high end
ACID transactions remain simple	Downtime during upgrade

Horizontal Scaling (Scale Out)

Add more machines and distribute the load across them.

Pros	Cons
Virtually unlimited scale	Application must be designed for distribution
Better fault tolerance (no SPOF)	Data consistency challenges
Commodity hardware (cheaper)	Network latency between nodes
Can scale independently per component	Operational complexity

When to Use Each

Scenario	Strategy
Database at 80% CPU	Vertical first (cheaper, simpler)
100K concurrent WebSocket users	Horizontal (memory-bound, need many servers)
Startup with < 10K users	Vertical (simplicity wins)
Global service with millions of users	Horizontal (must distribute geographically)

The Scaling Path

Most companies follow this progression:

Single server (web + DB on one machine)
Separate DB (web server + dedicated database server)
Read replicas (one primary, multiple replicas for reads)
Caching layer (Redis between app and DB)
Load balancer + multiple app servers (horizontal scaling)
Database sharding (horizontal DB scaling)
Microservices (scale different components independently)
Multi-region (globally distributed)

Interview tip: Always mention "I'd start simple and scale as needed" rather than immediately jumping to a complex distributed architecture for a new system.

💻 Code Example

codeTap to expand ⛶

1// ============================================
2// Scaling Patterns — Practical Approaches
3// ============================================
4
5// ---------- Scaling Path Demonstration ----------
6
7// Stage 1: Single Server
8const stage1 = {
9  architecture: 'Monolith on single server',
10  capacity: '~1K req/sec, ~10K users',
11  components: ['Web Server', 'App Logic', 'Database'],
12  cost: '$50/month',
13};
14
15// Stage 2: Separate Web and DB servers
16const stage2 = {
17  architecture: 'Web server + dedicated DB',
18  capacity: '~5K req/sec, ~50K users',
19  components: ['Web Server (4 CPU, 16GB)', 'DB Server (8 CPU, 32GB)'],
20  improvement: 'DB can be scaled independently, backups easier',
21};
22
23// Stage 3: Add caching and read replicas
24const stage3 = {
25  architecture: 'Web + Cache + DB Primary + Read Replicas',
26  capacity: '~20K req/sec, ~200K users',
27  components: ['Web Server', 'Redis Cache', 'DB Primary', '2x DB Replicas'],
28  improvement: 'Cache absorbs 80% of reads, replicas handle the rest',
29};
30
31// Stage 4: Horizontal scaling with load balancer
32const stage4 = {
33  architecture: 'Load Balancer + N Web Servers + Cache + DB',
34  capacity: '~100K req/sec, ~1M users',
35  components: ['Load Balancer (Nginx/ALB)', '4x Web Servers', 'Redis Cluster', 'DB Primary + 3 Replicas'],
36  improvement: 'Add/remove web servers based on traffic',
37};
38
39// Stage 5: Full distributed system
40const stage5 = {
41  architecture: 'Multi-region with CDN, sharded DB, microservices',
42  capacity: '~1M+ req/sec, ~100M users',
43  components: ['CDN', 'API Gateway', 'Microservices', 'Sharded DB', 'Kafka', 'Redis Cluster'],
44};
45
46// ---------- Stateless Application Server (Required for Horizontal Scaling) ----------
47
48// ❌ BAD: Stateful server — can't scale horizontally
49class StatefulServer {
50  constructor() {
51    this.sessions = {};  // In-memory state!
52  }
53  handleRequest(req) {
54    // If load balancer routes to a DIFFERENT server, session is lost
55    return this.sessions[req.sessionId];
56  }
57}
58
59// ✅ GOOD: Stateless server — scales to any number of instances
60class StatelessServer {
61  constructor(redisClient) {
62    this.redis = redisClient;  // External state store
63  }
64  async handleRequest(req) {
65    // Any server can handle any request
66    const session = await this.redis.get(`session:\${req.sessionId}`);
67    return JSON.parse(session);
68  }
69}
70
71// ---------- Connection Pooling (Critical for Scaling) ----------
72const { Pool } = require('pg');
73
74// ❌ BAD: New connection per request
75async function badHandler(req) {
76  const client = await new Pool().connect(); // 200ms overhead each time!
77  const result = await client.query('SELECT 1');
78  client.release();
79  return result;
80}
81
82// ✅ GOOD: Shared connection pool
83const pool = new Pool({
84  max: 20,
85  idleTimeoutMillis: 30000,
86  connectionTimeoutMillis: 2000,
87});
88
89async function goodHandler(req) {
90  const result = await pool.query('SELECT 1');  // Reuses existing connection
91  return result;
92}
93
94console.log("Scaling stages:", [stage1, stage2, stage3, stage4, stage5].map(s => s.architecture));

🏋️ Practice Exercise

Scaling Plan: Your app has 1K DAU and is growing 10x yearly. Design a 3-year scaling plan with specific milestones (10K, 100K, 1M users). What changes at each stage?
Cost Analysis: Compare the cost of vertically scaling one server to 128 cores / 512GB RAM vs horizontally scaling to 16 x 8-core / 32GB RAM servers. Include operational costs.
Stateless Refactor: Given a Node.js app storing user sessions and file uploads in memory, refactor it to be stateless so it can run on multiple servers behind a load balancer.
Bottleneck Identification: Your system handles 50K req/sec but response times spike during peak hours. Database CPU is at 90%, app servers at 30%. Identify the bottleneck and design the scaling strategy.

⚠️ Common Mistakes

Scaling the wrong component — if the database is the bottleneck, adding more app servers won't help. Always identify the bottleneck first (CPU, memory, I/O, network).
Premature horizontal scaling — adding distributed system complexity before it's needed. A single well-optimized server handles more than most startups need.
Not making servers stateless before scaling horizontally — if servers store session data in memory, load balancers must use sticky sessions, which defeats the purpose of horizontal scaling.
Ignoring database scaling — app servers scale easily horizontally, but the database is usually the bottleneck. Plan for read replicas, caching, and eventually sharding.

💼 Interview Questions

🎤 Mock Interview

Practice a live interview for Horizontal vs Vertical Scaling

Was this topic helpful?

← PreviousEvent-Driven Architecture Next →Load Balancing