Horizontal vs Vertical Scaling
📖 Concept
Scaling is the ability to handle increased load. There are two fundamental approaches:
Vertical Scaling (Scale Up)
Add more resources to a single machine: more CPU, RAM, or faster SSDs.
| Pros | Cons |
|---|---|
| Simple — no code changes | Hardware limits (max ~128 cores, 4TB RAM) |
| No distributed system complexity | Single point of failure |
| Works for any application | Expensive at high end |
| ACID transactions remain simple | Downtime during upgrade |
Horizontal Scaling (Scale Out)
Add more machines and distribute the load across them.
| Pros | Cons |
|---|---|
| Virtually unlimited scale | Application must be designed for distribution |
| Better fault tolerance (no SPOF) | Data consistency challenges |
| Commodity hardware (cheaper) | Network latency between nodes |
| Can scale independently per component | Operational complexity |
When to Use Each
| Scenario | Strategy |
|---|---|
| Database at 80% CPU | Vertical first (cheaper, simpler) |
| 100K concurrent WebSocket users | Horizontal (memory-bound, need many servers) |
| Startup with < 10K users | Vertical (simplicity wins) |
| Global service with millions of users | Horizontal (must distribute geographically) |
The Scaling Path
Most companies follow this progression:
- Single server (web + DB on one machine)
- Separate DB (web server + dedicated database server)
- Read replicas (one primary, multiple replicas for reads)
- Caching layer (Redis between app and DB)
- Load balancer + multiple app servers (horizontal scaling)
- Database sharding (horizontal DB scaling)
- Microservices (scale different components independently)
- Multi-region (globally distributed)
Interview tip: Always mention "I'd start simple and scale as needed" rather than immediately jumping to a complex distributed architecture for a new system.
💻 Code Example
1// ============================================2// Scaling Patterns — Practical Approaches3// ============================================45// ---------- Scaling Path Demonstration ----------67// Stage 1: Single Server8const stage1 = {9 architecture: 'Monolith on single server',10 capacity: '~1K req/sec, ~10K users',11 components: ['Web Server', 'App Logic', 'Database'],12 cost: '$50/month',13};1415// Stage 2: Separate Web and DB servers16const stage2 = {17 architecture: 'Web server + dedicated DB',18 capacity: '~5K req/sec, ~50K users',19 components: ['Web Server (4 CPU, 16GB)', 'DB Server (8 CPU, 32GB)'],20 improvement: 'DB can be scaled independently, backups easier',21};2223// Stage 3: Add caching and read replicas24const stage3 = {25 architecture: 'Web + Cache + DB Primary + Read Replicas',26 capacity: '~20K req/sec, ~200K users',27 components: ['Web Server', 'Redis Cache', 'DB Primary', '2x DB Replicas'],28 improvement: 'Cache absorbs 80% of reads, replicas handle the rest',29};3031// Stage 4: Horizontal scaling with load balancer32const stage4 = {33 architecture: 'Load Balancer + N Web Servers + Cache + DB',34 capacity: '~100K req/sec, ~1M users',35 components: ['Load Balancer (Nginx/ALB)', '4x Web Servers', 'Redis Cluster', 'DB Primary + 3 Replicas'],36 improvement: 'Add/remove web servers based on traffic',37};3839// Stage 5: Full distributed system40const stage5 = {41 architecture: 'Multi-region with CDN, sharded DB, microservices',42 capacity: '~1M+ req/sec, ~100M users',43 components: ['CDN', 'API Gateway', 'Microservices', 'Sharded DB', 'Kafka', 'Redis Cluster'],44};4546// ---------- Stateless Application Server (Required for Horizontal Scaling) ----------4748// ❌ BAD: Stateful server — can't scale horizontally49class StatefulServer {50 constructor() {51 this.sessions = {}; // In-memory state!52 }53 handleRequest(req) {54 // If load balancer routes to a DIFFERENT server, session is lost55 return this.sessions[req.sessionId];56 }57}5859// ✅ GOOD: Stateless server — scales to any number of instances60class StatelessServer {61 constructor(redisClient) {62 this.redis = redisClient; // External state store63 }64 async handleRequest(req) {65 // Any server can handle any request66 const session = await this.redis.get(`session:\${req.sessionId}`);67 return JSON.parse(session);68 }69}7071// ---------- Connection Pooling (Critical for Scaling) ----------72const { Pool } = require('pg');7374// ❌ BAD: New connection per request75async function badHandler(req) {76 const client = await new Pool().connect(); // 200ms overhead each time!77 const result = await client.query('SELECT 1');78 client.release();79 return result;80}8182// ✅ GOOD: Shared connection pool83const pool = new Pool({84 max: 20,85 idleTimeoutMillis: 30000,86 connectionTimeoutMillis: 2000,87});8889async function goodHandler(req) {90 const result = await pool.query('SELECT 1'); // Reuses existing connection91 return result;92}9394console.log("Scaling stages:", [stage1, stage2, stage3, stage4, stage5].map(s => s.architecture));
🏋️ Practice Exercise
Scaling Plan: Your app has 1K DAU and is growing 10x yearly. Design a 3-year scaling plan with specific milestones (10K, 100K, 1M users). What changes at each stage?
Cost Analysis: Compare the cost of vertically scaling one server to 128 cores / 512GB RAM vs horizontally scaling to 16 x 8-core / 32GB RAM servers. Include operational costs.
Stateless Refactor: Given a Node.js app storing user sessions and file uploads in memory, refactor it to be stateless so it can run on multiple servers behind a load balancer.
Bottleneck Identification: Your system handles 50K req/sec but response times spike during peak hours. Database CPU is at 90%, app servers at 30%. Identify the bottleneck and design the scaling strategy.
⚠️ Common Mistakes
Scaling the wrong component — if the database is the bottleneck, adding more app servers won't help. Always identify the bottleneck first (CPU, memory, I/O, network).
Premature horizontal scaling — adding distributed system complexity before it's needed. A single well-optimized server handles more than most startups need.
Not making servers stateless before scaling horizontally — if servers store session data in memory, load balancers must use sticky sessions, which defeats the purpose of horizontal scaling.
Ignoring database scaling — app servers scale easily horizontally, but the database is usually the bottleneck. Plan for read replicas, caching, and eventually sharding.
💼 Interview Questions
🎤 Mock Interview
Practice a live interview for Horizontal vs Vertical Scaling