Back-of-Envelope Estimation

0/4 in this phase0/45 across the roadmap

📖 Concept

Back-of-envelope estimation is the skill of quickly calculating approximate system capacity requirements — and it's a critical part of system design interviews. You don't need exact numbers; you need to be within an order of magnitude.

Essential Numbers to Memorize

Metric Value
Seconds in a day ~86,400 (~100K for easy math)
Seconds in a month ~2.5 million
1 million seconds ~12 days
1 billion seconds ~31 years

Storage & Memory

Unit Value
1 char (ASCII) 1 byte
1 char (UTF-8) 1-4 bytes
Average tweet ~300 bytes
Average image (compressed) ~300 KB
HD video (1 minute) ~100 MB
1 million rows (avg 1KB each) ~1 GB

Latency Numbers

Operation Time
L1 cache reference 0.5 ns
L2 cache reference 7 ns
RAM access 100 ns
SSD random read 150 Ξs
HDD random read 10 ms
Network round-trip (same datacenter) 0.5 ms
Network round-trip (cross-continent) 150 ms

Throughput Rules of Thumb

System Throughput
Single web server 1K-10K req/sec
MySQL (single node) 1K-5K queries/sec
Redis 100K-500K ops/sec
Kafka (single broker) 1M messages/sec

The 2-Step Estimation Process

  1. Start with daily active users (DAU) → derive requests per second
  2. Multiply by data per request → derive storage, bandwidth, and memory needs

Power of 2 Table

Power Exact Approx
2^10 1,024 1 Thousand (KB)
2^20 1,048,576 1 Million (MB)
2^30 1,073,741,824 1 Billion (GB)
2^40 1 Trillion (TB)

Pro tip: In interviews, always round aggressively. Say "about 100K QPS" not "exactly 115,740.74 QPS." Interviewers want to see your approach, not your arithmetic.

ðŸ’ŧ Code Example

codeTap to expand â›ķ
1// ============================================
2// Back-of-Envelope Estimation — Practice Examples
3// ============================================
4
5// ---------- Example 1: Twitter-like Feed Service ----------
6
7function estimateTwitterScale() {
8 // Step 1: Users
9 const monthlyActiveUsers = 500_000_000; // 500M MAU
10 const dailyActiveUsers = monthlyActiveUsers * 0.4; // 200M DAU (40% MAU)
11
12 // Step 2: Activity per user
13 const tweetsPerUserPerDay = 2; // Average (most users read, few tweet)
14 const feedRefreshesPerUserPerDay = 20; // Users scroll feed often
15 const followsPerUser = 200; // Average followings
16
17 // Step 3: QPS (Queries Per Second)
18 const secondsInDay = 86_400; // Round to ~100K for quick math
19 const tweetWriteQPS = (dailyActiveUsers * tweetsPerUserPerDay) / secondsInDay;
20 // = (200M * 2) / 86400 ≈ 4,630 writes/sec
21
22 const feedReadQPS = (dailyActiveUsers * feedRefreshesPerUserPerDay) / secondsInDay;
23 // = (200M * 20) / 86400 ≈ 46,300 reads/sec
24
25 const peakMultiplier = 3; // Peak hours = ~3x average
26 const peakReadQPS = feedReadQPS * peakMultiplier;
27 // ≈ 139,000 reads/sec at peak
28
29 // Step 4: Storage
30 const avgTweetSize = 300; // bytes (280 chars + metadata)
31 const tweetsPerDay = dailyActiveUsers * tweetsPerUserPerDay;
32 // = 400M tweets/day
33
34 const dailyStorage = tweetsPerDay * avgTweetSize;
35 // = 400M * 300 bytes = 120 GB/day
36
37 const yearlyStorage = dailyStorage * 365;
38 // = 120 GB * 365 ≈ 43.8 TB/year (text only, no media)
39
40 // Step 5: Media storage (images/videos)
41 const percentWithMedia = 0.2; // 20% of tweets have media
42 const avgMediaSize = 500_000; // 500 KB average
43 const dailyMediaStorage = tweetsPerDay * percentWithMedia * avgMediaSize;
44 // = 400M * 0.2 * 500KB = 40 TB/day ← THIS is why CDNs exist!
45
46 return {
47 tweetWriteQPS: Math.round(tweetWriteQPS),
48 feedReadQPS: Math.round(feedReadQPS),
49 peakReadQPS: Math.round(peakReadQPS),
50 dailyStorageGB: Math.round(dailyStorage / 1e9),
51 yearlyStorageTB: Math.round(yearlyStorage / 1e12),
52 dailyMediaStorageTB: Math.round(dailyMediaStorage / 1e12),
53 };
54}
55
56// ---------- Example 2: Estimating Cache Size ----------
57
58function estimateCacheRequirements() {
59 // Rule: Cache the hottest 20% of data (Pareto principle: 80/20 rule)
60
61 const totalDailyRequests = 10_000_000_000; // 10B requests/day
62 const uniqueURLs = 1_000_000_000; // 1B unique URLs
63 const avgResponseSize = 500; // 500 bytes
64
65 // 20% of URLs serve 80% of traffic
66 const hotURLs = uniqueURLs * 0.2; // 200M URLs to cache
67 const cacheSize = hotURLs * avgResponseSize;
68 // = 200M * 500 bytes = 100 GB ← fits in a single Redis cluster!
69
70 // How many Redis nodes?
71 const redisMemoryPerNode = 64 * 1e9; // 64 GB per node
72 const nodesNeeded = Math.ceil(cacheSize / redisMemoryPerNode);
73 // = ceil(100GB / 64GB) = 2 nodes (with replication: 4-6 nodes)
74
75 return {
76 hotURLs,
77 cacheSizeGB: Math.round(cacheSize / 1e9),
78 redisNodes: nodesNeeded,
79 withReplication: nodesNeeded * 3, // 3x for replication factor
80 };
81}
82
83// ---------- Example 3: Bandwidth Estimation ----------
84
85function estimateBandwidth() {
86 // Video streaming service like Netflix
87 const concurrentViewers = 10_000_000; // 10M concurrent
88 const avgBitrateBytes = 5_000_000; // 5 Mbps = ~625 KB/s per stream
89 const totalBandwidth = concurrentViewers * avgBitrateBytes;
90 // = 10M * 5MB/s = 50 TB/s ← This is why Netflix uses CDNs globally
91
92 // Convert to Gbps
93 const bandwidthGbps = (totalBandwidth * 8) / 1e9;
94 // = 400,000 Gbps = 400 Tbps
95
96 return {
97 totalBandwidthTBps: Math.round(totalBandwidth / 1e12),
98 bandwidthGbps: Math.round(bandwidthGbps),
99 };
100}
101
102console.log("Twitter estimates:", estimateTwitterScale());
103console.log("Cache estimates:", estimateCacheRequirements());
104console.log("Bandwidth estimates:", estimateBandwidth());

🏋ïļ Practice Exercise

  1. YouTube Storage: Estimate how much storage YouTube needs per day if 500 hours of video are uploaded every minute. Assume average quality of 720p at 2.5 Mbps.

  2. WhatsApp Message Volume: With 2 billion users and 100 billion messages per day, calculate the QPS for message delivery. What's the peak QPS assuming a 3x peak factor?

  3. Instagram Feed Cache: Instagram has 500M DAU who refresh their feed 10 times/day. Each feed contains 20 posts at 2KB metadata each. How much cache memory do you need for a 24-hour window?

  4. Uber Driver Matching: If Uber has 5M active drivers sending GPS updates every 4 seconds, calculate the QPS for location updates and the daily storage if each update is 100 bytes.

  5. Email Service Scale: Design capacity estimates for a Gmail-like service with 1.8B users, 300B emails/day, avg 50KB per email. Calculate daily storage, QPS, and how many servers you need.

  6. Practice Quick Math: Convert these without a calculator: (a) 500M / 86400, (b) 1TB in GB, (c) 1M * 4KB in GB, (d) 50K req/sec * 1KB per request in MB/sec.

⚠ïļ Common Mistakes

  • Getting bogged down in exact arithmetic — interviewers want to see your thought process and order-of-magnitude reasoning, not precise calculations. Round aggressively (86,400 → ~100K).

  • Forgetting peak traffic — average QPS is meaningless without accounting for peak hours. Always multiply by 2-3x for peak traffic and design for the peak, not the average.

  • Not considering data growth over time — if you need 10TB today, you'll need 30-50TB in 3 years. Always project 3-5 years ahead for storage and capacity planning.

  • Ignoring media/attachments — text data is tiny compared to images and videos. A chat app's storage is 90%+ media, not messages. Always ask 'does this system handle media?'

💞 Interview Questions

ðŸŽĪ Mock Interview

Practice a live interview for Back-of-Envelope Estimation