Design: Real-Time Chat Application

0/3 in this phase0/35 across the roadmap

📖 Concept

System Design: Real-Time Chat App for React Native

Requirement Clarification:

  • 1:1 and group messaging (up to 500 members)
  • Real-time message delivery (<100ms for online users)
  • Offline support (read/compose while offline, sync when online)
  • Media attachments (images, videos, files)
  • Read receipts and typing indicators
  • Push notifications for offline users
  • End-to-end encryption (optional, for premium users)
  • Scale: 10M DAU, 1B messages/day

High-Level Architecture:

React Native Client
  ├── UI Layer (Screens, Components)
  ├── Chat Engine
  │   ├── Message Queue (outgoing)
  │   ├── WebSocket Manager (real-time)
  │   ├── Sync Engine (offline reconciliation)
  │   └── Encryption Layer (E2E optional)
  ├── Local Storage (SQLite/WatermelonDB)
  └── Media Manager (upload/download/cache)

Backend
  ├── WebSocket Gateway (connection mgmt)
  ├── Message Service (routing, storage)
  ├── Presence Service (online/offline/typing)
  ├── Push Notification Service
  ├── Media Service (upload, CDN)
  └── Storage (Cassandra for messages, Redis for presence)

Key Design Decisions:

  1. Message Delivery: WebSocket + Fallback

    • Primary: Persistent WebSocket for real-time delivery
    • Fallback: Long-polling when WebSocket isn't available
    • Offline: Queue messages locally, push on reconnect
  2. Data Model — Chat Messages:

    • Messages stored locally in SQLite (WatermelonDB for RN)
    • Server stores in append-only log (Cassandra)
    • Each message has: id, chatId, senderId, content, type, status, timestamp, localId
    • Message status: sending → sent → delivered → read
  3. Offline Strategy:

    • Local-first: all reads from local DB
    • Writes queued with exponential retry
    • On reconnect: pull missed messages via cursor-based sync
    • Conflict resolution: server timestamp = source of truth
  4. Trade-offs:

    • WebSocket vs HTTP polling: WS = real-time but complex reconnection; polling = simpler but higher latency
    • SQLite vs AsyncStorage: SQLite = better query, indexing; AsyncStorage = simpler for small data
    • E2E encryption: adds complexity (key management, no server-side search) but critical for privacy

💻 Code Example

codeTap to expand ⛶
1// === CHAT SYSTEM DESIGN — KEY COMPONENTS ===
2
3// 1. WebSocket Manager with auto-reconnect
4class WebSocketManager {
5 private ws: WebSocket | null = null;
6 private reconnectAttempts = 0;
7 private maxReconnectAttempts = 10;
8 private messageQueue: QueuedMessage[] = [];
9
10 connect(url: string, token: string) {
11 this.ws = new WebSocket(`${url}?token=${token}`);
12
13 this.ws.onopen = () => {
14 this.reconnectAttempts = 0;
15 this.flushQueue(); // Send queued messages
16 };
17
18 this.ws.onmessage = (event) => {
19 const message = JSON.parse(event.data);
20 this.handleServerMessage(message);
21 };
22
23 this.ws.onclose = (event) => {
24 if (!event.wasClean) {
25 this.scheduleReconnect();
26 }
27 };
28 }
29
30 private scheduleReconnect() {
31 if (this.reconnectAttempts >= this.maxReconnectAttempts) return;
32
33 const delay = Math.min(
34 1000 * Math.pow(2, this.reconnectAttempts), // Exponential backoff
35 30000 // Max 30 seconds
36 );
37
38 this.reconnectAttempts++;
39 setTimeout(() => this.connect(this.url, this.token), delay);
40 }
41
42 send(message: OutgoingMessage) {
43 if (this.ws?.readyState === WebSocket.OPEN) {
44 this.ws.send(JSON.stringify(message));
45 } else {
46 this.messageQueue.push({ ...message, queuedAt: Date.now() });
47 }
48 }
49
50 private handleServerMessage(message: ServerMessage) {
51 switch (message.type) {
52 case 'MESSAGE':
53 chatStore.receiveMessage(message.payload);
54 break;
55 case 'TYPING':
56 chatStore.setTyping(message.payload.chatId, message.payload.userId);
57 break;
58 case 'READ_RECEIPT':
59 chatStore.markRead(message.payload.chatId, message.payload.messageId);
60 break;
61 case 'PRESENCE':
62 presenceStore.update(message.payload);
63 break;
64 }
65 }
66}
67
68// 2. Message sync engine
69class SyncEngine {
70 async syncChat(chatId: string) {
71 const lastSynced = await localDB.getLastSyncCursor(chatId);
72
73 const serverMessages = await api.getMessages({
74 chatId,
75 after: lastSynced,
76 limit: 100,
77 });
78
79 // Upsert server messages into local DB
80 await localDB.upsertMessages(chatId, serverMessages.messages);
81 await localDB.setSyncCursor(chatId, serverMessages.cursor);
82
83 // Push pending local messages
84 const pending = await localDB.getPendingMessages(chatId);
85 for (const msg of pending) {
86 try {
87 const serverMsg = await api.sendMessage(msg);
88 await localDB.updateMessageStatus(msg.localId, 'sent', serverMsg.id);
89 } catch (error) {
90 if (error.status === 409) {
91 // Duplicate — already sent, just update local
92 await localDB.updateMessageStatus(msg.localId, 'sent');
93 }
94 // Other errors — keep in pending queue for next sync
95 }
96 }
97 }
98}
99
100// 3. Optimistic message sending
101function useSendMessage(chatId: string) {
102 return useCallback(async (content: string) => {
103 const localId = generateUUID();
104 const optimisticMessage = {
105 localId,
106 chatId,
107 content,
108 senderId: currentUserId,
109 status: 'sending',
110 timestamp: Date.now(),
111 };
112
113 // Show immediately in UI
114 chatStore.addOptimisticMessage(optimisticMessage);
115
116 try {
117 // Send via WebSocket for speed
118 wsManager.send({ type: 'MESSAGE', payload: optimisticMessage });
119
120 // Also hit REST API for reliability
121 const serverMsg = await api.sendMessage(optimisticMessage);
122 chatStore.confirmMessage(localId, serverMsg.id);
123 } catch (error) {
124 chatStore.failMessage(localId, error.message);
125 }
126 }, [chatId]);
127}

🏋️ Practice Exercise

System Design Exercises:

  1. Design the complete data model for a chat app — messages, chats, participants, read cursors
  2. Implement a WebSocket manager with auto-reconnect and message queuing
  3. Design the offline sync strategy — handle network partitions lasting hours
  4. Design the media upload pipeline — chunked upload, progress, resume on failure
  5. Draw the full architecture diagram including client, backend, and infrastructure
  6. Discuss: how would you add E2E encryption? What changes in the architecture?

⚠️ Common Mistakes

  • Using REST for real-time messaging — too slow for chat; WebSocket is essential

  • Not implementing message deduplication — retried messages can appear twice if server doesn't deduplicate by localId

  • Storing all messages as a flat array — use a proper local database (SQLite/WatermelonDB) with indexes for performance

  • Not handling WebSocket reconnection — users lose real-time updates and don't know it

  • Sending read receipts immediately — batch them to avoid flooding the server (debounce per chat)

💼 Interview Questions

🎤 Mock Interview

Mock interview is powered by AI for Design: Real-Time Chat Application. Login to unlock this feature.