Designing Offline-First Sync Systems
📖 Concept
Offline-first sync is the most asked mobile system design topic at Google. The core challenge: how do you keep local data consistent with the server when the device goes offline?
Architecture overview:
┌────────────────────────────────────────────┐
│ UI Layer (Compose/XML) │
│ ↕ observes │
│ ViewModel (StateFlow) │
│ ↕ │
│ Repository (Single Source of Truth) │
│ ↕ ↕ │
│ Room DB Sync Engine │
│ (local) ↕ │
│ Network API │
│ (remote) │
└────────────────────────────────────────────┘
Sync engine responsibilities:
- Track local changes (insertion, update, deletion)
- Queue pending changes for upload
- Push changes when connectivity available
- Pull remote changes and merge
- Resolve conflicts
- Handle partial sync failures
Delta sync protocol:
- Client sends: last sync token + pending local changes
- Server responds: new sync token + remote changes since last token
- Client merges: apply remote changes to local DB, update sync token
- Much more efficient than full sync for large datasets
Conflict resolution strategies:
- LWW (Last Write Wins): Timestamp-based. Simple but can lose data.
- Field-level merge: Merge at the field level — if different fields changed, merge both.
- CRDT: Conflict-free data types that merge automatically (counters, sets).
- User-prompted: Show both versions, let user choose.
💻 Code Example
1// Complete delta sync implementation skeleton23// Sync metadata stored alongside data4@Entity(tableName = "sync_metadata")5data class SyncMetadata(6 @PrimaryKey val tableName: String,7 val lastSyncToken: String = "",8 val lastSyncTimestamp: Long = 09)1011// Change tracking for outgoing sync12@Entity(tableName = "pending_changes")13data class PendingChange(14 @PrimaryKey(autoGenerate = true) val id: Long = 0,15 val entityType: String,16 val entityId: String,17 val changeType: String, // INSERT, UPDATE, DELETE18 val payload: String, // JSON serialized entity19 val createdAt: Long = System.currentTimeMillis()20)2122// Sync engine implementation23class SyncEngine @Inject constructor(24 private val api: SyncApi,25 private val db: AppDatabase,26 private val changeDao: PendingChangeDao,27 private val metadataDao: SyncMetadataDao28) {29 suspend fun performSync(): SyncResult {30 return try {31 // Phase 1: Push local changes32 val pendingChanges = changeDao.getAll()33 if (pendingChanges.isNotEmpty()) {34 val pushResult = api.pushChanges(35 changes = pendingChanges.map { it.toApiModel() }36 )37 if (pushResult.isSuccessful) {38 changeDao.deleteAll(pendingChanges.map { it.id })39 }40 }4142 // Phase 2: Pull remote changes43 val metadata = metadataDao.get("main") ?: SyncMetadata("main")44 val pullResult = api.pullChanges(metadata.lastSyncToken)4546 // Phase 3: Apply remote changes in a transaction47 db.withTransaction {48 for (change in pullResult.changes) {49 applyRemoteChange(change)50 }51 metadataDao.upsert(metadata.copy(52 lastSyncToken = pullResult.newSyncToken,53 lastSyncTimestamp = System.currentTimeMillis()54 ))55 }5657 SyncResult.Success(58 pushed = pendingChanges.size,59 pulled = pullResult.changes.size60 )61 } catch (e: Exception) {62 SyncResult.Error(e)63 }64 }6566 private suspend fun applyRemoteChange(change: RemoteChange) {67 val localEntity = db.noteDao().getById(change.entityId)68 val localPending = changeDao.getByEntityId(change.entityId)6970 if (localPending != null) {71 // Conflict: local has pending changes for this entity72 resolveConflict(localEntity, change, localPending)73 } else {74 // No conflict: apply remote change directly75 when (change.type) {76 "INSERT", "UPDATE" -> db.noteDao().upsert(change.toEntity())77 "DELETE" -> db.noteDao().delete(change.entityId)78 }79 }80 }81}
🏋️ Practice Exercise
Practice:
- Design a delta sync protocol with sync tokens and implement it
- Implement field-level conflict resolution for a note-taking app
- Handle the edge case: user creates entity offline, another user creates entity with same name
- Design a sync system that handles network interruption mid-sync
- Implement exponential backoff retry with WorkManager for sync failures
⚠️ Common Mistakes
Using full sync instead of delta sync — wastes bandwidth for large datasets
Not handling mid-sync failures — if sync crashes halfway, data is in inconsistent state. Use DB transactions.
Ignoring clock skew — device time may be wrong. Use server-assigned timestamps.
Hard-deleting instead of soft-deleting — can't sync deletions without a record of what was deleted
💼 Interview Questions
🎤 Mock Interview
Mock interview is powered by AI for Designing Offline-First Sync Systems. Login to unlock this feature.