experiments in a post-browser web
10
fork

Configure Feed

Select the types of activity you want to include in your feed.

docs: sync source refactor research — replace syncSource with device IDs

Comprehensive analysis of syncSource usage across all platforms, with
migration plan to device ID-based tracking and a new devices table.

+233
+233
docs/sync-source-refactor.md
··· 1 + # Sync Source Refactor: Replace syncSource with Device IDs 2 + 3 + Research doc — Feb 2026 4 + 5 + ## Problem 6 + 7 + `syncSource` is overloaded with two jobs: 8 + 9 + 1. **Origin tracking** — where did this item come from? 10 + 2. **Sync gating** — should this item be pushed to the server? 11 + 12 + This causes several issues: 13 + 14 + - Items imported from browser history/tabs/bookmarks are silently excluded from sync (by convention, not by design) 15 + - Items pulled from the server lose their original source (overwritten with `'server'`) 16 + - The empty-string convention (`syncSource = ''` means "eligible for push") is implicit and fragile 17 + - Desktop navigation-tracked URLs sync with `syncSource = ''`, same as manual saves — making them indistinguishable on other devices 18 + 19 + The fix: **everything syncs**, using the deterministic timestamp-based algorithm we already have. Device IDs (already partially implemented) become the source-of-truth for item origin. 20 + 21 + ## Current State 22 + 23 + ### syncSource values in the wild 24 + 25 + | Value | Set by | Meaning | Syncs? | 26 + |-------|--------|---------|--------| 27 + | `''` (empty) | Default on item creation | Locally created, never synced | Yes (first push) | 28 + | `'server'` | Sync pull/push | Came from or was pushed to server | Only if modified after last sync | 29 + | `'history'` | Browser extension | Imported from browser history | Yes (empty string check passes*) | 30 + | `'tab'` | Browser extension | Imported from open tabs | Yes* | 31 + | `'bookmark'` | Browser extension | Imported from bookmarks | Yes* | 32 + 33 + *Note: Extension imports have non-empty syncSource but `syncedAt = 0`, so they match the push query on incremental sync (`syncSource = '' OR (syncedAt > 0 AND updatedAt > syncedAt)`) — the first condition fails, the second fails too since `syncedAt = 0`. On full sync, only `syncSource = ''` items are pushed. So **extension imports are effectively blocked from syncing**, which was the intent but is encoded implicitly. 34 + 35 + ### Where syncSource is read/written 36 + 37 + **Sync algorithm (push filtering):** 38 + - `backend/electron/sync.ts:515-525` — WHERE clause filters by `syncSource = ''` 39 + - `sync/sync.js:102-107` — Same filter in unified sync engine 40 + - `backend/tauri/src-tauri/src/sync.rs:634-665` — Same filter in Tauri desktop 41 + 42 + **Set on push (marks as synced):** 43 + - `backend/electron/sync.ts:619-620` — `syncSource = 'server'` after push 44 + - `sync/sync.js:162-166` — Same 45 + - `backend/tauri/src-tauri/src/sync.rs:799-800` — Same 46 + 47 + **Set on pull (marks origin as server):** 48 + - `backend/electron/sync.ts:393-394` — `syncSource = 'server'` on new items from server 49 + - `sync/sync.js:399-400` — Same 50 + - `backend/tauri/src-tauri/src/sync.rs:521-522` — Same 51 + 52 + **Reset on server change:** 53 + - `backend/electron/sync.ts:667-668` — Resets all to `syncSource = ''` 54 + - `sync/sync.js:277-282` — Same 55 + 56 + **Browser extension (sets import source):** 57 + - `backend/extension/history.js:119` — `syncSource = 'history'` 58 + - `backend/extension/tabs.js:138` — `syncSource = 'tab'` 59 + - `backend/extension/bookmarks.js:84` — `syncSource = 'bookmark'` 60 + 61 + **Datastore (gates device metadata):** 62 + - `backend/electron/datastore.ts:2322-2324` — Skips `addDeviceMetadata()` if `syncSource` is present 63 + - `backend/electron/datastore.ts:2404-2406` — Same for updates 64 + 65 + **Sync status UI:** 66 + - `backend/electron/sync.ts:773-775` — Counts pending items using syncSource filter 67 + - Extension stats pages count items by syncSource value 68 + 69 + ### Device ID (already partially implemented) 70 + 71 + `backend/electron/device.ts` generates device IDs and adds them to item metadata: 72 + 73 + ``` 74 + metadata._sync = { 75 + createdBy: "desktop-550e8400-e29b-41d4-a716-446655440000", 76 + createdAt: 1767023561234, 77 + modifiedBy: "desktop-550e8400-e29b-41d4-a716-446655440000", 78 + modifiedAt: 1767023561234 79 + } 80 + ``` 81 + 82 + This metadata **survives sync** (it's inside the JSON blob), unlike `syncSource` which gets overwritten. 83 + 84 + ### Browser import classifications 85 + 86 + Already stored redundantly — **removing syncSource loses nothing**: 87 + 88 + - **Tags**: `from:history`, `from:tab`, `from:bookmark` — auto-applied on import 89 + - **Metadata JSON**: Rich browser-specific data (visit counts, tab properties, bookmark dates) 90 + 91 + ## Changes 92 + 93 + ### 1. Remove syncSource from sync algorithm 94 + 95 + **Push query** — replace syncSource filter with timestamp-based: 96 + 97 + Before: 98 + ```sql 99 + WHERE (deletedAt = 0 AND (syncSource = '' OR (syncedAt > 0 AND updatedAt > syncedAt))) 100 + ``` 101 + 102 + After: 103 + ```sql 104 + WHERE (deletedAt = 0 AND (syncedAt = 0 OR updatedAt > syncedAt)) 105 + ``` 106 + 107 + This means: push items that have never been synced (`syncedAt = 0`) or that were modified since last sync. No reference to syncSource. 108 + 109 + **After push** — only update `syncId` and `syncedAt`, stop setting `syncSource`: 110 + 111 + Before: 112 + ```sql 113 + UPDATE items SET syncId = ?, syncSource = 'server', syncedAt = ? WHERE id = ? 114 + ``` 115 + 116 + After: 117 + ```sql 118 + UPDATE items SET syncId = ?, syncedAt = ? WHERE id = ? 119 + ``` 120 + 121 + **On pull** — stop setting `syncSource = 'server'` on new items. Device metadata (`_sync.createdBy`) already tracks origin. 122 + 123 + **On server change** — reset `syncedAt = 0` to force full re-sync. No need to touch syncSource. 124 + 125 + **Files to update:** 126 + - `backend/electron/sync.ts` — lines 393, 515-525, 619-620, 667-668, 773-775 127 + - `sync/sync.js` — lines 102-107, 162-166, 277-282, 399-400 128 + - `backend/tauri/src-tauri/src/sync.rs` — lines 521-522, 634-665, 799-800, 894 129 + 130 + ### 2. Stop setting syncSource on item creation 131 + 132 + **Browser extension** — remove `syncSource` param from `addItem()` calls: 133 + - `backend/extension/history.js:119` — remove `syncSource: 'history'` 134 + - `backend/extension/tabs.js:138` — remove `syncSource: 'tab'` 135 + - `backend/extension/bookmarks.js:84` — remove `syncSource: 'bookmark'` 136 + 137 + Classifications are already preserved in tags (`from:*`) and metadata JSON. 138 + 139 + **Datastore** — remove syncSource gating on device metadata: 140 + - `backend/electron/datastore.ts:2322-2324` — always call `addDeviceMetadata()`, not just when `!options.syncSource` 141 + - `backend/electron/datastore.ts:2404-2406` — same for updates 142 + 143 + ### 3. Device ID prefix removal 144 + 145 + Device IDs currently use platform prefixes: `"desktop-{uuid}"` (Electron), `"extension-{uuid}"` (browser extension). These prefixes are **not used in any conditional logic** — verified by codebase-wide search. No code checks `startsWith('desktop')` or parses the prefix. 146 + 147 + **Change:** Generate plain UUIDs going forward. Platform/type info belongs in the devices table. 148 + 149 + **Migration:** On startup, detect prefixed device IDs and strip the prefix: 150 + 151 + ``` 152 + "desktop-550e8400-e29b-41d4-a716-446655440000" → "550e8400-e29b-41d4-a716-446655440000" 153 + ``` 154 + 155 + This requires updating: 156 + 1. The stored device ID in `extension_settings` 157 + 2. All `metadata._sync.createdBy` and `metadata._sync.modifiedBy` references in existing items 158 + 159 + **Detection logic:** Check if device ID matches `/{prefix}-[0-9a-f]{8}-/` pattern. Known prefixes: `desktop-`, `extension-`. Strip prefix, keep UUID. 160 + 161 + **One exception:** `backend/extension/environment.js:29` checks `parsed.startsWith('extension-')` for the extension device ID. This needs updating to handle both old prefixed and new plain UUID formats during the transition period. 162 + 163 + **Files to update:** 164 + - `backend/electron/device.ts:46` — `crypto.randomUUID()` (remove `desktop-` prefix) 165 + - `backend/extension/environment.js` — update `startsWith('extension-')` check 166 + - Add migration in `backend/electron/datastore.ts` (startup) 167 + - Add migration in `backend/extension/` (startup) 168 + 169 + ### 4. New `devices` table 170 + 171 + Replace the ad-hoc `extension_settings` K/V storage with a proper table. 172 + 173 + ```sql 174 + CREATE TABLE IF NOT EXISTS devices ( 175 + id TEXT PRIMARY KEY, -- Plain UUID 176 + name TEXT DEFAULT '', -- User-friendly: hostname, "iPhone", etc. 177 + platform TEXT NOT NULL, -- 'electron', 'tauri', 'tauri-mobile', 'extension', 'server' 178 + metadata TEXT DEFAULT '{}', -- JSON: OS, arch, app version, backend type, capabilities 179 + createdAt INTEGER NOT NULL, -- First registration 180 + lastSeenAt INTEGER NOT NULL -- Last activity (updated on sync, app launch, etc.) 181 + ); 182 + ``` 183 + 184 + **All clients register on startup.** Server nodes also register (even though they don't originate items today, they may in the future — e.g., server-side feed fetching). 185 + 186 + **The `metadata._sync.createdBy` / `modifiedBy` fields** reference `devices.id` (plain UUID). 187 + 188 + **Synced table:** The devices table itself should sync across all nodes so every client knows about every device. This enables UI like "saved from MacBook" or "last synced from iPhone". 189 + 190 + **Schema location:** Add to `schema/v1.json` (or v2 if we version it). 191 + 192 + ### 5. Mobile filtering (separate task) 193 + 194 + Once syncSource cleanup is done and device IDs are reliable, mobile can filter the default list view: 195 + 196 + - Items with `metadata._sync.createdBy` matching a device where `platform = 'electron'` AND tagged `from:history` or created by navigation tracking → hide from default list, show in search 197 + - This is a frontend-only change on mobile (no schema changes needed) 198 + 199 + ## Migration & Backward Compatibility 200 + 201 + ### syncSource column 202 + 203 + **Keep the column** in the schema — removing it would break older clients. Just stop reading/writing it in new code. It becomes a no-op legacy field. 204 + 205 + ### Existing items 206 + 207 + No data migration needed for sync to work. The new push query (`syncedAt = 0 OR updatedAt > syncedAt`) handles all existing items correctly: 208 + - Items with `syncSource = ''` and `syncedAt = 0` → pushed (same as before) 209 + - Items with `syncSource = 'server'` and `syncedAt > 0` → only pushed if modified (same as before) 210 + - Items with `syncSource = 'history'` and `syncedAt = 0` → now pushed (new — these were previously blocked) 211 + 212 + The last point means browser imports will start syncing after this change. This is the desired behavior ("everything syncs"). 213 + 214 + ### Device ID migration 215 + 216 + Run on startup before any sync operations: 217 + 218 + 1. Read current device ID from `extension_settings` 219 + 2. If it matches `{prefix}-{uuid}` pattern, strip the prefix 220 + 3. Update `extension_settings` with the plain UUID 221 + 4. Scan items table: `UPDATE items SET metadata = replace(metadata, old_device_id, new_device_id) WHERE metadata LIKE '%' || old_device_id || '%'` 222 + 5. Register in new `devices` table 223 + 224 + ### Rollback safety 225 + 226 + If needed, the old sync algorithm still works — it just won't push browser imports (same as today). The syncSource column is preserved, so reverting the push query is a one-line change. 227 + 228 + ## Prod Impact 229 + 230 + - **Low risk**: The sync algorithm change is additive (more items sync, none stop syncing) 231 + - **Browser imports will start syncing**: This is intentional but may surprise users who had large history imports. Consider a one-time notification or a toggle. 232 + - **No breaking API changes**: Server doesn't use syncSource for anything — it stores and returns it, that's all 233 + - **Device ID migration is local-only**: Each client migrates its own stored ID on startup. No server coordination needed.