experiments in a post-browser web
10
fork

Configure Feed

Select the types of activity you want to include in your feed.

add notes

+2410 -1
+584
notes/datastore-architecture.md
··· 1 + # Datastore Architecture 2 + 3 + ## Overview 4 + 5 + Peek's datastore uses a centralized architecture where all data operations are handled in the main Electron process, with renderer processes accessing the datastore through an IPC (Inter-Process Communication) API exposed via the preload script. 6 + 7 + ## Architectural Decision 8 + 9 + ### The Choice: IPC-Based API vs Direct Library Access 10 + 11 + During implementation, we faced a critical architectural decision: 12 + 13 + **Option 1: Direct Library Access** 14 + - Features import and use TinyBase directly in renderer processes 15 + - Simpler initial implementation 16 + - Tighter coupling to TinyBase 17 + 18 + **Option 2: IPC-Based API (Chosen)** 19 + - Datastore logic centralized in main process 20 + - Features access via `api.datastore` abstraction 21 + - Complete separation between storage layer and application features 22 + 23 + ### Reasoning 24 + 25 + We chose **Option 2** for the following strategic reasons: 26 + 27 + 1. **Runtime Portability** 28 + - Future consideration of Tauri as an alternative to Electron 29 + - No renderer code changes needed when switching runtimes 30 + - Abstraction isolates platform-specific concerns 31 + 32 + 2. **Storage Backend Flexibility** 33 + - Can swap TinyBase for SQLite, Dexie, or cloud datastores 34 + - Features remain unchanged regardless of backend 35 + - Enables gradual migration strategies 36 + 37 + 3. **Cloud & Sync Readiness** 38 + - Architecture naturally supports remote datastore endpoints 39 + - Same API can route to local or cloud storage 40 + - Facilitates future sync implementations 41 + 42 + 4. **Mobile App Development** 43 + - Mobile apps can use the same API contract 44 + - Platform-specific storage implementations possible 45 + - Consistent developer experience across platforms 46 + 47 + 5. **Security & Isolation** 48 + - Datastore logic contained in trusted main process 49 + - Renderer processes have controlled, validated access 50 + - Easier to audit and secure data operations 51 + 52 + ## Current Implementation 53 + 54 + ### Technology Stack 55 + 56 + - **Storage Engine**: TinyBase v0.7.2 57 + - Reactive data store with CRDT support 58 + - Schema validation and indexes 59 + - Small footprint (5-11kB gzipped) 60 + - Built-in support for relationships and metrics 61 + 62 + - **Communication**: Electron IPC 63 + - `ipcMain.handle()` for main process handlers 64 + - `ipcRenderer.invoke()` for renderer requests 65 + - Async/await throughout 66 + 67 + ### Architecture Components 68 + 69 + ``` 70 + ┌─────────────────────────────────────────────────────────────┐ 71 + │ Renderer Process (app/) │ 72 + │ │ 73 + │ ┌────────────┐ ┌────────────┐ │ 74 + │ │ Features │────────▶│ api.js │ │ 75 + │ │ (peeks, │ │ │ │ 76 + │ │ slides, │ │ api. │ │ 77 + │ │ scripts) │ │ datastore │ │ 78 + │ └────────────┘ └─────┬──────┘ │ 79 + │ │ │ 80 + └───────────────────────────────┼──────────────────────────────┘ 81 + │ IPC invoke() 82 + 83 + ┌───────────────────────────────┼──────────────────────────────┐ 84 + │ Main Process (index.js) │ │ 85 + │ ▼ │ 86 + │ ┌──────────────────────────────────────────┐ │ 87 + │ │ IPC Handlers │ │ 88 + │ │ • datastore-add-address │ │ 89 + │ │ • datastore-get-address │ │ 90 + │ │ • datastore-query-addresses │ │ 91 + │ │ • datastore-add-visit │ │ 92 + │ │ • datastore-query-visits │ │ 93 + │ │ • datastore-add-content │ │ 94 + │ │ • datastore-get-table │ │ 95 + │ │ • datastore-set-row │ │ 96 + │ │ • datastore-get-stats │ │ 97 + │ └──────────────┬───────────────────────────┘ │ 98 + │ │ │ 99 + │ ▼ │ 100 + │ ┌──────────────────────────────────────────┐ │ 101 + │ │ TinyBase Store │ │ 102 + │ │ • Store (datastoreStore) │ │ 103 + │ │ • Indexes (datastoreIndexes) │ │ 104 + │ │ • Relationships (datastoreRelationships)│ │ 105 + │ │ • Metrics (datastoreMetrics) │ │ 106 + │ └──────────────────────────────────────────┘ │ 107 + │ │ 108 + └──────────────────────────────────────────────────────────────┘ 109 + ``` 110 + 111 + ### File Structure 112 + 113 + ``` 114 + /Users/dietrich/misc/peek/ 115 + ├── index.js # Main process 116 + │ ├── TinyBase initialization (lines 115-175) 117 + │ ├── Helper functions (lines 177-230) 118 + │ └── IPC handlers (lines 979-1230) 119 + 120 + ├── preload.js # Preload script 121 + │ └── api.datastore exposure (lines 210-242) 122 + 123 + └── app/ # Renderer process 124 + ├── datastore/ 125 + │ ├── schema.js # TinyBase schema definitions 126 + │ ├── config.js # Datastore configuration 127 + │ ├── history.js # Navigation tracking helpers 128 + │ └── test-ipc.html # IPC API test page 129 + 130 + ├── scripts/index.js # Script results tracking 131 + ├── peeks/index.js # Peek navigation tracking 132 + └── slides/index.js # Slide navigation tracking 133 + ``` 134 + 135 + ## API Reference 136 + 137 + ### api.datastore Methods 138 + 139 + All methods return a Promise that resolves to `{ success: boolean, data?: any, error?: string, id?: string }` 140 + 141 + #### Address Management 142 + 143 + ```javascript 144 + // Add a new address 145 + await api.datastore.addAddress(uri, options) 146 + // Parameters: 147 + // uri: string - The URL to track 148 + // options: { 149 + // title?: string 150 + // mimeType?: string 151 + // favicon?: string 152 + // description?: string 153 + // tags?: string (comma-separated) 154 + // metadata?: string (JSON) 155 + // } 156 + // Returns: { success: true, id: 'addr_...' } 157 + 158 + // Get address by ID 159 + await api.datastore.getAddress(id) 160 + // Returns: { success: true, data: { uri, domain, title, ... } } 161 + 162 + // Update address 163 + await api.datastore.updateAddress(id, updates) 164 + // Parameters: 165 + // id: string - Address ID 166 + // updates: object - Fields to update 167 + // Returns: { success: true, data: { ...updatedRow } } 168 + 169 + // Query addresses 170 + await api.datastore.queryAddresses(filter) 171 + // Parameters: 172 + // filter: { 173 + // domain?: string 174 + // protocol?: string 175 + // starred?: 0 | 1 176 + // tag?: string 177 + // sortBy?: 'lastVisit' | 'visitCount' | 'created' 178 + // limit?: number 179 + // } 180 + // Returns: { success: true, data: [...addresses] } 181 + ``` 182 + 183 + #### Visit Tracking 184 + 185 + ```javascript 186 + // Add a visit 187 + await api.datastore.addVisit(addressId, options) 188 + // Parameters: 189 + // addressId: string - The address being visited 190 + // options: { 191 + // source?: string - Source of navigation (peek, slide, direct) 192 + // sourceId?: string - ID of the source feature 193 + // windowType?: string - Type of window (modal, persistent, main) 194 + // duration?: number - Time spent in milliseconds 195 + // scrollDepth?: number - Scroll percentage (0-100) 196 + // interacted?: 0 | 1 - Whether user interacted 197 + // metadata?: string - Additional JSON data 198 + // } 199 + // Returns: { success: true, id: 'visit_...' } 200 + // Side effect: Updates address visitCount and lastVisitAt 201 + 202 + // Query visits 203 + await api.datastore.queryVisits(filter) 204 + // Parameters: 205 + // filter: { 206 + // addressId?: string 207 + // source?: string 208 + // windowType?: string 209 + // startDate?: number (timestamp) 210 + // endDate?: number (timestamp) 211 + // limit?: number 212 + // } 213 + // Returns: { success: true, data: [...visits] } 214 + ``` 215 + 216 + #### Content Management 217 + 218 + ```javascript 219 + // Add content (notes, markdown, code, etc.) 220 + await api.datastore.addContent(options) 221 + // Parameters: 222 + // options: { 223 + // title?: string 224 + // content: string 225 + // contentType?: 'plain' | 'markdown' | 'code' | 'json' | 'csv' 226 + // mimeType?: string 227 + // language?: string (for code) 228 + // tags?: string 229 + // addressId?: string (if related to an address) 230 + // metadata?: string 231 + // } 232 + // Returns: { success: true, id: 'content_...' } 233 + ``` 234 + 235 + #### Direct Table Access 236 + 237 + ```javascript 238 + // Get entire table 239 + await api.datastore.getTable(tableName) 240 + // Parameters: 241 + // tableName: 'addresses' | 'visits' | 'content' | 'tags' | 'blobs' | 'scripts_data' | 'feeds' 242 + // Returns: { success: true, data: { rowId: { ...row }, ... } } 243 + 244 + // Set row directly 245 + await api.datastore.setRow(tableName, rowId, rowData) 246 + // Parameters: 247 + // tableName: string 248 + // rowId: string 249 + // rowData: object - Complete row data matching schema 250 + // Returns: { success: true } 251 + ``` 252 + 253 + #### Statistics 254 + 255 + ```javascript 256 + // Get aggregate statistics 257 + await api.datastore.getStats() 258 + // Returns: { 259 + // success: true, 260 + // data: { 261 + // totalAddresses: number, 262 + // totalVisits: number, 263 + // totalContent: number, 264 + // // ... other metrics 265 + // } 266 + // } 267 + ``` 268 + 269 + ## Usage Examples 270 + 271 + ### Example 1: Track Navigation from a Feature 272 + 273 + ```javascript 274 + // app/peeks/index.js 275 + import api from '../api.js'; 276 + 277 + const executeItem = async (item) => { 278 + // Open window and navigate 279 + const window = await windows.createWindow(item.address, params); 280 + 281 + // Track the navigation 282 + if (api.datastore) { 283 + try { 284 + // Get or create address 285 + const addResult = await api.datastore.addAddress(item.address, { 286 + title: item.title, 287 + mimeType: 'text/html' 288 + }); 289 + 290 + // Record visit 291 + if (addResult.success) { 292 + await api.datastore.addVisit(addResult.id, { 293 + source: 'peek', 294 + sourceId: `peek_${item.keyNum}`, 295 + windowType: 'modal' 296 + }); 297 + } 298 + } catch (error) { 299 + console.error('Failed to track navigation:', error); 300 + } 301 + } 302 + }; 303 + ``` 304 + 305 + ### Example 2: Save Script Results 306 + 307 + ```javascript 308 + // app/scripts/index.js 309 + import api from '../api.js'; 310 + 311 + const saveScriptResult = async (script, result) => { 312 + try { 313 + // Find or create address 314 + const addressesResult = await api.datastore.queryAddresses({}); 315 + let addressId; 316 + 317 + if (addressesResult.success) { 318 + const existing = addressesResult.data.find(a => a.uri === script.address); 319 + 320 + if (existing) { 321 + addressId = existing.id; 322 + } else { 323 + const addResult = await api.datastore.addAddress(script.address, { 324 + title: `Script: ${script.title}` 325 + }); 326 + addressId = addResult.id; 327 + } 328 + } 329 + 330 + // Get previous results to detect changes 331 + const prevResults = await api.datastore.getTable('scripts_data'); 332 + let changed = 1; 333 + 334 + if (prevResults.success) { 335 + const scriptResults = Object.entries(prevResults.data) 336 + .filter(([id, row]) => row.scriptId === script.id) 337 + .sort((a, b) => b[1].extractedAt - a[1].extractedAt); 338 + 339 + if (scriptResults.length > 0) { 340 + const previousValue = scriptResults[0][1].content; 341 + changed = (result !== previousValue) ? 1 : 0; 342 + } 343 + } 344 + 345 + // Save new result 346 + await api.datastore.setRow('scripts_data', 347 + `script_data_${Date.now()}_${Math.random().toString(36).substr(2, 9)}`, 348 + { 349 + scriptId: script.id, 350 + scriptName: script.title, 351 + addressId: addressId, 352 + selector: script.selector, 353 + content: result, 354 + contentType: 'text', 355 + metadata: '{}', 356 + extractedAt: Date.now(), 357 + previousValue: scriptResults[0]?.[1]?.content || '', 358 + changed: changed 359 + } 360 + ); 361 + } catch (error) { 362 + console.error('Error saving script result:', error); 363 + } 364 + }; 365 + ``` 366 + 367 + ### Example 3: Query Recent History 368 + 369 + ```javascript 370 + // app/features/history-browser.js 371 + import api from '../api.js'; 372 + 373 + const getRecentHistory = async (limit = 20) => { 374 + try { 375 + // Get recent addresses 376 + const addressesResult = await api.datastore.queryAddresses({ 377 + sortBy: 'lastVisit', 378 + limit: limit 379 + }); 380 + 381 + if (!addressesResult.success) { 382 + console.error('Failed to fetch history:', addressesResult.error); 383 + return []; 384 + } 385 + 386 + // Enrich with visit details 387 + const enriched = await Promise.all( 388 + addressesResult.data.map(async (address) => { 389 + const visitsResult = await api.datastore.queryVisits({ 390 + addressId: address.id, 391 + limit: 5 392 + }); 393 + 394 + return { 395 + ...address, 396 + recentVisits: visitsResult.success ? visitsResult.data : [] 397 + }; 398 + }) 399 + ); 400 + 401 + return enriched; 402 + } catch (error) { 403 + console.error('Error getting recent history:', error); 404 + return []; 405 + } 406 + }; 407 + ``` 408 + 409 + ## Implementation Details 410 + 411 + ### Data Schema 412 + 413 + The datastore uses 7 tables defined in `app/datastore/schema.js`: 414 + 415 + 1. **addresses**: Web addresses (URLs) visited by the user 416 + 2. **visits**: Individual visit records with duration and interaction data 417 + 3. **content**: User-created notes, markdown files, code snippets 418 + 4. **tags**: Tags for organizing addresses and content 419 + 5. **blobs**: Binary data (images, files) with content-addressable storage 420 + 6. **scripts_data**: Results from background script executions 421 + 7. **feeds**: RSS/Atom feed subscriptions and entries 422 + 423 + ### Helper Functions (Main Process) 424 + 425 + ```javascript 426 + // Generate unique IDs 427 + generateId(prefix) // Returns: 'prefix_timestamp_randomstring' 428 + 429 + // Current timestamp 430 + now() // Returns: Date.now() 431 + 432 + // Parse URL into components 433 + parseUrl(uri) // Returns: { protocol, domain, path } 434 + ``` 435 + 436 + ### Error Handling 437 + 438 + All IPC handlers use try-catch blocks and return structured responses: 439 + 440 + ```javascript 441 + // Success response 442 + { success: true, data: {...}, id: '...' } 443 + 444 + // Error response 445 + { success: false, error: 'Error message' } 446 + ``` 447 + 448 + Features should always check the `success` field before using `data`. 449 + 450 + ### Initialization 451 + 452 + The datastore initializes automatically when the main process starts: 453 + 454 + ```javascript 455 + // index.js (main process) 456 + const initDatastore = () => { 457 + console.log('main initializing datastore'); 458 + 459 + // Create store with schema 460 + datastoreStore = createStore().setTablesSchema(schema); 461 + 462 + // Create indexes for efficient queries 463 + datastoreIndexes = createIndexes(datastoreStore); 464 + for (const [indexName, indexDef] of Object.entries(indexes)) { 465 + datastoreIndexes.setIndexDefinition(indexName, ...indexDef); 466 + } 467 + 468 + // Create relationships for joins 469 + datastoreRelationships = createRelationships(datastoreStore); 470 + for (const [relName, relDef] of Object.entries(relationships)) { 471 + datastoreRelationships.setRelationshipDefinition(relName, ...relDef); 472 + } 473 + 474 + // Create metrics for aggregations 475 + datastoreMetrics = createMetrics(datastoreStore); 476 + for (const [metricName, metricDef] of Object.entries(metrics)) { 477 + datastoreMetrics.setMetricDefinition(metricName, ...metricDef); 478 + } 479 + 480 + console.log('main datastore initialized successfully'); 481 + }; 482 + 483 + // Called during app ready 484 + app.whenReady().then(async () => { 485 + initDatastore(); 486 + // ... rest of initialization 487 + }); 488 + ``` 489 + 490 + ## Testing 491 + 492 + A test suite verifies the IPC API works correctly: 493 + 494 + ```bash 495 + # Run the app in debug mode 496 + npm run debug 497 + 498 + # The app automatically runs integration tests on startup 499 + # Check console for test results 500 + ``` 501 + 502 + Test coverage includes: 503 + - Address creation, retrieval, updates 504 + - Visit tracking and queries 505 + - Content management 506 + - Table access 507 + - Statistics aggregation 508 + 509 + ## Future Considerations 510 + 511 + ### Persistence Layer 512 + 513 + Currently, data exists only in memory. Future work includes: 514 + 515 + 1. **IndexedDB Persister** (Browser) 516 + - Use TinyBase's `createIndexedDbPersister()` 517 + - Automatic persistence to browser storage 518 + - Good for development and testing 519 + 520 + 2. **SQLite Persister** (Desktop) 521 + - Use TinyBase's SQL persisters 522 + - Better performance for large datasets 523 + - Native database queries 524 + 525 + 3. **File System Persister** (Desktop) 526 + - Use TinyBase's `createFilePersister()` 527 + - Human-readable JSON files 528 + - Easy backup and migration 529 + 530 + ### Sync Implementation 531 + 532 + The IPC architecture naturally supports synchronization: 533 + 534 + 1. **Local Sync** 535 + - Multiple renderer processes sharing same datastore 536 + - Already supported via IPC 537 + 538 + 2. **Cloud Sync** 539 + - Modify IPC handlers to route to remote API 540 + - Use TinyBase CRDT features for conflict resolution 541 + - Implement offline-first with local cache 542 + 543 + 3. **Peer-to-Peer Sync** 544 + - Use TinyBase's CRDT merge capabilities 545 + - Sync between devices on local network 546 + 547 + ### Migration Path 548 + 549 + To change storage backends: 550 + 551 + 1. Keep IPC API contract unchanged 552 + 2. Implement new backend in main process 553 + 3. Update IPC handlers to use new backend 554 + 4. Features continue working without changes 555 + 556 + Example: Migrating to SQLite: 557 + 558 + ```javascript 559 + // Old: TinyBase 560 + datastoreStore.setRow('addresses', id, row); 561 + 562 + // New: SQLite (better-sqlite3) 563 + db.prepare('INSERT INTO addresses VALUES (?, ?, ...)').run(id, ...values); 564 + 565 + // IPC handler updated, but api.datastore.addAddress() unchanged 566 + ``` 567 + 568 + ## Benefits Realized 569 + 570 + 1. **Clean Separation**: Storage logic completely isolated from UI code 571 + 2. **Easy Testing**: Can mock `api.datastore` for unit tests 572 + 3. **Consistent API**: Same patterns across all features 573 + 4. **Type Safety**: Single source of truth for data structures 574 + 5. **Performance**: Main process handles heavy data operations 575 + 6. **Security**: Validated data access through controlled IPC 576 + 7. **Flexibility**: Storage implementation can evolve independently 577 + 578 + ## References 579 + 580 + - [TinyBase Documentation](https://tinybase.org) 581 + - [Electron IPC Documentation](https://www.electronjs.org/docs/latest/api/ipc-main) 582 + - [Datastore Schema](./datastore-schema.md) 583 + - [Datastore Research](./datastore-research.md) 584 + - [Integration Summary](./datastore-integration.md)
+446
notes/datastore-integration.md
··· 1 + # Datastore Integration Summary 2 + 3 + Date: 2025-11-12 4 + Branch: datastore 5 + 6 + **ARCHITECTURE NOTE**: This document originally described a `window.datastore` approach. The actual implementation uses an **IPC-based architecture** with the datastore in the main process. See [datastore-architecture.md](./datastore-architecture.md) for the complete architectural documentation. 7 + 8 + ## What Was Built 9 + 10 + ### Core Datastore Module 11 + - ✅ **Full TinyBase implementation** with schema, indexes, relationships, metrics 12 + - ✅ **7 tables**: addresses, visits, content, tags, blobs, scripts_data, feeds 13 + - ✅ **IPC-based API** accessed via `api.datastore` in renderer processes 14 + - ✅ **Datastore in main process** for security, isolation, and portability 15 + - ✅ **Comprehensive testing** - all IPC handlers verified working 16 + - ✅ **Complete separation** between storage layer and features 17 + 18 + ### Files Created/Modified 19 + 20 + **New Files:** 21 + - `app/datastore/schema.js` - TinyBase schema definitions 22 + - `app/datastore/config.js` - Configuration 23 + - `app/datastore/history.js` - Navigation history tracking helpers (uses IPC API) 24 + - `app/datastore/test-ipc.html` - IPC API test page 25 + - `notes/datastore-research.md` - Technology research & comparison 26 + - `notes/datastore-schema.md` - Detailed schema documentation 27 + - `notes/datastore-architecture.md` - **Complete architectural documentation** 28 + - `notes/datastore-integration.md` - This file 29 + 30 + **Modified Files:** 31 + - `index.js` - **Datastore initialization in main process** (lines 115-230, 979-1230) 32 + - `preload.js` - **Expose api.datastore via IPC** (lines 210-242) 33 + - `app/index.js` - Import history tracking helpers, expose via window.datastoreHistory 34 + - `app/scripts/index.js` - Save script results using api.datastore IPC 35 + - `app/peeks/index.js` - Track peek navigation via history helpers 36 + - `app/slides/index.js` - Track slide navigation via history helpers 37 + - `package.json` - Added tinybase@0.7.2 dependency 38 + 39 + ## Integration Details 40 + 41 + ### 1. Scripts Feature Integration 42 + 43 + **What it does:** 44 + - Saves all script extraction results to `scripts_data` table 45 + - Tracks changes between runs 46 + - Creates/links addresses for script URLs 47 + - Maintains full history of extracted values 48 + 49 + **Implementation:** 50 + ```javascript 51 + // In app/scripts/index.js 52 + const saveScriptResult = async (script, result) => { 53 + // Find or create address using IPC 54 + const addressesResult = await api.datastore.queryAddresses({}); 55 + let addressId; 56 + 57 + if (addressesResult.success) { 58 + const existing = addressesResult.data.find(a => a.uri === script.address); 59 + if (existing) { 60 + addressId = existing.id; 61 + } else { 62 + const addResult = await api.datastore.addAddress(script.address, { 63 + title: `Script: ${script.title}` 64 + }); 65 + addressId = addResult.id; 66 + } 67 + } 68 + 69 + // Check for previous values using IPC 70 + const prevResults = await api.datastore.getTable('scripts_data'); 71 + let changed = 1; 72 + 73 + if (prevResults.success) { 74 + const scriptResults = Object.entries(prevResults.data) 75 + .filter(([id, row]) => row.scriptId === script.id) 76 + .sort((a, b) => b[1].extractedAt - a[1].extractedAt); 77 + if (scriptResults.length > 0) { 78 + changed = (result !== scriptResults[0][1].content) ? 1 : 0; 79 + } 80 + } 81 + 82 + // Save to datastore using IPC 83 + await api.datastore.setRow('scripts_data', rowId, { 84 + scriptId, scriptName, addressId, selector, 85 + content: result, contentType: 'text', 86 + extractedAt: Date.now(), 87 + previousValue, changed 88 + }); 89 + }; 90 + ``` 91 + 92 + **Data captured:** 93 + - Script ID and name 94 + - Source address (with automatic address creation) 95 + - CSS selector used 96 + - Extracted content 97 + - Timestamp 98 + - Previous value for change detection 99 + - Changed flag 100 + 101 + ### 2. Navigation History Tracking 102 + 103 + **What it does:** 104 + - Tracks every navigation from peeks and slides 105 + - Creates address records automatically 106 + - Records visit metadata (source, windowType, duration) 107 + - Updates visit counts and timestamps 108 + 109 + **Implementation:** 110 + ```javascript 111 + // In app/datastore/history.js 112 + export const trackNavigation = async (uri, options = {}) => { 113 + // Get or create address using IPC 114 + let addressId; 115 + const addressesResult = await api.datastore.queryAddresses({}); 116 + 117 + if (addressesResult.success) { 118 + const existing = addressesResult.data.find(addr => addr.uri === uri); 119 + 120 + if (existing) { 121 + addressId = existing.id; 122 + } else { 123 + const addResult = await api.datastore.addAddress(uri, { 124 + title: options.title || '', 125 + mimeType: options.mimeType || 'text/html' 126 + }); 127 + addressId = addResult.id; 128 + } 129 + } 130 + 131 + // Add visit record using IPC 132 + const visitResult = await api.datastore.addVisit(addressId, { 133 + source: options.source || 'direct', 134 + sourceId: options.sourceId || '', 135 + windowType: options.windowType || 'main', 136 + duration: options.duration || 0, 137 + metadata: JSON.stringify(options.metadata || {}) 138 + }); 139 + 140 + return { visitId: visitResult.id, addressId }; 141 + }; 142 + ``` 143 + 144 + **Data captured:** 145 + - Full URI and parsed components (protocol, domain, path) 146 + - Page title 147 + - Visit timestamp 148 + - Source feature (peek, slide, direct) 149 + - Source ID (peek_3, slide_left, etc.) 150 + - Window type (modal, persistent, main) 151 + - Visit count and last visit time 152 + 153 + ### 3. Peeks Integration 154 + 155 + **Integration point:** `app/peeks/index.js:32-44` 156 + 157 + ```javascript 158 + windows.openModalWindow(item.address, params) 159 + .then(result => { 160 + // Track navigation in datastore 161 + if (window.datastoreHistory) { 162 + window.datastoreHistory.trackNavigation(item.address, { 163 + source: 'peek', 164 + sourceId: `peek_${item.keyNum}`, 165 + windowType: 'modal', 166 + title: item.title 167 + }); 168 + } 169 + }); 170 + ``` 171 + 172 + **Tracks:** 173 + - Which peek was opened (peek_0 through peek_9) 174 + - URL visited 175 + - Modal window type 176 + - Creates address if first visit 177 + 178 + ### 4. Slides Integration 179 + 180 + **Integration point:** `app/slides/index.js:147-155` 181 + 182 + ```javascript 183 + windows.openModalWindow(item.address, params).then(result => { 184 + if (result.success) { 185 + // Track navigation in datastore 186 + if (window.datastoreHistory) { 187 + window.datastoreHistory.trackNavigation(item.address, { 188 + source: 'slide', 189 + sourceId: `slide_${item.screenEdge}`, 190 + windowType: 'modal', 191 + title: item.title 192 + }); 193 + } 194 + } 195 + }); 196 + ``` 197 + 198 + **Tracks:** 199 + - Which slide direction (slide_left, slide_right, slide_up, slide_down) 200 + - URL visited 201 + - Modal window type 202 + - Reuses existing address records 203 + 204 + ## API Available to Features 205 + 206 + ### Datastore Core API (IPC-based) 207 + 208 + All methods are async and return Promises with structure: `{ success: boolean, data?: any, error?: string, id?: string }` 209 + 210 + ```javascript 211 + // Access via api.datastore (exposed through preload.js) 212 + 213 + // Addresses 214 + await api.datastore.addAddress(uri, options) 215 + await api.datastore.getAddress(id) 216 + await api.datastore.updateAddress(id, updates) 217 + await api.datastore.queryAddresses(filter) 218 + 219 + // Visits 220 + await api.datastore.addVisit(addressId, options) 221 + await api.datastore.queryVisits(filter) 222 + 223 + // Content 224 + await api.datastore.addContent(options) 225 + 226 + // Direct table access 227 + await api.datastore.getTable(tableName) 228 + await api.datastore.setRow(tableName, rowId, rowData) 229 + 230 + // Stats 231 + await api.datastore.getStats() 232 + ``` 233 + 234 + **Note**: All IPC operations are asynchronous. Always use `await` or `.then()` and check the `success` field before using `data`. 235 + 236 + ### History Helper API 237 + 238 + ```javascript 239 + // Access via window.datastoreHistory 240 + 241 + // Track navigation 242 + window.datastoreHistory.trackNavigation(uri, { 243 + source, sourceId, windowType, duration, metadata 244 + }) 245 + 246 + // Query history 247 + window.datastoreHistory.getHistory(filter) 248 + 249 + // Get frequent addresses 250 + window.datastoreHistory.getFrequentAddresses(limit) 251 + 252 + // Get recent addresses 253 + window.datastoreHistory.getRecentAddresses(limit) 254 + ``` 255 + 256 + ## Example Queries (Using IPC API) 257 + 258 + ### Get Recent Navigation History 259 + ```javascript 260 + const recentVisits = await window.datastoreHistory.getHistory({ limit: 20 }); 261 + // Returns: [{ id, addressId, timestamp, source, address: {...} }, ...] 262 + ``` 263 + 264 + ### Find Most Visited Sites 265 + ```javascript 266 + const result = await api.datastore.queryAddresses({ 267 + sortBy: 'visitCount', 268 + limit: 10 269 + }); 270 + 271 + if (result.success) { 272 + const frequent = result.data; 273 + // Use frequent addresses... 274 + } 275 + ``` 276 + 277 + ### Get All Script Results That Changed 278 + ```javascript 279 + const tableResult = await api.datastore.getTable('scripts_data'); 280 + 281 + if (tableResult.success) { 282 + const changedResults = Object.entries(tableResult.data) 283 + .filter(([id, row]) => row.changed === 1) 284 + .map(([id, row]) => ({ id, ...row })); 285 + } 286 + ``` 287 + 288 + ### Get Script History for Specific Script 289 + ```javascript 290 + const scriptId = 'my-script-id'; 291 + const tableResult = await api.datastore.getTable('scripts_data'); 292 + 293 + if (tableResult.success) { 294 + const history = Object.entries(tableResult.data) 295 + .filter(([id, row]) => row.scriptId === scriptId) 296 + .sort((a, b) => b[1].extractedAt - a[1].extractedAt) 297 + .map(([id, row]) => ({ id, ...row })); 298 + } 299 + ``` 300 + 301 + ### Get All Markdown Content 302 + ```javascript 303 + const result = await api.datastore.getTable('content'); 304 + 305 + if (result.success) { 306 + const markdown = Object.entries(result.data) 307 + .filter(([id, row]) => row.contentType === 'markdown') 308 + .map(([id, row]) => ({ id, ...row })); 309 + } 310 + ``` 311 + 312 + ### Get Starred Addresses 313 + ```javascript 314 + const result = await api.datastore.queryAddresses({ starred: 1 }); 315 + 316 + if (result.success) { 317 + const starred = result.data; 318 + } 319 + ``` 320 + 321 + ## What's Working 322 + 323 + ✅ **Datastore initialization** - Loads on app startup 324 + ✅ **Schema enforcement** - TinyBase validates all data 325 + ✅ **Indexes** - Fast queries by domain, tag, timestamp, etc. 326 + ✅ **Relationships** - Visits→Addresses, Blobs→Content, etc. 327 + ✅ **Metrics** - Automatic aggregations (counts, averages) 328 + ✅ **Scripts tracking** - All extractions saved with history 329 + ✅ **Navigation tracking** - Peeks and slides log visits 330 + ✅ **Automatic address creation** - No duplicates, proper linking 331 + ✅ **Change detection** - Scripts know when data changes 332 + ✅ **Visit statistics** - Count and last visit time updated 333 + 334 + ## What's NOT Done Yet 335 + 336 + ⏭️ **Binary file storage** - Blobs table schema exists but no file I/O 337 + ⏭️ **Markdown sync** - Content table ready but no filesystem bidirectional sync 338 + ⏭️ **Persistence** - Currently in-memory only (need IndexedDB or SQLite persister) 339 + ⏭️ **Groups feature** - Not integrated yet 340 + ⏭️ **Cmd feature** - Not integrated yet 341 + ⏭️ **Navigation in main windows** - Only tracking peek/slide navigation 342 + ⏭️ **Duration tracking** - Visits record duration=0, needs window close tracking 343 + ⏭️ **Scroll depth tracking** - Schema ready but not implemented 344 + ⏭️ **Search/filtering UI** - No UI to browse datastore yet 345 + 346 + ## Next Steps 347 + 348 + ### Phase 1: Persistence (Critical) 349 + 1. Add IndexedDB persister (TinyBase has built-in support) 350 + 2. Auto-save on changes 351 + 3. Load from IndexedDB on startup 352 + 4. Verify data persists across app restarts 353 + 354 + ### Phase 2: Enhanced Tracking 355 + 1. Track duration when windows close 356 + 2. Track scroll depth and interaction 357 + 3. Add navigation tracking to groups/cmd features 358 + 4. Track main window navigation (not just peeks/slides) 359 + 360 + ### Phase 3: Binary Storage 361 + 1. Implement filesystem storage for blobs 362 + 2. Add image/video download capability 363 + 3. Generate thumbnails 364 + 4. Link blobs to addresses and content 365 + 366 + ### Phase 4: Filesystem Sync 367 + 1. Implement bidirectional markdown sync 368 + 2. Watch filesystem for changes 369 + 3. Sync content table to markdown files 370 + 4. Handle conflicts 371 + 372 + ### Phase 5: UI & Features 373 + 1. Build history browser UI 374 + 2. Add search interface 375 + 3. Create feeds viewer 376 + 4. Implement tagging UI 377 + 5. Show stats dashboard 378 + 379 + ## Testing 380 + 381 + ### IPC API Testing 382 + 383 + A test page was created at `app/datastore/test-ipc.html` to verify all IPC handlers work correctly. 384 + 385 + To test in the app: 386 + 1. Start Peek: `npm run debug` 387 + 2. Open a peek (Alt+0-9) - navigation tracked via IPC 388 + 3. Open a slide (Alt+arrows) - navigation tracked via IPC 389 + 4. Configure and run a script - results saved via IPC 390 + 5. Check main process console for IPC handler logs 391 + 392 + ### Verified Working 393 + - ✅ `datastore-add-address` - Address creation with URL parsing 394 + - ✅ `datastore-get-address` - Address retrieval 395 + - ✅ `datastore-update-address` - Address updates 396 + - ✅ `datastore-query-addresses` - Query with filters and sorting 397 + - ✅ `datastore-add-visit` - Visit tracking with stat updates 398 + - ✅ `datastore-query-visits` - Visit history queries 399 + - ✅ `datastore-add-content` - Content creation 400 + - ✅ `datastore-get-table` - Table access 401 + - ✅ `datastore-set-row` - Direct row manipulation 402 + - ✅ `datastore-get-stats` - Statistics aggregation 403 + 404 + ## Performance Notes 405 + 406 + - **In-memory storage**: Fast but needs persistence 407 + - **Small overhead**: TinyBase is 5-11kB gzipped 408 + - **Reactive**: Changes trigger index/metric updates automatically 409 + - **Scalable**: Tested with addresses, visits, content - all working 410 + 411 + ## Architecture Benefits 412 + 413 + ✅ **Complete separation**: Datastore isolated in main process, features access via IPC 414 + ✅ **Runtime portable**: Can migrate to Tauri without changing feature code 415 + ✅ **Backend flexible**: Can swap TinyBase for SQLite, cloud, etc. without feature changes 416 + ✅ **Cloud ready**: Same API can route to local or remote datastores 417 + ✅ **Mobile ready**: Architecture supports future mobile app development 418 + ✅ **Secure**: Datastore logic in trusted process with validated IPC access 419 + ✅ **Type safety**: Schema validation prevents bad data 420 + ✅ **Reactive**: Indexes and metrics update automatically (in main process) 421 + ✅ **Testable**: IPC handlers individually tested and verified 422 + ✅ **Extensible**: Easy to add new IPC handlers and tables 423 + ✅ **Sync ready**: Built-in CRDT support for future multi-device sync 424 + 425 + **Key Decision**: IPC-based architecture chosen over direct library access for maximum portability and flexibility. See [datastore-architecture.md](./datastore-architecture.md) for complete rationale. 426 + 427 + ## Conclusion 428 + 429 + The IPC-based datastore integration is **functional and working** for: 430 + - Script data extraction and history (via async IPC) 431 + - Navigation history from peeks and slides (via history helpers) 432 + - Address management with automatic deduplication 433 + - Visit tracking with statistics 434 + - Complete isolation between storage and UI layers 435 + 436 + The foundation is solid with: 437 + - ✅ All IPC handlers tested and verified 438 + - ✅ Async/await throughout for clean code 439 + - ✅ Error handling with structured responses 440 + - ✅ Main process datastore initialization 441 + - ✅ Preload script API exposure 442 + - ✅ Feature integration complete 443 + 444 + Ready for next phases: **persistence**, enhanced tracking, and UI development. 445 + 446 + **For complete architectural details**, see [datastore-architecture.md](./datastore-architecture.md).
+307
notes/datastore-research.md
··· 1 + # Datastore Technology Research & Comparison 2 + 3 + Research conducted: 2025-11-12 4 + 5 + ## Requirements Summary (from datastore.md) 6 + 7 + ### Primary Requirements 8 + - Store various data types with metadata (tags, MIME types, annotations) 9 + - Store feeds (navigation history, timeseries data, custom feeds) 10 + - Binary file support (images, videos) with filesystem references 11 + - Bidirectional filesystem sync for markdown/text files 12 + - Runtime/browser engine agnostic 13 + - Designed for sync (multi-device, cloud, collaboration) 14 + - Query capabilities (by type, tags, time, etc.) 15 + 16 + ### Performance Requirements 17 + - Fast local queries 18 + - Efficient indexing 19 + - Handle potentially large datasets (navigation history) 20 + - Reactive updates for UI 21 + 22 + --- 23 + 24 + ## Technology Comparison 25 + 26 + ### 1. TinyBase 27 + 28 + **Overview**: Reactive data store with built-in sync engine 29 + 30 + **Pros:** 31 + - ✅ **Tiny size**: 5.3kB-11.7kB gzipped, zero dependencies 32 + - ✅ **Reactive queries**: Built-in reactivity with granular listeners 33 + - ✅ **Native CRDT support**: Deterministic sync across clients 34 + - ✅ **Multiple persistence options**: IndexedDB, SQLite, PostgreSQL, files, OPFS 35 + - ✅ **Schema support**: Optional typed schemas with constraints and defaults 36 + - ✅ **Advanced queries**: TinyQL language, indexes, metrics, relationships 37 + - ✅ **Sync built-in**: WebSocket, BroadcastChannel, custom mediums 38 + - ✅ **Can integrate with**: Yjs, Automerge, CR-SQLite for additional CRDT options 39 + - ✅ **100% test coverage**: Well-tested and documented 40 + 41 + **Cons:** 42 + - ⚠️ In-memory first (requires persistence layer configuration) 43 + - ⚠️ Newer library (less battle-tested than SQLite) 44 + - ⚠️ Learning curve for TinyQL query language 45 + - ⚠️ Limited ecosystem compared to SQL 46 + 47 + **Fit for Peek:** 48 + - **Data modeling**: ★★★★★ (supports both key-value and tabular) 49 + - **Metadata/tags**: ★★★★★ (schemas, indexes, flexible structure) 50 + - **Navigation history**: ★★★★★ (indexes, metrics for aggregations) 51 + - **Binary files**: ★★★☆☆ (would need separate blob storage + references) 52 + - **Filesystem sync**: ★★★☆☆ (can persist to files, bidirectional needs custom logic) 53 + - **Collaboration**: ★★★★★ (native CRDT support, built-in sync) 54 + - **Runtime agnostic**: ★★★★★ (works anywhere JS runs) 55 + - **Performance**: ★★★★★ (optimized, minimal overhead) 56 + 57 + **Best for**: Reactive UIs, real-time collaboration, local-first apps with sync 58 + 59 + --- 60 + 61 + ### 2. Automerge 62 + 63 + **Overview**: JSON-like CRDT for collaborative applications 64 + 65 + **Pros:** 66 + - ✅ **Built for collaboration**: Automatic conflict-free merging 67 + - ✅ **Offline-first**: Full functionality offline, queues changes 68 + - ✅ **Versioning**: Complete change history, branching, time travel 69 + - ✅ **High performance**: Compressed columnar storage, handles millions of changes 70 + - ✅ **Automerge Repo**: Built-in sync server backend 71 + - ✅ **Multi-language**: Rust core with JS, Swift, Python, C, Java bindings 72 + - ✅ **Framework integration**: React, Prosemirror, CodeMirror plugins 73 + - ✅ **Actively maintained**: Recent 3.0 release with 10x memory reduction 74 + 75 + **Cons:** 76 + - ⚠️ **Not a database**: More of a data structure/sync protocol 77 + - ⚠️ **Requires additional storage**: Need separate persistence layer 78 + - ⚠️ **Learning curve**: CRDT concepts and document-based model 79 + - ⚠️ **Query limitations**: No SQL-like queries, need to build on top 80 + - ⚠️ **Larger size**: More overhead than minimal solutions 81 + - ⚠️ **Best for documents**: JSON-like data, less suited for relational queries 82 + 83 + **Fit for Peek:** 84 + - **Data modeling**: ★★★☆☆ (JSON-like, need to structure carefully) 85 + - **Metadata/tags**: ★★★★☆ (flexible JSON structure) 86 + - **Navigation history**: ★★★☆☆ (can store, but querying is manual) 87 + - **Binary files**: ★☆☆☆☆ (not designed for blobs) 88 + - **Filesystem sync**: ★★★★☆ (excellent sync, but need custom file integration) 89 + - **Collaboration**: ★★★★★ (core strength, best-in-class) 90 + - **Runtime agnostic**: ★★★★★ (Rust core, multiple language bindings) 91 + - **Performance**: ★★★★☆ (good for sync, less optimized for queries) 92 + 93 + **Best for**: Collaborative documents, offline-first sync, version control needs 94 + 95 + --- 96 + 97 + ### 3. SQLite (via better-sqlite3) 98 + 99 + **Overview**: Traditional relational database, synchronous Node.js bindings 100 + 101 + **Pros:** 102 + - ✅ **Battle-tested**: Decades of production use, extremely reliable 103 + - ✅ **SQL queries**: Powerful relational queries, joins, aggregations 104 + - ✅ **Fast**: 2000+ queries/sec possible with proper indexing 105 + - ✅ **Small overhead**: Single file database, minimal dependencies 106 + - ✅ **Full-text search**: Built-in FTS5 for text searching 107 + - ✅ **Transactions**: ACID compliance, WAL mode for performance 108 + - ✅ **Synchronous API**: Simpler than async (better-sqlite3) 109 + - ✅ **Widely known**: Easier to find developers/documentation 110 + - ✅ **JSON support**: JSON1 extension for flexible data 111 + 112 + **Cons:** 113 + - ⚠️ **No built-in sync**: Need to build custom sync layer 114 + - ⚠️ **No CRDT support**: Conflicts require manual resolution 115 + - ⚠️ **Not reactive**: Need to build change listeners 116 + - ⚠️ **File locking**: Single writer, can cause issues with sync 117 + - ⚠️ **Electron specific**: Need rebuild for Electron compatibility 118 + - ⚠️ **Main thread blocking**: Synchronous operations can freeze UI 119 + 120 + **Fit for Peek:** 121 + - **Data modeling**: ★★★★★ (relational, flexible schemas) 122 + - **Metadata/tags**: ★★★★★ (relations, indexes, JSON fields) 123 + - **Navigation history**: ★★★★★ (perfect for timeseries queries) 124 + - **Binary files**: ★★★★☆ (can store blobs or references efficiently) 125 + - **Filesystem sync**: ★★☆☆☆ (can persist, but bidirectional sync is complex) 126 + - **Collaboration**: ★☆☆☆☆ (no native sync, requires significant custom work) 127 + - **Runtime agnostic**: ★★★☆☆ (SQLite is portable, but bindings are platform-specific) 128 + - **Performance**: ★★★★★ (excellent for local queries) 129 + 130 + **Best for**: Complex queries, relational data, local-only or simple sync needs 131 + 132 + --- 133 + 134 + ### 4. Dexie.js 135 + 136 + **Overview**: IndexedDB wrapper with promise-based API 137 + 138 + **Pros:** 139 + - ✅ **Simple API**: Much easier than raw IndexedDB 140 + - ✅ **Live queries**: Reactive liveQuery() function 141 + - ✅ **Advanced queries**: Case-insensitive search, prefix matching, OR operations 142 + - ✅ **Browser-native**: Uses IndexedDB, no external dependencies 143 + - ✅ **Real classes**: Map classes to tables 144 + - ✅ **Performance optimized**: Bulk operations, batching 145 + - ✅ **Cross-platform**: Browsers, Electron, Capacitor, PWAs 146 + - ✅ **Widely used**: 100,000+ projects, battle-tested 147 + - ✅ **Dexie Cloud**: Optional commercial sync add-on 148 + - ✅ **Bug workarounds**: Handles IndexedDB inconsistencies 149 + 150 + **Cons:** 151 + - ⚠️ **IndexedDB limitations**: Key-value store, limited query capabilities 152 + - ⚠️ **No built-in sync**: Need Dexie Cloud (commercial) or custom solution 153 + - ⚠️ **Browser-focused**: Less ideal for Node.js/backend 154 + - ⚠️ **Larger bundle**: 33.1kB minified+gzipped 155 + - ⚠️ **Schema migrations**: Can be tricky with IndexedDB 156 + 157 + **Fit for Peek:** 158 + - **Data modeling**: ★★★★☆ (key-value with indexes, flexible) 159 + - **Metadata/tags**: ★★★★☆ (can index and query efficiently) 160 + - **Navigation history**: ★★★★☆ (good for timeseries with indexes) 161 + - **Binary files**: ★★★★☆ (IndexedDB can store blobs) 162 + - **Filesystem sync**: ★★☆☆☆ (browser-focused, no native file sync) 163 + - **Collaboration**: ★★☆☆☆ (Dexie Cloud or custom sync needed) 164 + - **Runtime agnostic**: ★★★☆☆ (browser/Electron focused) 165 + - **Performance**: ★★★★☆ (good for browser workloads) 166 + 167 + **Best for**: Browser-based apps, Electron apps not needing server sync 168 + 169 + --- 170 + 171 + ### 5. PouchDB 172 + 173 + **Overview**: CouchDB-compatible database for browser and Node.js 174 + 175 + **Status**: ⚠️ **Declining ecosystem** - Removed from RxDB, fewer active projects 176 + 177 + **Pros:** 178 + - ✅ **CouchDB sync**: Seamless replication with CouchDB servers 179 + - ✅ **Offline-first**: Designed for offline operation 180 + - ✅ **Multi-platform**: Browser (IndexedDB), Node.js (LevelDB) 181 + - ✅ **Change notifications**: Listen to database changes 182 + - ✅ **Document-based**: Flexible JSON documents 183 + 184 + **Cons:** 185 + - ⚠️ **Declining support**: Being phased out of modern projects 186 + - ⚠️ **Performance issues**: Slower than alternatives 187 + - ⚠️ **Large bundle size**: More overhead than newer solutions 188 + - ⚠️ **Complex replication**: CouchDB protocol has quirks 189 + - ⚠️ **Limited queries**: Map-reduce only, no SQL-like queries 190 + 191 + **Recommendation**: ❌ **Not recommended** for new projects in 2025 192 + 193 + --- 194 + 195 + ## Hybrid Approaches 196 + 197 + ### Option A: TinyBase + File Storage 198 + - Use TinyBase for structured data, indexes, queries 199 + - Use filesystem for binary files (referenced by hash/ID in TinyBase) 200 + - Leverage TinyBase's native CRDT for sync 201 + - Add custom filesystem sync for markdown bidirectional sync 202 + 203 + **Pros**: Best of both worlds, reactive, built-in sync 204 + **Cons**: Need to manage two storage systems 205 + 206 + ### Option B: SQLite + Automerge 207 + - Use SQLite for local queries and storage 208 + - Use Automerge for sync protocol 209 + - Translate between SQLite and Automerge documents 210 + 211 + **Pros**: Powerful queries + best-in-class sync 212 + **Cons**: Complex integration, two systems to maintain 213 + 214 + ### Option C: TinyBase with SQLite Persistence 215 + - Use TinyBase API and reactivity 216 + - Persist to SQLite for durability and querying 217 + - Best of reactive store + SQL power 218 + 219 + **Pros**: Reactive + SQL + sync capabilities 220 + **Cons**: Some complexity in persistence layer 221 + 222 + --- 223 + 224 + ## Recommendations 225 + 226 + ### For Peek v1 (MVP): **TinyBase + File Storage** 227 + 228 + **Rationale:** 229 + 1. **Meets all requirements**: Handles structured data, metadata, tags, history 230 + 2. **Sync built-in**: Native CRDT support for future multi-device sync 231 + 3. **Reactive**: Perfect for Peek's modal, event-driven UI 232 + 4. **Small footprint**: Minimal bundle size (5-11kB) 233 + 5. **Flexible persistence**: Can use SQLite backend if needed later 234 + 6. **Runtime agnostic**: Works anywhere JS runs 235 + 7. **Active development**: Well-maintained, modern codebase 236 + 237 + **Architecture:** 238 + ``` 239 + TinyBase Store 240 + ├── addresses (table) 241 + ├── visits (table) 242 + ├── notes (table) 243 + ├── tags (table) 244 + └── blobs (table - metadata only) 245 + 246 + Filesystem 247 + └── blobs/ 248 + ├── {hash}.jpg 249 + ├── {hash}.png 250 + └── {hash}.pdf 251 + 252 + Markdown Sync 253 + └── notes/ 254 + ├── note1.md (bidirectional sync) 255 + └── note2.md 256 + ``` 257 + 258 + **Implementation phases:** 259 + 1. Start with TinyBase in-memory + IndexedDB persistence 260 + 2. Add file storage for binaries 261 + 3. Add markdown filesystem sync 262 + 4. Add SQLite persistence option for performance 263 + 5. Enable sync features for multi-device support 264 + 265 + ### For Future (v2+): **Add Automerge for Advanced Collaboration** 266 + 267 + If Peek expands to real-time collaboration scenarios: 268 + - Use TinyBase for local store and queries 269 + - Add Automerge for document-level collaboration 270 + - Use Automerge Repo for sync infrastructure 271 + 272 + --- 273 + 274 + ## Decision Matrix 275 + 276 + | Feature | TinyBase | Automerge | SQLite | Dexie | PouchDB | 277 + |---------|----------|-----------|--------|-------|---------| 278 + | Local Queries | ★★★★★ | ★★☆☆☆ | ★★★★★ | ★★★★☆ | ★★☆☆☆ | 279 + | Sync Built-in | ★★★★★ | ★★★★★ | ★☆☆☆☆ | ★★☆☆☆ | ★★★★☆ | 280 + | Reactivity | ★★★★★ | ★★★☆☆ | ★☆☆☆☆ | ★★★★☆ | ★★★☆☆ | 281 + | Binary Storage | ★★★☆☆ | ★☆☆☆☆ | ★★★★☆ | ★★★★☆ | ★★★☆☆ | 282 + | Size/Performance | ★★★★★ | ★★★★☆ | ★★★★★ | ★★★★☆ | ★★☆☆☆ | 283 + | Ecosystem | ★★★☆☆ | ★★★★☆ | ★★★★★ | ★★★★☆ | ★★☆☆☆ | 284 + | Learning Curve | ★★★☆☆ | ★★☆☆☆ | ★★★★★ | ★★★★☆ | ★★★☆☆ | 285 + | **Total** | **29/35** | **23/35** | **27/35** | **28/35** | **20/35** | 286 + 287 + --- 288 + 289 + ## Next Steps 290 + 291 + 1. ✅ Complete research phase 292 + 2. ⏭️ Prototype TinyBase with basic CRUD operations 293 + 3. ⏭️ Test with Peek use case (storing URLs from peeks) 294 + 4. ⏭️ Evaluate performance with realistic data volumes 295 + 5. ⏭️ Design schema for addresses, visits, notes, metadata 296 + 6. ⏭️ Implement file storage integration 297 + 7. ⏭️ Build datastore API for Peek features 298 + 299 + --- 300 + 301 + ## References 302 + 303 + - TinyBase: https://tinybase.org/ 304 + - Automerge: https://automerge.org/ 305 + - better-sqlite3: https://github.com/WiseLibs/better-sqlite3 306 + - Dexie.js: https://dexie.org/ 307 + - PouchDB: https://pouchdb.com/
+892
notes/datastore-schema.md
··· 1 + # Peek Datastore Schema Design 2 + 3 + Version: 1.0 4 + Date: 2025-11-12 5 + Technology: TinyBase 6 + 7 + --- 8 + 9 + ## Overview 10 + 11 + This schema design uses TinyBase's tabular data model with the following principles: 12 + - Each table stores a specific entity type 13 + - Relationships via ID references 14 + - Flexible metadata using JSON cells 15 + - Indexes for common queries 16 + - Designed for reactivity and efficient queries 17 + 18 + --- 19 + 20 + ## Core Tables 21 + 22 + ### 1. `addresses` - URL/URI Index 23 + 24 + Stores all web addresses and URIs that Peek interacts with. 25 + 26 + ```javascript 27 + { 28 + rowId: string, // Auto-generated or hash of URI 29 + cells: { 30 + uri: string, // The full URI (required) 31 + protocol: string, // http, https, ipfs, etc. 32 + domain: string, // Extracted domain for querying 33 + path: string, // URL path component 34 + title: string, // Page title (if known) 35 + mimeType: string, // Content MIME type 36 + favicon: string, // Favicon URL or data URI 37 + description: string, // Meta description or user note 38 + tags: string, // Comma-separated tag IDs (for indexing) 39 + metadata: string, // JSON string for flexible metadata 40 + createdAt: number, // Unix timestamp (ms) 41 + updatedAt: number, // Unix timestamp (ms) 42 + lastVisitAt: number, // Unix timestamp of most recent visit 43 + visitCount: number, // Total number of visits 44 + starred: number, // 0 or 1 (boolean for indexing) 45 + archived: number // 0 or 1 (boolean for indexing) 46 + } 47 + } 48 + ``` 49 + 50 + **Indexes:** 51 + - `byDomain` - Group by domain for domain-level queries 52 + - `byProtocol` - Filter by protocol type 53 + - `byTag` - Index on tags field for tag filtering 54 + - `byStarred` - Quick access to starred addresses 55 + - `byLastVisit` - Sort by most recent visit 56 + 57 + **Example Row:** 58 + ```javascript 59 + { 60 + 'addr_1234': { 61 + uri: 'https://example.com/article', 62 + protocol: 'https', 63 + domain: 'example.com', 64 + path: '/article', 65 + title: 'Example Article', 66 + mimeType: 'text/html', 67 + favicon: 'https://example.com/favicon.ico', 68 + description: 'An interesting article', 69 + tags: 'tag_1,tag_5', 70 + metadata: '{"author":"John","lang":"en"}', 71 + createdAt: 1699564800000, 72 + updatedAt: 1699564800000, 73 + lastVisitAt: 1699651200000, 74 + visitCount: 5, 75 + starred: 1, 76 + archived: 0 77 + } 78 + } 79 + ``` 80 + 81 + --- 82 + 83 + ### 2. `visits` - Navigation History 84 + 85 + Tracks every visit to an address with temporal data. 86 + 87 + ```javascript 88 + { 89 + rowId: string, // Auto-generated unique ID 90 + cells: { 91 + addressId: string, // Reference to addresses table (required) 92 + timestamp: number, // Unix timestamp when visit occurred 93 + duration: number, // Time spent in milliseconds (0 if unknown) 94 + source: string, // How arrived: 'peek', 'slide', 'direct', 'link', etc. 95 + sourceId: string, // ID of source feature if applicable 96 + windowType: string, // 'modal', 'persistent', 'main', etc. 97 + metadata: string, // JSON string for flexible data 98 + scrollDepth: number, // Percentage scrolled (0-100) 99 + interacted: number // 0 or 1 (clicked, typed, etc.) 100 + } 101 + } 102 + ``` 103 + 104 + **Indexes:** 105 + - `byAddress` - Group visits by address 106 + - `byTimestamp` - Sort chronologically 107 + - `bySource` - Filter by entry source 108 + - `byDate` - Index by date (derived from timestamp) 109 + 110 + **Example Row:** 111 + ```javascript 112 + { 113 + 'visit_5678': { 114 + addressId: 'addr_1234', 115 + timestamp: 1699651200000, 116 + duration: 45000, 117 + source: 'peek', 118 + sourceId: 'peek_3', 119 + windowType: 'modal', 120 + metadata: '{"referrer":"addr_9999"}', 121 + scrollDepth: 80, 122 + interacted: 1 123 + } 124 + } 125 + ``` 126 + 127 + --- 128 + 129 + ### 3. `content` - Text Content 130 + 131 + Stores any text-based content: notes, CSV data, plain text, markdown documents, code snippets, etc. 132 + May or may not be linked to addresses. 133 + 134 + ```javascript 135 + { 136 + rowId: string, // Auto-generated unique ID 137 + cells: { 138 + title: string, // Content title or description 139 + content: string, // The actual text content 140 + mimeType: string, // text/markdown, text/plain, text/csv, text/html, application/json, etc. 141 + contentType: string, // Coarse type for easier querying: 'markdown', 'plain', 'csv', 'json', 'html', 'code' 142 + language: string, // Language/syntax if code (js, py, etc.) or human language (en, es) 143 + encoding: string, // Character encoding (default: utf-8) 144 + tags: string, // Comma-separated tag IDs 145 + addressRefs: string, // Comma-separated address IDs this content references or was sourced from 146 + parentId: string, // Parent content ID for hierarchies (optional) 147 + metadata: string, // JSON string for flexible metadata (headers for CSV, etc.) 148 + createdAt: number, // Unix timestamp 149 + updatedAt: number, // Unix timestamp 150 + syncPath: string, // Filesystem path if synced (e.g., 'content/data.csv', 'notes/note.md') 151 + synced: number, // 0 or 1 - whether synced to filesystem 152 + starred: number, // 0 or 1 153 + archived: number // 0 or 1 154 + } 155 + } 156 + ``` 157 + 158 + **Indexes:** 159 + - `byTag` - Filter by tags 160 + - `byContentType` - Filter by content type (markdown, csv, plain, etc.) 161 + - `byMimeType` - Filter by specific MIME type 162 + - `byAddress` - Content referencing specific addresses 163 + - `bySynced` - Find filesystem-synced content 164 + - `byUpdated` - Sort by most recently updated 165 + 166 + **Example Rows:** 167 + 168 + *Markdown note:* 169 + ```javascript 170 + { 171 + 'content_9012': { 172 + title: 'Meeting Notes - Project Kick-off', 173 + content: '# Project Kick-off\n\n- Discuss goals\n- Set timeline', 174 + mimeType: 'text/markdown', 175 + contentType: 'markdown', 176 + language: 'en', 177 + encoding: 'utf-8', 178 + tags: 'tag_2,tag_8', 179 + addressRefs: 'addr_1234,addr_5678', 180 + parentId: '', 181 + metadata: '{"mood":"productive","location":"office"}', 182 + createdAt: 1699564800000, 183 + updatedAt: 1699651200000, 184 + syncPath: 'content/meeting-2024-11-12.md', 185 + synced: 1, 186 + starred: 0, 187 + archived: 0 188 + } 189 + } 190 + ``` 191 + 192 + *CSV data:* 193 + ```javascript 194 + { 195 + 'content_9013': { 196 + title: 'Product Price List', 197 + content: 'product,price,stock\nWidget,19.99,150\nGadget,29.99,87', 198 + mimeType: 'text/csv', 199 + contentType: 'csv', 200 + language: '', 201 + encoding: 'utf-8', 202 + tags: 'tag_5', 203 + addressRefs: 'addr_shop', 204 + parentId: '', 205 + metadata: '{"delimiter":"comma","hasHeader":true,"columns":3}', 206 + createdAt: 1699564800000, 207 + updatedAt: 1699651200000, 208 + syncPath: 'content/prices.csv', 209 + synced: 1, 210 + starred: 0, 211 + archived: 0 212 + } 213 + } 214 + ``` 215 + 216 + *Code snippet:* 217 + ```javascript 218 + { 219 + 'content_9014': { 220 + title: 'Auth Helper Function', 221 + content: 'function authenticate(user, pass) {\n return hash(pass) === user.hash;\n}', 222 + mimeType: 'text/javascript', 223 + contentType: 'code', 224 + language: 'javascript', 225 + encoding: 'utf-8', 226 + tags: 'tag_7', 227 + addressRefs: '', 228 + parentId: '', 229 + metadata: '{"syntax":"js","lines":3}', 230 + createdAt: 1699564800000, 231 + updatedAt: 1699564800000, 232 + syncPath: '', 233 + synced: 0, 234 + starred: 1, 235 + archived: 0 236 + } 237 + } 238 + ``` 239 + 240 + --- 241 + 242 + ### 4. `tags` - Tag Taxonomy 243 + 244 + Hierarchical tag system for organizing all entities. 245 + 246 + ```javascript 247 + { 248 + rowId: string, // Auto-generated unique ID 249 + cells: { 250 + name: string, // Tag name (required, unique) 251 + slug: string, // URL-safe version of name 252 + color: string, // Hex color for UI (#FF5733) 253 + parentId: string, // Parent tag ID for hierarchies 254 + description: string, // Tag description 255 + metadata: string, // JSON string for flexible metadata 256 + createdAt: number, // Unix timestamp 257 + updatedAt: number, // Unix timestamp 258 + usageCount: number // Cached count of how many times used 259 + } 260 + } 261 + ``` 262 + 263 + **Indexes:** 264 + - `byName` - Lookup by name 265 + - `byParent` - Find child tags 266 + - `byUsage` - Sort by popularity 267 + 268 + **Example Row:** 269 + ```javascript 270 + { 271 + 'tag_1': { 272 + name: 'Work', 273 + slug: 'work', 274 + color: '#3498db', 275 + parentId: '', 276 + description: 'Work-related content', 277 + metadata: '{}', 278 + createdAt: 1699564800000, 279 + updatedAt: 1699564800000, 280 + usageCount: 150 281 + } 282 + } 283 + ``` 284 + 285 + --- 286 + 287 + ### 5. `blobs` - Binary File References 288 + 289 + Metadata index for binary files (images, videos, PDFs, etc.). 290 + Actual files stored in filesystem at `{userData}/{PROFILE}/datastore/blobs/` 291 + 292 + ```javascript 293 + { 294 + rowId: string, // Content hash (SHA-256) serves as ID 295 + cells: { 296 + filename: string, // Original filename 297 + mimeType: string, // MIME type (image/jpeg, video/mp4, application/pdf, etc.) 298 + mediaType: string, // Coarse type: 'image', 'video', 'audio', 'document', 'archive' 299 + size: number, // File size in bytes 300 + hash: string, // Content hash (same as rowId, for convenience) 301 + extension: string, // File extension (.jpg, .mp4, etc.) 302 + path: string, // Relative path in blob storage 303 + addressId: string, // Source address if downloaded from web 304 + contentId: string, // Associated content item if any 305 + tags: string, // Comma-separated tag IDs 306 + metadata: string, // JSON: dimensions, duration, EXIF, etc. 307 + createdAt: number, // Unix timestamp when added 308 + width: number, // Image/video width (if applicable) 309 + height: number, // Image/video height (if applicable) 310 + duration: number, // Audio/video duration in seconds (if applicable) 311 + thumbnail: string // Path to thumbnail if generated 312 + } 313 + } 314 + ``` 315 + 316 + **Indexes:** 317 + - `byMediaType` - Filter by media type 318 + - `byMimeType` - Filter by MIME type 319 + - `byAddress` - Find blobs from specific address 320 + - `byTag` - Filter by tags 321 + - `byDate` - Sort by date added 322 + 323 + **Example Row:** 324 + ```javascript 325 + { 326 + 'sha256_abc123...': { 327 + filename: 'screenshot.png', 328 + mimeType: 'image/png', 329 + mediaType: 'image', 330 + size: 1024768, 331 + hash: 'sha256_abc123...', 332 + extension: '.png', 333 + path: 'blobs/sha256_abc123.png', 334 + addressId: 'addr_1234', 335 + contentId: 'content_9012', 336 + tags: 'tag_3', 337 + metadata: '{"exif":{"camera":"iPhone"},"location":"home"}', 338 + createdAt: 1699564800000, 339 + width: 1920, 340 + height: 1080, 341 + duration: 0, 342 + thumbnail: 'blobs/thumbs/sha256_abc123_thumb.jpg' 343 + } 344 + } 345 + ``` 346 + 347 + --- 348 + 349 + ### 6. `scripts_data` - Script Extraction Results 350 + 351 + Stores data extracted by background Scripts feature. 352 + 353 + ```javascript 354 + { 355 + rowId: string, // Auto-generated unique ID 356 + cells: { 357 + scriptId: string, // ID of script that extracted this data 358 + scriptName: string, // Script name for easier querying 359 + addressId: string, // Source address 360 + selector: string, // CSS selector used 361 + content: string, // Extracted content 362 + contentType: string, // text, number, html, json, etc. 363 + metadata: string, // JSON string for flexible metadata 364 + extractedAt: number, // Unix timestamp when extracted 365 + previousValue: string, // Previous value for change detection 366 + changed: number // 0 or 1 - whether changed since last run 367 + } 368 + } 369 + ``` 370 + 371 + **Indexes:** 372 + - `byScript` - Group by script 373 + - `byAddress` - Filter by source address 374 + - `byTimestamp` - Sort chronologically 375 + - `byChanged` - Find changed values 376 + 377 + **Example Row:** 378 + ```javascript 379 + { 380 + 'script_data_3456': { 381 + scriptId: 'script_1', 382 + scriptName: 'Weather Monitor', 383 + addressId: 'addr_weather', 384 + selector: '.temperature', 385 + content: '72°F', 386 + contentType: 'text', 387 + metadata: '{"unit":"fahrenheit","location":"SF"}', 388 + extractedAt: 1699651200000, 389 + previousValue: '70°F', 390 + changed: 1 391 + } 392 + } 393 + ``` 394 + 395 + --- 396 + 397 + ### 7. `feeds` - Custom Feed Definitions 398 + 399 + Defines custom feeds and their queries/sources. 400 + 401 + ```javascript 402 + { 403 + rowId: string, // Auto-generated unique ID 404 + cells: { 405 + name: string, // Feed name 406 + description: string, // Feed description 407 + type: string, // 'query', 'script', 'external', 'aggregated' 408 + query: string, // Query definition (TinyQL or JSON query object) 409 + schedule: string, // Cron-like schedule for updates (if applicable) 410 + source: string, // External URL or internal source 411 + tags: string, // Comma-separated tag IDs 412 + metadata: string, // JSON string for flexible metadata 413 + createdAt: number, // Unix timestamp 414 + updatedAt: number, // Unix timestamp 415 + lastFetchedAt: number, // Unix timestamp of last update 416 + enabled: number // 0 or 1 - whether feed is active 417 + } 418 + } 419 + ``` 420 + 421 + **Indexes:** 422 + - `byType` - Filter by feed type 423 + - `byEnabled` - Find active feeds 424 + - `byTag` - Filter by tags 425 + 426 + **Example Row:** 427 + ```javascript 428 + { 429 + 'feed_7890': { 430 + name: 'Recent Work Links', 431 + description: 'Links tagged work from last 7 days', 432 + type: 'query', 433 + query: '{"table":"addresses","where":{"tags":"tag_1"},"since":"7d"}', 434 + schedule: '0 9 * * *', 435 + source: 'internal', 436 + tags: 'tag_1', 437 + metadata: '{"format":"rss"}', 438 + createdAt: 1699564800000, 439 + updatedAt: 1699651200000, 440 + lastFetchedAt: 1699651200000, 441 + enabled: 1 442 + } 443 + } 444 + ``` 445 + 446 + --- 447 + 448 + ## Schema Definition (TinyBase Format) 449 + 450 + ```javascript 451 + const schema = { 452 + addresses: { 453 + uri: { type: 'string' }, 454 + protocol: { type: 'string', default: 'https' }, 455 + domain: { type: 'string' }, 456 + path: { type: 'string', default: '' }, 457 + title: { type: 'string', default: '' }, 458 + mimeType: { type: 'string', default: 'text/html' }, 459 + favicon: { type: 'string', default: '' }, 460 + description: { type: 'string', default: '' }, 461 + tags: { type: 'string', default: '' }, 462 + metadata: { type: 'string', default: '{}' }, 463 + createdAt: { type: 'number' }, 464 + updatedAt: { type: 'number' }, 465 + lastVisitAt: { type: 'number', default: 0 }, 466 + visitCount: { type: 'number', default: 0 }, 467 + starred: { type: 'number', default: 0 }, 468 + archived: { type: 'number', default: 0 } 469 + }, 470 + 471 + visits: { 472 + addressId: { type: 'string' }, 473 + timestamp: { type: 'number' }, 474 + duration: { type: 'number', default: 0 }, 475 + source: { type: 'string', default: 'direct' }, 476 + sourceId: { type: 'string', default: '' }, 477 + windowType: { type: 'string', default: 'main' }, 478 + metadata: { type: 'string', default: '{}' }, 479 + scrollDepth: { type: 'number', default: 0 }, 480 + interacted: { type: 'number', default: 0 } 481 + }, 482 + 483 + content: { 484 + title: { type: 'string', default: 'Untitled' }, 485 + content: { type: 'string', default: '' }, 486 + mimeType: { type: 'string', default: 'text/plain' }, 487 + contentType: { type: 'string', default: 'plain' }, 488 + language: { type: 'string', default: '' }, 489 + encoding: { type: 'string', default: 'utf-8' }, 490 + tags: { type: 'string', default: '' }, 491 + addressRefs: { type: 'string', default: '' }, 492 + parentId: { type: 'string', default: '' }, 493 + metadata: { type: 'string', default: '{}' }, 494 + createdAt: { type: 'number' }, 495 + updatedAt: { type: 'number' }, 496 + syncPath: { type: 'string', default: '' }, 497 + synced: { type: 'number', default: 0 }, 498 + starred: { type: 'number', default: 0 }, 499 + archived: { type: 'number', default: 0 } 500 + }, 501 + 502 + tags: { 503 + name: { type: 'string' }, 504 + slug: { type: 'string' }, 505 + color: { type: 'string', default: '#999999' }, 506 + parentId: { type: 'string', default: '' }, 507 + description: { type: 'string', default: '' }, 508 + metadata: { type: 'string', default: '{}' }, 509 + createdAt: { type: 'number' }, 510 + updatedAt: { type: 'number' }, 511 + usageCount: { type: 'number', default: 0 } 512 + }, 513 + 514 + blobs: { 515 + filename: { type: 'string' }, 516 + mimeType: { type: 'string' }, 517 + mediaType: { type: 'string' }, 518 + size: { type: 'number' }, 519 + hash: { type: 'string' }, 520 + extension: { type: 'string' }, 521 + path: { type: 'string' }, 522 + addressId: { type: 'string', default: '' }, 523 + contentId: { type: 'string', default: '' }, 524 + tags: { type: 'string', default: '' }, 525 + metadata: { type: 'string', default: '{}' }, 526 + createdAt: { type: 'number' }, 527 + width: { type: 'number', default: 0 }, 528 + height: { type: 'number', default: 0 }, 529 + duration: { type: 'number', default: 0 }, 530 + thumbnail: { type: 'string', default: '' } 531 + }, 532 + 533 + scripts_data: { 534 + scriptId: { type: 'string' }, 535 + scriptName: { type: 'string' }, 536 + addressId: { type: 'string' }, 537 + selector: { type: 'string' }, 538 + content: { type: 'string' }, 539 + contentType: { type: 'string', default: 'text' }, 540 + metadata: { type: 'string', default: '{}' }, 541 + extractedAt: { type: 'number' }, 542 + previousValue: { type: 'string', default: '' }, 543 + changed: { type: 'number', default: 0 } 544 + }, 545 + 546 + feeds: { 547 + name: { type: 'string' }, 548 + description: { type: 'string', default: '' }, 549 + type: { type: 'string' }, 550 + query: { type: 'string', default: '' }, 551 + schedule: { type: 'string', default: '' }, 552 + source: { type: 'string', default: 'internal' }, 553 + tags: { type: 'string', default: '' }, 554 + metadata: { type: 'string', default: '{}' }, 555 + createdAt: { type: 'number' }, 556 + updatedAt: { type: 'number' }, 557 + lastFetchedAt: { type: 'number', default: 0 }, 558 + enabled: { type: 'number', default: 1 } 559 + } 560 + }; 561 + ``` 562 + 563 + --- 564 + 565 + ## Indexes Definition 566 + 567 + ```javascript 568 + const indexes = { 569 + // Address indexes 570 + addresses_byDomain: { 571 + table: 'addresses', 572 + on: 'domain' 573 + }, 574 + addresses_byProtocol: { 575 + table: 'addresses', 576 + on: 'protocol' 577 + }, 578 + addresses_byStarred: { 579 + table: 'addresses', 580 + on: 'starred' 581 + }, 582 + 583 + // Visit indexes 584 + visits_byAddress: { 585 + table: 'visits', 586 + on: 'addressId' 587 + }, 588 + visits_byTimestamp: { 589 + table: 'visits', 590 + on: 'timestamp' 591 + }, 592 + visits_bySource: { 593 + table: 'visits', 594 + on: 'source' 595 + }, 596 + 597 + // Content indexes 598 + content_byContentType: { 599 + table: 'content', 600 + on: 'contentType' 601 + }, 602 + content_byMimeType: { 603 + table: 'content', 604 + on: 'mimeType' 605 + }, 606 + content_bySynced: { 607 + table: 'content', 608 + on: 'synced' 609 + }, 610 + content_byUpdated: { 611 + table: 'content', 612 + on: 'updatedAt' 613 + }, 614 + 615 + // Tag indexes 616 + tags_byName: { 617 + table: 'tags', 618 + on: 'name' 619 + }, 620 + tags_byParent: { 621 + table: 'tags', 622 + on: 'parentId' 623 + }, 624 + 625 + // Blob indexes 626 + blobs_byMediaType: { 627 + table: 'blobs', 628 + on: 'mediaType' 629 + }, 630 + blobs_byMimeType: { 631 + table: 'blobs', 632 + on: 'mimeType' 633 + }, 634 + 635 + // Scripts data indexes 636 + scripts_data_byScript: { 637 + table: 'scripts_data', 638 + on: 'scriptId' 639 + }, 640 + scripts_data_byChanged: { 641 + table: 'scripts_data', 642 + on: 'changed' 643 + }, 644 + 645 + // Feed indexes 646 + feeds_byType: { 647 + table: 'feeds', 648 + on: 'type' 649 + }, 650 + feeds_byEnabled: { 651 + table: 'feeds', 652 + on: 'enabled' 653 + } 654 + }; 655 + ``` 656 + 657 + --- 658 + 659 + ## Relationships 660 + 661 + TinyBase relationships for efficient joins: 662 + 663 + ```javascript 664 + const relationships = { 665 + // Visits to their addresses 666 + visitAddress: { 667 + localTableId: 'visits', 668 + remoteTableId: 'addresses', 669 + relationshipId: 'addressId' 670 + }, 671 + 672 + // Blobs to their source addresses 673 + blobAddress: { 674 + localTableId: 'blobs', 675 + remoteTableId: 'addresses', 676 + relationshipId: 'addressId' 677 + }, 678 + 679 + // Blobs to their content 680 + blobContent: { 681 + localTableId: 'blobs', 682 + remoteTableId: 'content', 683 + relationshipId: 'contentId' 684 + }, 685 + 686 + // Scripts data to addresses 687 + scriptDataAddress: { 688 + localTableId: 'scripts_data', 689 + remoteTableId: 'addresses', 690 + relationshipId: 'addressId' 691 + }, 692 + 693 + // Tag hierarchy (self-referential) 694 + childTags: { 695 + localTableId: 'tags', 696 + remoteTableId: 'tags', 697 + relationshipId: 'parentId' 698 + }, 699 + 700 + // Content hierarchy (self-referential) 701 + childContent: { 702 + localTableId: 'content', 703 + remoteTableId: 'content', 704 + relationshipId: 'parentId' 705 + } 706 + }; 707 + ``` 708 + 709 + --- 710 + 711 + ## Metrics (Aggregations) 712 + 713 + Useful metrics for dashboard/analytics: 714 + 715 + ```javascript 716 + const metrics = { 717 + // Total addresses 718 + totalAddresses: { 719 + table: 'addresses', 720 + aggregate: 'count' 721 + }, 722 + 723 + // Total visits 724 + totalVisits: { 725 + table: 'visits', 726 + aggregate: 'count' 727 + }, 728 + 729 + // Average visit duration 730 + avgVisitDuration: { 731 + table: 'visits', 732 + metric: 'duration', 733 + aggregate: 'avg' 734 + }, 735 + 736 + // Total storage used by blobs 737 + totalBlobSize: { 738 + table: 'blobs', 739 + metric: 'size', 740 + aggregate: 'sum' 741 + }, 742 + 743 + // Number of content items 744 + totalContent: { 745 + table: 'content', 746 + aggregate: 'count' 747 + }, 748 + 749 + // Number of synced content items 750 + syncedContent: { 751 + table: 'content', 752 + where: { synced: 1 }, 753 + aggregate: 'count' 754 + }, 755 + 756 + // Content by type 757 + contentByType: { 758 + table: 'content', 759 + groupBy: 'contentType', 760 + aggregate: 'count' 761 + } 762 + }; 763 + ``` 764 + 765 + --- 766 + 767 + ## Common Queries (Examples) 768 + 769 + ### Recent addresses by visit 770 + ```javascript 771 + store.getTable('visits') 772 + .sort((a, b) => b.timestamp - a.timestamp) 773 + .slice(0, 10) 774 + .map(visit => visit.addressId) 775 + ``` 776 + 777 + ### Starred addresses with tags 778 + ```javascript 779 + store.getTable('addresses') 780 + .filter(addr => addr.starred === 1) 781 + .map(addr => ({ 782 + ...addr, 783 + tags: addr.tags.split(',').map(id => store.getRow('tags', id)) 784 + })) 785 + ``` 786 + 787 + ### Content synced to filesystem 788 + ```javascript 789 + store.getTable('content') 790 + .filter(item => item.synced === 1) 791 + ``` 792 + 793 + ### Markdown content only 794 + ```javascript 795 + store.getTable('content') 796 + .filter(item => item.contentType === 'markdown') 797 + ``` 798 + 799 + ### CSV data 800 + ```javascript 801 + store.getTable('content') 802 + .filter(item => item.contentType === 'csv') 803 + ``` 804 + 805 + ### Blobs by media type 806 + ```javascript 807 + store.getTable('blobs') 808 + .filter(blob => blob.mediaType === 'image') 809 + ``` 810 + 811 + ### Script data that changed 812 + ```javascript 813 + store.getTable('scripts_data') 814 + .filter(data => data.changed === 1) 815 + .sort((a, b) => b.extractedAt - a.extractedAt) 816 + ``` 817 + 818 + --- 819 + 820 + ## Storage Strategy 821 + 822 + ### Persistence Layers 823 + 824 + **Phase 1 (MVP):** 825 + - TinyBase in-memory store 826 + - Persist to IndexedDB for browser compatibility 827 + - File storage for blobs in `{userData}/{PROFILE}/datastore/blobs/` 828 + 829 + **Phase 2 (Performance):** 830 + - Add SQLite persistence option 831 + - Keep TinyBase API but use SQLite backend 832 + - Better for large datasets and complex queries 833 + 834 + **Phase 3 (Sync):** 835 + - Enable TinyBase CRDT sync 836 + - Sync between devices 837 + - Conflict-free merging 838 + 839 + ### File System Layout 840 + 841 + ``` 842 + {userData}/ 843 + {PROFILE}/ 844 + datastore/ 845 + index.db # SQLite backend (Phase 2) 846 + index.json # JSON backup 847 + blobs/ 848 + sha256_abc...png # Content-addressed blobs 849 + sha256_def...jpg 850 + thumbs/ # Thumbnails for images 851 + sha256_abc_thumb.jpg 852 + content/ # Synced text content 853 + notes/ # Markdown notes 854 + note1.md 855 + note2.md 856 + data/ # CSV and other data files 857 + prices.csv 858 + code/ # Code snippets 859 + helpers.js 860 + exports/ # User exports 861 + backup-2024-11-12.json 862 + ``` 863 + 864 + --- 865 + 866 + ## Migration Strategy 867 + 868 + ### Version 1.0 (Initial) 869 + - Create all tables with schema 870 + - Set up indexes 871 + - Set up relationships 872 + - Initialize with empty data 873 + 874 + ### Future Versions 875 + - TinyBase doesn't have built-in migrations 876 + - Implement custom migration system: 877 + - Version table to track schema version 878 + - Migration functions for each version bump 879 + - Backup before migration 880 + - Rollback capability 881 + 882 + --- 883 + 884 + ## Next Steps 885 + 886 + 1. ✅ Schema design complete 887 + 2. ⏭️ Install TinyBase package 888 + 3. ⏭️ Create datastore module scaffold 889 + 4. ⏭️ Implement store initialization with schema 890 + 5. ⏭️ Implement basic CRUD operations 891 + 6. ⏭️ Test with sample data 892 + 7. ⏭️ Build datastore API layer
+84
notes/datastore.md
··· 1 + # Peek Personal Datastore 2 + 3 + Browser profile directories are a jumble of organically-grown files and directories that are designed to serve browser internals vs being a store of user-curated and shaped information. 4 + 5 + The Peek Personal Datastore combines an address index with unstructured data, metadata, time-series data, and files. 6 + 7 + This is a local, private and pesonal store first and foremost. 8 + 9 + Peek needs a way of storing data that provides a few primary things: 10 + 11 + - Store various data types, and attach metadata to them 12 + - Store feeds, such as web navigation history, stored data history, custom generated feeds, timeseries data, and feeds pulled in from elsewhere 13 + - Have some kind of approach to binary files like images and videos, maybe fine to keep on filesystem but referenced by an index 14 + - Support bidirectional filesystem sync for some flavors of file, such as markdown, where we might want an Obsidian vault to map some set of "files" in the datastore that we can also edit in a "stickies" app running in Peek, for example 15 + - Mime types are implemented in nearly every aspect of the datastore to allow for type-based querying 16 + - Tags are implemented in nearly every aspect of the datastore to allow for coarse-grained annotations and querying 17 + 18 + Non-primary but keep in FOV: 19 + - Runtime/browser engine agnosticism, eg if we move off Electron someday 20 + - Designed with sync in mind, for mirroring to other devices, saving parts to specific cloud operations, or whole snapshots for backups and archives 21 + - Designed with sync in mind to collaborate with others - eg perhaps a subset of notes are synced with some other person's set of notes 22 + 23 + Primary types: 24 + 25 + - Address index: Peek at its core is a web user agent. First class support for saving addresses. Examples: HTTP URLs, other protocol URLs or URIs, such as IPFS CIDs. Fine to limit to URIs for now. The address index includes navigation history, or an imported Pocket archive or any type of address for any reason. 26 + - Web navigation history: Index of visits to addresses in the index. 27 + - Non-URL data, which can reference one or more URLs or none at all. Examples: markdown notes, images 28 + - Metadata for all data types: We want to annotate addresses and non-addresses with tags, signatures, mime-types, language metadata, usage information, etc. 29 + 30 + ## Application patterns 31 + 32 + - Applications need to read from and write to the datastore in ways specific to them. 33 + - Not necessarily full sandbox / sharding / area, but the ability to operate on types they know and use. 34 + - Eg Panorama will need to access the address index, and store group metadata, and access it quickly. 35 + - Address classifiers will be a very common use-case, with many applications just being specialized address classifiers, so maybe we need some application-level "data view" implemented for quickly accessing data in this way. 36 + - Perhaps lenses/views are a useful abstraction here. 37 + 38 + ## Use-cases 39 + 40 + Private local 41 + - navigation history 42 + - personal notes 43 + - saving images from pages 44 + - text/numerical datapoints and their history, eg (so, time-series data) 45 + 46 + Private remote 47 + - publishing a note to a remote server 48 + - syncing the datastore between my devices 49 + - publishing backups/archives 50 + 51 + Public remote 52 + - publish a note to my website 53 + - sync w/ a remote service, eg push urls+notes tagged 'arena' to are.na and pull from it 54 + 55 + Collaboration 56 + - syncing private data between two people 57 + 58 + Shared calendar scenario 59 + - two stores with calendar data 60 + - connect via agreed shared method 61 + 62 + ## Data, schemas, and schemalessness 63 + 64 + As Peek matures into a natively generative system, we need complex types beyond MIME types and what filesystems afford - the whole flora and fauna of digital daily life. We need a way of describing data when passing it between features and "applications" in Peek. We don't need some holy grail supersystem dream, maybe it's fine to just internet MIME types, filesystem types, and something like Atproto's "lexicons" when interacting in public collaborative scenarios. 65 + 66 + The store itself is probably fine using basic types, and we can layer on complex types in the context of applications. 67 + 68 + ## Implementation notes and ideas 69 + 70 + - JS/TS/Electron 71 + - Tinybase 72 + - Automerge 73 + 74 + Layer on: 75 + - identities 76 + - signing 77 + - verifiability 78 + - collaboration 79 + 80 + ## Examples 81 + 82 + - Atproto personal datastore (not designed to be private by default tho) https://atproto.com/guides/self-hosting 83 + - Solid pods https://solidproject.org/ 84 + - Perkeep is more focused on permanence but it does a lot of these things https://github.com/perkeep/perkeep
+1
notes/multi-protocol.md
··· 1 + # Multi-protocol support in Peek
+96 -1
package-lock.json
··· 9 9 "version": "0.0.1", 10 10 "license": "MIT", 11 11 "dependencies": { 12 - "lil-gui": "^0.19.2" 12 + "lil-gui": "^0.19.2", 13 + "tinybase": "^6.7.2" 13 14 }, 14 15 "devDependencies": { 15 16 "@electron-forge/cli": "^7.8.0", ··· 5258 5259 "integrity": "sha512-5ROII7nElnAirvFn8g7H7MtpfV1daMcyfTGQwsn/x2VtyV+VPiO5CjReCJtWLvoKTDEDmZocf3cNPraiMnBXLA==", 5259 5260 "dev": true, 5260 5261 "optional": true 5262 + }, 5263 + "node_modules/tinybase": { 5264 + "version": "6.7.2", 5265 + "resolved": "https://registry.npmjs.org/tinybase/-/tinybase-6.7.2.tgz", 5266 + "integrity": "sha512-2ufO+vqyhGu93IkYZgtzr3S7O7/e2uEN2o+FpZ/j1h2ZS8VpKllbgxP2pcL/k/fK3kChdpu3nUvyHldcZH9epQ==", 5267 + "license": "MIT", 5268 + "peerDependencies": { 5269 + "@automerge/automerge-repo": "^1.2.1", 5270 + "@cloudflare/workers-types": "^4.20251014.0", 5271 + "@electric-sql/pglite": "^0.3.11", 5272 + "@libsql/client": "^0.15.15", 5273 + "@powersync/common": "^1.40.0", 5274 + "@sqlite.org/sqlite-wasm": "^3.50.4-build1", 5275 + "@vlcn.io/crsqlite-wasm": "^0.16.0", 5276 + "bun": "^1.3.1", 5277 + "electric-sql": "^0.12.1", 5278 + "expo": "^54.0.10", 5279 + "expo-sqlite": "^16.0.8", 5280 + "partykit": "^0.0.115", 5281 + "partysocket": "^1.1.5", 5282 + "postgres": "^3.4.7", 5283 + "react": "^19.0.0", 5284 + "react-dom": "^19.0.0", 5285 + "react-native-mmkv": "4.0.0", 5286 + "react-native-sqlite-storage": "^6.0.1", 5287 + "sqlite3": "^5.1.7", 5288 + "ws": "^8.18.3", 5289 + "yjs": "^13.6.27" 5290 + }, 5291 + "peerDependenciesMeta": { 5292 + "@automerge/automerge-repo": { 5293 + "optional": true 5294 + }, 5295 + "@cloudflare/workers-types": { 5296 + "optional": true 5297 + }, 5298 + "@electric-sql/pglite": { 5299 + "optional": true 5300 + }, 5301 + "@libsql/client": { 5302 + "optional": true 5303 + }, 5304 + "@powersync/common": { 5305 + "optional": true 5306 + }, 5307 + "@sqlite.org/sqlite-wasm": { 5308 + "optional": true 5309 + }, 5310 + "@vlcn.io/crsqlite-wasm": { 5311 + "optional": true 5312 + }, 5313 + "bun": { 5314 + "optional": true 5315 + }, 5316 + "electric-sql": { 5317 + "optional": true 5318 + }, 5319 + "expo": { 5320 + "optional": true 5321 + }, 5322 + "expo-sqlite": { 5323 + "optional": true 5324 + }, 5325 + "partykit": { 5326 + "optional": true 5327 + }, 5328 + "partysocket": { 5329 + "optional": true 5330 + }, 5331 + "postgres": { 5332 + "optional": true 5333 + }, 5334 + "react": { 5335 + "optional": true 5336 + }, 5337 + "react-dom": { 5338 + "optional": true 5339 + }, 5340 + "react-native-mmkv": { 5341 + "optional": true 5342 + }, 5343 + "react-native-sqlite-storage": { 5344 + "optional": true 5345 + }, 5346 + "sqlite3": { 5347 + "optional": true 5348 + }, 5349 + "ws": { 5350 + "optional": true 5351 + }, 5352 + "yjs": { 5353 + "optional": true 5354 + } 5355 + } 5261 5356 }, 5262 5357 "node_modules/tmp": { 5263 5358 "version": "0.2.1",