···11+# Datastore Architecture
22+33+## Overview
44+55+Peek's datastore uses a centralized architecture where all data operations are handled in the main Electron process, with renderer processes accessing the datastore through an IPC (Inter-Process Communication) API exposed via the preload script.
66+77+## Architectural Decision
88+99+### The Choice: IPC-Based API vs Direct Library Access
1010+1111+During implementation, we faced a critical architectural decision:
1212+1313+**Option 1: Direct Library Access**
1414+- Features import and use TinyBase directly in renderer processes
1515+- Simpler initial implementation
1616+- Tighter coupling to TinyBase
1717+1818+**Option 2: IPC-Based API (Chosen)**
1919+- Datastore logic centralized in main process
2020+- Features access via `api.datastore` abstraction
2121+- Complete separation between storage layer and application features
2222+2323+### Reasoning
2424+2525+We chose **Option 2** for the following strategic reasons:
2626+2727+1. **Runtime Portability**
2828+ - Future consideration of Tauri as an alternative to Electron
2929+ - No renderer code changes needed when switching runtimes
3030+ - Abstraction isolates platform-specific concerns
3131+3232+2. **Storage Backend Flexibility**
3333+ - Can swap TinyBase for SQLite, Dexie, or cloud datastores
3434+ - Features remain unchanged regardless of backend
3535+ - Enables gradual migration strategies
3636+3737+3. **Cloud & Sync Readiness**
3838+ - Architecture naturally supports remote datastore endpoints
3939+ - Same API can route to local or cloud storage
4040+ - Facilitates future sync implementations
4141+4242+4. **Mobile App Development**
4343+ - Mobile apps can use the same API contract
4444+ - Platform-specific storage implementations possible
4545+ - Consistent developer experience across platforms
4646+4747+5. **Security & Isolation**
4848+ - Datastore logic contained in trusted main process
4949+ - Renderer processes have controlled, validated access
5050+ - Easier to audit and secure data operations
5151+5252+## Current Implementation
5353+5454+### Technology Stack
5555+5656+- **Storage Engine**: TinyBase v0.7.2
5757+ - Reactive data store with CRDT support
5858+ - Schema validation and indexes
5959+ - Small footprint (5-11kB gzipped)
6060+ - Built-in support for relationships and metrics
6161+6262+- **Communication**: Electron IPC
6363+ - `ipcMain.handle()` for main process handlers
6464+ - `ipcRenderer.invoke()` for renderer requests
6565+ - Async/await throughout
6666+6767+### Architecture Components
6868+6969+```
7070+┌─────────────────────────────────────────────────────────────┐
7171+│ Renderer Process (app/) │
7272+│ │
7373+│ ┌────────────┐ ┌────────────┐ │
7474+│ │ Features │────────▶│ api.js │ │
7575+│ │ (peeks, │ │ │ │
7676+│ │ slides, │ │ api. │ │
7777+│ │ scripts) │ │ datastore │ │
7878+│ └────────────┘ └─────┬──────┘ │
7979+│ │ │
8080+└───────────────────────────────┼──────────────────────────────┘
8181+ │ IPC invoke()
8282+ │
8383+┌───────────────────────────────┼──────────────────────────────┐
8484+│ Main Process (index.js) │ │
8585+│ ▼ │
8686+│ ┌──────────────────────────────────────────┐ │
8787+│ │ IPC Handlers │ │
8888+│ │ • datastore-add-address │ │
8989+│ │ • datastore-get-address │ │
9090+│ │ • datastore-query-addresses │ │
9191+│ │ • datastore-add-visit │ │
9292+│ │ • datastore-query-visits │ │
9393+│ │ • datastore-add-content │ │
9494+│ │ • datastore-get-table │ │
9595+│ │ • datastore-set-row │ │
9696+│ │ • datastore-get-stats │ │
9797+│ └──────────────┬───────────────────────────┘ │
9898+│ │ │
9999+│ ▼ │
100100+│ ┌──────────────────────────────────────────┐ │
101101+│ │ TinyBase Store │ │
102102+│ │ • Store (datastoreStore) │ │
103103+│ │ • Indexes (datastoreIndexes) │ │
104104+│ │ • Relationships (datastoreRelationships)│ │
105105+│ │ • Metrics (datastoreMetrics) │ │
106106+│ └──────────────────────────────────────────┘ │
107107+│ │
108108+└──────────────────────────────────────────────────────────────┘
109109+```
110110+111111+### File Structure
112112+113113+```
114114+/Users/dietrich/misc/peek/
115115+├── index.js # Main process
116116+│ ├── TinyBase initialization (lines 115-175)
117117+│ ├── Helper functions (lines 177-230)
118118+│ └── IPC handlers (lines 979-1230)
119119+│
120120+├── preload.js # Preload script
121121+│ └── api.datastore exposure (lines 210-242)
122122+│
123123+└── app/ # Renderer process
124124+ ├── datastore/
125125+ │ ├── schema.js # TinyBase schema definitions
126126+ │ ├── config.js # Datastore configuration
127127+ │ ├── history.js # Navigation tracking helpers
128128+ │ └── test-ipc.html # IPC API test page
129129+ │
130130+ ├── scripts/index.js # Script results tracking
131131+ ├── peeks/index.js # Peek navigation tracking
132132+ └── slides/index.js # Slide navigation tracking
133133+```
134134+135135+## API Reference
136136+137137+### api.datastore Methods
138138+139139+All methods return a Promise that resolves to `{ success: boolean, data?: any, error?: string, id?: string }`
140140+141141+#### Address Management
142142+143143+```javascript
144144+// Add a new address
145145+await api.datastore.addAddress(uri, options)
146146+// Parameters:
147147+// uri: string - The URL to track
148148+// options: {
149149+// title?: string
150150+// mimeType?: string
151151+// favicon?: string
152152+// description?: string
153153+// tags?: string (comma-separated)
154154+// metadata?: string (JSON)
155155+// }
156156+// Returns: { success: true, id: 'addr_...' }
157157+158158+// Get address by ID
159159+await api.datastore.getAddress(id)
160160+// Returns: { success: true, data: { uri, domain, title, ... } }
161161+162162+// Update address
163163+await api.datastore.updateAddress(id, updates)
164164+// Parameters:
165165+// id: string - Address ID
166166+// updates: object - Fields to update
167167+// Returns: { success: true, data: { ...updatedRow } }
168168+169169+// Query addresses
170170+await api.datastore.queryAddresses(filter)
171171+// Parameters:
172172+// filter: {
173173+// domain?: string
174174+// protocol?: string
175175+// starred?: 0 | 1
176176+// tag?: string
177177+// sortBy?: 'lastVisit' | 'visitCount' | 'created'
178178+// limit?: number
179179+// }
180180+// Returns: { success: true, data: [...addresses] }
181181+```
182182+183183+#### Visit Tracking
184184+185185+```javascript
186186+// Add a visit
187187+await api.datastore.addVisit(addressId, options)
188188+// Parameters:
189189+// addressId: string - The address being visited
190190+// options: {
191191+// source?: string - Source of navigation (peek, slide, direct)
192192+// sourceId?: string - ID of the source feature
193193+// windowType?: string - Type of window (modal, persistent, main)
194194+// duration?: number - Time spent in milliseconds
195195+// scrollDepth?: number - Scroll percentage (0-100)
196196+// interacted?: 0 | 1 - Whether user interacted
197197+// metadata?: string - Additional JSON data
198198+// }
199199+// Returns: { success: true, id: 'visit_...' }
200200+// Side effect: Updates address visitCount and lastVisitAt
201201+202202+// Query visits
203203+await api.datastore.queryVisits(filter)
204204+// Parameters:
205205+// filter: {
206206+// addressId?: string
207207+// source?: string
208208+// windowType?: string
209209+// startDate?: number (timestamp)
210210+// endDate?: number (timestamp)
211211+// limit?: number
212212+// }
213213+// Returns: { success: true, data: [...visits] }
214214+```
215215+216216+#### Content Management
217217+218218+```javascript
219219+// Add content (notes, markdown, code, etc.)
220220+await api.datastore.addContent(options)
221221+// Parameters:
222222+// options: {
223223+// title?: string
224224+// content: string
225225+// contentType?: 'plain' | 'markdown' | 'code' | 'json' | 'csv'
226226+// mimeType?: string
227227+// language?: string (for code)
228228+// tags?: string
229229+// addressId?: string (if related to an address)
230230+// metadata?: string
231231+// }
232232+// Returns: { success: true, id: 'content_...' }
233233+```
234234+235235+#### Direct Table Access
236236+237237+```javascript
238238+// Get entire table
239239+await api.datastore.getTable(tableName)
240240+// Parameters:
241241+// tableName: 'addresses' | 'visits' | 'content' | 'tags' | 'blobs' | 'scripts_data' | 'feeds'
242242+// Returns: { success: true, data: { rowId: { ...row }, ... } }
243243+244244+// Set row directly
245245+await api.datastore.setRow(tableName, rowId, rowData)
246246+// Parameters:
247247+// tableName: string
248248+// rowId: string
249249+// rowData: object - Complete row data matching schema
250250+// Returns: { success: true }
251251+```
252252+253253+#### Statistics
254254+255255+```javascript
256256+// Get aggregate statistics
257257+await api.datastore.getStats()
258258+// Returns: {
259259+// success: true,
260260+// data: {
261261+// totalAddresses: number,
262262+// totalVisits: number,
263263+// totalContent: number,
264264+// // ... other metrics
265265+// }
266266+// }
267267+```
268268+269269+## Usage Examples
270270+271271+### Example 1: Track Navigation from a Feature
272272+273273+```javascript
274274+// app/peeks/index.js
275275+import api from '../api.js';
276276+277277+const executeItem = async (item) => {
278278+ // Open window and navigate
279279+ const window = await windows.createWindow(item.address, params);
280280+281281+ // Track the navigation
282282+ if (api.datastore) {
283283+ try {
284284+ // Get or create address
285285+ const addResult = await api.datastore.addAddress(item.address, {
286286+ title: item.title,
287287+ mimeType: 'text/html'
288288+ });
289289+290290+ // Record visit
291291+ if (addResult.success) {
292292+ await api.datastore.addVisit(addResult.id, {
293293+ source: 'peek',
294294+ sourceId: `peek_${item.keyNum}`,
295295+ windowType: 'modal'
296296+ });
297297+ }
298298+ } catch (error) {
299299+ console.error('Failed to track navigation:', error);
300300+ }
301301+ }
302302+};
303303+```
304304+305305+### Example 2: Save Script Results
306306+307307+```javascript
308308+// app/scripts/index.js
309309+import api from '../api.js';
310310+311311+const saveScriptResult = async (script, result) => {
312312+ try {
313313+ // Find or create address
314314+ const addressesResult = await api.datastore.queryAddresses({});
315315+ let addressId;
316316+317317+ if (addressesResult.success) {
318318+ const existing = addressesResult.data.find(a => a.uri === script.address);
319319+320320+ if (existing) {
321321+ addressId = existing.id;
322322+ } else {
323323+ const addResult = await api.datastore.addAddress(script.address, {
324324+ title: `Script: ${script.title}`
325325+ });
326326+ addressId = addResult.id;
327327+ }
328328+ }
329329+330330+ // Get previous results to detect changes
331331+ const prevResults = await api.datastore.getTable('scripts_data');
332332+ let changed = 1;
333333+334334+ if (prevResults.success) {
335335+ const scriptResults = Object.entries(prevResults.data)
336336+ .filter(([id, row]) => row.scriptId === script.id)
337337+ .sort((a, b) => b[1].extractedAt - a[1].extractedAt);
338338+339339+ if (scriptResults.length > 0) {
340340+ const previousValue = scriptResults[0][1].content;
341341+ changed = (result !== previousValue) ? 1 : 0;
342342+ }
343343+ }
344344+345345+ // Save new result
346346+ await api.datastore.setRow('scripts_data',
347347+ `script_data_${Date.now()}_${Math.random().toString(36).substr(2, 9)}`,
348348+ {
349349+ scriptId: script.id,
350350+ scriptName: script.title,
351351+ addressId: addressId,
352352+ selector: script.selector,
353353+ content: result,
354354+ contentType: 'text',
355355+ metadata: '{}',
356356+ extractedAt: Date.now(),
357357+ previousValue: scriptResults[0]?.[1]?.content || '',
358358+ changed: changed
359359+ }
360360+ );
361361+ } catch (error) {
362362+ console.error('Error saving script result:', error);
363363+ }
364364+};
365365+```
366366+367367+### Example 3: Query Recent History
368368+369369+```javascript
370370+// app/features/history-browser.js
371371+import api from '../api.js';
372372+373373+const getRecentHistory = async (limit = 20) => {
374374+ try {
375375+ // Get recent addresses
376376+ const addressesResult = await api.datastore.queryAddresses({
377377+ sortBy: 'lastVisit',
378378+ limit: limit
379379+ });
380380+381381+ if (!addressesResult.success) {
382382+ console.error('Failed to fetch history:', addressesResult.error);
383383+ return [];
384384+ }
385385+386386+ // Enrich with visit details
387387+ const enriched = await Promise.all(
388388+ addressesResult.data.map(async (address) => {
389389+ const visitsResult = await api.datastore.queryVisits({
390390+ addressId: address.id,
391391+ limit: 5
392392+ });
393393+394394+ return {
395395+ ...address,
396396+ recentVisits: visitsResult.success ? visitsResult.data : []
397397+ };
398398+ })
399399+ );
400400+401401+ return enriched;
402402+ } catch (error) {
403403+ console.error('Error getting recent history:', error);
404404+ return [];
405405+ }
406406+};
407407+```
408408+409409+## Implementation Details
410410+411411+### Data Schema
412412+413413+The datastore uses 7 tables defined in `app/datastore/schema.js`:
414414+415415+1. **addresses**: Web addresses (URLs) visited by the user
416416+2. **visits**: Individual visit records with duration and interaction data
417417+3. **content**: User-created notes, markdown files, code snippets
418418+4. **tags**: Tags for organizing addresses and content
419419+5. **blobs**: Binary data (images, files) with content-addressable storage
420420+6. **scripts_data**: Results from background script executions
421421+7. **feeds**: RSS/Atom feed subscriptions and entries
422422+423423+### Helper Functions (Main Process)
424424+425425+```javascript
426426+// Generate unique IDs
427427+generateId(prefix) // Returns: 'prefix_timestamp_randomstring'
428428+429429+// Current timestamp
430430+now() // Returns: Date.now()
431431+432432+// Parse URL into components
433433+parseUrl(uri) // Returns: { protocol, domain, path }
434434+```
435435+436436+### Error Handling
437437+438438+All IPC handlers use try-catch blocks and return structured responses:
439439+440440+```javascript
441441+// Success response
442442+{ success: true, data: {...}, id: '...' }
443443+444444+// Error response
445445+{ success: false, error: 'Error message' }
446446+```
447447+448448+Features should always check the `success` field before using `data`.
449449+450450+### Initialization
451451+452452+The datastore initializes automatically when the main process starts:
453453+454454+```javascript
455455+// index.js (main process)
456456+const initDatastore = () => {
457457+ console.log('main initializing datastore');
458458+459459+ // Create store with schema
460460+ datastoreStore = createStore().setTablesSchema(schema);
461461+462462+ // Create indexes for efficient queries
463463+ datastoreIndexes = createIndexes(datastoreStore);
464464+ for (const [indexName, indexDef] of Object.entries(indexes)) {
465465+ datastoreIndexes.setIndexDefinition(indexName, ...indexDef);
466466+ }
467467+468468+ // Create relationships for joins
469469+ datastoreRelationships = createRelationships(datastoreStore);
470470+ for (const [relName, relDef] of Object.entries(relationships)) {
471471+ datastoreRelationships.setRelationshipDefinition(relName, ...relDef);
472472+ }
473473+474474+ // Create metrics for aggregations
475475+ datastoreMetrics = createMetrics(datastoreStore);
476476+ for (const [metricName, metricDef] of Object.entries(metrics)) {
477477+ datastoreMetrics.setMetricDefinition(metricName, ...metricDef);
478478+ }
479479+480480+ console.log('main datastore initialized successfully');
481481+};
482482+483483+// Called during app ready
484484+app.whenReady().then(async () => {
485485+ initDatastore();
486486+ // ... rest of initialization
487487+});
488488+```
489489+490490+## Testing
491491+492492+A test suite verifies the IPC API works correctly:
493493+494494+```bash
495495+# Run the app in debug mode
496496+npm run debug
497497+498498+# The app automatically runs integration tests on startup
499499+# Check console for test results
500500+```
501501+502502+Test coverage includes:
503503+- Address creation, retrieval, updates
504504+- Visit tracking and queries
505505+- Content management
506506+- Table access
507507+- Statistics aggregation
508508+509509+## Future Considerations
510510+511511+### Persistence Layer
512512+513513+Currently, data exists only in memory. Future work includes:
514514+515515+1. **IndexedDB Persister** (Browser)
516516+ - Use TinyBase's `createIndexedDbPersister()`
517517+ - Automatic persistence to browser storage
518518+ - Good for development and testing
519519+520520+2. **SQLite Persister** (Desktop)
521521+ - Use TinyBase's SQL persisters
522522+ - Better performance for large datasets
523523+ - Native database queries
524524+525525+3. **File System Persister** (Desktop)
526526+ - Use TinyBase's `createFilePersister()`
527527+ - Human-readable JSON files
528528+ - Easy backup and migration
529529+530530+### Sync Implementation
531531+532532+The IPC architecture naturally supports synchronization:
533533+534534+1. **Local Sync**
535535+ - Multiple renderer processes sharing same datastore
536536+ - Already supported via IPC
537537+538538+2. **Cloud Sync**
539539+ - Modify IPC handlers to route to remote API
540540+ - Use TinyBase CRDT features for conflict resolution
541541+ - Implement offline-first with local cache
542542+543543+3. **Peer-to-Peer Sync**
544544+ - Use TinyBase's CRDT merge capabilities
545545+ - Sync between devices on local network
546546+547547+### Migration Path
548548+549549+To change storage backends:
550550+551551+1. Keep IPC API contract unchanged
552552+2. Implement new backend in main process
553553+3. Update IPC handlers to use new backend
554554+4. Features continue working without changes
555555+556556+Example: Migrating to SQLite:
557557+558558+```javascript
559559+// Old: TinyBase
560560+datastoreStore.setRow('addresses', id, row);
561561+562562+// New: SQLite (better-sqlite3)
563563+db.prepare('INSERT INTO addresses VALUES (?, ?, ...)').run(id, ...values);
564564+565565+// IPC handler updated, but api.datastore.addAddress() unchanged
566566+```
567567+568568+## Benefits Realized
569569+570570+1. **Clean Separation**: Storage logic completely isolated from UI code
571571+2. **Easy Testing**: Can mock `api.datastore` for unit tests
572572+3. **Consistent API**: Same patterns across all features
573573+4. **Type Safety**: Single source of truth for data structures
574574+5. **Performance**: Main process handles heavy data operations
575575+6. **Security**: Validated data access through controlled IPC
576576+7. **Flexibility**: Storage implementation can evolve independently
577577+578578+## References
579579+580580+- [TinyBase Documentation](https://tinybase.org)
581581+- [Electron IPC Documentation](https://www.electronjs.org/docs/latest/api/ipc-main)
582582+- [Datastore Schema](./datastore-schema.md)
583583+- [Datastore Research](./datastore-research.md)
584584+- [Integration Summary](./datastore-integration.md)
+446
notes/datastore-integration.md
···11+# Datastore Integration Summary
22+33+Date: 2025-11-12
44+Branch: datastore
55+66+**ARCHITECTURE NOTE**: This document originally described a `window.datastore` approach. The actual implementation uses an **IPC-based architecture** with the datastore in the main process. See [datastore-architecture.md](./datastore-architecture.md) for the complete architectural documentation.
77+88+## What Was Built
99+1010+### Core Datastore Module
1111+- ✅ **Full TinyBase implementation** with schema, indexes, relationships, metrics
1212+- ✅ **7 tables**: addresses, visits, content, tags, blobs, scripts_data, feeds
1313+- ✅ **IPC-based API** accessed via `api.datastore` in renderer processes
1414+- ✅ **Datastore in main process** for security, isolation, and portability
1515+- ✅ **Comprehensive testing** - all IPC handlers verified working
1616+- ✅ **Complete separation** between storage layer and features
1717+1818+### Files Created/Modified
1919+2020+**New Files:**
2121+- `app/datastore/schema.js` - TinyBase schema definitions
2222+- `app/datastore/config.js` - Configuration
2323+- `app/datastore/history.js` - Navigation history tracking helpers (uses IPC API)
2424+- `app/datastore/test-ipc.html` - IPC API test page
2525+- `notes/datastore-research.md` - Technology research & comparison
2626+- `notes/datastore-schema.md` - Detailed schema documentation
2727+- `notes/datastore-architecture.md` - **Complete architectural documentation**
2828+- `notes/datastore-integration.md` - This file
2929+3030+**Modified Files:**
3131+- `index.js` - **Datastore initialization in main process** (lines 115-230, 979-1230)
3232+- `preload.js` - **Expose api.datastore via IPC** (lines 210-242)
3333+- `app/index.js` - Import history tracking helpers, expose via window.datastoreHistory
3434+- `app/scripts/index.js` - Save script results using api.datastore IPC
3535+- `app/peeks/index.js` - Track peek navigation via history helpers
3636+- `app/slides/index.js` - Track slide navigation via history helpers
3737+- `package.json` - Added tinybase@0.7.2 dependency
3838+3939+## Integration Details
4040+4141+### 1. Scripts Feature Integration
4242+4343+**What it does:**
4444+- Saves all script extraction results to `scripts_data` table
4545+- Tracks changes between runs
4646+- Creates/links addresses for script URLs
4747+- Maintains full history of extracted values
4848+4949+**Implementation:**
5050+```javascript
5151+// In app/scripts/index.js
5252+const saveScriptResult = async (script, result) => {
5353+ // Find or create address using IPC
5454+ const addressesResult = await api.datastore.queryAddresses({});
5555+ let addressId;
5656+5757+ if (addressesResult.success) {
5858+ const existing = addressesResult.data.find(a => a.uri === script.address);
5959+ if (existing) {
6060+ addressId = existing.id;
6161+ } else {
6262+ const addResult = await api.datastore.addAddress(script.address, {
6363+ title: `Script: ${script.title}`
6464+ });
6565+ addressId = addResult.id;
6666+ }
6767+ }
6868+6969+ // Check for previous values using IPC
7070+ const prevResults = await api.datastore.getTable('scripts_data');
7171+ let changed = 1;
7272+7373+ if (prevResults.success) {
7474+ const scriptResults = Object.entries(prevResults.data)
7575+ .filter(([id, row]) => row.scriptId === script.id)
7676+ .sort((a, b) => b[1].extractedAt - a[1].extractedAt);
7777+ if (scriptResults.length > 0) {
7878+ changed = (result !== scriptResults[0][1].content) ? 1 : 0;
7979+ }
8080+ }
8181+8282+ // Save to datastore using IPC
8383+ await api.datastore.setRow('scripts_data', rowId, {
8484+ scriptId, scriptName, addressId, selector,
8585+ content: result, contentType: 'text',
8686+ extractedAt: Date.now(),
8787+ previousValue, changed
8888+ });
8989+};
9090+```
9191+9292+**Data captured:**
9393+- Script ID and name
9494+- Source address (with automatic address creation)
9595+- CSS selector used
9696+- Extracted content
9797+- Timestamp
9898+- Previous value for change detection
9999+- Changed flag
100100+101101+### 2. Navigation History Tracking
102102+103103+**What it does:**
104104+- Tracks every navigation from peeks and slides
105105+- Creates address records automatically
106106+- Records visit metadata (source, windowType, duration)
107107+- Updates visit counts and timestamps
108108+109109+**Implementation:**
110110+```javascript
111111+// In app/datastore/history.js
112112+export const trackNavigation = async (uri, options = {}) => {
113113+ // Get or create address using IPC
114114+ let addressId;
115115+ const addressesResult = await api.datastore.queryAddresses({});
116116+117117+ if (addressesResult.success) {
118118+ const existing = addressesResult.data.find(addr => addr.uri === uri);
119119+120120+ if (existing) {
121121+ addressId = existing.id;
122122+ } else {
123123+ const addResult = await api.datastore.addAddress(uri, {
124124+ title: options.title || '',
125125+ mimeType: options.mimeType || 'text/html'
126126+ });
127127+ addressId = addResult.id;
128128+ }
129129+ }
130130+131131+ // Add visit record using IPC
132132+ const visitResult = await api.datastore.addVisit(addressId, {
133133+ source: options.source || 'direct',
134134+ sourceId: options.sourceId || '',
135135+ windowType: options.windowType || 'main',
136136+ duration: options.duration || 0,
137137+ metadata: JSON.stringify(options.metadata || {})
138138+ });
139139+140140+ return { visitId: visitResult.id, addressId };
141141+};
142142+```
143143+144144+**Data captured:**
145145+- Full URI and parsed components (protocol, domain, path)
146146+- Page title
147147+- Visit timestamp
148148+- Source feature (peek, slide, direct)
149149+- Source ID (peek_3, slide_left, etc.)
150150+- Window type (modal, persistent, main)
151151+- Visit count and last visit time
152152+153153+### 3. Peeks Integration
154154+155155+**Integration point:** `app/peeks/index.js:32-44`
156156+157157+```javascript
158158+windows.openModalWindow(item.address, params)
159159+ .then(result => {
160160+ // Track navigation in datastore
161161+ if (window.datastoreHistory) {
162162+ window.datastoreHistory.trackNavigation(item.address, {
163163+ source: 'peek',
164164+ sourceId: `peek_${item.keyNum}`,
165165+ windowType: 'modal',
166166+ title: item.title
167167+ });
168168+ }
169169+ });
170170+```
171171+172172+**Tracks:**
173173+- Which peek was opened (peek_0 through peek_9)
174174+- URL visited
175175+- Modal window type
176176+- Creates address if first visit
177177+178178+### 4. Slides Integration
179179+180180+**Integration point:** `app/slides/index.js:147-155`
181181+182182+```javascript
183183+windows.openModalWindow(item.address, params).then(result => {
184184+ if (result.success) {
185185+ // Track navigation in datastore
186186+ if (window.datastoreHistory) {
187187+ window.datastoreHistory.trackNavigation(item.address, {
188188+ source: 'slide',
189189+ sourceId: `slide_${item.screenEdge}`,
190190+ windowType: 'modal',
191191+ title: item.title
192192+ });
193193+ }
194194+ }
195195+});
196196+```
197197+198198+**Tracks:**
199199+- Which slide direction (slide_left, slide_right, slide_up, slide_down)
200200+- URL visited
201201+- Modal window type
202202+- Reuses existing address records
203203+204204+## API Available to Features
205205+206206+### Datastore Core API (IPC-based)
207207+208208+All methods are async and return Promises with structure: `{ success: boolean, data?: any, error?: string, id?: string }`
209209+210210+```javascript
211211+// Access via api.datastore (exposed through preload.js)
212212+213213+// Addresses
214214+await api.datastore.addAddress(uri, options)
215215+await api.datastore.getAddress(id)
216216+await api.datastore.updateAddress(id, updates)
217217+await api.datastore.queryAddresses(filter)
218218+219219+// Visits
220220+await api.datastore.addVisit(addressId, options)
221221+await api.datastore.queryVisits(filter)
222222+223223+// Content
224224+await api.datastore.addContent(options)
225225+226226+// Direct table access
227227+await api.datastore.getTable(tableName)
228228+await api.datastore.setRow(tableName, rowId, rowData)
229229+230230+// Stats
231231+await api.datastore.getStats()
232232+```
233233+234234+**Note**: All IPC operations are asynchronous. Always use `await` or `.then()` and check the `success` field before using `data`.
235235+236236+### History Helper API
237237+238238+```javascript
239239+// Access via window.datastoreHistory
240240+241241+// Track navigation
242242+window.datastoreHistory.trackNavigation(uri, {
243243+ source, sourceId, windowType, duration, metadata
244244+})
245245+246246+// Query history
247247+window.datastoreHistory.getHistory(filter)
248248+249249+// Get frequent addresses
250250+window.datastoreHistory.getFrequentAddresses(limit)
251251+252252+// Get recent addresses
253253+window.datastoreHistory.getRecentAddresses(limit)
254254+```
255255+256256+## Example Queries (Using IPC API)
257257+258258+### Get Recent Navigation History
259259+```javascript
260260+const recentVisits = await window.datastoreHistory.getHistory({ limit: 20 });
261261+// Returns: [{ id, addressId, timestamp, source, address: {...} }, ...]
262262+```
263263+264264+### Find Most Visited Sites
265265+```javascript
266266+const result = await api.datastore.queryAddresses({
267267+ sortBy: 'visitCount',
268268+ limit: 10
269269+});
270270+271271+if (result.success) {
272272+ const frequent = result.data;
273273+ // Use frequent addresses...
274274+}
275275+```
276276+277277+### Get All Script Results That Changed
278278+```javascript
279279+const tableResult = await api.datastore.getTable('scripts_data');
280280+281281+if (tableResult.success) {
282282+ const changedResults = Object.entries(tableResult.data)
283283+ .filter(([id, row]) => row.changed === 1)
284284+ .map(([id, row]) => ({ id, ...row }));
285285+}
286286+```
287287+288288+### Get Script History for Specific Script
289289+```javascript
290290+const scriptId = 'my-script-id';
291291+const tableResult = await api.datastore.getTable('scripts_data');
292292+293293+if (tableResult.success) {
294294+ const history = Object.entries(tableResult.data)
295295+ .filter(([id, row]) => row.scriptId === scriptId)
296296+ .sort((a, b) => b[1].extractedAt - a[1].extractedAt)
297297+ .map(([id, row]) => ({ id, ...row }));
298298+}
299299+```
300300+301301+### Get All Markdown Content
302302+```javascript
303303+const result = await api.datastore.getTable('content');
304304+305305+if (result.success) {
306306+ const markdown = Object.entries(result.data)
307307+ .filter(([id, row]) => row.contentType === 'markdown')
308308+ .map(([id, row]) => ({ id, ...row }));
309309+}
310310+```
311311+312312+### Get Starred Addresses
313313+```javascript
314314+const result = await api.datastore.queryAddresses({ starred: 1 });
315315+316316+if (result.success) {
317317+ const starred = result.data;
318318+}
319319+```
320320+321321+## What's Working
322322+323323+✅ **Datastore initialization** - Loads on app startup
324324+✅ **Schema enforcement** - TinyBase validates all data
325325+✅ **Indexes** - Fast queries by domain, tag, timestamp, etc.
326326+✅ **Relationships** - Visits→Addresses, Blobs→Content, etc.
327327+✅ **Metrics** - Automatic aggregations (counts, averages)
328328+✅ **Scripts tracking** - All extractions saved with history
329329+✅ **Navigation tracking** - Peeks and slides log visits
330330+✅ **Automatic address creation** - No duplicates, proper linking
331331+✅ **Change detection** - Scripts know when data changes
332332+✅ **Visit statistics** - Count and last visit time updated
333333+334334+## What's NOT Done Yet
335335+336336+⏭️ **Binary file storage** - Blobs table schema exists but no file I/O
337337+⏭️ **Markdown sync** - Content table ready but no filesystem bidirectional sync
338338+⏭️ **Persistence** - Currently in-memory only (need IndexedDB or SQLite persister)
339339+⏭️ **Groups feature** - Not integrated yet
340340+⏭️ **Cmd feature** - Not integrated yet
341341+⏭️ **Navigation in main windows** - Only tracking peek/slide navigation
342342+⏭️ **Duration tracking** - Visits record duration=0, needs window close tracking
343343+⏭️ **Scroll depth tracking** - Schema ready but not implemented
344344+⏭️ **Search/filtering UI** - No UI to browse datastore yet
345345+346346+## Next Steps
347347+348348+### Phase 1: Persistence (Critical)
349349+1. Add IndexedDB persister (TinyBase has built-in support)
350350+2. Auto-save on changes
351351+3. Load from IndexedDB on startup
352352+4. Verify data persists across app restarts
353353+354354+### Phase 2: Enhanced Tracking
355355+1. Track duration when windows close
356356+2. Track scroll depth and interaction
357357+3. Add navigation tracking to groups/cmd features
358358+4. Track main window navigation (not just peeks/slides)
359359+360360+### Phase 3: Binary Storage
361361+1. Implement filesystem storage for blobs
362362+2. Add image/video download capability
363363+3. Generate thumbnails
364364+4. Link blobs to addresses and content
365365+366366+### Phase 4: Filesystem Sync
367367+1. Implement bidirectional markdown sync
368368+2. Watch filesystem for changes
369369+3. Sync content table to markdown files
370370+4. Handle conflicts
371371+372372+### Phase 5: UI & Features
373373+1. Build history browser UI
374374+2. Add search interface
375375+3. Create feeds viewer
376376+4. Implement tagging UI
377377+5. Show stats dashboard
378378+379379+## Testing
380380+381381+### IPC API Testing
382382+383383+A test page was created at `app/datastore/test-ipc.html` to verify all IPC handlers work correctly.
384384+385385+To test in the app:
386386+1. Start Peek: `npm run debug`
387387+2. Open a peek (Alt+0-9) - navigation tracked via IPC
388388+3. Open a slide (Alt+arrows) - navigation tracked via IPC
389389+4. Configure and run a script - results saved via IPC
390390+5. Check main process console for IPC handler logs
391391+392392+### Verified Working
393393+- ✅ `datastore-add-address` - Address creation with URL parsing
394394+- ✅ `datastore-get-address` - Address retrieval
395395+- ✅ `datastore-update-address` - Address updates
396396+- ✅ `datastore-query-addresses` - Query with filters and sorting
397397+- ✅ `datastore-add-visit` - Visit tracking with stat updates
398398+- ✅ `datastore-query-visits` - Visit history queries
399399+- ✅ `datastore-add-content` - Content creation
400400+- ✅ `datastore-get-table` - Table access
401401+- ✅ `datastore-set-row` - Direct row manipulation
402402+- ✅ `datastore-get-stats` - Statistics aggregation
403403+404404+## Performance Notes
405405+406406+- **In-memory storage**: Fast but needs persistence
407407+- **Small overhead**: TinyBase is 5-11kB gzipped
408408+- **Reactive**: Changes trigger index/metric updates automatically
409409+- **Scalable**: Tested with addresses, visits, content - all working
410410+411411+## Architecture Benefits
412412+413413+✅ **Complete separation**: Datastore isolated in main process, features access via IPC
414414+✅ **Runtime portable**: Can migrate to Tauri without changing feature code
415415+✅ **Backend flexible**: Can swap TinyBase for SQLite, cloud, etc. without feature changes
416416+✅ **Cloud ready**: Same API can route to local or remote datastores
417417+✅ **Mobile ready**: Architecture supports future mobile app development
418418+✅ **Secure**: Datastore logic in trusted process with validated IPC access
419419+✅ **Type safety**: Schema validation prevents bad data
420420+✅ **Reactive**: Indexes and metrics update automatically (in main process)
421421+✅ **Testable**: IPC handlers individually tested and verified
422422+✅ **Extensible**: Easy to add new IPC handlers and tables
423423+✅ **Sync ready**: Built-in CRDT support for future multi-device sync
424424+425425+**Key Decision**: IPC-based architecture chosen over direct library access for maximum portability and flexibility. See [datastore-architecture.md](./datastore-architecture.md) for complete rationale.
426426+427427+## Conclusion
428428+429429+The IPC-based datastore integration is **functional and working** for:
430430+- Script data extraction and history (via async IPC)
431431+- Navigation history from peeks and slides (via history helpers)
432432+- Address management with automatic deduplication
433433+- Visit tracking with statistics
434434+- Complete isolation between storage and UI layers
435435+436436+The foundation is solid with:
437437+- ✅ All IPC handlers tested and verified
438438+- ✅ Async/await throughout for clean code
439439+- ✅ Error handling with structured responses
440440+- ✅ Main process datastore initialization
441441+- ✅ Preload script API exposure
442442+- ✅ Feature integration complete
443443+444444+Ready for next phases: **persistence**, enhanced tracking, and UI development.
445445+446446+**For complete architectural details**, see [datastore-architecture.md](./datastore-architecture.md).
+307
notes/datastore-research.md
···11+# Datastore Technology Research & Comparison
22+33+Research conducted: 2025-11-12
44+55+## Requirements Summary (from datastore.md)
66+77+### Primary Requirements
88+- Store various data types with metadata (tags, MIME types, annotations)
99+- Store feeds (navigation history, timeseries data, custom feeds)
1010+- Binary file support (images, videos) with filesystem references
1111+- Bidirectional filesystem sync for markdown/text files
1212+- Runtime/browser engine agnostic
1313+- Designed for sync (multi-device, cloud, collaboration)
1414+- Query capabilities (by type, tags, time, etc.)
1515+1616+### Performance Requirements
1717+- Fast local queries
1818+- Efficient indexing
1919+- Handle potentially large datasets (navigation history)
2020+- Reactive updates for UI
2121+2222+---
2323+2424+## Technology Comparison
2525+2626+### 1. TinyBase
2727+2828+**Overview**: Reactive data store with built-in sync engine
2929+3030+**Pros:**
3131+- ✅ **Tiny size**: 5.3kB-11.7kB gzipped, zero dependencies
3232+- ✅ **Reactive queries**: Built-in reactivity with granular listeners
3333+- ✅ **Native CRDT support**: Deterministic sync across clients
3434+- ✅ **Multiple persistence options**: IndexedDB, SQLite, PostgreSQL, files, OPFS
3535+- ✅ **Schema support**: Optional typed schemas with constraints and defaults
3636+- ✅ **Advanced queries**: TinyQL language, indexes, metrics, relationships
3737+- ✅ **Sync built-in**: WebSocket, BroadcastChannel, custom mediums
3838+- ✅ **Can integrate with**: Yjs, Automerge, CR-SQLite for additional CRDT options
3939+- ✅ **100% test coverage**: Well-tested and documented
4040+4141+**Cons:**
4242+- ⚠️ In-memory first (requires persistence layer configuration)
4343+- ⚠️ Newer library (less battle-tested than SQLite)
4444+- ⚠️ Learning curve for TinyQL query language
4545+- ⚠️ Limited ecosystem compared to SQL
4646+4747+**Fit for Peek:**
4848+- **Data modeling**: ★★★★★ (supports both key-value and tabular)
4949+- **Metadata/tags**: ★★★★★ (schemas, indexes, flexible structure)
5050+- **Navigation history**: ★★★★★ (indexes, metrics for aggregations)
5151+- **Binary files**: ★★★☆☆ (would need separate blob storage + references)
5252+- **Filesystem sync**: ★★★☆☆ (can persist to files, bidirectional needs custom logic)
5353+- **Collaboration**: ★★★★★ (native CRDT support, built-in sync)
5454+- **Runtime agnostic**: ★★★★★ (works anywhere JS runs)
5555+- **Performance**: ★★★★★ (optimized, minimal overhead)
5656+5757+**Best for**: Reactive UIs, real-time collaboration, local-first apps with sync
5858+5959+---
6060+6161+### 2. Automerge
6262+6363+**Overview**: JSON-like CRDT for collaborative applications
6464+6565+**Pros:**
6666+- ✅ **Built for collaboration**: Automatic conflict-free merging
6767+- ✅ **Offline-first**: Full functionality offline, queues changes
6868+- ✅ **Versioning**: Complete change history, branching, time travel
6969+- ✅ **High performance**: Compressed columnar storage, handles millions of changes
7070+- ✅ **Automerge Repo**: Built-in sync server backend
7171+- ✅ **Multi-language**: Rust core with JS, Swift, Python, C, Java bindings
7272+- ✅ **Framework integration**: React, Prosemirror, CodeMirror plugins
7373+- ✅ **Actively maintained**: Recent 3.0 release with 10x memory reduction
7474+7575+**Cons:**
7676+- ⚠️ **Not a database**: More of a data structure/sync protocol
7777+- ⚠️ **Requires additional storage**: Need separate persistence layer
7878+- ⚠️ **Learning curve**: CRDT concepts and document-based model
7979+- ⚠️ **Query limitations**: No SQL-like queries, need to build on top
8080+- ⚠️ **Larger size**: More overhead than minimal solutions
8181+- ⚠️ **Best for documents**: JSON-like data, less suited for relational queries
8282+8383+**Fit for Peek:**
8484+- **Data modeling**: ★★★☆☆ (JSON-like, need to structure carefully)
8585+- **Metadata/tags**: ★★★★☆ (flexible JSON structure)
8686+- **Navigation history**: ★★★☆☆ (can store, but querying is manual)
8787+- **Binary files**: ★☆☆☆☆ (not designed for blobs)
8888+- **Filesystem sync**: ★★★★☆ (excellent sync, but need custom file integration)
8989+- **Collaboration**: ★★★★★ (core strength, best-in-class)
9090+- **Runtime agnostic**: ★★★★★ (Rust core, multiple language bindings)
9191+- **Performance**: ★★★★☆ (good for sync, less optimized for queries)
9292+9393+**Best for**: Collaborative documents, offline-first sync, version control needs
9494+9595+---
9696+9797+### 3. SQLite (via better-sqlite3)
9898+9999+**Overview**: Traditional relational database, synchronous Node.js bindings
100100+101101+**Pros:**
102102+- ✅ **Battle-tested**: Decades of production use, extremely reliable
103103+- ✅ **SQL queries**: Powerful relational queries, joins, aggregations
104104+- ✅ **Fast**: 2000+ queries/sec possible with proper indexing
105105+- ✅ **Small overhead**: Single file database, minimal dependencies
106106+- ✅ **Full-text search**: Built-in FTS5 for text searching
107107+- ✅ **Transactions**: ACID compliance, WAL mode for performance
108108+- ✅ **Synchronous API**: Simpler than async (better-sqlite3)
109109+- ✅ **Widely known**: Easier to find developers/documentation
110110+- ✅ **JSON support**: JSON1 extension for flexible data
111111+112112+**Cons:**
113113+- ⚠️ **No built-in sync**: Need to build custom sync layer
114114+- ⚠️ **No CRDT support**: Conflicts require manual resolution
115115+- ⚠️ **Not reactive**: Need to build change listeners
116116+- ⚠️ **File locking**: Single writer, can cause issues with sync
117117+- ⚠️ **Electron specific**: Need rebuild for Electron compatibility
118118+- ⚠️ **Main thread blocking**: Synchronous operations can freeze UI
119119+120120+**Fit for Peek:**
121121+- **Data modeling**: ★★★★★ (relational, flexible schemas)
122122+- **Metadata/tags**: ★★★★★ (relations, indexes, JSON fields)
123123+- **Navigation history**: ★★★★★ (perfect for timeseries queries)
124124+- **Binary files**: ★★★★☆ (can store blobs or references efficiently)
125125+- **Filesystem sync**: ★★☆☆☆ (can persist, but bidirectional sync is complex)
126126+- **Collaboration**: ★☆☆☆☆ (no native sync, requires significant custom work)
127127+- **Runtime agnostic**: ★★★☆☆ (SQLite is portable, but bindings are platform-specific)
128128+- **Performance**: ★★★★★ (excellent for local queries)
129129+130130+**Best for**: Complex queries, relational data, local-only or simple sync needs
131131+132132+---
133133+134134+### 4. Dexie.js
135135+136136+**Overview**: IndexedDB wrapper with promise-based API
137137+138138+**Pros:**
139139+- ✅ **Simple API**: Much easier than raw IndexedDB
140140+- ✅ **Live queries**: Reactive liveQuery() function
141141+- ✅ **Advanced queries**: Case-insensitive search, prefix matching, OR operations
142142+- ✅ **Browser-native**: Uses IndexedDB, no external dependencies
143143+- ✅ **Real classes**: Map classes to tables
144144+- ✅ **Performance optimized**: Bulk operations, batching
145145+- ✅ **Cross-platform**: Browsers, Electron, Capacitor, PWAs
146146+- ✅ **Widely used**: 100,000+ projects, battle-tested
147147+- ✅ **Dexie Cloud**: Optional commercial sync add-on
148148+- ✅ **Bug workarounds**: Handles IndexedDB inconsistencies
149149+150150+**Cons:**
151151+- ⚠️ **IndexedDB limitations**: Key-value store, limited query capabilities
152152+- ⚠️ **No built-in sync**: Need Dexie Cloud (commercial) or custom solution
153153+- ⚠️ **Browser-focused**: Less ideal for Node.js/backend
154154+- ⚠️ **Larger bundle**: 33.1kB minified+gzipped
155155+- ⚠️ **Schema migrations**: Can be tricky with IndexedDB
156156+157157+**Fit for Peek:**
158158+- **Data modeling**: ★★★★☆ (key-value with indexes, flexible)
159159+- **Metadata/tags**: ★★★★☆ (can index and query efficiently)
160160+- **Navigation history**: ★★★★☆ (good for timeseries with indexes)
161161+- **Binary files**: ★★★★☆ (IndexedDB can store blobs)
162162+- **Filesystem sync**: ★★☆☆☆ (browser-focused, no native file sync)
163163+- **Collaboration**: ★★☆☆☆ (Dexie Cloud or custom sync needed)
164164+- **Runtime agnostic**: ★★★☆☆ (browser/Electron focused)
165165+- **Performance**: ★★★★☆ (good for browser workloads)
166166+167167+**Best for**: Browser-based apps, Electron apps not needing server sync
168168+169169+---
170170+171171+### 5. PouchDB
172172+173173+**Overview**: CouchDB-compatible database for browser and Node.js
174174+175175+**Status**: ⚠️ **Declining ecosystem** - Removed from RxDB, fewer active projects
176176+177177+**Pros:**
178178+- ✅ **CouchDB sync**: Seamless replication with CouchDB servers
179179+- ✅ **Offline-first**: Designed for offline operation
180180+- ✅ **Multi-platform**: Browser (IndexedDB), Node.js (LevelDB)
181181+- ✅ **Change notifications**: Listen to database changes
182182+- ✅ **Document-based**: Flexible JSON documents
183183+184184+**Cons:**
185185+- ⚠️ **Declining support**: Being phased out of modern projects
186186+- ⚠️ **Performance issues**: Slower than alternatives
187187+- ⚠️ **Large bundle size**: More overhead than newer solutions
188188+- ⚠️ **Complex replication**: CouchDB protocol has quirks
189189+- ⚠️ **Limited queries**: Map-reduce only, no SQL-like queries
190190+191191+**Recommendation**: ❌ **Not recommended** for new projects in 2025
192192+193193+---
194194+195195+## Hybrid Approaches
196196+197197+### Option A: TinyBase + File Storage
198198+- Use TinyBase for structured data, indexes, queries
199199+- Use filesystem for binary files (referenced by hash/ID in TinyBase)
200200+- Leverage TinyBase's native CRDT for sync
201201+- Add custom filesystem sync for markdown bidirectional sync
202202+203203+**Pros**: Best of both worlds, reactive, built-in sync
204204+**Cons**: Need to manage two storage systems
205205+206206+### Option B: SQLite + Automerge
207207+- Use SQLite for local queries and storage
208208+- Use Automerge for sync protocol
209209+- Translate between SQLite and Automerge documents
210210+211211+**Pros**: Powerful queries + best-in-class sync
212212+**Cons**: Complex integration, two systems to maintain
213213+214214+### Option C: TinyBase with SQLite Persistence
215215+- Use TinyBase API and reactivity
216216+- Persist to SQLite for durability and querying
217217+- Best of reactive store + SQL power
218218+219219+**Pros**: Reactive + SQL + sync capabilities
220220+**Cons**: Some complexity in persistence layer
221221+222222+---
223223+224224+## Recommendations
225225+226226+### For Peek v1 (MVP): **TinyBase + File Storage**
227227+228228+**Rationale:**
229229+1. **Meets all requirements**: Handles structured data, metadata, tags, history
230230+2. **Sync built-in**: Native CRDT support for future multi-device sync
231231+3. **Reactive**: Perfect for Peek's modal, event-driven UI
232232+4. **Small footprint**: Minimal bundle size (5-11kB)
233233+5. **Flexible persistence**: Can use SQLite backend if needed later
234234+6. **Runtime agnostic**: Works anywhere JS runs
235235+7. **Active development**: Well-maintained, modern codebase
236236+237237+**Architecture:**
238238+```
239239+TinyBase Store
240240+├── addresses (table)
241241+├── visits (table)
242242+├── notes (table)
243243+├── tags (table)
244244+└── blobs (table - metadata only)
245245+246246+Filesystem
247247+└── blobs/
248248+ ├── {hash}.jpg
249249+ ├── {hash}.png
250250+ └── {hash}.pdf
251251+252252+Markdown Sync
253253+└── notes/
254254+ ├── note1.md (bidirectional sync)
255255+ └── note2.md
256256+```
257257+258258+**Implementation phases:**
259259+1. Start with TinyBase in-memory + IndexedDB persistence
260260+2. Add file storage for binaries
261261+3. Add markdown filesystem sync
262262+4. Add SQLite persistence option for performance
263263+5. Enable sync features for multi-device support
264264+265265+### For Future (v2+): **Add Automerge for Advanced Collaboration**
266266+267267+If Peek expands to real-time collaboration scenarios:
268268+- Use TinyBase for local store and queries
269269+- Add Automerge for document-level collaboration
270270+- Use Automerge Repo for sync infrastructure
271271+272272+---
273273+274274+## Decision Matrix
275275+276276+| Feature | TinyBase | Automerge | SQLite | Dexie | PouchDB |
277277+|---------|----------|-----------|--------|-------|---------|
278278+| Local Queries | ★★★★★ | ★★☆☆☆ | ★★★★★ | ★★★★☆ | ★★☆☆☆ |
279279+| Sync Built-in | ★★★★★ | ★★★★★ | ★☆☆☆☆ | ★★☆☆☆ | ★★★★☆ |
280280+| Reactivity | ★★★★★ | ★★★☆☆ | ★☆☆☆☆ | ★★★★☆ | ★★★☆☆ |
281281+| Binary Storage | ★★★☆☆ | ★☆☆☆☆ | ★★★★☆ | ★★★★☆ | ★★★☆☆ |
282282+| Size/Performance | ★★★★★ | ★★★★☆ | ★★★★★ | ★★★★☆ | ★★☆☆☆ |
283283+| Ecosystem | ★★★☆☆ | ★★★★☆ | ★★★★★ | ★★★★☆ | ★★☆☆☆ |
284284+| Learning Curve | ★★★☆☆ | ★★☆☆☆ | ★★★★★ | ★★★★☆ | ★★★☆☆ |
285285+| **Total** | **29/35** | **23/35** | **27/35** | **28/35** | **20/35** |
286286+287287+---
288288+289289+## Next Steps
290290+291291+1. ✅ Complete research phase
292292+2. ⏭️ Prototype TinyBase with basic CRUD operations
293293+3. ⏭️ Test with Peek use case (storing URLs from peeks)
294294+4. ⏭️ Evaluate performance with realistic data volumes
295295+5. ⏭️ Design schema for addresses, visits, notes, metadata
296296+6. ⏭️ Implement file storage integration
297297+7. ⏭️ Build datastore API for Peek features
298298+299299+---
300300+301301+## References
302302+303303+- TinyBase: https://tinybase.org/
304304+- Automerge: https://automerge.org/
305305+- better-sqlite3: https://github.com/WiseLibs/better-sqlite3
306306+- Dexie.js: https://dexie.org/
307307+- PouchDB: https://pouchdb.com/
+892
notes/datastore-schema.md
···11+# Peek Datastore Schema Design
22+33+Version: 1.0
44+Date: 2025-11-12
55+Technology: TinyBase
66+77+---
88+99+## Overview
1010+1111+This schema design uses TinyBase's tabular data model with the following principles:
1212+- Each table stores a specific entity type
1313+- Relationships via ID references
1414+- Flexible metadata using JSON cells
1515+- Indexes for common queries
1616+- Designed for reactivity and efficient queries
1717+1818+---
1919+2020+## Core Tables
2121+2222+### 1. `addresses` - URL/URI Index
2323+2424+Stores all web addresses and URIs that Peek interacts with.
2525+2626+```javascript
2727+{
2828+ rowId: string, // Auto-generated or hash of URI
2929+ cells: {
3030+ uri: string, // The full URI (required)
3131+ protocol: string, // http, https, ipfs, etc.
3232+ domain: string, // Extracted domain for querying
3333+ path: string, // URL path component
3434+ title: string, // Page title (if known)
3535+ mimeType: string, // Content MIME type
3636+ favicon: string, // Favicon URL or data URI
3737+ description: string, // Meta description or user note
3838+ tags: string, // Comma-separated tag IDs (for indexing)
3939+ metadata: string, // JSON string for flexible metadata
4040+ createdAt: number, // Unix timestamp (ms)
4141+ updatedAt: number, // Unix timestamp (ms)
4242+ lastVisitAt: number, // Unix timestamp of most recent visit
4343+ visitCount: number, // Total number of visits
4444+ starred: number, // 0 or 1 (boolean for indexing)
4545+ archived: number // 0 or 1 (boolean for indexing)
4646+ }
4747+}
4848+```
4949+5050+**Indexes:**
5151+- `byDomain` - Group by domain for domain-level queries
5252+- `byProtocol` - Filter by protocol type
5353+- `byTag` - Index on tags field for tag filtering
5454+- `byStarred` - Quick access to starred addresses
5555+- `byLastVisit` - Sort by most recent visit
5656+5757+**Example Row:**
5858+```javascript
5959+{
6060+ 'addr_1234': {
6161+ uri: 'https://example.com/article',
6262+ protocol: 'https',
6363+ domain: 'example.com',
6464+ path: '/article',
6565+ title: 'Example Article',
6666+ mimeType: 'text/html',
6767+ favicon: 'https://example.com/favicon.ico',
6868+ description: 'An interesting article',
6969+ tags: 'tag_1,tag_5',
7070+ metadata: '{"author":"John","lang":"en"}',
7171+ createdAt: 1699564800000,
7272+ updatedAt: 1699564800000,
7373+ lastVisitAt: 1699651200000,
7474+ visitCount: 5,
7575+ starred: 1,
7676+ archived: 0
7777+ }
7878+}
7979+```
8080+8181+---
8282+8383+### 2. `visits` - Navigation History
8484+8585+Tracks every visit to an address with temporal data.
8686+8787+```javascript
8888+{
8989+ rowId: string, // Auto-generated unique ID
9090+ cells: {
9191+ addressId: string, // Reference to addresses table (required)
9292+ timestamp: number, // Unix timestamp when visit occurred
9393+ duration: number, // Time spent in milliseconds (0 if unknown)
9494+ source: string, // How arrived: 'peek', 'slide', 'direct', 'link', etc.
9595+ sourceId: string, // ID of source feature if applicable
9696+ windowType: string, // 'modal', 'persistent', 'main', etc.
9797+ metadata: string, // JSON string for flexible data
9898+ scrollDepth: number, // Percentage scrolled (0-100)
9999+ interacted: number // 0 or 1 (clicked, typed, etc.)
100100+ }
101101+}
102102+```
103103+104104+**Indexes:**
105105+- `byAddress` - Group visits by address
106106+- `byTimestamp` - Sort chronologically
107107+- `bySource` - Filter by entry source
108108+- `byDate` - Index by date (derived from timestamp)
109109+110110+**Example Row:**
111111+```javascript
112112+{
113113+ 'visit_5678': {
114114+ addressId: 'addr_1234',
115115+ timestamp: 1699651200000,
116116+ duration: 45000,
117117+ source: 'peek',
118118+ sourceId: 'peek_3',
119119+ windowType: 'modal',
120120+ metadata: '{"referrer":"addr_9999"}',
121121+ scrollDepth: 80,
122122+ interacted: 1
123123+ }
124124+}
125125+```
126126+127127+---
128128+129129+### 3. `content` - Text Content
130130+131131+Stores any text-based content: notes, CSV data, plain text, markdown documents, code snippets, etc.
132132+May or may not be linked to addresses.
133133+134134+```javascript
135135+{
136136+ rowId: string, // Auto-generated unique ID
137137+ cells: {
138138+ title: string, // Content title or description
139139+ content: string, // The actual text content
140140+ mimeType: string, // text/markdown, text/plain, text/csv, text/html, application/json, etc.
141141+ contentType: string, // Coarse type for easier querying: 'markdown', 'plain', 'csv', 'json', 'html', 'code'
142142+ language: string, // Language/syntax if code (js, py, etc.) or human language (en, es)
143143+ encoding: string, // Character encoding (default: utf-8)
144144+ tags: string, // Comma-separated tag IDs
145145+ addressRefs: string, // Comma-separated address IDs this content references or was sourced from
146146+ parentId: string, // Parent content ID for hierarchies (optional)
147147+ metadata: string, // JSON string for flexible metadata (headers for CSV, etc.)
148148+ createdAt: number, // Unix timestamp
149149+ updatedAt: number, // Unix timestamp
150150+ syncPath: string, // Filesystem path if synced (e.g., 'content/data.csv', 'notes/note.md')
151151+ synced: number, // 0 or 1 - whether synced to filesystem
152152+ starred: number, // 0 or 1
153153+ archived: number // 0 or 1
154154+ }
155155+}
156156+```
157157+158158+**Indexes:**
159159+- `byTag` - Filter by tags
160160+- `byContentType` - Filter by content type (markdown, csv, plain, etc.)
161161+- `byMimeType` - Filter by specific MIME type
162162+- `byAddress` - Content referencing specific addresses
163163+- `bySynced` - Find filesystem-synced content
164164+- `byUpdated` - Sort by most recently updated
165165+166166+**Example Rows:**
167167+168168+*Markdown note:*
169169+```javascript
170170+{
171171+ 'content_9012': {
172172+ title: 'Meeting Notes - Project Kick-off',
173173+ content: '# Project Kick-off\n\n- Discuss goals\n- Set timeline',
174174+ mimeType: 'text/markdown',
175175+ contentType: 'markdown',
176176+ language: 'en',
177177+ encoding: 'utf-8',
178178+ tags: 'tag_2,tag_8',
179179+ addressRefs: 'addr_1234,addr_5678',
180180+ parentId: '',
181181+ metadata: '{"mood":"productive","location":"office"}',
182182+ createdAt: 1699564800000,
183183+ updatedAt: 1699651200000,
184184+ syncPath: 'content/meeting-2024-11-12.md',
185185+ synced: 1,
186186+ starred: 0,
187187+ archived: 0
188188+ }
189189+}
190190+```
191191+192192+*CSV data:*
193193+```javascript
194194+{
195195+ 'content_9013': {
196196+ title: 'Product Price List',
197197+ content: 'product,price,stock\nWidget,19.99,150\nGadget,29.99,87',
198198+ mimeType: 'text/csv',
199199+ contentType: 'csv',
200200+ language: '',
201201+ encoding: 'utf-8',
202202+ tags: 'tag_5',
203203+ addressRefs: 'addr_shop',
204204+ parentId: '',
205205+ metadata: '{"delimiter":"comma","hasHeader":true,"columns":3}',
206206+ createdAt: 1699564800000,
207207+ updatedAt: 1699651200000,
208208+ syncPath: 'content/prices.csv',
209209+ synced: 1,
210210+ starred: 0,
211211+ archived: 0
212212+ }
213213+}
214214+```
215215+216216+*Code snippet:*
217217+```javascript
218218+{
219219+ 'content_9014': {
220220+ title: 'Auth Helper Function',
221221+ content: 'function authenticate(user, pass) {\n return hash(pass) === user.hash;\n}',
222222+ mimeType: 'text/javascript',
223223+ contentType: 'code',
224224+ language: 'javascript',
225225+ encoding: 'utf-8',
226226+ tags: 'tag_7',
227227+ addressRefs: '',
228228+ parentId: '',
229229+ metadata: '{"syntax":"js","lines":3}',
230230+ createdAt: 1699564800000,
231231+ updatedAt: 1699564800000,
232232+ syncPath: '',
233233+ synced: 0,
234234+ starred: 1,
235235+ archived: 0
236236+ }
237237+}
238238+```
239239+240240+---
241241+242242+### 4. `tags` - Tag Taxonomy
243243+244244+Hierarchical tag system for organizing all entities.
245245+246246+```javascript
247247+{
248248+ rowId: string, // Auto-generated unique ID
249249+ cells: {
250250+ name: string, // Tag name (required, unique)
251251+ slug: string, // URL-safe version of name
252252+ color: string, // Hex color for UI (#FF5733)
253253+ parentId: string, // Parent tag ID for hierarchies
254254+ description: string, // Tag description
255255+ metadata: string, // JSON string for flexible metadata
256256+ createdAt: number, // Unix timestamp
257257+ updatedAt: number, // Unix timestamp
258258+ usageCount: number // Cached count of how many times used
259259+ }
260260+}
261261+```
262262+263263+**Indexes:**
264264+- `byName` - Lookup by name
265265+- `byParent` - Find child tags
266266+- `byUsage` - Sort by popularity
267267+268268+**Example Row:**
269269+```javascript
270270+{
271271+ 'tag_1': {
272272+ name: 'Work',
273273+ slug: 'work',
274274+ color: '#3498db',
275275+ parentId: '',
276276+ description: 'Work-related content',
277277+ metadata: '{}',
278278+ createdAt: 1699564800000,
279279+ updatedAt: 1699564800000,
280280+ usageCount: 150
281281+ }
282282+}
283283+```
284284+285285+---
286286+287287+### 5. `blobs` - Binary File References
288288+289289+Metadata index for binary files (images, videos, PDFs, etc.).
290290+Actual files stored in filesystem at `{userData}/{PROFILE}/datastore/blobs/`
291291+292292+```javascript
293293+{
294294+ rowId: string, // Content hash (SHA-256) serves as ID
295295+ cells: {
296296+ filename: string, // Original filename
297297+ mimeType: string, // MIME type (image/jpeg, video/mp4, application/pdf, etc.)
298298+ mediaType: string, // Coarse type: 'image', 'video', 'audio', 'document', 'archive'
299299+ size: number, // File size in bytes
300300+ hash: string, // Content hash (same as rowId, for convenience)
301301+ extension: string, // File extension (.jpg, .mp4, etc.)
302302+ path: string, // Relative path in blob storage
303303+ addressId: string, // Source address if downloaded from web
304304+ contentId: string, // Associated content item if any
305305+ tags: string, // Comma-separated tag IDs
306306+ metadata: string, // JSON: dimensions, duration, EXIF, etc.
307307+ createdAt: number, // Unix timestamp when added
308308+ width: number, // Image/video width (if applicable)
309309+ height: number, // Image/video height (if applicable)
310310+ duration: number, // Audio/video duration in seconds (if applicable)
311311+ thumbnail: string // Path to thumbnail if generated
312312+ }
313313+}
314314+```
315315+316316+**Indexes:**
317317+- `byMediaType` - Filter by media type
318318+- `byMimeType` - Filter by MIME type
319319+- `byAddress` - Find blobs from specific address
320320+- `byTag` - Filter by tags
321321+- `byDate` - Sort by date added
322322+323323+**Example Row:**
324324+```javascript
325325+{
326326+ 'sha256_abc123...': {
327327+ filename: 'screenshot.png',
328328+ mimeType: 'image/png',
329329+ mediaType: 'image',
330330+ size: 1024768,
331331+ hash: 'sha256_abc123...',
332332+ extension: '.png',
333333+ path: 'blobs/sha256_abc123.png',
334334+ addressId: 'addr_1234',
335335+ contentId: 'content_9012',
336336+ tags: 'tag_3',
337337+ metadata: '{"exif":{"camera":"iPhone"},"location":"home"}',
338338+ createdAt: 1699564800000,
339339+ width: 1920,
340340+ height: 1080,
341341+ duration: 0,
342342+ thumbnail: 'blobs/thumbs/sha256_abc123_thumb.jpg'
343343+ }
344344+}
345345+```
346346+347347+---
348348+349349+### 6. `scripts_data` - Script Extraction Results
350350+351351+Stores data extracted by background Scripts feature.
352352+353353+```javascript
354354+{
355355+ rowId: string, // Auto-generated unique ID
356356+ cells: {
357357+ scriptId: string, // ID of script that extracted this data
358358+ scriptName: string, // Script name for easier querying
359359+ addressId: string, // Source address
360360+ selector: string, // CSS selector used
361361+ content: string, // Extracted content
362362+ contentType: string, // text, number, html, json, etc.
363363+ metadata: string, // JSON string for flexible metadata
364364+ extractedAt: number, // Unix timestamp when extracted
365365+ previousValue: string, // Previous value for change detection
366366+ changed: number // 0 or 1 - whether changed since last run
367367+ }
368368+}
369369+```
370370+371371+**Indexes:**
372372+- `byScript` - Group by script
373373+- `byAddress` - Filter by source address
374374+- `byTimestamp` - Sort chronologically
375375+- `byChanged` - Find changed values
376376+377377+**Example Row:**
378378+```javascript
379379+{
380380+ 'script_data_3456': {
381381+ scriptId: 'script_1',
382382+ scriptName: 'Weather Monitor',
383383+ addressId: 'addr_weather',
384384+ selector: '.temperature',
385385+ content: '72°F',
386386+ contentType: 'text',
387387+ metadata: '{"unit":"fahrenheit","location":"SF"}',
388388+ extractedAt: 1699651200000,
389389+ previousValue: '70°F',
390390+ changed: 1
391391+ }
392392+}
393393+```
394394+395395+---
396396+397397+### 7. `feeds` - Custom Feed Definitions
398398+399399+Defines custom feeds and their queries/sources.
400400+401401+```javascript
402402+{
403403+ rowId: string, // Auto-generated unique ID
404404+ cells: {
405405+ name: string, // Feed name
406406+ description: string, // Feed description
407407+ type: string, // 'query', 'script', 'external', 'aggregated'
408408+ query: string, // Query definition (TinyQL or JSON query object)
409409+ schedule: string, // Cron-like schedule for updates (if applicable)
410410+ source: string, // External URL or internal source
411411+ tags: string, // Comma-separated tag IDs
412412+ metadata: string, // JSON string for flexible metadata
413413+ createdAt: number, // Unix timestamp
414414+ updatedAt: number, // Unix timestamp
415415+ lastFetchedAt: number, // Unix timestamp of last update
416416+ enabled: number // 0 or 1 - whether feed is active
417417+ }
418418+}
419419+```
420420+421421+**Indexes:**
422422+- `byType` - Filter by feed type
423423+- `byEnabled` - Find active feeds
424424+- `byTag` - Filter by tags
425425+426426+**Example Row:**
427427+```javascript
428428+{
429429+ 'feed_7890': {
430430+ name: 'Recent Work Links',
431431+ description: 'Links tagged work from last 7 days',
432432+ type: 'query',
433433+ query: '{"table":"addresses","where":{"tags":"tag_1"},"since":"7d"}',
434434+ schedule: '0 9 * * *',
435435+ source: 'internal',
436436+ tags: 'tag_1',
437437+ metadata: '{"format":"rss"}',
438438+ createdAt: 1699564800000,
439439+ updatedAt: 1699651200000,
440440+ lastFetchedAt: 1699651200000,
441441+ enabled: 1
442442+ }
443443+}
444444+```
445445+446446+---
447447+448448+## Schema Definition (TinyBase Format)
449449+450450+```javascript
451451+const schema = {
452452+ addresses: {
453453+ uri: { type: 'string' },
454454+ protocol: { type: 'string', default: 'https' },
455455+ domain: { type: 'string' },
456456+ path: { type: 'string', default: '' },
457457+ title: { type: 'string', default: '' },
458458+ mimeType: { type: 'string', default: 'text/html' },
459459+ favicon: { type: 'string', default: '' },
460460+ description: { type: 'string', default: '' },
461461+ tags: { type: 'string', default: '' },
462462+ metadata: { type: 'string', default: '{}' },
463463+ createdAt: { type: 'number' },
464464+ updatedAt: { type: 'number' },
465465+ lastVisitAt: { type: 'number', default: 0 },
466466+ visitCount: { type: 'number', default: 0 },
467467+ starred: { type: 'number', default: 0 },
468468+ archived: { type: 'number', default: 0 }
469469+ },
470470+471471+ visits: {
472472+ addressId: { type: 'string' },
473473+ timestamp: { type: 'number' },
474474+ duration: { type: 'number', default: 0 },
475475+ source: { type: 'string', default: 'direct' },
476476+ sourceId: { type: 'string', default: '' },
477477+ windowType: { type: 'string', default: 'main' },
478478+ metadata: { type: 'string', default: '{}' },
479479+ scrollDepth: { type: 'number', default: 0 },
480480+ interacted: { type: 'number', default: 0 }
481481+ },
482482+483483+ content: {
484484+ title: { type: 'string', default: 'Untitled' },
485485+ content: { type: 'string', default: '' },
486486+ mimeType: { type: 'string', default: 'text/plain' },
487487+ contentType: { type: 'string', default: 'plain' },
488488+ language: { type: 'string', default: '' },
489489+ encoding: { type: 'string', default: 'utf-8' },
490490+ tags: { type: 'string', default: '' },
491491+ addressRefs: { type: 'string', default: '' },
492492+ parentId: { type: 'string', default: '' },
493493+ metadata: { type: 'string', default: '{}' },
494494+ createdAt: { type: 'number' },
495495+ updatedAt: { type: 'number' },
496496+ syncPath: { type: 'string', default: '' },
497497+ synced: { type: 'number', default: 0 },
498498+ starred: { type: 'number', default: 0 },
499499+ archived: { type: 'number', default: 0 }
500500+ },
501501+502502+ tags: {
503503+ name: { type: 'string' },
504504+ slug: { type: 'string' },
505505+ color: { type: 'string', default: '#999999' },
506506+ parentId: { type: 'string', default: '' },
507507+ description: { type: 'string', default: '' },
508508+ metadata: { type: 'string', default: '{}' },
509509+ createdAt: { type: 'number' },
510510+ updatedAt: { type: 'number' },
511511+ usageCount: { type: 'number', default: 0 }
512512+ },
513513+514514+ blobs: {
515515+ filename: { type: 'string' },
516516+ mimeType: { type: 'string' },
517517+ mediaType: { type: 'string' },
518518+ size: { type: 'number' },
519519+ hash: { type: 'string' },
520520+ extension: { type: 'string' },
521521+ path: { type: 'string' },
522522+ addressId: { type: 'string', default: '' },
523523+ contentId: { type: 'string', default: '' },
524524+ tags: { type: 'string', default: '' },
525525+ metadata: { type: 'string', default: '{}' },
526526+ createdAt: { type: 'number' },
527527+ width: { type: 'number', default: 0 },
528528+ height: { type: 'number', default: 0 },
529529+ duration: { type: 'number', default: 0 },
530530+ thumbnail: { type: 'string', default: '' }
531531+ },
532532+533533+ scripts_data: {
534534+ scriptId: { type: 'string' },
535535+ scriptName: { type: 'string' },
536536+ addressId: { type: 'string' },
537537+ selector: { type: 'string' },
538538+ content: { type: 'string' },
539539+ contentType: { type: 'string', default: 'text' },
540540+ metadata: { type: 'string', default: '{}' },
541541+ extractedAt: { type: 'number' },
542542+ previousValue: { type: 'string', default: '' },
543543+ changed: { type: 'number', default: 0 }
544544+ },
545545+546546+ feeds: {
547547+ name: { type: 'string' },
548548+ description: { type: 'string', default: '' },
549549+ type: { type: 'string' },
550550+ query: { type: 'string', default: '' },
551551+ schedule: { type: 'string', default: '' },
552552+ source: { type: 'string', default: 'internal' },
553553+ tags: { type: 'string', default: '' },
554554+ metadata: { type: 'string', default: '{}' },
555555+ createdAt: { type: 'number' },
556556+ updatedAt: { type: 'number' },
557557+ lastFetchedAt: { type: 'number', default: 0 },
558558+ enabled: { type: 'number', default: 1 }
559559+ }
560560+};
561561+```
562562+563563+---
564564+565565+## Indexes Definition
566566+567567+```javascript
568568+const indexes = {
569569+ // Address indexes
570570+ addresses_byDomain: {
571571+ table: 'addresses',
572572+ on: 'domain'
573573+ },
574574+ addresses_byProtocol: {
575575+ table: 'addresses',
576576+ on: 'protocol'
577577+ },
578578+ addresses_byStarred: {
579579+ table: 'addresses',
580580+ on: 'starred'
581581+ },
582582+583583+ // Visit indexes
584584+ visits_byAddress: {
585585+ table: 'visits',
586586+ on: 'addressId'
587587+ },
588588+ visits_byTimestamp: {
589589+ table: 'visits',
590590+ on: 'timestamp'
591591+ },
592592+ visits_bySource: {
593593+ table: 'visits',
594594+ on: 'source'
595595+ },
596596+597597+ // Content indexes
598598+ content_byContentType: {
599599+ table: 'content',
600600+ on: 'contentType'
601601+ },
602602+ content_byMimeType: {
603603+ table: 'content',
604604+ on: 'mimeType'
605605+ },
606606+ content_bySynced: {
607607+ table: 'content',
608608+ on: 'synced'
609609+ },
610610+ content_byUpdated: {
611611+ table: 'content',
612612+ on: 'updatedAt'
613613+ },
614614+615615+ // Tag indexes
616616+ tags_byName: {
617617+ table: 'tags',
618618+ on: 'name'
619619+ },
620620+ tags_byParent: {
621621+ table: 'tags',
622622+ on: 'parentId'
623623+ },
624624+625625+ // Blob indexes
626626+ blobs_byMediaType: {
627627+ table: 'blobs',
628628+ on: 'mediaType'
629629+ },
630630+ blobs_byMimeType: {
631631+ table: 'blobs',
632632+ on: 'mimeType'
633633+ },
634634+635635+ // Scripts data indexes
636636+ scripts_data_byScript: {
637637+ table: 'scripts_data',
638638+ on: 'scriptId'
639639+ },
640640+ scripts_data_byChanged: {
641641+ table: 'scripts_data',
642642+ on: 'changed'
643643+ },
644644+645645+ // Feed indexes
646646+ feeds_byType: {
647647+ table: 'feeds',
648648+ on: 'type'
649649+ },
650650+ feeds_byEnabled: {
651651+ table: 'feeds',
652652+ on: 'enabled'
653653+ }
654654+};
655655+```
656656+657657+---
658658+659659+## Relationships
660660+661661+TinyBase relationships for efficient joins:
662662+663663+```javascript
664664+const relationships = {
665665+ // Visits to their addresses
666666+ visitAddress: {
667667+ localTableId: 'visits',
668668+ remoteTableId: 'addresses',
669669+ relationshipId: 'addressId'
670670+ },
671671+672672+ // Blobs to their source addresses
673673+ blobAddress: {
674674+ localTableId: 'blobs',
675675+ remoteTableId: 'addresses',
676676+ relationshipId: 'addressId'
677677+ },
678678+679679+ // Blobs to their content
680680+ blobContent: {
681681+ localTableId: 'blobs',
682682+ remoteTableId: 'content',
683683+ relationshipId: 'contentId'
684684+ },
685685+686686+ // Scripts data to addresses
687687+ scriptDataAddress: {
688688+ localTableId: 'scripts_data',
689689+ remoteTableId: 'addresses',
690690+ relationshipId: 'addressId'
691691+ },
692692+693693+ // Tag hierarchy (self-referential)
694694+ childTags: {
695695+ localTableId: 'tags',
696696+ remoteTableId: 'tags',
697697+ relationshipId: 'parentId'
698698+ },
699699+700700+ // Content hierarchy (self-referential)
701701+ childContent: {
702702+ localTableId: 'content',
703703+ remoteTableId: 'content',
704704+ relationshipId: 'parentId'
705705+ }
706706+};
707707+```
708708+709709+---
710710+711711+## Metrics (Aggregations)
712712+713713+Useful metrics for dashboard/analytics:
714714+715715+```javascript
716716+const metrics = {
717717+ // Total addresses
718718+ totalAddresses: {
719719+ table: 'addresses',
720720+ aggregate: 'count'
721721+ },
722722+723723+ // Total visits
724724+ totalVisits: {
725725+ table: 'visits',
726726+ aggregate: 'count'
727727+ },
728728+729729+ // Average visit duration
730730+ avgVisitDuration: {
731731+ table: 'visits',
732732+ metric: 'duration',
733733+ aggregate: 'avg'
734734+ },
735735+736736+ // Total storage used by blobs
737737+ totalBlobSize: {
738738+ table: 'blobs',
739739+ metric: 'size',
740740+ aggregate: 'sum'
741741+ },
742742+743743+ // Number of content items
744744+ totalContent: {
745745+ table: 'content',
746746+ aggregate: 'count'
747747+ },
748748+749749+ // Number of synced content items
750750+ syncedContent: {
751751+ table: 'content',
752752+ where: { synced: 1 },
753753+ aggregate: 'count'
754754+ },
755755+756756+ // Content by type
757757+ contentByType: {
758758+ table: 'content',
759759+ groupBy: 'contentType',
760760+ aggregate: 'count'
761761+ }
762762+};
763763+```
764764+765765+---
766766+767767+## Common Queries (Examples)
768768+769769+### Recent addresses by visit
770770+```javascript
771771+store.getTable('visits')
772772+ .sort((a, b) => b.timestamp - a.timestamp)
773773+ .slice(0, 10)
774774+ .map(visit => visit.addressId)
775775+```
776776+777777+### Starred addresses with tags
778778+```javascript
779779+store.getTable('addresses')
780780+ .filter(addr => addr.starred === 1)
781781+ .map(addr => ({
782782+ ...addr,
783783+ tags: addr.tags.split(',').map(id => store.getRow('tags', id))
784784+ }))
785785+```
786786+787787+### Content synced to filesystem
788788+```javascript
789789+store.getTable('content')
790790+ .filter(item => item.synced === 1)
791791+```
792792+793793+### Markdown content only
794794+```javascript
795795+store.getTable('content')
796796+ .filter(item => item.contentType === 'markdown')
797797+```
798798+799799+### CSV data
800800+```javascript
801801+store.getTable('content')
802802+ .filter(item => item.contentType === 'csv')
803803+```
804804+805805+### Blobs by media type
806806+```javascript
807807+store.getTable('blobs')
808808+ .filter(blob => blob.mediaType === 'image')
809809+```
810810+811811+### Script data that changed
812812+```javascript
813813+store.getTable('scripts_data')
814814+ .filter(data => data.changed === 1)
815815+ .sort((a, b) => b.extractedAt - a.extractedAt)
816816+```
817817+818818+---
819819+820820+## Storage Strategy
821821+822822+### Persistence Layers
823823+824824+**Phase 1 (MVP):**
825825+- TinyBase in-memory store
826826+- Persist to IndexedDB for browser compatibility
827827+- File storage for blobs in `{userData}/{PROFILE}/datastore/blobs/`
828828+829829+**Phase 2 (Performance):**
830830+- Add SQLite persistence option
831831+- Keep TinyBase API but use SQLite backend
832832+- Better for large datasets and complex queries
833833+834834+**Phase 3 (Sync):**
835835+- Enable TinyBase CRDT sync
836836+- Sync between devices
837837+- Conflict-free merging
838838+839839+### File System Layout
840840+841841+```
842842+{userData}/
843843+ {PROFILE}/
844844+ datastore/
845845+ index.db # SQLite backend (Phase 2)
846846+ index.json # JSON backup
847847+ blobs/
848848+ sha256_abc...png # Content-addressed blobs
849849+ sha256_def...jpg
850850+ thumbs/ # Thumbnails for images
851851+ sha256_abc_thumb.jpg
852852+ content/ # Synced text content
853853+ notes/ # Markdown notes
854854+ note1.md
855855+ note2.md
856856+ data/ # CSV and other data files
857857+ prices.csv
858858+ code/ # Code snippets
859859+ helpers.js
860860+ exports/ # User exports
861861+ backup-2024-11-12.json
862862+```
863863+864864+---
865865+866866+## Migration Strategy
867867+868868+### Version 1.0 (Initial)
869869+- Create all tables with schema
870870+- Set up indexes
871871+- Set up relationships
872872+- Initialize with empty data
873873+874874+### Future Versions
875875+- TinyBase doesn't have built-in migrations
876876+- Implement custom migration system:
877877+ - Version table to track schema version
878878+ - Migration functions for each version bump
879879+ - Backup before migration
880880+ - Rollback capability
881881+882882+---
883883+884884+## Next Steps
885885+886886+1. ✅ Schema design complete
887887+2. ⏭️ Install TinyBase package
888888+3. ⏭️ Create datastore module scaffold
889889+4. ⏭️ Implement store initialization with schema
890890+5. ⏭️ Implement basic CRUD operations
891891+6. ⏭️ Test with sample data
892892+7. ⏭️ Build datastore API layer
+84
notes/datastore.md
···11+# Peek Personal Datastore
22+33+Browser profile directories are a jumble of organically-grown files and directories that are designed to serve browser internals vs being a store of user-curated and shaped information.
44+55+The Peek Personal Datastore combines an address index with unstructured data, metadata, time-series data, and files.
66+77+This is a local, private and pesonal store first and foremost.
88+99+Peek needs a way of storing data that provides a few primary things:
1010+1111+- Store various data types, and attach metadata to them
1212+- Store feeds, such as web navigation history, stored data history, custom generated feeds, timeseries data, and feeds pulled in from elsewhere
1313+- Have some kind of approach to binary files like images and videos, maybe fine to keep on filesystem but referenced by an index
1414+- Support bidirectional filesystem sync for some flavors of file, such as markdown, where we might want an Obsidian vault to map some set of "files" in the datastore that we can also edit in a "stickies" app running in Peek, for example
1515+- Mime types are implemented in nearly every aspect of the datastore to allow for type-based querying
1616+- Tags are implemented in nearly every aspect of the datastore to allow for coarse-grained annotations and querying
1717+1818+Non-primary but keep in FOV:
1919+- Runtime/browser engine agnosticism, eg if we move off Electron someday
2020+- Designed with sync in mind, for mirroring to other devices, saving parts to specific cloud operations, or whole snapshots for backups and archives
2121+- Designed with sync in mind to collaborate with others - eg perhaps a subset of notes are synced with some other person's set of notes
2222+2323+Primary types:
2424+2525+- Address index: Peek at its core is a web user agent. First class support for saving addresses. Examples: HTTP URLs, other protocol URLs or URIs, such as IPFS CIDs. Fine to limit to URIs for now. The address index includes navigation history, or an imported Pocket archive or any type of address for any reason.
2626+- Web navigation history: Index of visits to addresses in the index.
2727+- Non-URL data, which can reference one or more URLs or none at all. Examples: markdown notes, images
2828+- Metadata for all data types: We want to annotate addresses and non-addresses with tags, signatures, mime-types, language metadata, usage information, etc.
2929+3030+## Application patterns
3131+3232+- Applications need to read from and write to the datastore in ways specific to them.
3333+- Not necessarily full sandbox / sharding / area, but the ability to operate on types they know and use.
3434+- Eg Panorama will need to access the address index, and store group metadata, and access it quickly.
3535+- Address classifiers will be a very common use-case, with many applications just being specialized address classifiers, so maybe we need some application-level "data view" implemented for quickly accessing data in this way.
3636+- Perhaps lenses/views are a useful abstraction here.
3737+3838+## Use-cases
3939+4040+Private local
4141+- navigation history
4242+- personal notes
4343+- saving images from pages
4444+- text/numerical datapoints and their history, eg (so, time-series data)
4545+4646+Private remote
4747+- publishing a note to a remote server
4848+- syncing the datastore between my devices
4949+- publishing backups/archives
5050+5151+Public remote
5252+- publish a note to my website
5353+- sync w/ a remote service, eg push urls+notes tagged 'arena' to are.na and pull from it
5454+5555+Collaboration
5656+- syncing private data between two people
5757+5858+Shared calendar scenario
5959+- two stores with calendar data
6060+- connect via agreed shared method
6161+6262+## Data, schemas, and schemalessness
6363+6464+As Peek matures into a natively generative system, we need complex types beyond MIME types and what filesystems afford - the whole flora and fauna of digital daily life. We need a way of describing data when passing it between features and "applications" in Peek. We don't need some holy grail supersystem dream, maybe it's fine to just internet MIME types, filesystem types, and something like Atproto's "lexicons" when interacting in public collaborative scenarios.
6565+6666+The store itself is probably fine using basic types, and we can layer on complex types in the context of applications.
6767+6868+## Implementation notes and ideas
6969+7070+- JS/TS/Electron
7171+- Tinybase
7272+- Automerge
7373+7474+Layer on:
7575+- identities
7676+- signing
7777+- verifiability
7878+- collaboration
7979+8080+## Examples
8181+8282+- Atproto personal datastore (not designed to be private by default tho) https://atproto.com/guides/self-hosting
8383+- Solid pods https://solidproject.org/
8484+- Perkeep is more focused on permanence but it does a lot of these things https://github.com/perkeep/perkeep