# Mast Sync Protocol v2.0 Specification ## Overview This is a clean v2 specification that starts fresh from v1, incorporating lessons learned to solve critical issues while maintaining simplicity. This replaces the previous over-engineered v2 implementation with a focused, production-ready protocol. ## Core Design Principles 1. **Secure by Default**: All operations require authentication, rooms are private by default 2. **Simple & Clean**: Remove unnecessary complexity while maintaining reliability 3. **Payment Ready**: Built-in support for paid room creation with backwards compatibility 4. **CRDT-Native**: Leverage CR-SQLite's conflict resolution, no complex coordination needed 5. **Immediate Feedback**: Clients know their room status immediately after connection ## Authentication & Authorization ### Key Management - **ECDSA P-256** public key cryptography - **Automatic registration** on first connection (when payment enforcement disabled) - **Room-scoped permissions**: read, write, invite capabilities - **Signature format**: `{type}:{data-json}` ### Environment-Based Enforcement ```bash # Development/Testing (default) REQUIRE_AUTH=false # Auto-grant invite permissions to any connecting user # Production (future) REQUIRE_AUTH=true # Only users in auth table can access rooms ``` ## Connection Lifecycle ### 1. WebSocket Connection ``` URL: wss://server/sync?room={roomId}&publicKey={base64PublicKey} ``` ### 2. Immediate Room Status Upon connection, server immediately sends room status: ```json { "type": "room_status", "access": "write|read|none|no_room" } ``` **Access Levels**: - `"write"`: Full read/write access to existing room - `"read"`: Read-only access to existing room - `"none"`: Room exists but no permissions granted (need to be invited -- later) - `"no_room"`: Room doesn't exist (payment may be required if they wish to create one) ### 3. Room Access Behavior **When `REQUIRE_AUTH=false` (Development)**: - Any room → Auto-grant invite permissions (read + write) - New rooms created automatically - User receives `"access": "write"` **When `REQUIRE_AUTH=true` (Production)**: - Only authenticated users can access - User must exist in room_keys table - User receives access level based on permissions ## Core Sync Protocol ### Missing Changes Problem Solution The critical flaw in v1 was using `MAX(db_version)` which caused missing intermediate versions. **Problem**: Client has versions [1,2,5,10] and requests `> 10`, never getting versions 3,4,6,7,8,9. **Solution**: Track highest contiguous version + explicit missing ranges. #### Version State Tracking ```javascript const versionState = { contiguousUpTo: 2, // Highest version with no gaps before it missingRanges: [ // Explicit gaps we know about {start: 3, end: 4}, // Missing versions 3-4 {start: 6, end: 9} // Missing versions 6-9 ], maxVersionSeen: 10 // Highest version ever seen }; ``` ### Change Synchronization #### Sync Request (Client → Server) ```json { "type": "sync_request", "site_id": "base64-site-id", "contiguous_up_to": 2, "missing_ranges": [ {"start": 3, "end": 4}, {"start": 6, "end": 9} ], "max_version_seen": 10, "publicKey": "base64-public-key" } ``` #### Sync Response (Server → Client) ```json { "type": "sync_response", "current_max_version": 15, "changes": [ // Missing ranges 3-4, 6-9 // Plus new changes 11-15 { "TableName": "todos", "PK": "base64-pk", "ColumnName": "description", "Value": "Task content", "ColVersion": 15, "DBVersion": 8, "SiteID": "base64-site-id", "CL": 1, "Seq": 1 } ] } ``` #### Change Push (Write Operations) ```json { "type": "changes", "publicKey": "base64-public-key", "signature": "base64-signature", "data": [...changes...] } ``` **Signature payload**: `changes:{JSON.stringify(data)}` ### Real-time Broadcasting Server broadcasts changes to all authorized clients in room (excluding sender): ```json { "type": "changes", "data": [...changes...] } ``` ## Server Architecture ### Essential Components - ✅ ECDSA signature verification - ✅ Room-based isolation - ✅ Permission checking - ✅ Auto room creation (with environment flag) - ✅ Real-time broadcasting - ✅ Connection cleanup on auth failure ### Database Schema ```sql CREATE TABLE rooms ( room_id TEXT PRIMARY KEY, created_at INTEGER NOT NULL ); CREATE TABLE room_keys ( room_id TEXT NOT NULL, public_key TEXT NOT NULL, can_read BOOLEAN NOT NULL DEFAULT 1, can_write BOOLEAN NOT NULL DEFAULT 1, can_invite BOOLEAN NOT NULL DEFAULT 0, created_at INTEGER NOT NULL, PRIMARY KEY (room_id, public_key) ); ``` ### Server Version Query Logic #### Multi-Range SQL Query ```sql -- Get missing ranges + new changes SELECT * FROM crsql_changes WHERE site_id != ? AND ( -- Missing range 3-4 (db_version >= 3 AND db_version <= 4) OR -- Missing range 6-9 (db_version >= 6 AND db_version <= 9) OR -- New changes 11-15 (db_version > 10) ) ORDER BY db_version ASC ``` #### Client Version State Updates ```javascript function updateVersionState(changes) { for (const change of changes) { const version = change.DBVersion; // Update max seen versionState.maxVersionSeen = Math.max(versionState.maxVersionSeen, version); // Fill gaps and update contiguous fillGapsAndUpdateContiguous(version); } } function fillGapsAndUpdateContiguous(version) { // Remove version from missing ranges removeMissingVersion(version); // Extend contiguous if possible while (versionState.contiguousUpTo + 1 <= versionState.maxVersionSeen && !isVersionMissing(versionState.contiguousUpTo + 1)) { versionState.contiguousUpTo++; } } ``` ## Auto-Registration Flow ### Connection Logic 1. Client connects with `publicKey` parameter 2. Server checks room + key permissions 3. **When `REQUIRE_AUTH=false`**: - Auto-grant invite permissions to any user - Create room if it doesn't exist 4. **When `REQUIRE_AUTH=true`**: - Only users in auth table can access - No auto-registration ### Permission Granting Logic ```go // Auto-grant invite permissions (read + write) func AutoGrantInvitePermissions(roomID, publicKey string) error { return GrantPermissions(roomID, publicKey, true, true, true) // read, write, invite } ``` ## Client Implementation ### Connection Management ```typescript // Simple reconnection (no exponential backoff) function attemptReconnection(room: string) { setTimeout(() => { connectWebSocket(room).catch(() => attemptReconnection(room)); }, math.Rand(3, 9); // Between 3 and 9 seconds retry, to stop stampeding herd } ``` ### Authentication Integration ```typescript // Include publicKey in all requests async function sendSyncRequest(connection) { const syncRequest = { type: "sync_request", site_id: connection.siteId, last_version: connection.lastSyncVersion, publicKey // Always include for all syncRequest }; ws.send(JSON.stringify(syncRequest)); } ``` ### Room Status Handling ```typescript // Handle immediate room status case 'room_status': self.postMessage({ type: 'room_status', dbname: dbname, access: msg.access, needsPayment: msg.access === 'no_room' }); break; ``` ## Server Implementation ### Core Logic ```go func handleWebSocket(w http.ResponseWriter, r *http.Request) { roomID := r.URL.Query().Get("room") publicKey := r.URL.Query().Get("publicKey") // Check authentication/auto-grant permissions access := determineAccess(roomID, publicKey) // Upgrade WebSocket conn, err := upgrader.Upgrade(w, r, nil) if err != nil { return } // Send immediate room status sendRoomStatus(conn, access) // Handle sync protocol handleSyncProtocol(conn, roomID, publicKey) } func determineAccess(roomID, publicKey string) string { if !requireAuth { // Development mode - auto-grant invite permissions autoGrantInvitePermissions(roomID, publicKey) return "write" } // Production mode - check existing permissions return checkUserPermissions(roomID, publicKey) } ``` ## Error Handling ### Structured Error Responses ```json { "type": "error", "code": "AUTH_FAILED|ROOM_NOT_FOUND|PAYMENT_REQUIRED|PERMISSION_DENIED", "message": "Human readable description" } ``` ### Authentication Failure Behavior - **Invalid signature**: Close connection immediately - **No read permission**: Close connection immediately - **No write permission**: Reject change, keep connection open - **No room**: Send `room_status` with `"access": "no_room"` ## Migration from v1 ### Environment Variable Control ```bash # Start with development mode REQUIRE_AUTH=false # Switch to production when ready REQUIRE_AUTH=true ``` ### Protocol Changes from v1 - Replace `"pull"` message with `"sync_request"` - Add version range tracking instead of simple `last_version` - Add immediate `"room_status"` response - Add `publicKey` to WebSocket URL and all requests ### Seamless Transition - v1 rooms continue working unchanged - Auto-registration ensures no user disruption - Environment variable provides clean cutoff point ## Security Model ### Transport Security - **WSS required** for production (TLS 1.3 minimum) - **Certificate validation** on client ### Message Security - **ECDSA P-256 signatures** for all write operations - **Public key authentication** for all read operations - **Room isolation** - no cross-room access ### Key Security - **Web Crypto API** for key generation - **Secure storage** (recommend upgrade from localStorage for production) - **No key transmission** (only public keys sent to server) ### CR-SQLite Benefits - **Offline-first**: Changes work immediately without server - **Conflict-free**: Automatic merge resolution - **Consistent**: Guaranteed eventual consistency - **Efficient**: Delta-only synchronization ## Implementation Priorities ### Phase 1: Core Protocol 1. Implement missing changes solution with version range tracking through sync_request messages - Version state tracking on client side - Version requests in sync_request messages - Response from server giving exactly the changes requested 2. Add immediate room_status messages 3. Simple reconnection (3-second retry) 4. Room status handling in UI (SyncStatus component) ### Phase 2: Authorization 1. Environment-based auth enforcement (`REQUIRE_AUTH` flag) 2. Auto-registration for development mode 3. ECDSA signature verification (real implementation) 4. publicKey authentication for all operations 5. Connection cleanup on auth failure ## Configuration ### Server Configuration ```go // Environment variables var ( REQUIRE_AUTH = os.Getenv("REQUIRE_AUTH") == "true" ) ``` ### Client Configuration ```typescript interface SyncConfig { room: string; endpoint: string; autoReconnect?: boolean; // Default: true reconnectDelay?: number; // Default: 3000ms } ```