BlueSky & more on desktop lazurite.stormlightlabs.org/
tauri rust typescript bluesky appview atproto solid
2
fork

Configure Feed

Select the types of activity you want to include in your feed.

docs: search and v0.2 plan

+343 -52
+105 -23
docs/specs/search.md
··· 1 - # Search & Embeddings 1 + # Search 2 2 3 - Local full-text + semantic search over the authenticated user's saved and liked posts. 3 + Search has two scopes: 4 4 5 - ## Data Pipeline 5 + 1. **Network search**: server-side search via Bluesky APIs — no local indexing. Always available. 6 + 2. **Local search**: full-text + semantic search over the **authenticated user's own** liked and bookmarked/saved posts, stored locally in SQLite. 6 7 7 - 1. **Sync**: on login and periodically, fetch user's likes (`app.bsky.feed.getActorLikes`) and bookmarks. Paginate fully, store in SQLite. 8 - 2. **Index FTS**: insert post text into SQLite FTS5 virtual table for keyword search. 9 - 3. **Embed**: run post text through `fastembed` with `nomic-embed-text-v1.5` (768-dim). Store vectors in `sqlite-vec` virtual table. 10 - 4. **Incremental**: track cursor/last-seen; only process new posts on subsequent syncs. 8 + Local semantic search (embeddings) is **opt-out**: enabled by default, but can be disabled in settings. When disabled, only local keyword (FTS) search is available and the embedding model is not downloaded. 9 + 10 + ## Network Search (not indexed) 11 + 12 + Server-side Bluesky search APIs. These are thin wrappers — no local storage or indexing. 13 + 14 + ### `app.bsky.feed.searchPosts` 15 + 16 + Search all public posts. 17 + 18 + | Parameter | Type | Required | Notes | 19 + | --------- | -------- | -------- | -------------------------------------------- | 20 + | `q` | string | yes | Query string. Supports `from:handle` syntax. | 21 + | `sort` | string | no | `top` (default) or `latest` | 22 + | `since` | string | no | ISO 8601 datetime, inclusive | 23 + | `until` | string | no | ISO 8601 datetime, exclusive | 24 + | `author` | string | no | Filter by DID or handle | 25 + | `lang` | string | no | Language code (e.g., `en`) | 26 + | `tag` | string[] | no | Hashtag filter (without `#`), repeatable | 27 + | `limit` | integer | no | 1–100, default 25 | 28 + | `cursor` | string | no | Pagination cursor from previous response | 29 + 30 + Returns `{ cursor?, hitsTotal?, posts: PostView[] }`. With auth the response includes `viewer` state. 31 + 32 + ### `app.bsky.actor.searchActors` 33 + 34 + Search user profiles. 35 + 36 + | Parameter | Type | Required | Notes | 37 + | --------- | ------- | -------- | ----------------- | 38 + | `q` | string | yes | Query string | 39 + | `limit` | integer | no | 1–100, default 25 | 40 + | `cursor` | string | no | Pagination cursor | 41 + 42 + Returns `{ cursor?, actors: ProfileView[] }`. 43 + 44 + ### `app.bsky.actor.searchActorsTypeahead` 45 + 46 + Lightweight actor search for autocomplete (already used in login flow). 47 + 48 + | Parameter | Type | Required | Notes | 49 + | --------- | ------- | -------- | ----------------- | 50 + | `q` | string | yes | Query string | 51 + | `limit` | integer | no | 1–100, default 10 | 52 + 53 + Returns `{ actors: ProfileViewBasic[] }`. No pagination. 54 + 55 + ### `app.bsky.graph.searchStarterPacks` 56 + 57 + Search starter packs. 58 + 59 + | Parameter | Type | Required | Notes | 60 + | --------- | ------- | -------- | ----------------- | 61 + | `q` | string | yes | Query string | 62 + | `limit` | integer | no | 1–100, default 25 | 63 + | `cursor` | string | no | Pagination cursor | 64 + 65 + Returns `{ cursor?, starterPacks: StarterPackViewBasic[] }`. 66 + 67 + ## Local Data Pipeline 68 + 69 + 1. **Sync**: on login and periodically, fetch the authenticated user's own likes (`app.bsky.feed.getActorLikes`) and bookmarks. Paginate using the API cursor, store posts in SQLite. 70 + 2. **Cursor persistence**: store the last-seen API cursor per `(did, source)` in the `sync_state` table. On subsequent syncs, resume from the stored cursor so we only fetch new posts — never re-fetch the full history. 71 + 3. **Index FTS**: insert post text into SQLite FTS5 virtual table for keyword search (always active). 72 + 4. **Embed** _(opt-out)_: run post text through `fastembed` with `nomic-embed-text-v1.5` (768-dim). Store vectors in `sqlite-vec` virtual table. Skipped when embeddings are disabled. 73 + 5. **Reindex**: a manual "Reindex" action clears all embeddings from `posts_vec` and re-embeds every post. Useful after model updates or if the index becomes corrupted. 11 74 12 75 ## SQLite Schema 13 76 14 77 ```sql 15 - -- Post storage 78 + -- Post storage (authenticated user's liked/bookmarked posts) 16 79 CREATE TABLE posts ( 17 80 uri TEXT PRIMARY KEY, 18 81 cid TEXT NOT NULL, ··· 22 85 created_at TEXT, 23 86 indexed_at TEXT DEFAULT CURRENT_TIMESTAMP, 24 87 json_record TEXT, -- full record JSON 25 - source TEXT NOT NULL -- 'like', 'bookmark', 'own' 88 + source TEXT NOT NULL -- 'like', 'bookmark' 26 89 ); 27 90 28 - -- Full-text search 91 + -- Sync cursor tracking (avoids re-fetching on every sync) 92 + CREATE TABLE sync_state ( 93 + did TEXT NOT NULL, 94 + source TEXT NOT NULL, -- 'like', 'bookmark' 95 + cursor TEXT, -- last API cursor returned 96 + last_synced_at TEXT, 97 + PRIMARY KEY (did, source) 98 + ); 99 + 100 + -- Full-text search (always active) 29 101 CREATE VIRTUAL TABLE posts_fts USING fts5(text, uri UNINDEXED, content=posts, content_rowid=rowid); 30 102 31 - -- Vector embeddings 103 + -- Vector embeddings (opt-out — only populated when embeddings enabled) 32 104 CREATE VIRTUAL TABLE posts_vec USING vec0( 33 105 uri TEXT PRIMARY KEY, 34 106 embedding float[768] ··· 37 109 38 110 ## Search Modes 39 111 40 - | Mode | How | 41 - | -------- | ----------------------------------------------------------------------------------------- | 42 - | Keyword | `SELECT * FROM posts_fts WHERE posts_fts MATCH ?` | 43 - | Semantic | Embed query → `SELECT * FROM posts_vec WHERE embedding MATCH ? ORDER BY distance LIMIT k` | 44 - | Hybrid | Run both, merge results by reciprocal rank fusion | 112 + | Mode | Scope | How | 113 + | -------- | ------ | ----------------------------------------------------------------------------------------- | 114 + | Network | Remote | Server-side via Bluesky APIs (posts, actors, starter packs) — not indexed locally | 115 + | Keyword | Local | `SELECT * FROM posts_fts WHERE posts_fts MATCH ?` | 116 + | Semantic | Local | Embed query → `SELECT * FROM posts_vec WHERE embedding MATCH ? ORDER BY distance LIMIT k` | 117 + | Hybrid | Local | Run keyword + semantic, merge results by reciprocal rank fusion | 45 118 46 119 ## Embedding Details 47 120 48 121 - Model: `nomic-embed-text-v1.5` via `fastembed` (ONNX runtime, no GPU required) 49 122 - Dimensions: 768 (or 256 with Matryoshka truncation for speed) 50 123 - Batch embedding on sync; single embedding on search query 51 - - Model downloaded on first use, cached in Tauri app data dir 124 + - Model downloaded on first use, cached in Tauri app data dir (skipped entirely when embeddings disabled) 52 125 53 126 ## Tauri Commands 54 127 55 128 ```rs 129 + // Network search (not indexed — direct API calls) 130 + search_posts_network(query: String, sort: Option<String>, limit: Option<u32>, cursor: Option<String>) -> NetworkSearchResult 131 + search_actors(query: String, limit: Option<u32>, cursor: Option<String>) -> ActorSearchResult 132 + search_starter_packs(query: String, limit: Option<u32>, cursor: Option<String>) -> StarterPackSearchResult 133 + // Note: searchActorsTypeahead already exists in auth module 134 + 135 + // Local search (user's own likes/bookmarks) 56 136 search_posts(query: String, mode: "keyword"|"semantic"|"hybrid", limit: u32) -> Vec<PostResult> 57 - sync_liked_posts(did: String) -> SyncStatus 137 + sync_posts(did: String, source: "like"|"bookmark") -> SyncStatus // resumes from stored cursor 58 138 get_sync_status(did: String) -> SyncStatus 139 + reindex_embeddings() -> () // clears & re-embeds all posts 140 + set_embeddings_enabled(enabled: bool) -> () // opt-out toggle 59 141 ``` 60 142 61 143 ## Keyboard Shortcuts 62 144 63 - | Key | Action | 64 - | -------- | ----------------------------------------------- | 65 - | `/` | Focus search bar from anywhere | 66 - | `Tab` | Cycle search mode (keyword → semantic → hybrid) | 67 - | `Escape` | Clear search / close results | 145 + | Key | Action | 146 + | -------- | --------------------------------------------------------- | 147 + | `/` | Focus search bar from anywhere | 148 + | `Tab` | Cycle search mode (network → keyword → semantic → hybrid) | 149 + | `Escape` | Clear search / close results | 68 150 69 151 ## UX Polish 70 152
+166
docs/specs/v0.2.md
··· 1 + --- 2 + title: Beyond MVP 1 (v0.2.0) 3 + updated: 2026-03-29 4 + --- 5 + 6 + The most useful "social power toolz" additions are the ones that answer: 7 + 8 + - **How is this account perceived or acted on by others?** 9 + - **What network-side artifacts affect me?** 10 + - **What context am I missing before I follow, reply, or trust this account?** 11 + 12 + ## Social Diagnostics/Tools 13 + 14 + Tabbed panel with four buckets: 15 + 16 + 1. **Reputation/exposure** - lists, labels, starter packs 17 + 2. **Safety/boundaries** - blocking, blocked-by, moderation-related visibility 18 + 3. **Context/provenance** - profile history, handle/DID/PDS changes, post references 19 + 4. **Power-user protocol inspection** - raw records, backlinks, record graph, PDS explorer 20 + 21 + ## Feature Mapping 22 + 23 + | Feature | Problem it Solves | Infra | 24 + | ----------------------------------- | ------------------------------------------------------- | ------------------------------------------------------------ | 25 + | Lists I’m on/this account is on | Why do people react strongly to this account? | Constellation over `app.bsky.graph.listitem` & | 26 + | | | hydrate list via Bluesky APIs | 27 + | Blocked by | Why can’t I interact with some accounts? | ClearSky-like indexing/graph analysis; | 28 + | | What is my exposure? | derived from public block relations where visible | 29 + | Blocking | Who has this account blocked? | Public block records/graph index | 30 + | | What kinds of accounts do they block? | | 31 + | Labels on account | Is there moderation metadata on this actor? | Bluesky moderation/label views where exposed by AppView APIs | 32 + | Starter packs containing account | How are people discovering this account? | Graph relationship indexing | 33 + | | | list/starter-pack hydration | 34 + | Profile/identity history | Did this account recently change handle/name/PDS? | Historical index snapshots | 35 + | OnPosts/posts involving account | Where does this account show up in discourse? | Link/backlink index over posts referencing DID/URI | 36 + | PDS/DID/repo provenance | What PDS is this on, and what repo identity is this? | Existing PDSls-style explorer | 37 + | | | Slingshot-style identity resolution | 38 + | List risk summary | Is this account heavily listed? in what categories? | Derived from list memberships | 39 + | | | list metadata | 40 + | Moderation visibility summary | Before I follow/reply, is there any moderation context? | Labels, lists, blocks, starter packs | 41 + | Network relationship diff over time | What changed about this account recently? | Historical snapshots and graph diffs | 42 + 43 + ### 1. Lists 44 + 45 + Why it matters: 46 + 47 + - It explains hidden context around an account. 48 + - It helps users understand reputation, curation, and network placement. 49 + - It is legible to normal users. 50 + 51 + Implementation note: 52 + 53 + - Query list memberships via Constellation-style link indexing on `app.bsky.graph.listitem`. 54 + - Hydrate the parent list URI into title/owner/details using Bluesky list endpoints. 55 + Bluesky documents public list hydration via `app.bsky.graph.getList`, and ClearSky visibly exposes Lists as a core account tab. 56 + 57 + ### 2. Labels 58 + 59 + Lightweight but prominent. 60 + 61 + Why it matters: 62 + 63 + - It affects moderation and user expectations. 64 + - It can explain hidden visibility changes or warning states. 65 + 66 + ### 3. Blocked-by visibility 67 + 68 + Needs careful UX. 69 + 70 + Why it matters: 71 + 72 + - It helps users understand social or moderation boundaries. 73 + - It can reduce confusion when interactions fail. 74 + 75 + Risk: 76 + 77 + - It can become voyeuristic or inflammatory if over-emphasized. 78 + 79 + ### 4. Starter packs 80 + 81 + This is underrated and very useful. 82 + 83 + Why it matters: 84 + 85 + - It tells users how an account is being promoted or grouped. 86 + - It is often more benign and informative than list/blocking data. 87 + 88 + Best UI: 89 + 90 + - compact cards 91 + - title, creator, description 92 + - why you’re in this pack if derivable 93 + 94 + ### 5. History 95 + 96 + Why it matters: 97 + 98 + - Handle changes, PDS changes, and identity churn can be meaningful. 99 + - It helps debug impersonation, migration, or account evolution. 100 + 101 + Best UI: 102 + 103 + - timeline with discrete events 104 + - profile snapshot diff 105 + - copy raw record action for power users 106 + 107 + ## UFOs (Lexicons & NSIDs) 108 + 109 + | User-facing feature | What user problem it solves | 110 + | ------------------------------------- | ----------------------------------------------------------- | 111 + | Collection/lexicon explorer | What apps and schemas exist beyond core Bluesky? | 112 + | Collection activity charts | Is this feature/app active or growing? | 113 + | Sample records for NSID | What does this collection actually store? | 114 + | Related lexicons/adjacent collections | What other record types go with this app? | 115 + | App footprint for an account | What non-Bluesky collections does this actor appear to use? | 116 + | Trending/unusual collection activity | What new or unusual things are happening on the network? | 117 + 118 + ### Apps & Collections tab on a profile 119 + 120 + Given a profile, infer or inspect which collections that actor has records in, then show: 121 + 122 + - core Bluesky collections used 123 + - third-party app collections used 124 + - rare / unusual app footprints 125 + - open samples for those NSIDs 126 + 127 + ### NSID explorer 128 + 129 + A user can search an NSID prefix like sh.tangled, app.bsky, or your own lexicon namespace and see: 130 + 131 + - active collections 132 + - sample records 133 + - recent activity shape 134 + - related collections 135 + 136 + This is directly aligned with UFOs’ public purpose. 137 + 138 + ### What is this thing? panel for unknown records 139 + 140 + When your client encounters a collection it does not recognize, UFOs can provide: 141 + 142 + - sample schema shape from real records 143 + - nearby lexicons 144 + - collection activity hints 145 + - rough understanding of whether it is niche, abandoned, or actively used 146 + 147 + That makes your client much better as an ATProto-native explorer. 148 + 149 + ### Ecosystem radar / discovery dashboard 150 + 151 + Examples: 152 + 153 + - fastest-growing collections this week 154 + - newly seen lexicon prefixes 155 + - unusual spikes in specific collections 156 + - recently updated for app collections 157 + 158 + UFOs can serve recent records of any NSID it has seen. 159 + 160 + ## Phased Breakdown 161 + 162 + 1. Phase 1 - lists, labels, blocked-by, starter packs (via Constellation + Bluesky APIs) 163 + 2. Phase 2 - profile history, apps & collections tab (via UFOs) 164 + 3. Phase 3 - NSID explorer and collection insights (via UFOs) 165 + 4. Phase 4 - network relationship diffs and provenance graphs (Constellation) 166 + 5. Phase 5 - Discovery dashboard and ecosystem insights (UFOs)
+54 -18
docs/tasks/06-search.md
··· 2 2 3 3 Spec: [search.md](../specs/search.md) 4 4 5 - ## Steps 5 + ## Tasks 6 + 7 + ### Backend 8 + 9 + #### Network Search 6 10 7 - - [ ] Create `src-tauri/src/search.rs` 11 + - [ ] Create 12 + - `src-tauri/src/search.rs` for business logic 13 + - `src-tauri/src/commands/search.rs` 14 + - [ ] Implement network search commands (not indexed - direct API calls): 15 + - `search_posts_network(query, sort?, limit?, cursor?)` → `app.bsky.feed.searchPosts` 16 + - `search_actors(query, limit?, cursor?)` → `app.bsky.actor.searchActors` 17 + - `search_starter_packs(query, limit?, cursor?)` → `app.bsky.graph.searchStarterPacks` 18 + - Note: `searchActorsTypeahead` already exists in auth module 19 + - Always available - no local setup required 20 + 21 + #### Local Data Pipeline (Base) 22 + 23 + - [ ] Add `sync_state` table to migrations (stores cursor per `(did, source)`) 8 24 - [ ] Implement `sync_posts(did: String, source: "like"|"bookmark")`: 9 - - Paginate `app.bsky.feed.getActorLikes` (or bookmarks) 25 + - Resume from stored cursor in `sync_state` (never re-fetch full history) 26 + - Paginate `app.bsky.feed.getActorLikes` (or bookmarks) for the **authenticated user's own** likes/saves 10 27 - Upsert into `posts` table 11 - - Insert text into `posts_fts` 12 - - Track sync cursor in `sync_state` table 13 - - [ ] Implement `embed_pending_posts()`: 28 + - FTS index is maintained automatically via triggers 29 + - Persist the new cursor back to `sync_state` 30 + 31 + #### Embeddings 32 + 33 + - [ ] Implement `embed_pending_posts()` *(opt-out - skip when embeddings disabled)*: 14 34 - Query posts without embeddings 15 35 - Batch through `fastembed` TextEmbedding model (`nomic-embed-text-v1.5`) 16 36 - Insert into `posts_vec` via `zerocopy::AsBytes` 37 + - [ ] Implement `reindex_embeddings()`: 38 + - Clear all rows from `posts_vec` 39 + - Re-embed every post in `posts` table 40 + - Triggered manually by user (reindex button in UI) 41 + - [ ] Implement `set_embeddings_enabled(enabled: bool)`: 42 + - Persist preference; when disabled, skip model download + embedding on sync 43 + - Keyword search remains fully functional regardless 44 + 45 + #### Search Result Context 46 + 17 47 - [ ] Implement `search_posts(query, mode, limit)`: 18 - - `keyword`: FTS5 MATCH query 19 - - `semantic`: embed query string → vec similarity search 20 - - `hybrid`: run both, merge via reciprocal rank fusion 21 - - [ ] `get_sync_status(did)` → last sync time, post counts 22 - - [ ] Model management: download `nomic-embed-text-v1.5` ONNX on first use to `app_data_dir/models/` 48 + - `keyword`: FTS5 MATCH query (always available) 49 + - `semantic`: embed query string → vec similarity search (requires embeddings enabled) 50 + - `hybrid`: run both, merge via reciprocal rank fusion (falls back to keyword-only if embeddings disabled) 51 + - [ ] `get_sync_status(did)` → last sync time, post counts, cursor state 52 + - [ ] Model management: download `nomic-embed-text-v1.5` ONNX on first use to `<app_data_dir>/models/` (skipped when embeddings disabled) 23 53 - [ ] Background sync: trigger after login, then every 15 min 24 - - [ ] **Frontend**: search bar (`/` to focus) with mode selector, `Motion` sliding indicator underline 25 - - [ ] **Frontend**: search results with staggered `Motion` fade-in, highlighted keyword matches 26 - - [ ] **Frontend**: sync status indicator with animated progress bar, `Presence` fade-out on complete 27 - - [ ] **Frontend**: model download progress bar (percentage + ETA) on first launch 28 - - Splash/Preflight route should explain what the point of this is 29 - - [ ] **Frontend**: empty state illustration when no posts synced yet 30 - - [ ] **Frontend**: `Tab` cycles search mode, `Escape` clears 54 + 55 + ### Frontend 56 + 57 + - [ ] search bar (`/` or `CTRL/CMD + F` to focus) with mode selector (network / keyword / semantic / hybrid), `Motion` sliding indicator underline 58 + - [ ] search results with staggered `Motion` fade-in, highlighted keyword matches 59 + - [ ] sync status indicator with animated progress bar, `Presence` fade-out on complete 60 + - [ ] reindex button: triggers `reindex_embeddings()`, shown in search settings or sync status area 61 + - [ ] embeddings opt-out toggle in settings (disables semantic search, skips model download) 62 + - [ ] model download progress bar (percentage + ETA) on first launch 63 + - Enabled by default (opt-out) 64 + - Splash/Preflight route should explain what semantic search provides 65 + - [ ] empty state illustration when no posts synced yet 66 + - [ ] `Tab` cycles search mode (network → keyword → semantic → hybrid), `Escape` clears
+11 -11
src-tauri/src/db.rs
··· 1 + use super::error::AppError; 2 + use rusqlite::ffi::sqlite3_auto_extension; 3 + use rusqlite::{params, Connection, OpenFlags, OptionalExtension}; 4 + use sqlite_vec::sqlite3_vec_init; 1 5 use std::collections::HashSet; 2 6 use std::ffi::{c_char, c_int}; 3 7 use std::fs; 4 8 use std::path::PathBuf; 5 9 use std::sync::{Arc, Mutex}; 6 - 7 - use rusqlite::ffi::sqlite3_auto_extension; 8 - use rusqlite::{params, Connection, OpenFlags, OptionalExtension}; 9 - use sqlite_vec::sqlite3_vec_init; 10 10 use tauri::{AppHandle, Manager}; 11 11 12 - use crate::error::AppError; 13 - 14 12 pub type DbPool = Arc<Mutex<Connection>>; 13 + 15 14 type SqliteVecInit = unsafe extern "C" fn(); 15 + 16 16 type SqliteAutoExtension = unsafe extern "C" fn( 17 17 db: *mut rusqlite::ffi::sqlite3, 18 18 pz_err_msg: *mut *mut c_char, ··· 40 40 include_str!("migrations/003_oauth_sessions_without_fk.sql"), 41 41 ), 42 42 Migration::new(4, "account_avatars", include_str!("migrations/004_account_avatars.sql")), 43 + Migration::new(5, "sync_state", include_str!("migrations/005_sync_state.sql")), 43 44 ]; 44 45 45 46 pub fn initialize_database(app: &AppHandle) -> Result<DbPool, AppError> { ··· 119 120 .query_row("SELECT vec_version()", [], |row| row.get(0)) 120 121 .optional()?; 121 122 122 - if version.is_none() { 123 - return Err(AppError::Validation( 123 + match version.is_none() { 124 + true => Err(AppError::Validation( 124 125 "sqlite-vec extension did not report a version".to_string(), 125 - )); 126 + )), 127 + false => Ok(()), 126 128 } 127 - 128 - Ok(()) 129 129 } 130 130 131 131 #[cfg(test)]
+7
src-tauri/src/migrations/005_sync_state.sql
··· 1 + CREATE TABLE IF NOT EXISTS sync_state ( 2 + did TEXT NOT NULL, 3 + source TEXT NOT NULL, 4 + cursor TEXT, 5 + last_synced_at TEXT, 6 + PRIMARY KEY (did, source) 7 + );