GET /xrpc/app.bsky.actor.searchActorsTypeahead typeahead.waow.tech
16
fork

Configure Feed

Select the types of activity you want to include in your feed.

fix handle NOT NULL constraint, limit validation, and tighten readme

- bind empty string instead of null for missing handles (fixes batch failures)
- use COALESCE(NULLIF(...)) to preserve existing handles on partial updates
- filter empty-handle actors from search results
- reject limit <= 0, NaN, and non-numeric values with 400
- restore SLINGSHOT_URL constant for /request-indexing
- trim readme to match project style

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

+43 -50
+14 -29
README.md
··· 1 1 # typeahead 2 2 3 - community actor search for [atproto](https://atproto.com). 3 + community actor search for [atproto](https://atproto.com). drop-in replacement for `app.bsky.actor.searchActorsTypeahead`. 4 4 5 5 **live:** https://typeahead.waow.tech 6 - 7 - ## what it does 8 - 9 - drop-in replacement for Bluesky's `app.bsky.actor.searchActorsTypeahead` endpoint. any atproto app can point at this instead of `public.api.bsky.app` and get actor search without depending on Bluesky's infrastructure. 10 6 11 7 ```bash 12 8 curl "https://typeahead.waow.tech/xrpc/app.bsky.actor.searchActorsTypeahead?q=nate&limit=10" 13 9 ``` 14 10 15 - ## how it works 11 + ## stack 16 12 17 - 1. **ingester** (zig) subscribes to [Jetstream](https://docs.bsky.app/blog/jetstream), filters identity and profile events, batches them to the worker 18 - 2. **worker** (cloudflare) serves search via FTS5 full-text search over D1, with edge caching (60s) and rate limiting 19 - 3. **backfill** fills index gaps by querying Bluesky's typeahead in the background — temporary bridge while the index catches up 13 + ``` 14 + jetstream → ingester (zig, fly.io) → worker (cloudflare) 15 + 16 + D1 (sqlite/FTS5) 17 + ``` 20 18 21 - ## stack 19 + - **ingester**: [zig](https://ziglang.org) on [fly.io](https://fly.io) — streams identity + profile events via [jetstream](https://docs.bsky.app/blog/jetstream), batches to worker 20 + - **worker**: [cloudflare worker](https://workers.cloudflare.com) + D1 + KV + cache API — FTS5 prefix search, edge-cached (60s), rate-limited 21 + - **identity**: [slingshot](https://microcosm.blue) for on-demand handle resolution 22 22 23 - - **ingester**: zig on fly.io — streams jetstream, posts batches to worker 24 - - **worker**: cloudflare worker + D1 (sqlite/FTS5) + KV + cache API 25 - - **search**: FTS5 prefix matching, edge-cached per query 26 - 27 - ## run locally 23 + ## dev 28 24 29 25 ```bash 30 26 # worker 31 - npm install 32 - npx wrangler dev 27 + npm install && npx wrangler dev 33 28 34 29 # ingester 35 30 cd ingester && zig build run ··· 40 35 ## deploy 41 36 42 37 ```bash 43 - npx wrangler deploy # worker to cloudflare 44 - cd ingester && fly deploy # ingester to fly.io 38 + npx wrangler deploy # worker to cloudflare 39 + cd ingester && fly deploy # ingester to fly.io 45 40 ``` 46 - 47 - ## known limitations 48 - 49 - - **response shape**: returns `did`, `handle`, `displayName`, `avatar`. does not yet include `associated`, `labels`, `createdAt`, or other fields from [`profileViewBasic`](https://docs.bsky.app/docs/api/app-bsky-actor-search-actors-typeahead). apps that only destructure the core fields work fine; apps that depend on labels or associated metadata should be aware of this gap. 50 - - **no moderation filtering**: results are not filtered by moderation labels. unsafe accounts may appear in results. 51 - - **index coverage**: the index grows from jetstream events + backfill. it does not yet have full coverage of all atproto actors. 52 - 53 - ## license 54 - 55 - MIT
+1 -1
schema.sql
··· 1 1 CREATE TABLE IF NOT EXISTS actors ( 2 2 did TEXT PRIMARY KEY, 3 - handle TEXT NOT NULL, 3 + handle TEXT NOT NULL DEFAULT '', 4 4 display_name TEXT DEFAULT '', 5 5 avatar_url TEXT DEFAULT '', 6 6 updated_at INTEGER NOT NULL DEFAULT (unixepoch())
+28 -20
src/index.ts
··· 79 79 `INSERT INTO actors (did, handle, display_name, avatar_url, updated_at) 80 80 VALUES (?1, ?2, ?3, ?4, unixepoch()) 81 81 ON CONFLICT(did) DO UPDATE SET 82 - handle = COALESCE(?2, actors.handle), 82 + handle = COALESCE(NULLIF(?2, ''), actors.handle), 83 83 display_name = COALESCE(NULLIF(?3, ''), actors.display_name), 84 84 avatar_url = COALESCE(NULLIF(?4, ''), actors.avatar_url), 85 85 updated_at = unixepoch()` 86 86 ).bind( 87 87 a.did, 88 - a.handle || null, 89 - a.displayName || null, 90 - a.avatar || null 88 + a.handle || '', 89 + a.displayName || '', 90 + a.avatar || '' 91 91 ) 92 92 ); 93 93 ··· 122 122 ): Promise<Response> { 123 123 const url = new URL(request.url); 124 124 const q = url.searchParams.get("q") || url.searchParams.get("term") || ""; 125 - const limitParam = parseInt(url.searchParams.get("limit") || "10", 10); 125 + const limitRaw = url.searchParams.get("limit"); 126 + const limitParam = limitRaw !== null ? parseInt(limitRaw, 10) : 10; 126 127 127 128 if (!q.trim()) { 128 129 return json({ error: "InvalidRequest", message: "Error: Params must have the property \"q\"" }, 400); 129 130 } 130 - if (limitParam > 100) { 131 - return json({ error: "InvalidRequest", message: "Error: limit must be <= 100" }, 400); 131 + if (isNaN(limitParam) || limitParam < 1 || limitParam > 100) { 132 + return json({ error: "InvalidRequest", message: "Error: limit must be between 1 and 100" }, 400); 132 133 } 133 134 134 - const limit = Math.max(1, limitParam || 10); 135 + const limit = limitParam; 135 136 const term = sanitize(q); 136 137 if (!term) { 137 138 return json({ actors: [] }); ··· 160 161 .bind(ftsQuery, limit) 161 162 .all<ActorRow>(); 162 163 163 - const actors = (results || []).map((r) => ({ 164 - did: r.did, 165 - handle: r.handle, 166 - ...(r.display_name ? { displayName: r.display_name } : {}), 167 - ...(r.avatar_url ? { avatar: r.avatar_url } : {}), 168 - })); 164 + const actors = (results || []) 165 + .filter((r) => r.handle) 166 + .map((r) => ({ 167 + did: r.did, 168 + handle: r.handle, 169 + ...(r.display_name ? { displayName: r.display_name } : {}), 170 + ...(r.avatar_url ? { avatar: r.avatar_url } : {}), 171 + })); 169 172 170 173 // --- backfill: remove this block once at parity with Bluesky --- 171 174 const hasGaps = actors.length < limit || actors.some((a) => !a.avatar); ··· 220 223 `INSERT INTO actors (did, handle, display_name, avatar_url, updated_at) 221 224 VALUES (?1, ?2, ?3, ?4, unixepoch()) 222 225 ON CONFLICT(did) DO UPDATE SET 223 - handle = COALESCE(?2, actors.handle), 224 - display_name = COALESCE(?3, actors.display_name), 226 + handle = COALESCE(NULLIF(?2, ''), actors.handle), 227 + display_name = COALESCE(NULLIF(?3, ''), actors.display_name), 225 228 avatar_url = COALESCE(NULLIF(?4, ''), actors.avatar_url), 226 229 updated_at = unixepoch()` 227 230 ).bind( 228 231 e.did, 229 - e.handle || null, 230 - e.display_name || null, 231 - avatarUrl 232 + e.handle || '', 233 + e.display_name || '', 234 + avatarUrl || '' 232 235 ); 233 236 }); 234 237 235 - await env.DB.batch(stmts); 238 + try { 239 + await env.DB.batch(stmts); 240 + } catch (e: any) { 241 + console.log(JSON.stringify({ event: "ingest_error", error: e?.message, count: events.length })); 242 + return json({ error: e?.message || "db batch failed" }, 500); 243 + } 236 244 237 245 if (cursor !== undefined) { 238 246 try {