See the best posts from any Bluesky account
0
fork

Configure Feed

Select the types of activity you want to include in your feed.

Add design spec for skystar.social

Initial design for a favstar.fm-style site for Bluesky: type a handle,
see that user's most-liked and most-reposted posts. Documents the
approved architecture (Adonis + Lucid/SQLite + ClickHouse), data model
with per-post watermark reconciliation, ingest/backfill/read flows,
routes, testing strategy, and a single-docker-compose deployment shape.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Tao Bojlén 93823015

+797
+797
docs/superpowers/specs/2026-04-11-skystar-bluesky-design.md
··· 1 + # Skystar.social — Design Spec 2 + 3 + **Date:** 2026-04-11 4 + **Status:** Approved, ready for implementation planning 5 + **One-line:** A favstar.fm-style site for Bluesky — type a handle, see that user's most-liked and most-reposted posts. 6 + 7 + --- 8 + 9 + ## 1. Goal and scope 10 + 11 + Skystar.social lets a visitor type any Bluesky handle and see that user's 12 + top-25 most-liked or most-reposted posts, filterable by time window. The 13 + experience is modeled on the original favstar.fm for Twitter (2009-2018): 14 + unauthenticated, instant on repeat visits, focused on a single greatest-hits 15 + view per user. 16 + 17 + ### v1 scope 18 + 19 + - Per-user profile page with the top-25 posts ranked by likes or reposts. 20 + - Two time lenses: all-time (default) and "last month". 21 + - Handle resolution with canonical-URL redirects (handle changes are healed 22 + transparently). 23 + - Synchronous backfill on first lookup, capped at 10,000 most recent posts. 24 + - Live ingest of engagement events for tracked users from the Bluesky 25 + Jetstream firehose. 26 + - Honoring of post deletions, account deletions, and account takedowns. 27 + 28 + ### Explicitly out of scope for v1 29 + 30 + - Authentication, accounts, or any login flow. 31 + - Quote-post tracking. The schema reserves space for it; no code touches it. 32 + - Global leaderboards ("today's best", "this week's best"). 33 + - Per-user trophies, awards, or notifications. 34 + - Email, RSS, or any push surface. 35 + - Pagination beyond top-25. 36 + - Backfill of historical individual like records (only aggregate counts at 37 + backfill time, plus live deltas thereafter). 38 + - Reconciliation of count drift after backfill. 39 + - Tracking of unlike / unrepost events (counts may drift up slightly over 40 + time; accepted). 41 + - Smoke / E2E / load / browser tests. 42 + - Metrics, APM, error tracking, log shipping. 43 + 44 + --- 45 + 46 + ## 2. Constraints and key research findings 47 + 48 + The Bluesky AppView (`api.bsky.app` / `public.api.bsky.app`) does **not** 49 + expose any endpoint that returns a user's posts sorted by like count or 50 + repost count. `app.bsky.feed.searchPosts?sort=top` exists but uses an opaque 51 + relevance ranking, not raw counts, and pagination is not guaranteed to 52 + completion. There is no third-party public API that exposes per-user 53 + engagement leaderboards either. 54 + 55 + Therefore: **we must build our own index.** The two cheap data sources are: 56 + 57 + 1. **`getPosts`** — hydrates up to 25 post URIs per call, returning aggregate 58 + `likeCount` and `repostCount`. This is how we get the historical baseline 59 + for a user's posts at backfill time. 60 + 2. **Jetstream** (`wss://jetstream2.us-east.bsky.network/subscribe`) — JSON- 61 + over-WebSocket translation of the firehose, supports server-side 62 + filtering by `wantedCollections`, and emits `app.bsky.feed.like`, 63 + `app.bsky.feed.repost`, and `app.bsky.feed.post` create/delete events 64 + that we use for live deltas. 65 + 66 + Jetstream's replay window is short (days, not years) and there is no public 67 + API that returns historical individual like records cheaply, so we cannot 68 + reconstruct full per-user engagement history. The v1 design accepts this 69 + and uses a snapshot baseline + live delta model. 70 + 71 + DIDs (`did:plc:...`) are stable across PDS migrations and handle changes, 72 + so they are the only identifier the system tracks internally. Handles are 73 + mutable display labels. 74 + 75 + Honoring deletions is non-negotiable: Bluesky's ToS expects appviews and 76 + clients to stop serving deleted records. The worker handles delete events 77 + on `app.bsky.feed.post` and `kind: "account"` events with `status: "deleted" 78 + | "takendown"`. 79 + 80 + --- 81 + 82 + ## 3. Stack 83 + 84 + - **Language and framework:** TypeScript on Node.js 24, AdonisJS v6. 85 + - **Templating:** Edge (Adonis's first-party SSR template engine). 86 + - **Read-side ORM:** Lucid (Adonis's first-party Knex-based ORM), used only 87 + for SQLite. 88 + - **Engagement store:** ClickHouse, accessed via `@clickhouse/client` from a 89 + shared `packages/clickhouse` package. No ORM. 90 + - **Metadata store:** SQLite (file-backed, WAL mode), via Lucid. 91 + - **Worker:** Adonis Ace command (`node ace jetstream:consume`) with 92 + `staysAlive = true`. Same project, same image, same code as the web app — 93 + different entrypoint. 94 + - **Atproto client:** `@atproto/api` (the official TypeScript SDK), wrapped 95 + in `packages/atproto` for centralized rate-limit handling and parsing. 96 + - **Containerization:** Single `Dockerfile` (Node 24 alpine multi-stage with 97 + tini), single `docker-compose.yml` with three services: `clickhouse`, 98 + `web`, `worker`. 99 + 100 + The web and worker are deliberately separate processes that share code but 101 + not memory. They communicate only through ClickHouse and SQLite. This 102 + isolates failure domains and allows independent restarts. 103 + 104 + ### Why ClickHouse, not Postgres 105 + 106 + The hot query is "for one author, give me the top-25 posts by sum of 107 + engagement events." This is the canonical columnar OLAP workload. ClickHouse 108 + handles 1B+ event rows on a single small box at single-digit-millisecond 109 + query time, with compression ratios of ~10-15× on like-event data (DIDs and 110 + URI prefixes repeat heavily). The append-only model also removes a class of 111 + write-contention bugs we'd otherwise hit on Postgres counter UPDATEs. 112 + 113 + SQLite holds the ~kilobyte-scale relational state (users, jetstream cursor, 114 + backfill jobs) where ClickHouse would be wrong. Splitting them means we can 115 + rebuild ClickHouse from Jetstream + re-backfill without losing operational 116 + state, and vice versa. 117 + 118 + ### Why AdonisJS, not Hono / Nuxt 119 + 120 + The site is small now but expected to grow auth, possibly accounts, and 121 + other features later. Adonis's batteries-included structure (controllers, 122 + sessions, validators, mailer, Ace commands) pays for itself the first time 123 + we add a feature that would otherwise need 5 separate libraries glued 124 + together. At 100 req/sec peak, the framework overhead is invisible relative 125 + to the database round-trip. The `staysAlive` Ace command pattern is also a 126 + particularly clean fit for hosting the Jetstream worker in the same project. 127 + 128 + --- 129 + 130 + ## 4. Architecture 131 + 132 + ``` 133 + ┌─────────────────────────────────────────────────────────────┐ 134 + │ Hetzner-class VPS (4 vCPU, 8 GB, 160 GB NVMe) │ 135 + │ │ 136 + │ ┌────────────────────┐ ┌────────────────────┐ │ 137 + │ │ web container │ │ worker container │ │ 138 + │ │ node bin/server │ │ node ace │ │ 139 + │ │ Adonis HTTP │ │ jetstream:consume│ │ 140 + │ │ + Edge SSR │ │ (Ace command, │ │ 141 + │ │ + Lucid │ │ staysAlive=true) │ │ 142 + │ └─────────┬──────────┘ └──────────┬─────────┘ │ 143 + │ │ │ │ 144 + │ │ reads + writes │ writes │ 145 + │ ▼ ▼ │ 146 + │ ┌────────────────────────────────────────────────────┐ │ 147 + │ │ shared TS package: packages/clickhouse │ │ 148 + │ │ (@clickhouse/client wrapper, schema, query funcs) │ │ 149 + │ └────────────────────────┬───────────────────────────┘ │ 150 + │ ▼ │ 151 + │ ┌──────────────────┐ ┌──────────────────────┐ │ 152 + │ │ ClickHouse │ │ SQLite (Lucid) │ │ 153 + │ │ engagement │ │ metadata: │ │ 154 + │ │ events + │ │ - users (did,handle)│ │ 155 + │ │ post snapshots │ │ - cursor checkpoint │ │ 156 + │ │ (append-only) │ │ - backfill_jobs │ │ 157 + │ └──────────────────┘ └──────────────────────┘ │ 158 + └─────────────────────────────────────────────────────────────┘ 159 + ▲ ▲ 160 + │ wss:// │ HTTPS 161 + │ │ 162 + Bluesky Jetstream Bluesky AppView API 163 + (jetstream2.us-east...) (public.api.bsky.app) 164 + ``` 165 + 166 + ### Repository layout 167 + 168 + ``` 169 + skystar/ 170 + ├── apps/ 171 + │ └── web/ # Adonis project root 172 + │ ├── app/ 173 + │ │ ├── controllers/ # ProfileController, SearchController 174 + │ │ ├── models/ # User (Lucid) 175 + │ │ └── services/ # HandleResolver, BackfillRunner 176 + │ ├── commands/ 177 + │ │ └── jetstream_consume.ts # Ace staysAlive worker 178 + │ ├── database/ 179 + │ │ ├── migrations/ # SQLite migrations (Adonis) 180 + │ │ └── clickhouse/ # ordered .sql files for CH migrations 181 + │ ├── resources/views/ # Edge templates 182 + │ └── start/ # routes.ts, kernel, etc. 183 + ├── packages/ 184 + │ ├── clickhouse/ # CH client wrapper, schema, queries 185 + │ └── atproto/ # @atproto/api wrapper, parsing, rate limits 186 + ├── docker-compose.yml 187 + └── Dockerfile 188 + ``` 189 + 190 + Both packages are imported by the Adonis app (controllers, services, the 191 + Ace command) via TypeScript path aliases. There is one published image. 192 + 193 + ### Key invariants 194 + 195 + 1. The web and worker processes never communicate directly. All coordination 196 + is via SQLite and ClickHouse. 197 + 2. The shared `packages/clickhouse` package is the only place that knows 198 + ClickHouse SQL. Everywhere else uses its query functions. 199 + 3. Every per-request operation is a constant number of SQL/ClickHouse 200 + queries, regardless of how many users are tracked. 201 + 202 + --- 203 + 204 + ## 5. Data model 205 + 206 + ### ClickHouse — engagement data 207 + 208 + Two tables. The reconciliation between historical snapshots (from 209 + `getPosts`) and live event deltas (from Jetstream) is handled by a 210 + **per-post watermark**, not a per-user one — this is the only correctness- 211 + critical detail in the data layer. 212 + 213 + #### `post_snapshots` 214 + 215 + One row per post we have ever seen, capturing what the AppView reported at 216 + backfill time. 217 + 218 + ```sql 219 + CREATE TABLE post_snapshots ( 220 + post_uri String, 221 + post_author_did String, 222 + post_text String, 223 + post_created_at DateTime64(6), 224 + snapshot_likes UInt32, 225 + snapshot_reposts UInt32, 226 + snapshot_quotes UInt32, -- always 0 in v1, reserved for v2 227 + snapshot_taken_at DateTime64(6), -- per-post watermark 228 + is_deleted UInt8 DEFAULT 0 229 + ) ENGINE = ReplacingMergeTree(snapshot_taken_at) 230 + ORDER BY (post_author_did, post_uri) 231 + PARTITION BY toYYYYMM(post_created_at); 232 + ``` 233 + 234 + `ReplacingMergeTree` means re-running backfill or post-delete tombstones for 235 + the same `(post_author_did, post_uri)` collapses to the latest version on 236 + merge. 237 + 238 + #### `engagement_events` 239 + 240 + Append-only log of every like/repost we have seen via Jetstream **for 241 + tracked users only**. The worker filters out engagement events targeting 242 + untracked authors before insert. 243 + 244 + ```sql 245 + CREATE TABLE engagement_events ( 246 + post_uri String, 247 + post_author_did String, -- denormalized for fast author filter 248 + actor_did String, 249 + rkey String, -- the engagement record's rkey 250 + kind LowCardinality(String), -- 'like' | 'repost' (| 'quote' v2) 251 + event_created_at DateTime64(6), -- when the actor created the engagement 252 + ingested_at DateTime64(6) DEFAULT now64(6) 253 + ) ENGINE = ReplacingMergeTree(ingested_at) 254 + ORDER BY (post_author_did, kind, post_uri, actor_did, rkey) 255 + PARTITION BY toYYYYMM(event_created_at); 256 + ``` 257 + 258 + `ReplacingMergeTree` keyed on the natural unique identifier 259 + `(post_author_did, kind, post_uri, actor_did, rkey)` makes inserts 260 + idempotent: replaying the same event on Jetstream reconnect collides with 261 + itself and collapses on merge. We do **not** use `FINAL` in queries; the 262 + brief transient overcount during the merge window is accepted (see §10). 263 + 264 + The order key also makes "top posts for one author, one kind" a sequential 265 + scan over a tight slab of disk, supporting single-digit-millisecond queries. 266 + 267 + ### How counts are computed 268 + 269 + The total engagement of one kind on one post is: 270 + 271 + ``` 272 + total(post, kind) = snapshot_count(post, kind) 273 + + count(events for post where kind matches 274 + AND event_created_at > snapshot_taken_at(post)) 275 + ``` 276 + 277 + The watermark is **per-post**, not per-user, because the backfill makes 278 + ~hundreds of `getPosts` calls spread across several seconds — each call 279 + captures a snapshot at a different moment, and a single user-level watermark 280 + would either double-count or miss events depending on its value. 281 + 282 + The top-25 query is one `LEFT JOIN` across the two tables filtered to the 283 + single author's slab, computed as: 284 + 285 + ```sql 286 + SELECT 287 + s.post_uri, 288 + any(s.post_text) AS text, 289 + any(s.post_created_at) AS created_at, 290 + s.snapshot_likes 291 + + countIf(e.kind='like' AND e.event_created_at > s.snapshot_taken_at) 292 + AS likes, 293 + s.snapshot_reposts 294 + + countIf(e.kind='repost' AND e.event_created_at > s.snapshot_taken_at) 295 + AS reposts 296 + FROM post_snapshots s 297 + LEFT JOIN engagement_events e 298 + ON e.post_uri = s.post_uri 299 + AND e.post_author_did = s.post_author_did 300 + WHERE s.post_author_did = ? 301 + AND s.is_deleted = 0 302 + AND (? IS NULL OR s.post_created_at >= ?) -- ?after= filter 303 + GROUP BY s.post_uri, s.snapshot_likes, s.snapshot_reposts, s.snapshot_taken_at 304 + ORDER BY {likes|reposts} DESC, created_at DESC 305 + LIMIT 25; 306 + ``` 307 + 308 + ClickHouse pushes the `post_author_did = ?` predicate down both sides of the 309 + join because both tables are ordered by `post_author_did` first. 310 + 311 + ### SQLite — metadata (Lucid models) 312 + 313 + ```sql 314 + CREATE TABLE users ( 315 + did TEXT PRIMARY KEY, -- stable identity 316 + handle TEXT NOT NULL, -- current handle (mutable) 317 + display_name TEXT, 318 + avatar_url TEXT, 319 + first_seen_at INTEGER NOT NULL, 320 + backfilled_at INTEGER, -- NULL = never backfilled 321 + last_searched_at INTEGER, 322 + deleted_at INTEGER -- set on account deletion/takedown 323 + ); 324 + CREATE INDEX users_handle ON users(handle); 325 + 326 + CREATE TABLE jetstream_cursor ( 327 + id INTEGER PRIMARY KEY CHECK (id = 1), 328 + cursor_us INTEGER NOT NULL, 329 + updated_at INTEGER NOT NULL 330 + ); 331 + 332 + CREATE TABLE backfill_jobs ( 333 + did TEXT PRIMARY KEY, 334 + started_at INTEGER NOT NULL, 335 + finished_at INTEGER, 336 + total_posts INTEGER, 337 + fetched_posts INTEGER NOT NULL DEFAULT 0, 338 + state TEXT NOT NULL, -- 'running' | 'done' | 'failed' 339 + error TEXT 340 + ); 341 + ``` 342 + 343 + The `users.backfilled_at` column is informational only; it is not used in 344 + the query math. Per-post `snapshot_taken_at` carries the load-bearing 345 + correctness signal. 346 + 347 + --- 348 + 349 + ## 6. Data flow 350 + 351 + ### Flow 1 — Ingest (worker, always running) 352 + 353 + The worker subscribes to Jetstream with 354 + `wantedCollections=app.bsky.feed.like,app.bsky.feed.repost,app.bsky.feed.post` 355 + and `compress=true`. We receive every event in those collections from the 356 + whole network — server-side filtering by *author* is not possible for our 357 + case (likes are authored by likers, not post authors), so we filter in the 358 + consumer. 359 + 360 + On each event: 361 + 362 + 1. Parse JSON. 363 + 2. Branch on collection. 364 + 3. For likes/reposts: extract `subject.uri`, parse the post author DID from 365 + it (`at://did:plc:THIS_PART/app.bsky.feed.post/RKEY`). 366 + 4. Check the post author DID against the in-memory `Set<string>` of tracked 367 + DIDs. If absent, drop the event (>99% of events). The lookup is ~50ns 368 + per event; the filter is effectively free. 369 + 5. If present, append the row to an in-memory buffer. 370 + 6. The buffer flushes to ClickHouse every 500ms or 1000 rows, whichever 371 + comes first. ClickHouse is much happier with ~100 large inserts/sec than 372 + 1500 single-row inserts/sec. 373 + 7. After a successful flush, update `jetstream_cursor.cursor_us` to the 374 + cursor of the last event in that batch. 375 + 376 + For `app.bsky.feed.post` events: if the post's author is tracked, insert a 377 + row into `post_snapshots` with `snapshot_likes=0, snapshot_reposts=0, 378 + snapshot_taken_at=now()`, so future engagement on it has somewhere to live. 379 + 380 + For `app.bsky.feed.post` *delete* events on tracked authors: write a 381 + tombstone row to `post_snapshots` with `is_deleted=1` and a fresh 382 + `snapshot_taken_at`. ReplacingMergeTree collapses to the tombstone version. 383 + 384 + For `kind: "account"` events with `status: "deleted" | "takendown"` on a 385 + tracked DID: set `users.deleted_at = now()`, remove the DID from the 386 + in-memory tracked set, and schedule a worker task that tombstones all 387 + of that user's `post_snapshots` rows. 388 + 389 + Every 1 second, the worker re-reads new entries from `users` and adds any 390 + newly-seen DIDs to the in-memory tracked set. There is no handshake back to 391 + the web process — this is a one-way refresh. 392 + 393 + Every 1 second, the worker also writes the latest cursor checkpoint to 394 + SQLite. (If the worker crashes between checkpoints, the natural replay on 395 + reconnect — see "Reconnect handling" — produces duplicate inserts that 396 + ReplacingMergeTree collapses on merge.) 397 + 398 + #### Reconnect handling 399 + 400 + On WebSocket disconnect, the worker reads `jetstream_cursor.cursor_us` and 401 + reconnects with `cursor=<that value>`. **No subtraction, no overlap window.** 402 + Jetstream cursors are monotonic microsecond timestamps; resuming from 403 + exactly the last-checkpointed cursor delivers strictly subsequent events. 404 + The natural replay window is the events received-but-not-yet-checkpointed 405 + at the moment of crash (at most ~one batch, ~500ms wide), which collapses 406 + naturally on `ReplacingMergeTree` merges. 407 + 408 + ### Flow 2 — Backfill (web process, on first user search) 409 + 410 + Triggered when `GET /profile/:handle/likes` (or `/reposts`) hits a user 411 + whose `users.backfilled_at IS NULL`. 412 + 413 + 1. Resolve handle → DID via `com.atproto.identity.resolveHandle` against 414 + `public.api.bsky.app`. 415 + 2. `INSERT OR IGNORE` into `users(did, handle, first_seen_at)`. If a 416 + parallel request already inserted, the second request sees the same row 417 + and the same `backfill_jobs` state. 418 + 3. `INSERT OR IGNORE` into `backfill_jobs(did, state='running', started_at)`. 419 + 4. **Fire off the backfill as an unawaited background task** 420 + (`void runBackfill(did)`) and return a static "Indexing @handle…" page 421 + with `<meta http-equiv="refresh" content="2">`. No SSE, no JS, no 422 + progress bar. 423 + 5. The browser auto-refreshes every 2 seconds. Each refresh re-runs the 424 + same controller, which sees `backfill_jobs.state` and either renders the 425 + loading page again or, once `state='done'`, renders the actual top-25 426 + page. 427 + 6. The background backfill loop: 428 + a. `getAuthorFeed(DID, cursor)` paginates the user's posts in reverse 429 + chronological order, ~100 posts per page. 430 + b. For each batch of 25 URIs, `getPosts(uris)` returns aggregate counts. 431 + c. Insert one row per post into `post_snapshots` with 432 + `snapshot_taken_at = now()` recorded *at the moment that batch's 433 + response landed*. 434 + d. Update `backfill_jobs.fetched_posts` for the loading page. 435 + e. Continue until cursor exhausted or 10,000 posts reached 436 + (`BACKFILL_MAX_POSTS`). 437 + f. `UPDATE users SET backfilled_at = now()`, `UPDATE backfill_jobs SET 438 + state='done', finished_at = now()`. 439 + 440 + The worker is, in parallel, polling the `users` table every 1 second and 441 + will pick up the new DID within ~1 second of insertion, beginning live 442 + ingest. There is a small window (≤1 second) at the start of the backfill 443 + during which likes targeting the new user's earliest backfilled posts may 444 + be lost — they happen after the snapshot was taken (so not in the snapshot) 445 + but before the worker picked the user up (so not in events). This drift is 446 + bounded to a handful of likes per first-search per user, only affects the 447 + most-recent posts in the first batch, and is **explicitly accepted** 448 + (see §10). 449 + 450 + If the backfill crashes mid-flight, the `backfill_jobs` row is left in 451 + `state='running'` and the user's loading page never resolves. This is a 452 + known limitation of v1 and is recovered manually by deleting the row. 453 + 454 + ### Flow 3 — Read (web process, every subsequent visit) 455 + 456 + `GET /profile/:handle/likes?after=2026-03-11` 457 + 458 + 1. `SELECT did, handle, deleted_at FROM users WHERE handle = ?` 459 + 2. If `deleted_at IS NOT NULL`, render 410 Gone. 460 + 3. If the URL handle differs from the canonical handle for that DID 461 + (handle change since last visit), 301 to the canonical URL. 462 + 4. Run the top-25 ClickHouse query (per §5) with the `?after=` filter and 463 + the kind from the path segment (`/likes` or `/reposts`). 464 + 5. Render the Edge template with the 25 cards. 465 + 6. `Cache-Control: public, max-age=60` — top posts of all time don't change 466 + meaningfully second to second. 467 + 468 + Each post card renders directly from `post_snapshots.post_text` plus the 469 + joined live counts. **No callbacks to the AppView at request time.** Avatar 470 + images link directly to the Bluesky CDN URL stored on the user row. 471 + 472 + --- 473 + 474 + ## 7. Routes and URLs 475 + 476 + | Method | Route | Purpose | 477 + |---|---|---| 478 + | GET | `/` | Landing page with search box | 479 + | GET | `/search?q=:handle` | Resolve, 302 → canonical `/profile/:handle/likes` | 480 + | GET | `/profile/:handle` | 301 → `/profile/:handle/likes` | 481 + | GET | `/profile/:handle/likes` | Profile page, top by likes | 482 + | GET | `/profile/:handle/reposts` | Profile page, top by reposts | 483 + | GET | `/about` | Static "what is this" + ATProto credits | 484 + | GET | `/healthz` | Liveness check | 485 + 486 + ### Canonicalization rules 487 + 488 + Every profile view has **exactly one canonical URL**. All non-canonical 489 + forms 301 to it. 490 + 491 + - `/profile/dril` → 301 → `/profile/dril.bsky.social/likes` 492 + - `/profile/dril.bsky.social` → 301 → `/profile/dril.bsky.social/likes` 493 + - `/profile/dril.bsky.social/likes` → 200 (canonical) 494 + - `/profile/oldhandle.bsky.social/likes` → 301 → 495 + `/profile/newhandle.bsky.social/likes` (handle change healing) 496 + - `/profile/btao.org/likes` → 200 (custom domain handles are already 497 + canonical) 498 + 499 + The search box on the landing page resolves through canonicalization 500 + server-side so a user typing "dril" lands directly on 501 + `/profile/dril.bsky.social/likes` with one redirect, not two. 502 + 503 + A `<link rel="canonical">` tag in the page `<head>` points at the canonical 504 + URL as belt-and-braces. 505 + 506 + ### Query parameters 507 + 508 + - `?after=YYYY-MM-DD` — filter to posts created on or after the given UTC 509 + date. Omitted parameter means "all time". The UI dropdown computes the 510 + date when generating the link, so a "Last month" link is a permalink to 511 + that exact range. 512 + 513 + The UI dropdown still shows friendly labels ("All time", "Last month"). 514 + 515 + --- 516 + 517 + ## 8. UI 518 + 519 + ### Landing (`/`) 520 + 521 + - App name and one-line tagline. 522 + - Single search box. 523 + - Four example handles to seed curiosity. 524 + - One short paragraph of explanation. 525 + - No nav, no login, no footer cruft. 526 + 527 + ### Profile page 528 + 529 + - Header: avatar, display name, handle, optional bio line. 530 + - Two controls: 531 + - **Kind toggle:** "Most liked" / "Most reposted" — implemented as links 532 + between `/profile/:handle/likes` and `/profile/:handle/reposts`. 533 + - **Lens dropdown:** "All time" / "Last month" — sets the `?after=` 534 + parameter. 535 + - 25 post cards in ranked order. Each card shows: like count, repost count, 536 + post text (with line breaks preserved), date, "view on Bluesky" link. 537 + 538 + ### Loading page (during first-visit backfill) 539 + 540 + A static page with "Indexing @handle…" copy and a meta-refresh tag. No 541 + progress bar, no SSE, no JS. The browser refreshes every 2 seconds; each 542 + refresh re-runs the controller which decides whether to keep showing the 543 + loading page or render the real one. 544 + 545 + ### Error states 546 + 547 + - Handle does not resolve → 404 with "We can't find @whatever on Bluesky. 548 + Did you typo?" 549 + - User has zero public posts → render the profile header with "@user 550 + hasn't posted anything yet". 551 + - User has fewer than 25 posts → render whatever they have. 552 + - Account deleted/taken down → 410 Gone. 553 + - ClickHouse query fails → 503 with "we're having a moment, try again in a 554 + sec". 555 + - AppView rate-limited mid-backfill → backoff and resume silently. The 556 + loading page continues to refresh; the user is not told why it is slow. 557 + - Backfill cap reached (10,000 posts) → render the page normally. (No UI 558 + callout; v1 keeps the page clean.) 559 + 560 + --- 561 + 562 + ## 9. Configuration 563 + 564 + Complete env-var surface: 565 + 566 + | Variable | Purpose | 567 + |---|---| 568 + | `NODE_ENV` | `production` or `development` | 569 + | `PORT` | HTTP port (web only) | 570 + | `APP_KEY` | Adonis encryption key | 571 + | `SQLITE_PATH` | Path to SQLite file | 572 + | `CLICKHOUSE_URL` | ClickHouse HTTP endpoint | 573 + | `CLICKHOUSE_DB` / `CLICKHOUSE_USER` / `CLICKHOUSE_PASSWORD` | Credentials | 574 + | `JETSTREAM_URL` | Jetstream WebSocket URL (worker only) | 575 + | `BACKFILL_MAX_POSTS` | Defaults to 10000 | 576 + | `LOG_LEVEL` | Defaults to `info` | 577 + 578 + No config files baked into the image. No service discovery. No secrets 579 + manager. 580 + 581 + --- 582 + 583 + ## 10. Accepted error budgets 584 + 585 + The design deliberately accepts several small sources of count drift in 586 + exchange for radical simplicity: 587 + 588 + 1. **Sub-second AppView/Jetstream race during backfill.** A like that 589 + arrives via Jetstream within ~100-500ms of a `getPosts` call may end up 590 + on the wrong side of `snapshot_taken_at`, causing a single missing or 591 + double-counted like. Bounded to a few likes per backfill per active post. 592 + 2. **First-second worker lag on new users.** Likes happening between the 593 + moment a user is first inserted and the moment the worker picks them up 594 + on its 1-second poll may be lost — they happen after the snapshot was 595 + taken (for the earliest backfill batches) but before the worker is 596 + tracking. Bounded to a handful of likes per first-search per user, only 597 + affects the user's most-recent posts, only happens once per user ever. 598 + 3. **Unlikes / unreposts not tracked.** Counts may drift up over time. The 599 + alternative is a multi-billion-row rkey-to-post mapping table, which is 600 + not worth the storage cost for v1. 601 + 4. **Transient over-count during ReplacingMergeTree merge windows.** After 602 + a Jetstream reconnect, duplicate event rows exist briefly until a 603 + background merge collapses them. Bounded to seconds. 604 + 605 + For a "greatest-hits ranking" UX where the top posts are old viral content 606 + with thousands of likes, all four of these drifts are invisible. 607 + 608 + There is no reconciliation job in v1. 609 + 610 + --- 611 + 612 + ## 11. Testing strategy 613 + 614 + Two layers. No smoke / E2E / load / browser tests. 615 + 616 + ### Unit tests (Japa) 617 + 618 + - `packages/clickhouse/queries.ts` — top-25 query builder, fixtures seeded 619 + into a real ClickHouse. Tests cover: all-time and `?after=` ordering, 620 + per-post watermark math, ties broken by `post_created_at`, posts with 621 + zero engagement excluded, tombstoned posts excluded. 622 + - `packages/atproto/parse.ts` — pure functions for AT-URI parsing, 623 + Jetstream event JSON → internal shape, `getPosts` response → internal 624 + shape. 625 + - `apps/web/app/services/handle_resolver.ts` — handle normalization, 626 + custom domain detection, invalid handle rejection. 627 + - The worker's filter logic — given an event and a tracked-DID set, does 628 + it correctly keep or drop. 629 + - Date parsing for `?after=`. 630 + 631 + ### Integration tests (Japa with real services) 632 + 633 + - Full backfill flow with a stub `@atproto/api` client. 634 + - Jetstream consumer loop with a fake WebSocket — verify only tracked 635 + users' events land in ClickHouse, post deletes tombstone correctly, 636 + cursor checkpoint advances. 637 + - Canonical URL redirect flows (bare handle, kind suffix, handle change, 638 + custom domain). 639 + - Concurrent first-search dedup. 640 + 641 + ### Test infrastructure 642 + 643 + - One `docker-compose.yml` at the repo root (also used in development). 644 + Runs ClickHouse. SQLite is handled natively by Adonis's test runner. 645 + - CI uses the same Compose file. Per-test ClickHouse database isolation 646 + via `CREATE DATABASE test_<uuid>`. 647 + - Test data builders (`aPost(...)`, `aLikeEvent(...)`) for readable setups. 648 + 649 + ### Coverage targets 650 + 651 + No percentage. The bugs that matter are concentrated in: 652 + 653 + 1. The watermark/snapshot reconciliation math. 654 + 2. The Jetstream event filter. 655 + 3. The handle resolution / canonical URL flow. 656 + 4. The deletion-honoring path. 657 + 658 + --- 659 + 660 + ## 12. Deployment 661 + 662 + A single `docker-compose.yml` at the repo root, three services. The 663 + application's only deployment concerns are: building a Docker image, 664 + logging to stdout, reading config from env vars. 665 + 666 + ### `Dockerfile` 667 + 668 + Multi-stage, `node:24-alpine` base, tini as the entrypoint (Node-as-PID-1 669 + does not reap zombies or forward signals correctly; tini fixes this with 670 + one line). 671 + 672 + ```dockerfile 673 + FROM node:24-alpine AS build 674 + WORKDIR /app 675 + COPY package.json package-lock.json ./ 676 + RUN npm ci 677 + COPY . . 678 + RUN node ace build 679 + RUN npm prune --production 680 + 681 + FROM node:24-alpine AS runtime 682 + RUN apk add --no-cache tini 683 + WORKDIR /app 684 + COPY --from=build --chown=node:node /app/build ./build 685 + COPY --from=build --chown=node:node /app/node_modules ./node_modules 686 + COPY --from=build --chown=node:node /app/package.json ./package.json 687 + USER node 688 + ENTRYPOINT ["/sbin/tini", "--"] 689 + # CMD set per-service in docker-compose.yml 690 + ``` 691 + 692 + ### `docker-compose.yml` 693 + 694 + ```yaml 695 + services: 696 + clickhouse: 697 + image: clickhouse/clickhouse-server:latest 698 + volumes: 699 + - clickhouse-data:/var/lib/clickhouse 700 + environment: 701 + CLICKHOUSE_DB: skystar 702 + CLICKHOUSE_USER: skystar 703 + CLICKHOUSE_PASSWORD: ${CLICKHOUSE_PASSWORD} 704 + ulimits: 705 + nofile: 262144 706 + 707 + web: 708 + build: . 709 + command: node bin/server.js 710 + depends_on: [clickhouse] 711 + volumes: 712 + - sqlite-data:/data 713 + environment: 714 + NODE_ENV: production 715 + PORT: 3333 716 + DB_CONNECTION: sqlite 717 + SQLITE_PATH: /data/skystar.sqlite 718 + CLICKHOUSE_URL: http://clickhouse:8123 719 + CLICKHOUSE_DB: skystar 720 + CLICKHOUSE_USER: skystar 721 + CLICKHOUSE_PASSWORD: ${CLICKHOUSE_PASSWORD} 722 + APP_KEY: ${APP_KEY} 723 + ports: 724 + - "3333:3333" 725 + 726 + worker: 727 + build: . 728 + command: node ace jetstream:consume 729 + depends_on: [clickhouse] 730 + volumes: 731 + - sqlite-data:/data 732 + environment: 733 + NODE_ENV: production 734 + DB_CONNECTION: sqlite 735 + SQLITE_PATH: /data/skystar.sqlite 736 + CLICKHOUSE_URL: http://clickhouse:8123 737 + CLICKHOUSE_DB: skystar 738 + CLICKHOUSE_USER: skystar 739 + CLICKHOUSE_PASSWORD: ${CLICKHOUSE_PASSWORD} 740 + JETSTREAM_URL: wss://jetstream2.us-east.bsky.network/subscribe 741 + 742 + volumes: 743 + clickhouse-data: 744 + sqlite-data: 745 + ``` 746 + 747 + ### Migrations 748 + 749 + - `web` startup runs `node ace migration:run` (SQLite via Lucid) and 750 + `node ace clickhouse:migrate` (custom Ace command applying versioned 751 + `.sql` files in `database/clickhouse/`, tracked in a 752 + `_clickhouse_migrations` table). 753 + - `worker` startup skips migrations and waits up to 5 seconds for the web 754 + to have run them, then proceeds. 755 + 756 + ### Logging 757 + 758 + Every process writes to stdout. Adonis is configured with `LOG_LEVEL=info` 759 + and a JSON formatter (Pino, Adonis's default). No file logging, no 760 + rotation, no shipping. The orchestration layer collects logs. 761 + 762 + ### Graceful shutdown 763 + 764 + Both `web` and `worker` handle SIGTERM: 765 + 766 + - `web`: Adonis's HTTP server already handles this — drains in-flight 767 + requests, then exits. 768 + - `worker`: closes the Jetstream WebSocket, flushes the in-memory ClickHouse 769 + buffer one last time, writes the final cursor checkpoint, exits 0. 770 + 771 + Tini ensures SIGTERM is forwarded correctly under `docker stop`. 772 + 773 + --- 774 + 775 + ## 13. Out of scope (recap) 776 + 777 + To preserve focus, the following are explicitly **not** built in v1: 778 + 779 + - Authentication, accounts, sessions. 780 + - Quote-post tracking (schema reserves space, no code). 781 + - Global leaderboards, daily/weekly "best of" pages. 782 + - Trophies, awards, notifications. 783 + - Email, RSS, push. 784 + - Pagination beyond top-25. 785 + - Backfill of historical individual like records. 786 + - Reconciliation jobs. 787 + - Unlike / unrepost tracking. 788 + - Browser tests, E2E tests, smoke tests, load tests. 789 + - Metrics, APM, error tracking, log shipping. 790 + - Backup automation, deploy pipeline (handled by operator). 791 + - Crashed-backfill auto-recovery (manual cleanup). 792 + - IPC between web and worker beyond shared SQLite. 793 + 794 + Each of these is a known forward-looking feature. The schema and 795 + architecture are designed to absorb most of them without rework — quotes 796 + fit into the existing `kind` enum, leaderboards are `WHERE author_did = ?` 797 + removed from the existing query, time windows are already parameterized.