See the best posts from any Bluesky account
0
fork

Configure Feed

Select the types of activity you want to include in your feed.

Run backfill jobs via @adonisjs/queue (SQLite)

Replace the "void runBackfill(did)" fire-and-forget pattern in the web
process with a proper queued BackfillJob handled by @adonisjs/queue using
its database adapter on SQLite. Adds a third Adonis process,
queue-worker, running 'node ace queue:work'.

Web request handler still does the dedup gate (INSERT OR IGNORE on
backfill_jobs, dispatch only if the insert actually happened) since
@adonisjs/queue does not provide dedup-by-key natively. backfill_jobs
remains the loading-page state surface that the queued job updates as
it runs; the queue's internal tables are an implementation detail.

Update the architecture diagram, repo layout, deployment compose file,
migrations notes, and graceful-shutdown notes accordingly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

+124 -76
+124 -76
docs/superpowers/specs/2026-04-11-skystar-bluesky-design.md
··· 91 91 - **Engagement store:** ClickHouse, accessed via `@clickhouse/client` from a 92 92 shared `packages/clickhouse` package. No ORM. 93 93 - **Metadata store:** SQLite (file-backed, WAL mode), via Lucid. 94 - - **Worker:** Adonis Ace command (`node ace jetstream:consume`) with 95 - `staysAlive = true`. Same project, same image, same code as the web app — 96 - different entrypoint. 94 + - **Jetstream worker:** Adonis Ace command (`node ace jetstream:consume`) 95 + with `staysAlive = true`. Same project, same image, same code as the web 96 + app — different entrypoint. 97 + - **Background jobs:** `@adonisjs/queue` (the official AdonisJS queue 98 + package) with the **database adapter on SQLite**. Used for backfill jobs 99 + only. Run via `node ace queue:work` as a third process. 97 100 - **Atproto client:** `@atproto/api` (the official TypeScript SDK), wrapped 98 101 in `packages/atproto` for centralized rate-limit handling and parsing. 99 102 - **Containerization:** Single `Dockerfile` (Node 24 alpine multi-stage with 100 - tini), single `docker-compose.yml` with three services: `clickhouse`, 101 - `web`, `worker`. 103 + tini), single `docker-compose.yml` with four services: `clickhouse`, 104 + `web`, `jetstream-worker`, `queue-worker`. 102 105 103 - The web and worker are deliberately separate processes that share code but 104 - not memory. They communicate only through ClickHouse and SQLite. This 105 - isolates failure domains and allows independent restarts. 106 + The web, jetstream worker, and queue worker are deliberately separate 107 + processes that share code but not memory. They communicate only through 108 + ClickHouse and SQLite. This isolates failure domains and allows independent 109 + restarts: a crash in the queue worker does not interrupt live engagement 110 + ingest, and a crash in the jetstream worker does not interrupt running 111 + backfills. 106 112 107 113 ### Why ClickHouse, not Postgres 108 114 ··· 133 139 ## 4. Architecture 134 140 135 141 ``` 136 - ┌─────────────────────────────────────────────────────────────┐ 137 - │ Hetzner-class VPS (4 vCPU, 8 GB, 160 GB NVMe) │ 138 - │ │ 139 - │ ┌────────────────────┐ ┌────────────────────┐ │ 140 - │ │ web container │ │ worker container │ │ 141 - │ │ node bin/server │ │ node ace │ │ 142 - │ │ Adonis HTTP │ │ jetstream:consume│ │ 143 - │ │ + Edge SSR │ │ (Ace command, │ │ 144 - │ │ + Lucid │ │ staysAlive=true) │ │ 145 - │ └─────────┬──────────┘ └──────────┬─────────┘ │ 146 - │ │ │ │ 147 - │ │ reads + writes │ writes │ 148 - │ ▼ ▼ │ 149 - │ ┌────────────────────────────────────────────────────┐ │ 150 - │ │ shared TS package: packages/clickhouse │ │ 151 - │ │ (@clickhouse/client wrapper, schema, query funcs) │ │ 152 - │ └────────────────────────┬───────────────────────────┘ │ 153 - │ ▼ │ 154 - │ ┌──────────────────┐ ┌──────────────────────┐ │ 155 - │ │ ClickHouse │ │ SQLite (Lucid) │ │ 156 - │ │ engagement │ │ metadata: │ │ 157 - │ │ events + │ │ - users (did,handle)│ │ 158 - │ │ post snapshots │ │ - cursor checkpoint │ │ 159 - │ │ (append-only) │ │ - backfill_jobs │ │ 160 - │ └──────────────────┘ └──────────────────────┘ │ 161 - └─────────────────────────────────────────────────────────────┘ 162 - ▲ ▲ 163 - │ wss:// │ HTTPS 164 - │ │ 165 - Bluesky Jetstream Bluesky AppView API 166 - (jetstream2.us-east...) (public.api.bsky.app) 142 + ┌─────────────────────────────────────────────────────────────────┐ 143 + │ Hetzner-class VPS (4 vCPU, 8 GB, 160 GB NVMe) │ 144 + │ │ 145 + │ ┌──────────────┐ ┌──────────────────┐ ┌──────────────────┐ │ 146 + │ │ web │ │ jetstream-worker │ │ queue-worker │ │ 147 + │ │ Adonis HTTP │ │ node ace │ │ node ace │ │ 148 + │ │ + Edge SSR │ │ jetstream: │ │ queue:work │ │ 149 + │ │ + Lucid │ │ consume │ │ (runs │ │ 150 + │ │ │ │ (staysAlive) │ │ BackfillJob) │ │ 151 + │ └──────┬───────┘ └────────┬─────────┘ └────────┬─────────┘ │ 152 + │ │ │ │ │ 153 + │ │ reads │ writes │ reads+writes│ 154 + │ ▼ ▼ ▼ │ 155 + │ ┌────────────────────────────────────────────────────────┐ │ 156 + │ │ shared TS package: packages/clickhouse │ │ 157 + │ │ (@clickhouse/client wrapper, schema, query funcs) │ │ 158 + │ └────────────────────────┬───────────────────────────────┘ │ 159 + │ ▼ │ 160 + │ ┌──────────────────┐ ┌──────────────────────┐ │ 161 + │ │ ClickHouse │ │ SQLite (Lucid) │ │ 162 + │ │ engagement │ │ metadata: │ │ 163 + │ │ events + │ │ - users │ │ 164 + │ │ post snapshots │ │ - cursor checkpoint │ │ 165 + │ │ (append-only) │ │ - backfill_jobs │ │ 166 + │ │ │ │ - adonis_jobs │ │ 167 + │ │ │ │ (queue internal) │ │ 168 + │ └──────────────────┘ └──────────────────────┘ │ 169 + └─────────────────────────────────────────────────────────────────┘ 170 + ▲ ▲ 171 + │ wss:// │ HTTPS 172 + │ │ 173 + Bluesky Jetstream Bluesky AppView API 174 + (jetstream2.us-east...) (public.api.bsky.app) 167 175 ``` 168 176 169 177 ### Repository layout ··· 174 182 │ └── web/ # Adonis project root 175 183 │ ├── app/ 176 184 │ │ ├── controllers/ # ProfileController, SearchController 177 - │ │ ├── models/ # User (Lucid) 185 + │ │ ├── models/ # User, BackfillJob row (Lucid) 186 + │ │ ├── jobs/ # BackfillJob (@adonisjs/queue Job) 178 187 │ │ └── services/ # HandleResolver, BackfillRunner 179 188 │ ├── commands/ 180 189 │ │ └── jetstream_consume.ts # Ace staysAlive worker ··· 463 472 at the moment of crash (at most ~one batch, ~500ms wide), which collapses 464 473 naturally on `ReplacingMergeTree` merges. 465 474 466 - ### Flow 2 — Backfill (web process, on first user search) 475 + ### Flow 2 — Backfill (web process dispatches, queue worker runs) 467 476 468 477 Triggered when `GET /profile/:handle/likes` (or `/reposts`) hits a user 469 478 whose `users.backfilled_at IS NULL`. 470 479 480 + **In the web request handler:** 481 + 471 482 1. Resolve handle → DID via `com.atproto.identity.resolveHandle` against 472 483 `public.api.bsky.app`. 473 484 2. `INSERT OR IGNORE` into `users(did, handle, first_seen_at)`. If a 474 - parallel request already inserted, the second request sees the same row 475 - and the same `backfill_jobs` state. 476 - 3. `INSERT OR IGNORE` into `backfill_jobs(did, state='running', started_at)`. 477 - 4. **Fire off the backfill as an unawaited background task** 478 - (`void runBackfill(did)`) and return a static "Indexing @handle…" page 479 - with `<meta http-equiv="refresh" content="2">`. No SSE, no JS, no 480 - progress bar. 485 + parallel request already inserted, the second request sees the same 486 + row. 487 + 3. `INSERT OR IGNORE` into `backfill_jobs(did, state='running', 488 + started_at)`. **Only if the insert actually happened** (i.e. this is the 489 + first request for this handle, not a parallel duplicate), dispatch the 490 + queued job: `await BackfillJob.dispatch({did})`. The conditional 491 + dispatch is our deduplication mechanism — `@adonisjs/queue` does not 492 + provide dedup-by-key natively. 493 + 4. Return a static "Indexing @handle…" page with `<meta http-equiv="refresh" 494 + content="2">`. No SSE, no JS, no progress bar. 481 495 5. The browser auto-refreshes every 2 seconds. Each refresh re-runs the 482 - same controller, which sees `backfill_jobs.state` and either renders the 483 - loading page again or, once `state='done'`, renders the actual top-25 484 - page. 485 - 6. The background backfill loop: 486 - a. `getAuthorFeed(DID, cursor)` paginates the user's posts in reverse 487 - chronological order, ~100 posts per page. 488 - b. For each batch of 25 URIs, `getPosts(uris)` returns aggregate counts. 489 - c. Insert one row per post into `post_snapshots` with 490 - `snapshot_likes`, `snapshot_reposts`, and `snapshot_quotes` copied 491 - directly from the `getPosts` response, and `snapshot_taken_at = 492 - now()` recorded *at the moment that batch's response landed*. 493 - d. Update `backfill_jobs.fetched_posts` for the loading page. 494 - e. Continue until cursor exhausted or 10,000 posts reached 495 - (`BACKFILL_MAX_POSTS`). 496 - f. `UPDATE users SET backfilled_at = now()`, `UPDATE backfill_jobs SET 497 - state='done', finished_at = now()`. 496 + same controller, which checks `backfill_jobs.state` and either renders 497 + the loading page again or, once `state='done'`, renders the actual 498 + top-25 page. 499 + 500 + **In the queue worker process** (`node ace queue:work`), `BackfillJob` 501 + runs: 502 + 503 + 1. `getAuthorFeed(DID, cursor)` paginates the user's posts in reverse 504 + chronological order, ~100 posts per page. 505 + 2. For each batch of 25 URIs, `getPosts(uris)` returns aggregate counts. 506 + 3. Insert one row per post into `post_snapshots` with `snapshot_likes`, 507 + `snapshot_reposts`, and `snapshot_quotes` copied directly from the 508 + `getPosts` response, and `snapshot_taken_at = now()` recorded *at the 509 + moment that batch's response landed*. 510 + 4. Update `backfill_jobs.fetched_posts` for the loading page. 511 + 5. Continue until cursor exhausted or 10,000 posts reached 512 + (`BACKFILL_MAX_POSTS`). 513 + 6. `UPDATE users SET backfilled_at = now()`, `UPDATE backfill_jobs SET 514 + state='done', finished_at = now()`. 515 + 516 + If the job throws, `@adonisjs/queue` retries with exponential backoff 517 + (default settings). On final failure, the job's `failed()` hook updates 518 + `backfill_jobs.state = 'failed'` with the error message, and the loading 519 + page can render a "we couldn't index this user — try again later" message 520 + on the next meta-refresh. 498 521 499 522 The worker is, in parallel, polling the `users` table every 1 second and 500 523 will pick up the new DID within ~1 second of insertion, beginning live ··· 506 529 most-recent posts in the first batch, and is **explicitly accepted** 507 530 (see §10). 508 531 509 - If the backfill crashes mid-flight, the `backfill_jobs` row is left in 510 - `state='running'` and the user's loading page never resolves. This is a 511 - known limitation of v1 and is recovered manually by deleting the row. 532 + If the queue worker process is killed mid-job (not a thrown error, but a 533 + hard crash or `kill -9`), the `backfill_jobs` row is left in 534 + `state='running'` and the user's loading page never resolves. The queued 535 + job itself may or may not be retried by `@adonisjs/queue` depending on 536 + visibility-timeout configuration; for v1 we recover manually by deleting 537 + the row. Throw-based failures are handled cleanly by the job's `failed()` 538 + hook (above). 512 539 513 540 ### Flow 3 — Read (web process, every subsequent visit) 514 541 ··· 723 750 724 751 ## 12. Deployment 725 752 726 - A single `docker-compose.yml` at the repo root, three services. The 753 + A single `docker-compose.yml` at the repo root, four services. The 727 754 application's only deployment concerns are: building a Docker image, 728 755 logging to stdout, reading config from env vars. 729 756 ··· 787 814 ports: 788 815 - "3333:3333" 789 816 790 - worker: 817 + jetstream-worker: 791 818 build: . 792 819 command: node ace jetstream:consume 793 - depends_on: [clickhouse] 820 + depends_on: [clickhouse, web] 794 821 volumes: 795 822 - sqlite-data:/data 796 823 environment: ··· 803 830 CLICKHOUSE_PASSWORD: ${CLICKHOUSE_PASSWORD} 804 831 JETSTREAM_URL: wss://jetstream2.us-east.bsky.network/subscribe 805 832 833 + queue-worker: 834 + build: . 835 + command: node ace queue:work 836 + depends_on: [clickhouse, web] 837 + volumes: 838 + - sqlite-data:/data 839 + environment: 840 + NODE_ENV: production 841 + DB_CONNECTION: sqlite 842 + SQLITE_PATH: /data/skystar.sqlite 843 + CLICKHOUSE_URL: http://clickhouse:8123 844 + CLICKHOUSE_DB: skystar 845 + CLICKHOUSE_USER: skystar 846 + CLICKHOUSE_PASSWORD: ${CLICKHOUSE_PASSWORD} 847 + 806 848 volumes: 807 849 clickhouse-data: 808 850 sqlite-data: ··· 810 852 811 853 ### Migrations 812 854 813 - - `web` startup runs `node ace migration:run` (SQLite via Lucid) and 855 + - `web` startup runs `node ace migration:run` (SQLite via Lucid, which 856 + also creates the `@adonisjs/queue` internal tables) and 814 857 `node ace clickhouse:migrate` (custom Ace command applying versioned 815 858 `.sql` files in `database/clickhouse/`, tracked in a 816 859 `_clickhouse_migrations` table). 817 - - `worker` startup skips migrations and waits up to 5 seconds for the web 818 - to have run them, then proceeds. 860 + - `jetstream-worker` and `queue-worker` skip migrations and wait up to 5 861 + seconds for `web` to have run them, then proceed. Compose `depends_on` 862 + ordering ensures `web` starts first. 819 863 820 864 ### Logging 821 865 ··· 825 869 826 870 ### Graceful shutdown 827 871 828 - Both `web` and `worker` handle SIGTERM: 872 + All three Adonis processes handle SIGTERM: 829 873 830 874 - `web`: Adonis's HTTP server already handles this — drains in-flight 831 875 requests, then exits. 832 - - `worker`: closes the Jetstream WebSocket, flushes the in-memory ClickHouse 833 - buffer one last time, writes the final cursor checkpoint, exits 0. 876 + - `jetstream-worker`: closes the Jetstream WebSocket, flushes the in-memory 877 + ClickHouse buffer one last time, writes the final cursor checkpoint, 878 + exits 0. 879 + - `queue-worker`: `@adonisjs/queue`'s `queue:work` command handles SIGTERM 880 + by waiting for the in-flight job (if any) to complete or hit its 881 + visibility timeout, then exits cleanly. 834 882 835 883 Tini ensures SIGTERM is forwarded correctly under `docker stop`. 836 884