Prepare repo for open-sourcing · btao.org/favs.blue@a8d5bf5

+19 -5

.env.example

··· 9 9 APP_KEY= 10 10 APP_URL=http://${HOST}:${PORT} 11 11 12 + # Build / version (set at deploy time; surfaced via /health/* and to the OTEL service.version attr) 13 + GIT_SHA= 14 + 12 15 # Session 13 16 SESSION_DRIVER=cookie 14 17 ··· 37 40 # served but not discoverable via describeFeedGenerator. 38 41 FEED_PUBLISHER_DID= 39 42 43 + # Firehose virality webhooks — POSTed when a post crosses the 1k / 10k 44 + # engagement threshold. Both optional; unset => no webhook is sent. 45 + FIREHOSE_WEBHOOK_URL_1K= 46 + FIREHOSE_WEBHOOK_URL_10K= 47 + 48 + # Health check token — required by /health/ready via the x-monitoring-secret 49 + # header. Unset => /health/ready returns 401. /health/live is always public. 50 + HEALTH_CHECK_TOKEN= 51 + 40 52 # PostHog (analytics) 41 53 # Tracking is disabled when POSTHOG_API_KEY is unset. 54 + # POSTHOG_HOST defaults to https://us.i.posthog.com; set to https://eu.i.posthog.com 55 + # for the EU region or to your own reverse proxy. 42 56 POSTHOG_API_KEY= 43 - POSTHOG_HOST=https://ph.btao.org 57 + POSTHOG_HOST=https://us.i.posthog.com 44 58 45 - # OpenTelemetry (traces to Axiom) 59 + # OpenTelemetry (any OTLP-compatible backend — Axiom, Honeycomb, Grafana Cloud, ...) 46 60 # Tracing is disabled when OTEL_EXPORTER_OTLP_ENDPOINT is unset. 47 - # OTEL_SERVICE_NAME is set per-process in docker-compose.yml (web/worker/jetstream). 48 - OTEL_EXPORTER_OTLP_ENDPOINT=https://api.axiom.co 49 - OTEL_EXPORTER_OTLP_HEADERS=Authorization=Bearer <AXIOM_API_TOKEN>,X-Axiom-Dataset=<AXIOM_DATASET> 61 + # OTEL_SERVICE_NAME is set per-process in docker-compose.yml (web/worker/jetstream/scheduler). 62 + OTEL_EXPORTER_OTLP_ENDPOINT= 63 + OTEL_EXPORTER_OTLP_HEADERS=

+2

.gitignore

··· 42 42 # TypeScript 43 43 *.tsbuildinfo 44 44 .worktrees 45 + 46 + docs/superpowers/

+16 -14

AGENTS.md

··· 6 6 7 7 ## Stack 8 8 9 - - **TypeScript + AdonisJS v7** (web + workers, single repo, no monorepo — `apps/web` was flattened to root in commit c46cdbb) 10 - - **ClickHouse** — append-only store for engagement events + post snapshots 11 - - **SQLite** (via Lucid) — metadata: tracked users, jetstream cursor, backfill jobs 12 - - **Node 24**, **pnpm 10** (pinned in `mise.toml`; the repo migrated from npm in c46cdbb) 13 - - **Edge.js** templates, **Alpine.js (CSP build)** + **Vite** on the frontend 14 - - Ships as one Docker image with three process entrypoints. 9 + - **TypeScript + AdonisJS v7** (web + workers, single repo, no monorepo). 10 + - **ClickHouse** — append-only store for engagement events + post snapshots. 11 + - **SQLite** (via Lucid) — metadata: tracked users, jetstream cursor, backfill jobs. 12 + - **Node 24**, **pnpm 10** (pinned in `mise.toml`). 13 + - **Edge.js** templates, **Alpine.js (CSP build)** + **Vite** on the frontend. 14 + - Ships as one Docker image with four process entrypoints (web, jetstream, queue, scheduler). 15 15 16 16 ## Development 17 17 ··· 20 20 ## Commands 21 21 22 22 ```bash 23 - pnpm dev # node ace serve --hmr 23 + pnpm dev # runs web + jetstream + queue + scheduler concurrently with hot reload 24 24 pnpm test # node ace test (Japa) 25 25 pnpm lint # eslint . 26 26 pnpm typecheck # tsc --noEmit ··· 39 39 2. **Jetstream consumer** (`node ace jetstream:consume`) — `app/services/jetstream_consumer.ts` connects to the Bluesky Jetstream WebSocket, filters for tracked DIDs, writes engagement events to ClickHouse, and persists cursor position to SQLite via `jetstream_cursor_io.ts`. The consumer takes a `WebSocketLike` factory so tests can inject fakes. 40 40 3. **Queue worker** (`node ace queue:work`, `@adonisjs/queue`) — runs `app/jobs/backfill_job.ts` which fetches historical posts/likes/reposts from the AppView on first lookup of a new handle. 41 41 42 + A scheduler (`node ace scheduler:run`) handles periodic jobs like virality threshold scans. 43 + 42 44 ### Data flow 43 45 44 - - Handle → `HandleResolver` (SQLite lookup or AppView resolution) → persisted to `users` table (DID is the tracked identity) 45 - - First time we see a DID: enqueue a backfill job; Jetstream consumer also starts tracking it for live events 46 - - Queries for top-N posts read from ClickHouse (`app/lib/clickhouse/store.ts`) joining engagement events with post snapshots 47 - - Cursor checkpoint so Jetstream resumes after restart 46 + - Handle → `HandleResolver` (SQLite lookup or AppView resolution) → persisted to `users` table (DID is the tracked identity). 47 + - First time we see a DID: enqueue a backfill job; Jetstream consumer also starts tracking it for live events. 48 + - Queries for top-N posts read from ClickHouse (`app/lib/clickhouse/store.ts`) joining engagement events with post snapshots. 49 + - Cursor checkpoint so Jetstream resumes after restart. 48 50 49 51 ### Key modules 50 52 51 - - `app/lib/atproto/` — AppView client + Jetstream event parsers. Import via `#lib/atproto/index` (path alias). Previously lived in `packages/atproto`, flattened in commit 0cea00e. 52 - - `app/lib/clickhouse/` — ClickHouse client wrapper + query store. Import via `#lib/clickhouse/index`. **Tests must drain query streams** — see commit c46cdbb; undrained streams hang the test process. 53 + - `app/lib/atproto/` — AppView client + Jetstream event parsers. Import via `#lib/atproto/index` (path alias). 54 + - `app/lib/clickhouse/` — ClickHouse client wrapper + query store. Import via `#lib/clickhouse/index`. **Tests must drain query streams** — undrained streams hang the test process. 53 55 - `database/schema.ts` + `database/schema_rules.ts` — Lucid schema definitions; `database/migrations/` for SQLite; `database/clickhouse/*.sql` for ClickHouse DDL (numbered, applied in order). 54 56 - `config/` — standard AdonisJS config (app, database, queue, session, shield, static, vite). `start/env.ts` defines required env vars via Vine schema. 55 57 ··· 59 61 60 62 ## Edge templates gotcha 61 63 62 - Edge uses `{{ handle }}` for interpolation, **not** `{{{ handle }}}`. There was a recent bug (ac30713) where `@{{ handle }}` rendered as literal `@<handle>` because of Edge's `@` tag parsing — if you're rendering a handle preceded by `@`, escape the `@` or use `{{ '@' + handle }}`. 64 + Edge uses `{{ handle }}` for interpolation, **not** `{{{ handle }}}`. If you're rendering a handle preceded by `@`, escape the `@` or use `{{ '@' + handle }}` — Edge parses `@` as a tag prefix and a literal `@{{ handle }}` will render as `@<handle>`. 63 65 64 66 ## Alpine.js CSP build 65 67

+51 -25

README.md

··· 1 - # favs.blue 1 + # ❤️ favs.blue 2 2 3 - favs.blue is a favstar.fm-style web app for Bluesky: type any handle and see that user's top-25 most-liked or most-reposted posts, filterable by "all time" or "last month." It works by subscribing to the Bluesky Jetstream firehose for live engagement events, and running a one-time backfill on first lookup against the Bluesky AppView API. The stack is TypeScript / AdonisJS v7 (web + workers), ClickHouse for the append-only engagement store, and SQLite for metadata. Everything ships as a single Docker image with three process entrypoints: an HTTP web server, a Jetstream WebSocket consumer, and a queue worker that runs backfill jobs. 3 + A favstar.fm-style web app for Bluesky: type any handle and see that user's most-liked or most-reposted posts, filterable by "all time" or "last month." 4 + 5 + Live engagement comes from the Bluesky [Jetstream](https://github.com/bluesky-social/jetstream) firehose. The first time a handle is looked up, a backfill job pulls historical posts from the Bluesky AppView API. 6 + 7 + **Stack:** TypeScript / [AdonisJS](https://adonisjs.com), [ClickHouse](https://clickhouse.com) for engagement events, SQLite for metadata, [Edge.js](https://edgejs.dev) templates with [Alpine.js (CSP build)](https://alpinejs.dev/advanced/csp) on the frontend. Ships as one Docker image with four process entrypoints (web, jetstream consumer, queue worker, scheduler). 4 8 5 9 ## Local development 6 10 7 - **Prerequisites:** Node 24+, Docker. 11 + **Prerequisites:** [Mise](https://mise.jdx.dev/), Docker. 8 12 9 13 ```bash 10 - # 1. Copy env template and fill in APP_KEY at minimum 11 - cp .env.example apps/web/.env 12 - node -e "console.log(require('crypto').randomBytes(32).toString('base64url'))" # paste as APP_KEY 14 + # 1. Install Node/pnpm 15 + mise trust && mise install 16 + 17 + # 1. Copy the env template and generate an APP_KEY 18 + cp .env.example .env 19 + node -e "console.log(require('crypto').randomBytes(32).toString('base64url'))" 20 + # paste the output into APP_KEY in .env 13 21 14 22 # 2. Start ClickHouse 15 23 docker compose up clickhouse -d 16 24 17 25 # 3. Install dependencies 18 - npm install 26 + pnpm install 19 27 20 - # 4. Run migrations 21 - cd apps/web && node ace migration:run 28 + # 4. Run migrations (SQLite + ClickHouse) and start all four processes 29 + pnpm dev 30 + ``` 22 31 23 - # 5. Start the dev server (with hot reload) 24 - node ace serve --hmr 32 + `pnpm dev` runs the web server, jetstream consumer, queue worker, and scheduler concurrently with hot-reload. The app is at http://localhost:3333. 25 33 26 - # 6. (Optional) Start the Jetstream worker in a second terminal 27 - node ace jetstream:consume 34 + The first profile lookup for a new handle takes a minute or two while backfill runs. Live engagement updates flow in once the jetstream consumer has been running. 28 35 29 - # 7. (Optional) Start the queue worker in a third terminal 30 - node ace queue:work 31 - ``` 36 + ### Useful commands 32 37 33 - The app will be available at http://localhost:3333. 38 + ```bash 39 + pnpm test # Japa test runner 40 + pnpm lint # eslint 41 + pnpm typecheck # tsc --noEmit 42 + pnpm format # prettier --write 43 + pnpm build # production build 34 44 35 - ### Tests 45 + # Run a single test file 46 + node ace test --files "tests/unit/clickhouse_store.spec.ts" 36 47 37 - ```bash 38 - cd apps/web && node ace test 48 + # Apply ClickHouse migrations only 49 + node ace clickhouse:migrate 39 50 ``` 40 51 41 - ### Production build (Docker) 52 + ### Production (Docker Swarm) 42 53 43 - ```bash 44 - docker compose build 45 - docker compose up 46 - ``` 54 + The repository ships a `docker-compose.yml` configured for `docker stack deploy`. 55 + 56 + ## Architecture 57 + 58 + Three process entrypoints: 59 + 60 + - **HTTP web server** (`bin/server.ts`) — controllers in `app/controllers/`, routes in `start/routes.ts`. Profile pages at `/profile/:handle/likes|reposts` are the main product surface. 61 + - **Jetstream consumer** (`node ace jetstream:consume`) — connects to the Bluesky Jetstream WebSocket, filters for tracked DIDs, writes engagement events to ClickHouse, and persists cursor position so it can resume after restart. 62 + - **Queue worker** (`node ace queue:work`) — runs backfill jobs that fetch historical posts/likes/reposts from the AppView on first lookup of a new handle. 63 + 64 + A scheduler process (`node ace scheduler:run`) handles periodic jobs like virality threshold scans. 65 + 66 + ## Configuration 67 + 68 + See `.env.example` for the full list of environment variables. `start/env.ts` is the source of truth — variables are validated there at boot. 69 + 70 + ## License 71 + 72 + See [LICENSE](LICENSE).

+1 -3

app/services/posthog.ts

··· 9 9 if (!apiKey) return null 10 10 11 11 client = new PostHog(apiKey, { 12 - host: process.env.POSTHOG_HOST || 'https://ph.btao.org', 13 - // Disable geoip on PostHog's side — we send $ip from Cloudflare's 14 - // CF-Connecting-IP header so PostHog can geolocate from that. 12 + host: process.env.POSTHOG_HOST || 'https://us.i.posthog.com', 15 13 disableGeoip: false, 16 14 }) 17 15

+5 -3

config/shield.ts

··· 1 1 import app from '@adonisjs/core/services/app' 2 2 import { defineConfig } from '@adonisjs/shield' 3 3 4 + const posthogHost = process.env.POSTHOG_HOST || 'https://us.i.posthog.com' 5 + 4 6 const shieldConfig = defineConfig({ 5 7 csp: { 6 8 enabled: true, 7 9 directives: { 8 10 defaultSrc: [`'self'`], 9 - scriptSrc: [`'self'`, '@nonce', 'https://ph.btao.org'], 11 + scriptSrc: [`'self'`, '@nonce', posthogHost], 10 12 styleSrc: [ 11 13 `'self'`, 12 14 `'unsafe-inline'`, ··· 21 23 'https://video.cdn.bsky.app', 22 24 ], 23 25 connectSrc: app.inDev 24 - ? [`'self'`, '@viteUrl', 'ws://localhost:*', 'https://ph.btao.org'] 25 - : [`'self'`, 'https://ph.btao.org'], 26 + ? [`'self'`, '@viteUrl', 'ws://localhost:*', posthogHost] 27 + : [`'self'`, posthogHost], 26 28 fontSrc: [`'self'`, 'https://cdn.jsdelivr.net', 'https://fonts.bunny.net'], 27 29 workerSrc: [`'self'`, 'blob:', 'data:'], 28 30 objectSrc: [`'none'`],

+1 -1

docker-compose.yml

··· 13 13 QUEUE_DRIVER: database 14 14 QUEUE_CONCURRENCY: ${QUEUE_CONCURRENCY:-5} 15 15 POSTHOG_API_KEY: ${POSTHOG_API_KEY:-} 16 - POSTHOG_HOST: ${POSTHOG_HOST:-https://ph.btao.org} 16 + POSTHOG_HOST: ${POSTHOG_HOST:-https://us.i.posthog.com} 17 17 OTEL_EXPORTER_OTLP_ENDPOINT: ${OTEL_EXPORTER_OTLP_ENDPOINT} 18 18 BACKFILL_MAX_POSTS: ${BACKFILL_MAX_POSTS:-10000} 19 19 OTEL_EXPORTER_OTLP_HEADERS: ${OTEL_EXPORTER_OTLP_HEADERS}

-185

docs/superpowers/specs/2026-04-11-backfill-sse-progress-design.md

··· 1 - # Backfill Progress via SSE — Design 2 - 3 - ## Problem 4 - 5 - The loading page (`resources/views/pages/profile/loading.edge`) shown while a 6 - user's first-time backfill runs uses a 2-second `<meta http-equiv="refresh">`. 7 - There is no progress indication — users see a spinner-less message claiming 8 - "about 10 seconds" while a 100K-post backfill can take much longer. 9 - 10 - We want a real progress bar on the loading page, driven by Server-Sent Events 11 - per AdonisJS's SSE support 12 - (https://docs.adonisjs.com/guides/digging-deeper/server-sent-events#overview). 13 - 14 - ## Scope 15 - 16 - - Add a progress bar to `loading.edge` driven by SSE. 17 - - Emit `fetched_posts / total_posts` and terminal `done` / `failed` events. 18 - - Populate `backfill_jobs.total_posts` from the user's Bluesky profile 19 - `postsCount` at dispatch time. 20 - - Keep the existing `<meta refresh>` as a fallback for environments where SSE 21 - is blocked. 22 - 23 - Out of scope: pub/sub, websockets, any change to the Jetstream consumer, any 24 - change to existing backfill job semantics beyond using a real denominator. 25 - 26 - ## Design 27 - 28 - ### Total-posts denominator 29 - 30 - We don't currently know a user's post count up-front, so the progress bar has 31 - no denominator. Fix: fetch the user's Bluesky profile once at dispatch time. 32 - 33 - - Add `getProfile(did)` to `AtprotoClient` in `app/lib/atproto/`. Calls 34 - `app.bsky.actor.getProfile` on the AppView. Returns at least `postsCount`. 35 - - In `ProfileController.#ensureBackfillStarted`, after resolving DID and 36 - before inserting the `BackfillJobRow`, call `getProfile` and pass 37 - `postsCount` as `totalPosts` into `BackfillJobRow.create`. 38 - - Effective denominator displayed to the user is 39 - `Math.min(totalPosts, BACKFILL_MAX_POSTS)` — if a user has 400K posts we'll 40 - still cap at 100K and the bar will fill to 100% at the cap. 41 - - If `getProfile` fails, the dispatcher propagates the error via the same 42 - paths that already handle `HandleResolver` failures: 43 - `BlueskyRateLimitedError` → 503 error page, `HandleNotFoundError` → 404 44 - error page. The BackfillJob row is only ever created with a concrete 45 - `totalPosts`, so the progress bar always has a real denominator. 46 - 47 - ### SSE endpoint 48 - 49 - New route: `GET /profile/:handle/backfill/stream`, handled by a new controller 50 - method (`ProfileController.backfillStream` or a dedicated 51 - `BackfillStreamController` — controller file gets chosen during implementation 52 - based on what keeps `profile_controller.ts` small). 53 - 54 - Handler flow: 55 - 56 - 1. Canonicalize the handle (reuse `HandleResolver.normalize`). 57 - 2. Look up `User` by handle → DID. If missing, respond 404. 58 - 3. Look up `BackfillJobRow` by DID. If missing, respond 404 (client falls 59 - back to meta-refresh reload). 60 - 4. Open the SSE stream by writing raw headers directly to the underlying 61 - Node response (`response.response`). `@adonisjs/transmit` is deliberately 62 - NOT used — it's channel-based pub/sub with an in-memory store that 63 - doesn't bridge across the web and queue processes without adding Redis. 64 - Raw SSE keeps the infra surface unchanged. 65 - Headers: `Content-Type: text/event-stream`, 66 - `Cache-Control: no-cache, no-transform`, `Connection: keep-alive`, 67 - `X-Accel-Buffering: no`. Call `response.response.flushHeaders()` to 68 - flush, then write events using the 69 - `event: <name>\ndata: <json>\n\n` framing. 70 - 5. Emit one `progress` event immediately with the current row state (so 71 - reconnects feel instant). 72 - 6. Enter a polling loop: every ~500ms re-read the row. Emit a new `progress` 73 - event only when `fetched_posts` or `state` changed since the last emit. 74 - 7. When `state` transitions to `done` or `failed`, emit a terminal event of 75 - that name with the final payload and close the stream. 76 - 8. On client disconnect (request close), exit the loop and release the 77 - connection. 78 - 79 - Event payloads (all JSON): 80 - 81 - ``` 82 - event: progress 83 - data: {"fetched": 1234, "total": 8500, "state": "running"} 84 - ``` 85 - 86 - ``` 87 - event: done 88 - data: {"fetched": 8500, "total": 8500} 89 - ``` 90 - 91 - ``` 92 - event: failed 93 - data: {"error": "<message>"} 94 - ``` 95 - 96 - ### Polling vs. pub/sub 97 - 98 - The backfill job already persists `fetched_posts` to SQLite every 25-URI 99 - batch. Rather than introducing an in-memory pub/sub (which doesn't work 100 - cross-process anyway — web and queue run in separate processes), the SSE 101 - handler polls the same SQLite row. SQLite reads are cheap, one client per 102 - new-user lookup is low volume, and a 500ms poll interval is imperceptible to 103 - users. 104 - 105 - ### Reconnection semantics 106 - 107 - `EventSource` auto-reconnects on network drops. Our handler is stateless per 108 - connection — on reconnect it reads the current row and emits one immediate 109 - event, so the client recovers without `Last-Event-ID` or any server-side 110 - session state. The job's `fetched_posts` only grows monotonically and 111 - terminal states are sticky, so repeated emits are idempotent. 112 - 113 - ### Loading page 114 - 115 - Changes to `resources/views/pages/profile/loading.edge`: 116 - 117 - - Remove the 2-second `<meta refresh>`, replace with a 15-second fallback 118 - meta-refresh so SSE-blocked users still eventually land on the profile. 119 - - Render a `<progress>` element and a live counter line: 120 - "Indexed `<fetched>` of `<total>` posts". 121 - - Wrap the block in an Alpine component `x-data="backfillProgress(...)"` 122 - initialized with `{ handle, total }` from the Edge template. 123 - 124 - Alpine component lives in `resources/js/app.js` alongside the existing 125 - `alert` component: 126 - 127 - ```js 128 - Alpine.data('backfillProgress', function (initial) { 129 - return { 130 - handle: initial.handle, 131 - fetched: 0, 132 - total: initial.total, // may be null 133 - state: 'running', 134 - error: null, 135 - init() { 136 - const es = new EventSource(`/profile/${this.handle}/backfill/stream`) 137 - es.addEventListener('progress', (e) => { 138 - const d = JSON.parse(e.data) 139 - this.fetched = d.fetched 140 - this.total = d.total 141 - }) 142 - es.addEventListener('done', () => { 143 - es.close() 144 - window.location.href = `/profile/${this.handle}/likes` 145 - }) 146 - es.addEventListener('failed', (e) => { 147 - this.state = 'failed' 148 - this.error = JSON.parse(e.data).error 149 - es.close() 150 - }) 151 - }, 152 - } 153 - }) 154 - ``` 155 - 156 - ### Controller data passed to the view 157 - 158 - `#show` currently calls `view.render('pages/profile/loading', { handle })`. 159 - Extend it to also pass `totalPosts` (read from the row it just created / 160 - found during `#ensureBackfillStarted`) so the initial render has a 161 - denominator before SSE connects. 162 - 163 - ## Testing 164 - 165 - - Unit: `AtprotoClient.getProfile` — mock HTTP, asserts `postsCount` parsed. 166 - - Unit: `#ensureBackfillStarted` — populates `totalPosts` from profile; 167 - propagates `BlueskyRateLimitedError` / `HandleNotFoundError` from 168 - `getProfile` without creating the BackfillJob row. 169 - - Functional: `GET /profile/:handle/backfill/stream` — creates a BackfillJob 170 - row in each state (`running` with progress increments, `done`, `failed`, 171 - missing), asserts the right events are emitted and the stream closes on 172 - terminal state. 173 - - Functional: loading page renders the Alpine component with the expected 174 - initial `total` value. 175 - 176 - No changes to existing ClickHouse tests; no changes to backfill job 177 - semantics. 178 - 179 - ## Non-goals / deliberate omissions 180 - 181 - - No pub/sub or LISTEN/NOTIFY — SQLite polling is sufficient. 182 - - No progress emitted more often than once per 25-URI batch (the job only 183 - writes that often). Finer granularity would require changing the job. 184 - - No changes to `BACKFILL_MAX_POSTS` behavior. 185 - - No retry UI on `failed` — existing error page flow handles that on reload.

-198

docs/superpowers/specs/2026-04-11-post-embeds-design.md

··· 1 - # Post embeds (images, video, external link) — design 2 - 3 - **Status:** Draft 4 - **Date:** 2026-04-11 5 - 6 - ## Problem 7 - 8 - The profile page renders post text only. Posts that consist of an image, a video, or a link card show up as empty or misleading tiles — e.g. `https://bsky.app/profile/did:plc:vc7f4oafdgxsihk4cry2xpze/post/3lc52lahzgc24` is an image post with no text, which currently renders as a blank card. 9 - 10 - We need to display images (with alt text), video thumbnails, and external link cards alongside post text. 11 - 12 - ## Scope 13 - 14 - In scope (rendered): 15 - 16 - - `app.bsky.embed.images` (1–4 images) 17 - - `app.bsky.embed.video` (single video, as thumbnail + link-out; no inline playback) 18 - - `app.bsky.embed.external` (link card) 19 - 20 - **Excluded at ingest time** (post is not stored in `post_snapshots` at all): 21 - 22 - - `app.bsky.embed.record` (pure quote post) 23 - - `app.bsky.embed.recordWithMedia` (quote + media) 24 - 25 - Rationale: quoted posts can be deleted after we cache them, and Bluesky expects clients to respect deletion. Propagating deletes (either by scanning `embed_json` on every `post-delete` or by hydrating at render time) is too expensive for the benefit, so we simply don't track posts whose primary purpose is to quote another post. Quote posts with media get dropped along with non-media quote posts — the quote context is lost anyway, so rendering the media half alone is misleading. 26 - 27 - Also out of scope: 28 - 29 - - Inline HLS video playback 30 - - GIFs/Tenor treated as anything special — they arrive as `external` embeds and render as link cards like any other 31 - 32 - ## Architecture 33 - 34 - ### Data shape 35 - 36 - New tagged union in `app/lib/atproto/types.ts`: 37 - 38 - ```ts 39 - export type PostEmbed = ImagesEmbed | VideoEmbed | ExternalEmbed 40 - 41 - export interface ImagesEmbed { 42 - type: 'images' 43 - items: Array<{ 44 - thumb: string 45 - fullsize: string 46 - alt: string 47 - aspectRatio?: { width: number; height: number } 48 - }> 49 - } 50 - 51 - export interface VideoEmbed { 52 - type: 'video' 53 - thumbnail: string 54 - alt: string 55 - aspectRatio?: { width: number; height: number } 56 - } 57 - 58 - export interface ExternalEmbed { 59 - type: 'external' 60 - uri: string 61 - title: string 62 - description: string 63 - thumb: string | null 64 - } 65 - ``` 66 - 67 - `PostSnapshot` gains `embed: PostEmbed | null`. 68 - 69 - ### ClickHouse schema 70 - 71 - New column on `post_snapshots`: 72 - 73 - ```sql 74 - ALTER TABLE post_snapshots ADD COLUMN embed_json String DEFAULT '' 75 - ``` 76 - 77 - Migration file: `database/clickhouse/NNN_add_embed_to_post_snapshots.sql` (next sequential number). User will drop and re-run migrations locally — no deployed data to migrate. 78 - 79 - Empty string (`''`) is the canonical "no embed" value; the store maps it to `null` on read. Stored JSON is produced with `JSON.stringify(embed)` and parsed with `JSON.parse` on read. 80 - 81 - ### CDN URL construction (jetstream path) 82 - 83 - Jetstream records contain raw blob CIDs, not hydrated URLs. The jetstream parser constructs URLs from `(did, cid)` pairs using these formats (verified empirically against `public.api.bsky.app` responses on 2026-04-11): 84 - 85 - - **Image thumb:** `https://cdn.bsky.app/img/feed_thumbnail/plain/{did}/{cid}` 86 - - **Image fullsize:** `https://cdn.bsky.app/img/feed_fullsize/plain/{did}/{cid}` 87 - - **External thumb:** same as image thumb (pulled from `external.thumb` blob on the post author's DID) 88 - - **Video thumbnail:** `https://video.bsky.app/watch/{url-encoded-did}/{cid}/thumbnail.jpg` 89 - - The DID is URL-encoded (e.g. `did%3Aplc%3Az72i7hdynmk6r22z27h6tvur`) via `encodeURIComponent`, not raw 90 - 91 - The backfill path (getAuthorFeed) never constructs these — the AppView returns fully hydrated URLs in the `#view` union, and the parser reads them directly. 92 - 93 - ### Quote-post filtering 94 - 95 - Both ingest paths detect `app.bsky.embed.record` and `app.bsky.embed.recordWithMedia` and signal "skip this post entirely". The snapshot is never created. Backfill drops it from the returned array; jetstream consumer doesn't buffer it. Deleted-quote propagation simply doesn't exist as a problem because we never store anything referencing the quoted post. 96 - 97 - ## Components 98 - 99 - ### 1. `app/lib/atproto/parsers/get_author_feed.ts` 100 - 101 - Extend the existing parser. Per feed item: 102 - 103 - 1. If `post.embed.$type` is `app.bsky.embed.record#view` or `app.bsky.embed.recordWithMedia#view` → **skip this feed item** (don't push a snapshot). 104 - 2. Otherwise, parse `post.embed` (the hydrated `#view` union) into a `PostEmbed`: 105 - - `app.bsky.embed.images#view` → `ImagesEmbed` (uses `thumb`, `fullsize`, `alt`, `aspectRatio` straight from the response) 106 - - `app.bsky.embed.video#view` → `VideoEmbed` (uses `thumbnail`, `alt`, `aspectRatio`; `playlist` ignored) 107 - - `app.bsky.embed.external#view` → `ExternalEmbed` (reads `external.{uri,title,description,thumb}`; thumb is already a full URL) 108 - - No embed → `embed: null` 109 - 3. Malformed but recognized embed → `embed: null` and log. Do not throw — other snapshots in the same batch must still be inserted. 110 - 111 - ### 2. `app/lib/atproto/parsers/jetstream.ts` 112 - 113 - New exported function: 114 - 115 - ```ts 116 - export function parsePostEmbed( 117 - record: unknown, 118 - authorDid: string 119 - ): { skip: true } | { skip: false; embed: PostEmbed | null } 120 - ``` 121 - 122 - Called by `jetstream_consumer.ts` **only** when `isTrackedAuthor === true`, so we don't waste CPU parsing embeds on the firehose for untracked posts. 123 - 124 - - `record.embed.$type === 'app.bsky.embed.record'` or `'app.bsky.embed.recordWithMedia'` → `{ skip: true }` 125 - - `app.bsky.embed.images` → `ImagesEmbed` built from blob CIDs + the author's DID 126 - - `app.bsky.embed.video` → `VideoEmbed` built from the blob CID + URL-encoded DID 127 - - `app.bsky.embed.external` → `ExternalEmbed` built from the blob CID (if any) + URL-encoded DID for the thumb 128 - - No embed → `{ skip: false, embed: null }` 129 - - Malformed → `{ skip: false, embed: null }` and log. Do not throw. 130 - 131 - ### 3. `app/services/jetstream_consumer.ts` 132 - 133 - In `handlePostEvent`, modify Part A (the snapshot-insert branch gated on `isTrackedAuthor`): 134 - 135 - 1. Call `parsePostEmbed(record, authorDid)`. 136 - 2. If the result is `{ skip: true }` → do NOT push a snapshot and do NOT advance the cursor. This matches the existing pattern where `advancePendingCursor` is only called when something actually gets buffered; fully untracked events follow the same pattern today. Fall through to Part B — if the skipped post quotes a tracked user, Part B will still fire and advance the cursor on its own. 137 - 3. Otherwise, push a snapshot with `embed` attached and advance the cursor, exactly as today. 138 - 139 - **Part B (quote engagement detection) is untouched.** It already applies to ALL post events regardless of whether the post's author is tracked, and it writes to `engagement_events`, not `post_snapshots`. A post that we skip in Part A (because the tracked author is quoting someone) can still trigger a quote event in Part B if the quoted author is *also* tracked. These are orthogonal. 140 - 141 - No changes to `flushBuffer`. No `getPosts` hydration call. 142 - 143 - ### 4. `app/lib/clickhouse/store.ts` 144 - 145 - - `insertPostSnapshots`: add `embed_json: s.embed ? JSON.stringify(s.embed) : ''` to the value row. 146 - - `getTopPosts`: add `s.embed_json` to `SELECT_COLUMNS` and `GROUP BY`. Map to `embed: row.embed_json ? JSON.parse(row.embed_json) as PostEmbed : null` in the result. 147 - - `tombstonePost` / `tombstoneUserSnapshots`: `embed_json: ''` for tombstone rows; `tombstoneUserSnapshots` INSERT SELECT passes `''` in the positional column list. 148 - - `TopPostsResult` type in `app/lib/clickhouse/types.ts` gains `embed: PostEmbed | null`. 149 - 150 - ### 5. `app/controllers/profile_controller.ts` 151 - 152 - Pass `embed` through in the `postsWithUrl` mapping — just `embed: p.embed`. No HTML escaping needed; Edge escapes string interpolation by default, and URLs/text fields are safely interpolated through `{{ }}`. 153 - 154 - ### 6. `resources/views/pages/profile/show.edge` 155 - 156 - Below the existing `<p>{{{ post.postTextSafe }}}</p>`, add a conditional block scoped to `post.embed`: 157 - 158 - ```edge 159 - @if(post.embed) 160 - @if(post.embed.type === 'images') 161 - {{-- Grid: 1 = single, 2 = two cols, 3 = 1 big + 2 small, 4 = 2x2 --}} 162 - @elseif(post.embed.type === 'video') 163 - {{-- Thumbnail + ▶ overlay, wrapped in <a> to post.bskyUrl --}} 164 - @elseif(post.embed.type === 'external') 165 - {{-- Bordered card: thumb + title + description, <a> to uri --}} 166 - @endif 167 - @endif 168 - ``` 169 - 170 - - **Image rendering:** `<img src="{{ img.thumb }}" alt="{{ img.alt }}" loading="lazy">` wrapped in `<a href="{{ img.fullsize }}" target="_blank" rel="noopener">`. Aspect ratio applied via inline `style` when known. **Alt text is only in the `alt` attribute — no visible figcaption.** 171 - - **Video rendering:** `<img src="{{ video.thumbnail }}" alt="{{ video.alt }}">` wrapped in `<a href="{{ post.bskyUrl }}">` with a `▶` CSS-positioned overlay badge. 172 - - **External rendering:** Flex card with optional thumb on the left, title (bold) + description (muted) on the right, wrapped in `<a href="{{ post.embed.uri }}" target="_blank" rel="noopener">`. 173 - 174 - All three branches live inline in the template — no new Edge components. Styles follow the existing inline-style pattern on `show.edge`. 175 - 176 - ## Testing 177 - 178 - - **Parser unit tests** 179 - - `tests/unit/atproto/get_author_feed_parser.spec.ts`: extend with fixtures for `images`, `video`, `external`, `record` (must be skipped), `recordWithMedia` (must be skipped), and no-embed. Assert output shape and that malformed recognized embeds produce `embed: null` without throwing. 180 - - New `tests/unit/atproto/jetstream_embed_parser.spec.ts`: raw-record `images`, `video`, `external`, `record` (skip), `recordWithMedia` (skip), no-embed, malformed. Assert CID→URL construction including DID URL-encoding for the video path. 181 - - **ClickHouse round-trip** 182 - - Extend `tests/unit/clickhouse_store.spec.ts` with cases that insert a snapshot carrying each of `images`/`video`/`external` + null, read back via `getTopPosts`, assert structural equality. Cover the tombstone path (`embed_json = ''` survives the merge). 183 - - **Jetstream consumer** 184 - - Extend `tests/unit/jetstream_consumer.spec.ts`: 185 - - Tracked-author post with an `images` embed → snapshot has `embed` populated. 186 - - Tracked-author post with a `record` embed → **no snapshot buffered**; cursor still advances. 187 - - Tracked-author post with a `recordWithMedia` embed → **no snapshot buffered**; cursor still advances. 188 - - **No template tests.** Manual browser check against `/profile/jcsalterego.bsky.social/likes` plus a handful of profiles chosen to exercise each embed type. 189 - 190 - ## Open questions 191 - 192 - None. 193 - 194 - ## Non-goals / deferred 195 - 196 - - Dev-ergonomic drop-and-rebackfill command — user handles this manually for this change. 197 - - Re-rendering embeds on snapshot refresh (snapshots are take-once at backfill/jetstream time; if an image is replaced on the author's PDS, we keep the old URL — acceptable). 198 - - Any future support for quote posts (would require a deletion-propagation strategy we're explicitly not building).

-903

docs/superpowers/specs/2026-04-11-skystar-bluesky-design.md

··· 1 - # favs.blue — Design Spec 2 - 3 - **Date:** 2026-04-11 4 - **Status:** Approved, ready for implementation planning 5 - **One-line:** A favstar.fm-style site for Bluesky — type a handle, see that user's most-liked and most-reposted posts. 6 - 7 - --- 8 - 9 - ## 1. Goal and scope 10 - 11 - favs.blue lets a visitor type any Bluesky handle and see that user's 12 - top-25 most-liked or most-reposted posts, filterable by time window. The 13 - experience is modeled on the original favstar.fm for Twitter (2009-2018): 14 - unauthenticated, instant on repeat visits, focused on a single greatest-hits 15 - view per user. 16 - 17 - ### v1 scope 18 - 19 - - Per-user profile page with the top-25 posts ranked by likes or reposts. 20 - - Two time lenses: all-time (default) and "last month". 21 - - Handle resolution with canonical-URL redirects (handle changes are healed 22 - transparently). 23 - - Synchronous backfill on first lookup, capped at 10,000 most recent posts. 24 - - Live ingest of engagement events for tracked users from the Bluesky 25 - Jetstream firehose. 26 - - Honoring of post deletions, account deletions, and account takedowns. 27 - 28 - ### Explicitly out of scope for v1 29 - 30 - - Authentication, accounts, or any login flow. 31 - - Quote-post tracking in the UI. (Quote data **is** captured in the data 32 - layer from v1 onward — both `snapshot_quotes` from `getPosts` and live 33 - `kind='quote'` events from Jetstream — but no v1 route, query, or UI 34 - element exposes it.) 35 - - Global leaderboards ("today's best", "this week's best"). 36 - - Per-user trophies, awards, or notifications. 37 - - Email, RSS, or any push surface. 38 - - Pagination beyond top-25. 39 - - Backfill of historical individual like records (only aggregate counts at 40 - backfill time, plus live deltas thereafter). 41 - - Reconciliation of count drift after backfill. 42 - - Tracking of unlike / unrepost events (counts may drift up slightly over 43 - time; accepted). 44 - - Smoke / E2E / load / browser tests. 45 - - Metrics, APM, error tracking, log shipping. 46 - 47 - --- 48 - 49 - ## 2. Constraints and key research findings 50 - 51 - The Bluesky AppView (`api.bsky.app` / `public.api.bsky.app`) does **not** 52 - expose any endpoint that returns a user's posts sorted by like count or 53 - repost count. `app.bsky.feed.searchPosts?sort=top` exists but uses an opaque 54 - relevance ranking, not raw counts, and pagination is not guaranteed to 55 - completion. There is no third-party public API that exposes per-user 56 - engagement leaderboards either. 57 - 58 - Therefore: **we must build our own index.** The two cheap data sources are: 59 - 60 - 1. **`getPosts`** — hydrates up to 25 post URIs per call, returning aggregate 61 - `likeCount` and `repostCount`. This is how we get the historical baseline 62 - for a user's posts at backfill time. 63 - 2. **Jetstream** (`wss://jetstream2.us-east.bsky.network/subscribe`) — JSON- 64 - over-WebSocket translation of the firehose, supports server-side 65 - filtering by `wantedCollections`, and emits `app.bsky.feed.like`, 66 - `app.bsky.feed.repost`, and `app.bsky.feed.post` create/delete events 67 - that we use for live deltas. 68 - 69 - Jetstream's replay window is short (days, not years) and there is no public 70 - API that returns historical individual like records cheaply, so we cannot 71 - reconstruct full per-user engagement history. The v1 design accepts this 72 - and uses a snapshot baseline + live delta model. 73 - 74 - DIDs (`did:plc:...`) are stable across PDS migrations and handle changes, 75 - so they are the only identifier the system tracks internally. Handles are 76 - mutable display labels. 77 - 78 - Honoring deletions is non-negotiable: Bluesky's ToS expects appviews and 79 - clients to stop serving deleted records. The worker handles delete events 80 - on `app.bsky.feed.post` and `kind: "account"` events with `status: "deleted" 81 - | "takendown"`. 82 - 83 - --- 84 - 85 - ## 3. Stack 86 - 87 - - **Language and framework:** TypeScript on Node.js 24, AdonisJS v7 88 - (the current scaffolder produces v7; the spec originally said v6 based 89 - on stale knowledge — v7 is functionally equivalent for our needs). 90 - - **Templating:** Edge (Adonis's first-party SSR template engine). 91 - - **Read-side ORM:** Lucid (Adonis's first-party Knex-based ORM), used only 92 - for SQLite. 93 - - **Engagement store:** ClickHouse, accessed via `@clickhouse/client`. No ORM. 94 - - **Metadata store:** SQLite (file-backed, WAL mode), via Lucid. 95 - - **Jetstream worker:** Adonis Ace command (`node ace jetstream:consume`) 96 - with `staysAlive = true`. Same project, same image, same code as the web 97 - app — different entrypoint. 98 - - **Background jobs:** `@adonisjs/queue` (the official AdonisJS queue 99 - package) with the **database adapter on SQLite**. Used for backfill jobs 100 - only. Run via `node ace queue:work` as a third process. 101 - - **Atproto client:** `@atproto/api` (the official TypeScript SDK), with rate-limit handling and parsing in `app/lib/atproto/`. 102 - - **Containerization:** Single `Dockerfile` (Node 24 alpine multi-stage with 103 - tini), single `docker-compose.yml` with four services: `clickhouse`, 104 - `web`, `jetstream-worker`, `queue-worker`. 105 - 106 - The web, jetstream worker, and queue worker are deliberately separate 107 - processes that share code but not memory. They communicate only through 108 - ClickHouse and SQLite. This isolates failure domains and allows independent 109 - restarts: a crash in the queue worker does not interrupt live engagement 110 - ingest, and a crash in the jetstream worker does not interrupt running 111 - backfills. 112 - 113 - ### Why ClickHouse, not Postgres 114 - 115 - The hot query is "for one author, give me the top-25 posts by sum of 116 - engagement events." This is the canonical columnar OLAP workload. ClickHouse 117 - handles 1B+ event rows on a single small box at single-digit-millisecond 118 - query time, with compression ratios of ~10-15× on like-event data (DIDs and 119 - URI prefixes repeat heavily). The append-only model also removes a class of 120 - write-contention bugs we'd otherwise hit on Postgres counter UPDATEs. 121 - 122 - SQLite holds the ~kilobyte-scale relational state (users, jetstream cursor, 123 - backfill jobs) where ClickHouse would be wrong. Splitting them means we can 124 - rebuild ClickHouse from Jetstream + re-backfill without losing operational 125 - state, and vice versa. 126 - 127 - ### Why AdonisJS, not Hono / Nuxt 128 - 129 - The site is small now but expected to grow auth, possibly accounts, and 130 - other features later. Adonis's batteries-included structure (controllers, 131 - sessions, validators, mailer, Ace commands) pays for itself the first time 132 - we add a feature that would otherwise need 5 separate libraries glued 133 - together. At 100 req/sec peak, the framework overhead is invisible relative 134 - to the database round-trip. The `staysAlive` Ace command pattern is also a 135 - particularly clean fit for hosting the Jetstream worker in the same project. 136 - 137 - --- 138 - 139 - ## 4. Architecture 140 - 141 - ``` 142 - ┌─────────────────────────────────────────────────────────────────┐ 143 - │ Hetzner-class VPS (4 vCPU, 8 GB, 160 GB NVMe) │ 144 - │ │ 145 - │ ┌──────────────┐ ┌──────────────────┐ ┌──────────────────┐ │ 146 - │ │ web │ │ jetstream-worker │ │ queue-worker │ │ 147 - │ │ Adonis HTTP │ │ node ace │ │ node ace │ │ 148 - │ │ + Edge SSR │ │ jetstream: │ │ queue:work │ │ 149 - │ │ + Lucid │ │ consume │ │ (runs │ │ 150 - │ │ │ │ (staysAlive) │ │ BackfillJob) │ │ 151 - │ └──────┬───────┘ └────────┬─────────┘ └────────┬─────────┘ │ 152 - │ │ │ │ │ 153 - │ │ reads │ writes │ reads+writes│ 154 - │ ▼ ▼ ▼ │ 155 - │ (via app/lib/clickhouse/) │ 156 - │ ▼ │ 157 - │ ┌──────────────────┐ ┌──────────────────────┐ │ 158 - │ │ ClickHouse │ │ SQLite (Lucid) │ │ 159 - │ │ engagement │ │ metadata: │ │ 160 - │ │ events + │ │ - users │ │ 161 - │ │ post snapshots │ │ - cursor checkpoint │ │ 162 - │ │ (append-only) │ │ - backfill_jobs │ │ 163 - │ │ │ │ - adonis_jobs │ │ 164 - │ │ │ │ (queue internal) │ │ 165 - │ └──────────────────┘ └──────────────────────┘ │ 166 - └─────────────────────────────────────────────────────────────────┘ 167 - ▲ ▲ 168 - │ wss:// │ HTTPS 169 - │ │ 170 - Bluesky Jetstream Bluesky AppView API 171 - (jetstream2.us-east...) (public.api.bsky.app) 172 - ``` 173 - 174 - ### Repository layout 175 - 176 - ``` 177 - favs.blue/ 178 - ├── app/ 179 - │ ├── controllers/ # ProfileController, SearchController 180 - │ ├── jobs/ # BackfillJob (@adonisjs/queue Job) 181 - │ ├── lib/ # atproto/ and clickhouse/ helpers 182 - │ ├── models/ # User, BackfillJob row (Lucid) 183 - │ └── services/ # HandleResolver, JetstreamConsumer 184 - ├── commands/ 185 - │ └── jetstream_consume.ts # Ace staysAlive worker 186 - ├── database/ 187 - │ ├── migrations/ # SQLite migrations (Adonis) 188 - │ └── clickhouse/ # ordered .sql files for CH migrations 189 - ├── resources/views/ # Edge templates 190 - ├── docker-compose.yml 191 - └── Dockerfile 192 - ``` 193 - 194 - There is one published image. 195 - 196 - ### Key invariants 197 - 198 - 1. The web and worker processes never communicate directly. All coordination 199 - is via SQLite and ClickHouse. 200 - 2. The `app/lib/clickhouse/` directory is the only place that knows 201 - ClickHouse SQL. Every controller, job, and service imports from `#lib/clickhouse` 202 - — there's no inline SQL anywhere else. 203 - 3. Every per-request operation is a constant number of SQL/ClickHouse 204 - queries, regardless of how many users are tracked. 205 - 206 - --- 207 - 208 - ## 5. Data model 209 - 210 - ### ClickHouse — engagement data 211 - 212 - Two tables. The reconciliation between historical snapshots (from 213 - `getPosts`) and live event deltas (from Jetstream) is handled by a 214 - **per-post watermark**, not a per-user one — this is the only correctness- 215 - critical detail in the data layer. 216 - 217 - #### `post_snapshots` 218 - 219 - One row per post we have ever seen, capturing what the AppView reported at 220 - backfill time. 221 - 222 - ```sql 223 - CREATE TABLE IF NOT EXISTS post_snapshots ( 224 - post_uri String, 225 - post_author_did LowCardinality(String), 226 - post_text String, 227 - post_created_at DateTime64(6), 228 - snapshot_likes UInt32, 229 - snapshot_reposts UInt32, 230 - snapshot_quotes UInt32, -- populated from getPosts.quoteCount 231 - snapshot_taken_at DateTime64(6), -- per-post watermark 232 - is_deleted UInt8 DEFAULT 0, 233 - INDEX idx_created post_created_at TYPE minmax GRANULARITY 4 234 - ) ENGINE = ReplacingMergeTree(snapshot_taken_at, is_deleted) 235 - ORDER BY (post_author_did, post_uri) 236 - PARTITION BY tuple(); 237 - ``` 238 - 239 - `ReplacingMergeTree` collapses rows with the same ORDER BY tuple 240 - `(post_author_did, post_uri)` to the latest version (largest 241 - `snapshot_taken_at`). The two-argument engine form enables native deletion 242 - semantics: a tombstone row written with `is_deleted=1` causes ClickHouse to 243 - physically remove the row on the next merge, and `FINAL` queries skip 244 - deleted rows at read time. `is_deleted` is **not** part of the order key — 245 - if it were, tombstones would have a different key tuple from the originals 246 - they replace and the collapse would never happen. 247 - 248 - `LowCardinality(String)` on `post_author_did` turns every column read into 249 - a dictionary lookup; tens of thousands of distinct authors is well within 250 - the LowCardinality sweet spot. 251 - 252 - The `idx_created` minmax skip index lets the optional `?days=` filter 253 - prune granules without putting a microsecond timestamp in the sort key 254 - (which would destroy the sparse primary index). 255 - 256 - `PARTITION BY tuple()` (i.e. no partitioning) is deliberate. Backfill 257 - inserts ~10,000 posts per user spanning years of history in batches of 25; 258 - monthly partitioning would fan each batch across many partitions and 259 - trigger the "too many parts" failure mode. At our total data size (~50 GB 260 - lifetime) and with no query that filters by date alone, partitioning buys 261 - us nothing. 262 - 263 - #### `engagement_events` 264 - 265 - Append-only log of every like/repost we have seen via Jetstream **for 266 - tracked users only**. The worker filters out engagement events targeting 267 - untracked authors before insert. 268 - 269 - ```sql 270 - CREATE TABLE IF NOT EXISTS engagement_events ( 271 - post_uri String, 272 - post_author_did LowCardinality(String), -- denormalized for fast author filter 273 - actor_did String, 274 - rkey String, -- the engagement record's rkey 275 - kind LowCardinality(String), -- 'like' | 'repost' | 'quote' 276 - event_created_at DateTime64(6), -- when the actor created the engagement 277 - ingested_at DateTime64(6) DEFAULT now64(6) 278 - ) ENGINE = ReplacingMergeTree(ingested_at) 279 - ORDER BY (post_author_did, kind, post_uri, actor_did, rkey) 280 - PARTITION BY tuple(); 281 - ``` 282 - 283 - `ReplacingMergeTree` keyed on the natural unique identifier 284 - `(post_author_did, kind, post_uri, actor_did, rkey)` makes inserts 285 - idempotent: replaying the same event on Jetstream reconnect collides with 286 - itself and collapses on merge. We do **not** use `FINAL` on 287 - `engagement_events` — `countIf` over briefly-duplicated rows produces a 288 - slight transient overcount, which is within the accepted error budget (see 289 - §10). This is different from `post_snapshots`, where duplicate rows would 290 - cause GROUP BY bucket splitting that returns the same post twice; on 291 - `post_snapshots` we do use `FINAL`. 292 - 293 - The order key also makes "top posts for one author, one kind" a sequential 294 - scan over a tight slab of disk, supporting single-digit-millisecond queries. 295 - `post_author_did` is `LowCardinality(String)` for the same dictionary-lookup 296 - benefits as on `post_snapshots`. `PARTITION BY tuple()` for the same 297 - "too many parts" reasons. 298 - 299 - ### How counts are computed 300 - 301 - The total engagement of one kind on one post is: 302 - 303 - ``` 304 - total(post, kind) = snapshot_count(post, kind) 305 - + count(events for post where kind matches 306 - AND event_created_at > snapshot_taken_at(post)) 307 - ``` 308 - 309 - The watermark is **per-post**, not per-user, because the backfill makes 310 - ~hundreds of `getPosts` calls spread across several seconds — each call 311 - captures a snapshot at a different moment, and a single user-level watermark 312 - would either double-count or miss events depending on its value. 313 - 314 - The top-25 query is one `LEFT JOIN` across the two tables filtered to the 315 - single author's slab, computed as: 316 - 317 - ```sql 318 - SELECT 319 - s.post_uri, 320 - s.post_text, 321 - s.post_created_at, 322 - s.snapshot_likes 323 - + countIf(e.kind='like' AND e.event_created_at > s.snapshot_taken_at) 324 - AS likes, 325 - s.snapshot_reposts 326 - + countIf(e.kind='repost' AND e.event_created_at > s.snapshot_taken_at) 327 - AS reposts 328 - FROM post_snapshots AS s FINAL 329 - LEFT JOIN engagement_events AS e 330 - ON e.post_uri = s.post_uri 331 - AND e.post_author_did = s.post_author_did 332 - WHERE s.post_author_did = ? 333 - AND s.is_deleted = 0 334 - AND (? IS NULL OR s.post_created_at >= now() - INTERVAL ? DAY) -- ?days= filter 335 - GROUP BY s.post_uri, s.post_text, s.post_created_at, 336 - s.snapshot_likes, s.snapshot_reposts, s.snapshot_taken_at 337 - ORDER BY {likes|reposts} DESC, s.post_created_at DESC 338 - LIMIT 25; 339 - ``` 340 - 341 - `FROM post_snapshots FINAL` is necessary for correctness: between merges, 342 - multiple snapshot versions of the same `post_uri` can coexist (re-backfill, 343 - tombstones), and a naive GROUP BY would split them into separate buckets 344 - and return the same post twice. `FINAL` deduplicates at query time and 345 - applies the `is_deleted` marker. Per the ClickHouse docs, `FINAL` is cheap 346 - when the query filters on the leading order-key column — we always filter 347 - on `post_author_did` first, so `FINAL` only has to scan that one author's 348 - tight slab (a few granules at most for ~10k posts). 349 - 350 - We do **not** use `FINAL` on `engagement_events`. `countIf` over briefly- 351 - duplicated rows from Jetstream replays produces a small transient overcount 352 - which is within the accepted error budget (see §10). 353 - 354 - ClickHouse pushes the `post_author_did = ?` predicate down both sides of 355 - the join because both tables are ordered by `post_author_did` first. 356 - 357 - ### SQLite — metadata (Lucid models) 358 - 359 - ```sql 360 - CREATE TABLE users ( 361 - did TEXT PRIMARY KEY, -- stable identity 362 - handle TEXT NOT NULL, -- current handle (mutable) 363 - display_name TEXT, 364 - avatar_url TEXT, 365 - first_seen_at INTEGER NOT NULL, 366 - backfilled_at INTEGER, -- NULL = never backfilled 367 - last_searched_at INTEGER, 368 - deleted_at INTEGER -- set on account deletion/takedown 369 - ); 370 - CREATE INDEX users_handle ON users(handle); 371 - 372 - CREATE TABLE jetstream_cursor ( 373 - id INTEGER PRIMARY KEY CHECK (id = 1), 374 - cursor_us INTEGER NOT NULL, 375 - updated_at INTEGER NOT NULL 376 - ); 377 - 378 - CREATE TABLE backfill_jobs ( 379 - did TEXT PRIMARY KEY, 380 - started_at INTEGER NOT NULL, 381 - finished_at INTEGER, 382 - total_posts INTEGER, 383 - fetched_posts INTEGER NOT NULL DEFAULT 0, 384 - state TEXT NOT NULL, -- 'running' | 'done' | 'failed' 385 - error TEXT 386 - ); 387 - ``` 388 - 389 - The `users.backfilled_at` column is informational only; it is not used in 390 - the query math. Per-post `snapshot_taken_at` carries the load-bearing 391 - correctness signal. 392 - 393 - --- 394 - 395 - ## 6. Data flow 396 - 397 - ### Flow 1 — Ingest (worker, always running) 398 - 399 - The worker subscribes to Jetstream with 400 - `wantedCollections=app.bsky.feed.like,app.bsky.feed.repost,app.bsky.feed.post` 401 - and `compress=true`. We receive every event in those collections from the 402 - whole network — server-side filtering by *author* is not possible for our 403 - case (likes are authored by likers, not post authors), so we filter in the 404 - consumer. 405 - 406 - On each event: 407 - 408 - 1. Parse JSON. 409 - 2. Branch on collection. 410 - 3. For likes/reposts: extract `subject.uri`, parse the post author DID from 411 - it (`at://did:plc:THIS_PART/app.bsky.feed.post/RKEY`). 412 - 4. Check the post author DID against the in-memory `Set<string>` of tracked 413 - DIDs. If absent, drop the event (>99% of events). The lookup is ~50ns 414 - per event; the filter is effectively free. 415 - 5. If present, append the row to an in-memory buffer. 416 - 6. The buffer flushes to ClickHouse every 500ms or 1000 rows, whichever 417 - comes first. ClickHouse is much happier with ~100 large inserts/sec than 418 - 1500 single-row inserts/sec. 419 - 7. After a successful flush, update `jetstream_cursor.cursor_us` to the 420 - cursor of the last event in that batch. 421 - 422 - For `app.bsky.feed.post` events: if the post's author is tracked, insert a 423 - row into `post_snapshots` with `snapshot_likes=0, snapshot_reposts=0, 424 - snapshot_quotes=0, snapshot_taken_at=now()`, so future engagement on it has 425 - somewhere to live. 426 - 427 - Additionally, for every `app.bsky.feed.post` event (regardless of whether 428 - the post's author is tracked), inspect `record.embed`. If it is an 429 - `app.bsky.embed.record` or `app.bsky.embed.recordWithMedia`, parse the 430 - embedded post's URI and extract its author DID. If that author is in the 431 - tracked set, insert a row into `engagement_events` with `kind='quote'`, 432 - `post_uri = embedded post URI`, `actor_did = quoter's DID`. Quote events 433 - are stored from v1 forward but not surfaced in the v1 UI — the data 434 - accumulates silently until quote support ships in a future version. 435 - This avoids the irrecoverable loss of per-event quote history that would 436 - otherwise happen between v1 and v2 (Jetstream cannot be replayed beyond 437 - its ~few-day retention window). 438 - 439 - For `app.bsky.feed.post` *delete* events on tracked authors: write a 440 - tombstone row to `post_snapshots` with `is_deleted=1` and a fresh 441 - `snapshot_taken_at`. ReplacingMergeTree collapses to the tombstone version. 442 - 443 - For `kind: "account"` events with `status: "deleted" | "takendown"` on a 444 - tracked DID: set `users.deleted_at = now()`, remove the DID from the 445 - in-memory tracked set, and schedule a worker task that tombstones all 446 - of that user's `post_snapshots` rows. 447 - 448 - Every 1 second, the worker re-reads new entries from `users` and adds any 449 - newly-seen DIDs to the in-memory tracked set. There is no handshake back to 450 - the web process — this is a one-way refresh. 451 - 452 - Every 1 second, the worker also writes the latest cursor checkpoint to 453 - SQLite. (If the worker crashes between checkpoints, the natural replay on 454 - reconnect — see "Reconnect handling" — produces duplicate inserts that 455 - ReplacingMergeTree collapses on merge.) 456 - 457 - #### Reconnect handling 458 - 459 - On WebSocket disconnect, the worker reads `jetstream_cursor.cursor_us` and 460 - reconnects with `cursor=<that value>`. **No subtraction, no overlap window.** 461 - Jetstream cursors are monotonic microsecond timestamps; resuming from 462 - exactly the last-checkpointed cursor delivers strictly subsequent events. 463 - The natural replay window is the events received-but-not-yet-checkpointed 464 - at the moment of crash (at most ~one batch, ~500ms wide), which collapses 465 - naturally on `ReplacingMergeTree` merges. 466 - 467 - ### Flow 2 — Backfill (web process dispatches, queue worker runs) 468 - 469 - Triggered when `GET /profile/:handle/likes` (or `/reposts`) hits a user 470 - whose `users.backfilled_at IS NULL`. 471 - 472 - **In the web request handler:** 473 - 474 - 1. Resolve handle → DID via `com.atproto.identity.resolveHandle` against 475 - `public.api.bsky.app`. 476 - 2. `INSERT OR IGNORE` into `users(did, handle, first_seen_at)`. If a 477 - parallel request already inserted, the second request sees the same 478 - row. 479 - 3. `INSERT OR IGNORE` into `backfill_jobs(did, state='running', 480 - started_at)`. **Only if the insert actually happened** (i.e. this is the 481 - first request for this handle, not a parallel duplicate), dispatch the 482 - queued job: `await BackfillJob.dispatch({did})`. The conditional 483 - dispatch is our deduplication mechanism — `@adonisjs/queue` does not 484 - provide dedup-by-key natively. 485 - 4. Return a static "Indexing @handle…" page with `<meta http-equiv="refresh" 486 - content="2">`. No SSE, no JS, no progress bar. 487 - 5. The browser auto-refreshes every 2 seconds. Each refresh re-runs the 488 - same controller, which checks `backfill_jobs.state` and either renders 489 - the loading page again or, once `state='done'`, renders the actual 490 - top-25 page. 491 - 492 - **In the queue worker process** (`node ace queue:work`), `BackfillJob` 493 - runs: 494 - 495 - 1. `getAuthorFeed(DID, cursor)` paginates the user's posts in reverse 496 - chronological order, ~100 posts per page. 497 - 2. For each batch of 25 URIs, `getPosts(uris)` returns aggregate counts. 498 - 3. Insert one row per post into `post_snapshots` with `snapshot_likes`, 499 - `snapshot_reposts`, and `snapshot_quotes` copied directly from the 500 - `getPosts` response, and `snapshot_taken_at = now()` recorded *at the 501 - moment that batch's response landed*. 502 - 4. Update `backfill_jobs.fetched_posts` for the loading page. 503 - 5. Continue until cursor exhausted or 10,000 posts reached 504 - (`BACKFILL_MAX_POSTS`). 505 - 6. `UPDATE users SET backfilled_at = now()`, `UPDATE backfill_jobs SET 506 - state='done', finished_at = now()`. 507 - 508 - If the job throws, `@adonisjs/queue` retries with exponential backoff 509 - (default settings). On final failure, the job's `failed()` hook updates 510 - `backfill_jobs.state = 'failed'` with the error message, and the loading 511 - page can render a "we couldn't index this user — try again later" message 512 - on the next meta-refresh. 513 - 514 - The worker is, in parallel, polling the `users` table every 1 second and 515 - will pick up the new DID within ~1 second of insertion, beginning live 516 - ingest. There is a small window (≤1 second) at the start of the backfill 517 - during which likes targeting the new user's earliest backfilled posts may 518 - be lost — they happen after the snapshot was taken (so not in the snapshot) 519 - but before the worker picked the user up (so not in events). This drift is 520 - bounded to a handful of likes per first-search per user, only affects the 521 - most-recent posts in the first batch, and is **explicitly accepted** 522 - (see §10). 523 - 524 - If the queue worker process is killed mid-job (not a thrown error, but a 525 - hard crash or `kill -9`), the `backfill_jobs` row is left in 526 - `state='running'` and the user's loading page never resolves. The queued 527 - job itself may or may not be retried by `@adonisjs/queue` depending on 528 - visibility-timeout configuration; for v1 we recover manually by deleting 529 - the row. Throw-based failures are handled cleanly by the job's `failed()` 530 - hook (above). 531 - 532 - ### Flow 3 — Read (web process, every subsequent visit) 533 - 534 - `GET /profile/:handle/likes?days=30` 535 - 536 - 1. `SELECT did, handle, deleted_at FROM users WHERE handle = ?` 537 - 2. If `deleted_at IS NOT NULL`, render 410 Gone. 538 - 3. If the URL handle differs from the canonical handle for that DID 539 - (handle change since last visit), 301 to the canonical URL. 540 - 4. Run the top-25 ClickHouse query (per §5) with the `?days=` filter and 541 - the kind from the path segment (`/likes` or `/reposts`). 542 - 5. Render the Edge template with the 25 cards. 543 - 6. `Cache-Control: public, max-age=60` — top posts of all time don't change 544 - meaningfully second to second. 545 - 546 - Each post card renders directly from `post_snapshots.post_text` plus the 547 - joined live counts. **No callbacks to the AppView at request time.** Avatar 548 - images link directly to the Bluesky CDN URL stored on the user row. 549 - 550 - --- 551 - 552 - ## 7. Routes and URLs 553 - 554 - | Method | Route | Purpose | 555 - |---|---|---| 556 - | GET | `/` | Landing page with search box | 557 - | GET | `/search?q=:handle` | Resolve, 302 → canonical `/profile/:handle/likes` | 558 - | GET | `/profile/:handle` | 301 → `/profile/:handle/likes` | 559 - | GET | `/profile/:handle/likes` | Profile page, top by likes | 560 - | GET | `/profile/:handle/reposts` | Profile page, top by reposts | 561 - | GET | `/about` | Static "what is this" + ATProto credits | 562 - 563 - ### Canonicalization rules 564 - 565 - Every profile view has **exactly one canonical URL**. All non-canonical 566 - forms 301 to it. 567 - 568 - - `/profile/dril` → 301 → `/profile/dril.bsky.social/likes` 569 - - `/profile/dril.bsky.social` → 301 → `/profile/dril.bsky.social/likes` 570 - - `/profile/dril.bsky.social/likes` → 200 (canonical) 571 - - `/profile/oldhandle.bsky.social/likes` → 301 → 572 - `/profile/newhandle.bsky.social/likes` (handle change healing) 573 - - `/profile/btao.org/likes` → 200 (custom domain handles are already 574 - canonical) 575 - 576 - The search box on the landing page resolves through canonicalization 577 - server-side so a user typing "dril" lands directly on 578 - `/profile/dril.bsky.social/likes` with one redirect, not two. 579 - 580 - A `<link rel="canonical">` tag in the page `<head>` points at the canonical 581 - URL as belt-and-braces. 582 - 583 - ### Query parameters 584 - 585 - - `?days=N` — filter to posts created within the last `N` days, computed 586 - at query time relative to the current wall clock. `N` is a positive 587 - integer, validated against a sanity cap (3650). Omitted parameter means 588 - "all time". This shape was chosen so that bookmarking a "Last month" 589 - page yields a permalink that always means "the most recent 30 days," 590 - not a fixed date range that drifts backward over time. 591 - 592 - The UI dropdown shows friendly labels ("All time", "Last month") and emits 593 - the corresponding `?days=` value (or omits it for all time). 594 - 595 - --- 596 - 597 - ## 8. UI 598 - 599 - ### Landing (`/`) 600 - 601 - - App name and one-line tagline. 602 - - Single search box. 603 - - Four example handles to seed curiosity. 604 - - One short paragraph of explanation. 605 - - No nav, no login, no footer cruft. 606 - 607 - ### Profile page 608 - 609 - - Header: avatar, display name, handle, optional bio line. 610 - - Two controls: 611 - - **Kind toggle:** "Most liked" / "Most reposted" — implemented as links 612 - between `/profile/:handle/likes` and `/profile/:handle/reposts`. 613 - - **Lens dropdown:** "All time" / "Last month" — sets the `?days=` 614 - parameter. 615 - - 25 post cards in ranked order. Each card shows: like count, repost count, 616 - post text (with line breaks preserved), date, "view on Bluesky" link. 617 - 618 - ### Loading page (during first-visit backfill) 619 - 620 - A static page with "Indexing @handle…" copy and a meta-refresh tag. No 621 - progress bar, no SSE, no JS. The browser refreshes every 2 seconds; each 622 - refresh re-runs the controller which decides whether to keep showing the 623 - loading page or render the real one. 624 - 625 - ### Error states 626 - 627 - - Handle does not resolve → 404 with "We can't find @whatever on Bluesky. 628 - Did you typo?" 629 - - User has zero public posts → render the profile header with "@user 630 - hasn't posted anything yet". 631 - - User has fewer than 25 posts → render whatever they have. 632 - - Account deleted/taken down → 410 Gone. 633 - - ClickHouse query fails → 503 with "we're having a moment, try again in a 634 - sec". 635 - - AppView rate-limited mid-backfill → backoff and resume silently. The 636 - loading page continues to refresh; the user is not told why it is slow. 637 - - Backfill cap reached (10,000 posts) → render the page normally. (No UI 638 - callout; v1 keeps the page clean.) 639 - 640 - --- 641 - 642 - ## 9. Configuration 643 - 644 - Complete env-var surface: 645 - 646 - | Variable | Purpose | 647 - |---|---| 648 - | `NODE_ENV` | `production` or `development` | 649 - | `PORT` | HTTP port (web only) | 650 - | `APP_KEY` | Adonis encryption key | 651 - | `SQLITE_PATH` | Path to SQLite file | 652 - | `CLICKHOUSE_URL` | ClickHouse HTTP endpoint | 653 - | `CLICKHOUSE_DB` / `CLICKHOUSE_USER` / `CLICKHOUSE_PASSWORD` | Credentials | 654 - | `JETSTREAM_URL` | Jetstream WebSocket URL (worker only) | 655 - | `BACKFILL_MAX_POSTS` | Defaults to 10000 | 656 - | `LOG_LEVEL` | Defaults to `info` | 657 - 658 - No config files baked into the image. No service discovery. No secrets 659 - manager. 660 - 661 - --- 662 - 663 - ## 10. Accepted error budgets 664 - 665 - The design deliberately accepts several small sources of count drift in 666 - exchange for radical simplicity: 667 - 668 - 1. **Sub-second AppView/Jetstream race during backfill.** A like that 669 - arrives via Jetstream within ~100-500ms of a `getPosts` call may end up 670 - on the wrong side of `snapshot_taken_at`, causing a single missing or 671 - double-counted like. Bounded to a few likes per backfill per active post. 672 - 2. **First-second worker lag on new users.** Likes happening between the 673 - moment a user is first inserted and the moment the worker picks them up 674 - on its 1-second poll may be lost — they happen after the snapshot was 675 - taken (for the earliest backfill batches) but before the worker is 676 - tracking. Bounded to a handful of likes per first-search per user, only 677 - affects the user's most-recent posts, only happens once per user ever. 678 - 3. **Unlikes / unreposts not tracked.** Counts may drift up over time. The 679 - alternative is a multi-billion-row rkey-to-post mapping table, which is 680 - not worth the storage cost for v1. 681 - 4. **Transient over-count during ReplacingMergeTree merge windows.** After 682 - a Jetstream reconnect, duplicate event rows exist briefly until a 683 - background merge collapses them. Bounded to seconds. 684 - 685 - For a "greatest-hits ranking" UX where the top posts are old viral content 686 - with thousands of likes, all four of these drifts are invisible. 687 - 688 - There is no reconciliation job in v1. 689 - 690 - --- 691 - 692 - ## 11. Testing strategy 693 - 694 - Two layers. No smoke / E2E / load / browser tests. 695 - 696 - ### Unit tests (Japa) 697 - 698 - - `app/lib/clickhouse/store.ts` — top-25 query builder, fixtures seeded 699 - into a real ClickHouse. Tests cover: all-time and `?days=` ordering, 700 - per-post watermark math, ties broken by `post_created_at`, posts with 701 - zero engagement excluded, tombstoned posts excluded. 702 - - `app/lib/atproto/parsers/` — pure functions for AT-URI parsing, 703 - Jetstream event JSON → internal shape, `getPosts` response → internal 704 - shape. 705 - - `app/services/handle_resolver.ts` — handle normalization, 706 - custom domain detection, invalid handle rejection. 707 - - The worker's filter logic — given an event and a tracked-DID set, does 708 - it correctly keep or drop. 709 - - The worker's quote-detection logic — given a post event, correctly 710 - identifies `app.bsky.embed.record` and `app.bsky.embed.recordWithMedia` 711 - embeds, parses the embedded post's author DID, ignores other embed 712 - types, and only emits `kind='quote'` engagement rows for tracked 713 - embedded authors. 714 - - Integer parsing for `?days=`. 715 - 716 - ### Integration tests (Japa with real services) 717 - 718 - - Full backfill flow with a stub `@atproto/api` client. 719 - - Jetstream consumer loop with a fake WebSocket — verify only tracked 720 - users' events land in ClickHouse, post deletes tombstone correctly, 721 - cursor checkpoint advances. 722 - - Canonical URL redirect flows (bare handle, kind suffix, handle change, 723 - custom domain). 724 - - Concurrent first-search dedup. 725 - 726 - ### Test infrastructure 727 - 728 - - One `docker-compose.yml` at the repo root (also used in development). 729 - Runs ClickHouse. SQLite is handled natively by Adonis's test runner. 730 - - CI uses the same Compose file. Per-test ClickHouse database isolation 731 - via `CREATE DATABASE test_<uuid>`. 732 - - Test data builders (`aPost(...)`, `aLikeEvent(...)`) for readable setups. 733 - 734 - ### Coverage targets 735 - 736 - No percentage. The bugs that matter are concentrated in: 737 - 738 - 1. The watermark/snapshot reconciliation math. 739 - 2. The Jetstream event filter. 740 - 3. The handle resolution / canonical URL flow. 741 - 4. The deletion-honoring path. 742 - 743 - --- 744 - 745 - ## 12. Deployment 746 - 747 - A single `docker-compose.yml` at the repo root, four services. The 748 - application's only deployment concerns are: building a Docker image, 749 - logging to stdout, reading config from env vars. 750 - 751 - ### `Dockerfile` 752 - 753 - Multi-stage, `node:24-alpine` base, tini as the entrypoint (Node-as-PID-1 754 - does not reap zombies or forward signals correctly; tini fixes this with 755 - one line). 756 - 757 - ```dockerfile 758 - FROM node:24-alpine AS build 759 - WORKDIR /app 760 - COPY package.json package-lock.json ./ 761 - RUN npm ci 762 - COPY . . 763 - RUN node ace build 764 - RUN npm prune --production 765 - 766 - FROM node:24-alpine AS runtime 767 - RUN apk add --no-cache tini 768 - WORKDIR /app 769 - COPY --from=build --chown=node:node /app/build ./build 770 - COPY --from=build --chown=node:node /app/node_modules ./node_modules 771 - COPY --from=build --chown=node:node /app/package.json ./package.json 772 - USER node 773 - ENTRYPOINT ["/sbin/tini", "--"] 774 - # CMD set per-service in docker-compose.yml 775 - ``` 776 - 777 - ### `docker-compose.yml` 778 - 779 - ```yaml 780 - services: 781 - clickhouse: 782 - image: clickhouse/clickhouse-server:latest 783 - volumes: 784 - - clickhouse-data:/var/lib/clickhouse 785 - environment: 786 - CLICKHOUSE_DB: favs 787 - CLICKHOUSE_USER: favs 788 - CLICKHOUSE_PASSWORD: ${CLICKHOUSE_PASSWORD} 789 - ulimits: 790 - nofile: 262144 791 - 792 - web: 793 - build: . 794 - command: node bin/server.js 795 - depends_on: [clickhouse] 796 - volumes: 797 - - sqlite-data:/data 798 - environment: 799 - NODE_ENV: production 800 - PORT: 3333 801 - DB_CONNECTION: sqlite 802 - SQLITE_PATH: /data/favs.sqlite 803 - CLICKHOUSE_URL: http://clickhouse:8123 804 - CLICKHOUSE_DB: favs 805 - CLICKHOUSE_USER: favs 806 - CLICKHOUSE_PASSWORD: ${CLICKHOUSE_PASSWORD} 807 - APP_KEY: ${APP_KEY} 808 - ports: 809 - - "3333:3333" 810 - 811 - jetstream-worker: 812 - build: . 813 - command: node ace jetstream:consume 814 - depends_on: [clickhouse, web] 815 - volumes: 816 - - sqlite-data:/data 817 - environment: 818 - NODE_ENV: production 819 - DB_CONNECTION: sqlite 820 - SQLITE_PATH: /data/favs.sqlite 821 - CLICKHOUSE_URL: http://clickhouse:8123 822 - CLICKHOUSE_DB: favs 823 - CLICKHOUSE_USER: favs 824 - CLICKHOUSE_PASSWORD: ${CLICKHOUSE_PASSWORD} 825 - JETSTREAM_URL: wss://jetstream2.us-east.bsky.network/subscribe 826 - 827 - queue-worker: 828 - build: . 829 - command: node ace queue:work 830 - depends_on: [clickhouse, web] 831 - volumes: 832 - - sqlite-data:/data 833 - environment: 834 - NODE_ENV: production 835 - DB_CONNECTION: sqlite 836 - SQLITE_PATH: /data/favs.sqlite 837 - CLICKHOUSE_URL: http://clickhouse:8123 838 - CLICKHOUSE_DB: favs 839 - CLICKHOUSE_USER: favs 840 - CLICKHOUSE_PASSWORD: ${CLICKHOUSE_PASSWORD} 841 - 842 - volumes: 843 - clickhouse-data: 844 - sqlite-data: 845 - ``` 846 - 847 - ### Migrations 848 - 849 - - `web` startup runs `node ace migration:run` (SQLite via Lucid, which 850 - also creates the `@adonisjs/queue` internal tables) and 851 - `node ace clickhouse:migrate` (custom Ace command applying versioned 852 - `.sql` files in `database/clickhouse/`, tracked in a 853 - `_clickhouse_migrations` table). 854 - - `jetstream-worker` and `queue-worker` skip migrations and wait up to 5 855 - seconds for `web` to have run them, then proceed. Compose `depends_on` 856 - ordering ensures `web` starts first. 857 - 858 - ### Logging 859 - 860 - Every process writes to stdout. Adonis is configured with `LOG_LEVEL=info` 861 - and a JSON formatter (Pino, Adonis's default). No file logging, no 862 - rotation, no shipping. The orchestration layer collects logs. 863 - 864 - ### Graceful shutdown 865 - 866 - All three Adonis processes handle SIGTERM: 867 - 868 - - `web`: Adonis's HTTP server already handles this — drains in-flight 869 - requests, then exits. 870 - - `jetstream-worker`: closes the Jetstream WebSocket, flushes the in-memory 871 - ClickHouse buffer one last time, writes the final cursor checkpoint, 872 - exits 0. 873 - - `queue-worker`: `@adonisjs/queue`'s `queue:work` command handles SIGTERM 874 - by waiting for the in-flight job (if any) to complete or hit its 875 - visibility timeout, then exits cleanly. 876 - 877 - Tini ensures SIGTERM is forwarded correctly under `docker stop`. 878 - 879 - --- 880 - 881 - ## 13. Out of scope (recap) 882 - 883 - To preserve focus, the following are explicitly **not** built in v1: 884 - 885 - - Authentication, accounts, sessions. 886 - - Quote-post tracking in the UI (data is captured in v1 but not surfaced). 887 - - Global leaderboards, daily/weekly "best of" pages. 888 - - Trophies, awards, notifications. 889 - - Email, RSS, push. 890 - - Pagination beyond top-25. 891 - - Backfill of historical individual like records. 892 - - Reconciliation jobs. 893 - - Unlike / unrepost tracking. 894 - - Browser tests, E2E tests, smoke tests, load tests. 895 - - Metrics, APM, error tracking, log shipping. 896 - - Backup automation, deploy pipeline (handled by operator). 897 - - Crashed-backfill auto-recovery (manual cleanup). 898 - - IPC between web and worker beyond shared SQLite. 899 - 900 - Each of these is a known forward-looking feature. The schema and 901 - architecture are designed to absorb most of them without rework — quotes 902 - fit into the existing `kind` enum, leaderboards are `WHERE author_did = ?` 903 - removed from the existing query, time windows are already parameterized.

-121

docs/superpowers/specs/2026-04-13-bluesky-oauth-design.md

··· 1 - # Bluesky OAuth Sign-In 2 - 3 - ## Goal 4 - 5 - Allow users to sign in with their Bluesky account so they can quickly view their own top posts and like/repost posts inline when viewing other profiles. 6 - 7 - ## Packages 8 - 9 - - `@adonisjs/auth` — session guard, `ctx.auth`, auth middleware, `auth` in Edge templates 10 - - `@atproto/oauth-client-node` — handles DPoP, PAR, PKCE, token refresh 11 - 12 - ## OAuth Scopes (minimum required) 13 - 14 - - `atproto` (required base scope) 15 - - `repo:app.bsky.feed.like?action=create&action=delete` 16 - - `repo:app.bsky.feed.repost?action=create&action=delete` 17 - 18 - ## Database Changes 19 - 20 - ### Rename `users` → `tracked_profiles` 21 - 22 - Rename the table and update all references (model class becomes `TrackedProfile`, path alias, imports, queries, controllers, etc.). 23 - 24 - ### New `accounts` table (SQLite) 25 - 26 - | Column | Type | Notes | 27 - |--------|------|-------| 28 - | `did` | TEXT | Primary key | 29 - | `handle` | TEXT | | 30 - | `session_data` | TEXT | JSON blob — atproto SDK token/DPoP storage | 31 - | `created_at` | INTEGER | | 32 - | `updated_at` | INTEGER | | 33 - 34 - Display name and avatar are resolved the same way as any other profile (via `tracked_profiles` / AppView lookup), not duplicated here. 35 - 36 - ### New `auth_states` table (SQLite) 37 - 38 - Ephemeral storage for in-flight OAuth flows. Rows are deleted after callback completes. 39 - 40 - | Column | Type | Notes | 41 - |--------|------|-------| 42 - | `key` | TEXT | Primary key | 43 - | `state_data` | TEXT | JSON blob | 44 - | `created_at` | INTEGER | | 45 - 46 - ## Auth Architecture 47 - 48 - ### AdonisJS Auth 49 - 50 - - Session guard configured against the `Account` model (primary key: `did`) 51 - - `initialize_auth_middleware` in the global middleware stack (so `ctx.auth` is always available, even on public pages) 52 - - Named `auth` middleware for protecting API routes 53 - 54 - ### Atproto OAuth Service 55 - 56 - Singleton AdonisJS service: `app/services/atproto_oauth_service.ts` 57 - 58 - Implements the SDK's `NodeSavedState` and `NodeSavedSession` storage interfaces backed by SQLite (`auth_states` table and `session_data` column on `accounts`). 59 - 60 - Key methods: 61 - - `authorize(handle)` → redirect URL 62 - - `callback(params)` → account DID 63 - - `getAgent(did)` → authenticated AT Protocol agent for API calls 64 - 65 - ### Client Metadata 66 - 67 - Served at `GET /oauth/client-metadata.json`. The `client_id` in the OAuth flow is this URL. Must be publicly accessible. 68 - 69 - ## OAuth Flow 70 - 71 - 1. User clicks "Sign in" → `GET /oauth/login` → service resolves their PDS and initiates PAR → redirect to Bluesky authorization server 72 - 2. User approves → callback to `GET /oauth/callback` → service exchanges code for tokens, stores session data on account → `auth.use('web').login(account)` → redirect to original page (or own profile if they were on landing page) 73 - 3. Subsequent requests: AdonisJS session cookie identifies the user. When authenticated API calls are needed (like/repost), the service restores an atproto agent from the stored session data. 74 - 4. Logout: `POST /oauth/logout` → `auth.use('web').logout()` + clear stored atproto session data. 75 - 76 - ## UI Changes 77 - 78 - ### Header 79 - 80 - - Signed out: "Sign in" link in the top right (alongside search bar and dark mode toggle) 81 - - Signed in: "My profile" link (goes to their own profile page) and "Sign out" button 82 - 83 - ### Landing Page 84 - 85 - - When signed in and on the landing page, after sign-in redirect to their own profile 86 - - No special CTA — sign in lives in the header only 87 - 88 - ### Profile Page Post Cards 89 - 90 - - Signed out: no change 91 - - Signed in: each post card gets like (heart) and repost icons 92 - - Icons reflect current state: filled/highlighted if the viewer has already liked/reposted 93 - - Clicking toggles the action inline with optimistic UI (toggle immediately, revert on error) 94 - - Like/repost counts update accordingly (+1/-1) 95 - 96 - ## Fetching Viewer State 97 - 98 - When a signed-in user views a profile page, after fetching top-25 posts from ClickHouse, the controller calls `app.bsky.feed.getPosts` with the viewer's authenticated agent. The response includes `viewer.like` and `viewer.repost` fields. This data is passed to the template alongside existing post data. 99 - 100 - One API call, max 25 URIs (within the API limit), only for authenticated visitors. 101 - 102 - ## API Routes for Like/Repost 103 - 104 - All protected by auth middleware. 105 - 106 - | Method | Route | Request | Response | 107 - |--------|-------|---------|----------| 108 - | POST | `/api/posts/:uri/like` | — | `{ uri: "<like-record-uri>" }` | 109 - | DELETE | `/api/posts/:uri/like` | `{ uri: "<like-record-uri>" }` | `{}` | 110 - | POST | `/api/posts/:uri/repost` | — | `{ uri: "<repost-record-uri>" }` | 111 - | DELETE | `/api/posts/:uri/repost` | `{ uri: "<repost-record-uri>" }` | `{}` | 112 - 113 - The `:uri` param is the post's AT URI (e.g. `at://did:plc:abc/app.bsky.feed.post/123`). Since it contains slashes, it must be URL-encoded in the path, or alternatively we can use a query parameter or a wildcard route param (`*`). 114 - 115 - ## Alpine.js Interactions 116 - 117 - Each post card gets an `Alpine.data()` component (CSP build — no inline expressions) that manages: 118 - - Like/repost toggled state and record URIs 119 - - Optimistic count updates 120 - - API calls to the routes above 121 - - Revert on error

-382

docs/superpowers/specs/2026-04-14-firehose-virality-webhook-design.md

··· 1 - # Firehose Virality Webhook — Design 2 - 3 - ## Problem 4 - 5 - We want to know in near real-time when any post on Bluesky crosses a like 6 - threshold (1,000 and 10,000 likes), and push a Discord-compatible webhook 7 - to a configured URL — one URL per threshold, so the two signal levels 8 - can be routed to different Discord channels — each time a post first 9 - crosses each threshold. This is 10 - distinct from the existing per-user tracking: we are counting likes on 11 - arbitrary posts across the whole network, not only posts by tracked profiles. 12 - 13 - Constraints: 14 - 15 - - Must run on the existing 8 GB / 4 vCPU Hetzner box alongside the web, 16 - jetstream, and queue processes. 17 - - Must not store unbounded data — the design works over a rolling window. 18 - - Must survive process restarts. 19 - - Must handle unlikes correctly, to prevent the threshold being gamed by 20 - a user rapidly liking/unliking the same post. Unlikes cancel their 21 - matching like, so a like/unlike/like cycle from one user nets to a 22 - single like regardless of how many cycles happen. 23 - - Must not fire for deleted posts. 24 - 25 - ## Scope 26 - 27 - - New `firehose-worker` process that subscribes to the unfiltered 28 - `app.bsky.feed.like` jetstream and writes like / unlike deltas into 29 - ClickHouse. 30 - - Two new ClickHouse tables: a lookup table mapping `like_uri → subject_uri` 31 - and a per-day per-post counts table. 32 - - Cursor persistence for the firehose worker (SQLite), mirroring the existing 33 - jetstream consumer pattern. 34 - - New scheduled queue job (via `adonisjs-scheduler` + the existing 35 - AdonisJS queue) that scans for posts crossing thresholds and fires 36 - Discord webhooks. 37 - - New SQLite table `notified_thresholds` for per-(post, threshold) dedup. 38 - - ClickHouse memory caps configured explicitly. 39 - - `queue-worker` container memory trimmed to leave room for `firehose-worker`. 40 - 41 - Out of scope: 42 - 43 - - Multi-subscriber webhooks, per-user thresholds, auth. Single global 44 - Discord URL configured via env. 45 - - Reposts, replies, or any engagement signal other than likes. 46 - - Backfilling pre-existing like counts from the AppView. We count only 47 - what we see on the firehose from startup onward. 48 - - Exact replay idempotency. Jetstream reconnect replay can cause tiny 49 - overcounts; accepted as design-level noise. Escape hatch documented 50 - under "Deferred work". 51 - - Per-day historical queries against `like_counts_daily`. The schema only 52 - supports "total likes in the last N days" aggregation. 53 - 54 - ## Design 55 - 56 - ### Counting model 57 - 58 - We treat the system as counting net likes (likes minus unlikes) per post 59 - over a rolling 7-day window. A post "crosses" a threshold the first time 60 - its 7-day net like count is observed at or above the threshold value. 61 - 62 - Counts are approximate by design: 63 - 64 - - We count only likes observed on the jetstream from process start onward. 65 - A post that was already popular when we started watching won't be counted 66 - against its true total. 67 - - Jetstream delete events may arrive slightly out of order or be replayed 68 - briefly on reconnect, causing small drift. 69 - - The 7-day window means a post which accumulates likes slowly over weeks 70 - never crosses a threshold. This is intentional; we're detecting virality, 71 - not cumulative popularity. 72 - 73 - ### ClickHouse schema 74 - 75 - Two tables, both `MergeTree` family: 76 - 77 - ```sql 78 - CREATE TABLE like_events_lookup ( 79 - like_uri String, 80 - subject_uri String, 81 - created_at DateTime 82 - ) 83 - ENGINE = MergeTree 84 - ORDER BY like_uri 85 - TTL created_at + INTERVAL 8 DAY; 86 - 87 - CREATE TABLE like_counts_daily ( 88 - subject_uri String, 89 - day Date, 90 - count Int64 91 - ) 92 - ENGINE = SummingMergeTree 93 - ORDER BY subject_uri 94 - PARTITION BY day 95 - TTL day + INTERVAL 8 DAY; 96 - ``` 97 - 98 - `like_events_lookup` exists so that when an unlike arrives — Jetstream 99 - delete events give us `(did, collection, rkey)` but not the subject of the 100 - original like — we can look up the subject URI by primary-key scan. 101 - `ORDER BY like_uri` makes this a cheap sparse-index lookup 102 - (sub-millisecond cached, single-digit ms cold). 103 - 104 - `like_counts_daily` is the aggregation target for the threshold poll. 105 - `PARTITION BY day` means TTL drops whole partitions cheaply rather than 106 - doing row-level mutations. Within each daily partition, `SummingMergeTree` 107 - collapses all `(subject_uri, day)` deltas into a single row per post. 108 - 8-day retention is 7 days for the query window plus one day of buffer. 109 - 110 - TTL is 8 days on both tables; the query window is 7 days, giving one day 111 - of slack for merge lag and clock skew. 112 - 113 - ### Write path (firehose-worker) 114 - 115 - The worker subscribes to jetstream with a collection filter of 116 - `app.bsky.feed.like` and no DID filter. For each commit: 117 - 118 - - **Create** (new like): 119 - - `INSERT INTO like_events_lookup VALUES (like_uri, subject_uri, created_at)` 120 - - `INSERT INTO like_counts_daily VALUES (subject_uri, today(), +1)` 121 - 122 - - **Delete** (unlike): 123 - - `SELECT subject_uri FROM like_events_lookup WHERE like_uri = ?` (PK lookup) 124 - - If found: `INSERT INTO like_counts_daily VALUES (subject_uri, today(), -1)` 125 - - If not found (original like aged out or never seen): drop the event 126 - 127 - Inserts are batched in the worker: collect events for up to 2 seconds or 128 - 5,000 events, whichever comes first, then issue one batched `INSERT` per 129 - table using ClickHouse async inserts. 130 - 131 - Cursor checkpointing mirrors `app/services/jetstream_cursor_io.ts`: the 132 - worker persists the last processed jetstream cursor to SQLite every few 133 - seconds, resumes from there on restart. 134 - 135 - ### Threshold poll (scheduled queue job) 136 - 137 - We use [`adonisjs-scheduler`](https://packages.adonisjs.com/packages/adonisjs-scheduler) 138 - to trigger a queue job on a cron schedule. The scheduler runs as part of 139 - the `queue-worker` process (or a dedicated scheduler entrypoint if that 140 - package recommends it during install — to be confirmed during 141 - implementation). Every 60 seconds it dispatches a `ThresholdScanJob` 142 - onto the existing AdonisJS queue, which runs inside `queue-worker`. 143 - 144 - The job does: 145 - 146 - 1. Query ClickHouse for candidate posts: 147 - 148 - ```sql 149 - SELECT subject_uri, sum(count) AS c 150 - FROM like_counts_daily 151 - WHERE day >= today() - 7 152 - GROUP BY subject_uri 153 - HAVING c >= :min_threshold 154 - ``` 155 - 156 - `:min_threshold` is the smallest configured threshold (1,000), so we 157 - never need to consider posts below that. 158 - 159 - 2. For each candidate, look up SQLite `notified_thresholds` by 160 - `(subject_uri, threshold)` to find the largest threshold already fired. 161 - Determine which thresholds the post has newly crossed (e.g. `c=10_400`, 162 - last fired `1000` → fire `10000`). 163 - 164 - 3. For the set of (post, threshold) pairs to fire, batch an AppView 165 - `app.bsky.feed.getPosts` call (up to 25 URIs per call) to fetch 166 - author handle and post text. 167 - 168 - Outcomes per post: 169 - 170 - - **Post/author resolved cleanly:** fire the webhook for the 171 - relevant threshold(s). 172 - - **Post or author deleted** (AppView returns not-found/tombstone 173 - for the post, or the post exists but the author handle can't be 174 - resolved because the account was deleted): skip firing, insert 175 - dedup row(s) so we don't re-check on the next poll. Do **not** 176 - log an error — this is expected. 177 - - **Any other enrichment failure** (network error, AppView 5xx, 178 - unexpected shape): skip firing, do **not** insert a dedup row 179 - (so the next poll retries), and log an error to PostHog with 180 - the subject URI and failure reason. 181 - 182 - 4. For each post selected to fire: POST to the Discord webhook URL for 183 - the corresponding threshold, then 184 - `INSERT INTO notified_thresholds VALUES (subject_uri, threshold, now)`. 185 - 186 - Webhook HTTP failures retry up to 3 times with exponential backoff. 187 - On final failure, log to PostHog and still insert the dedup row — 188 - we don't want a broken Discord URL to cause the system to re-try 189 - forever. 190 - 191 - ### Discord webhook payload 192 - 193 - Two env vars, one per threshold: 194 - 195 - - `FIREHOSE_WEBHOOK_URL_1K` — Discord webhook for posts crossing 1,000 likes. 196 - - `FIREHOSE_WEBHOOK_URL_10K` — Discord webhook for posts crossing 10,000 likes. 197 - 198 - This keeps the two signal levels routable to different Discord channels. 199 - Both are optional — if a URL is absent, posts crossing that threshold 200 - still get a dedup row written (so we don't re-process every poll) but 201 - no webhook is sent. 202 - 203 - Payload shape: 204 - 205 - ```json 206 - { 207 - "username": "favs.blue firehose", 208 - "embeds": [{ 209 - "author": { "name": "@joe.bsky.social" }, 210 - "title": "Post crossed 1,000 likes", 211 - "url": "https://bsky.app/profile/did:plc:.../post/3k...", 212 - "description": "the post text, truncated to ~500 chars if needed", 213 - "color": 3447003, 214 - "timestamp": "2026-04-14T12:34:56Z", 215 - "fields": [ 216 - { "name": "Estimated likes", "value": "1,037", "inline": true }, 217 - { "name": "Threshold", "value": "1,000", "inline": true } 218 - ] 219 - }] 220 - } 221 - ``` 222 - 223 - AT-URI to bsky.app web URL conversion: 224 - `at://{did}/app.bsky.feed.post/{rkey}` → `https://bsky.app/profile/{did}/post/{rkey}`. 225 - 226 - Enrichment failure handling is described in the threshold poll section: 227 - if the post or author was deleted, skip silently; for any other 228 - enrichment failure, skip, log to PostHog, and do not write a dedup row 229 - (retry on next poll). 230 - 231 - ### SQLite schema 232 - 233 - ``` 234 - notified_thresholds 235 - subject_uri TEXT NOT NULL 236 - threshold INTEGER NOT NULL 237 - fired_at INTEGER NOT NULL -- ms since epoch 238 - PRIMARY KEY (subject_uri, threshold) 239 - ``` 240 - 241 - No TTL needed. Bounded by (unique viral posts) × (number of thresholds). 242 - Even at Bluesky scale over a year, a few tens of thousands of rows max. 243 - 244 - ### Process and container changes 245 - 246 - New `firehose-worker` entry in docker-compose: 247 - 248 - - Command: `node ace.js firehose:watch` 249 - - Memory limit: 384M 250 - - CPU limit: 0.5 251 - - Same volume mounts as jetstream-worker 252 - - Same restart/healthcheck policies as jetstream-worker 253 - - Env: `JETSTREAM_URL` (same as the existing consumer, or its own unfiltered 254 - endpoint if different) 255 - 256 - Existing `queue-worker` memory trimmed from 1G → 512M. The queue worker 257 - runs backfill jobs which are I/O bound; 1 G is generous and the threshold 258 - poll work is cheap. 259 - 260 - ClickHouse internal memory caps added to a new 261 - `clickhouse/config.d/memory.xml` file mounted into the ClickHouse 262 - container: 263 - 264 - ```xml 265 - <clickhouse> 266 - <max_server_memory_usage>3221225472</max_server_memory_usage> 267 - <profiles> 268 - <default> 269 - <max_memory_usage>1073741824</max_memory_usage> 270 - </default> 271 - </profiles> 272 - </clickhouse> 273 - ``` 274 - 275 - (3 GiB server cap, 1 GiB per-query cap.) This ensures ClickHouse fails 276 - queries gracefully rather than getting OOM-killed by Docker. 277 - 278 - Resulting memory budget: 279 - 280 - | Service | Limit | Notes | 281 - |------------------|-------|--------------------------------| 282 - | clickhouse | 4 G | Internal cap set to 3 G | 283 - | web | 1.5 G | | 284 - | jetstream-worker | 512 M | | 285 - | queue-worker | 512 M | Trimmed from 1 G | 286 - | firehose-worker | 384 M | New | 287 - | **Total** | ~6.9 G | Leaves ~1.1 G for OS/overhead | 288 - 289 - ### Ace commands and code layout 290 - 291 - - `commands/firehose_watch.ts` — the `node ace firehose:watch` entrypoint. 292 - - `app/services/firehose_consumer.ts` — modeled on 293 - `app/services/jetstream_consumer.ts`. Takes an injected `WebSocketLike` 294 - factory; subscribes without DID filter; filters for the `.like` 295 - collection only. 296 - - `app/services/firehose_cursor_io.ts` — cursor persistence, mirrors 297 - `jetstream_cursor_io.ts`. 298 - - `app/lib/clickhouse/firehose_writes.ts` — batched insert helpers for 299 - `like_events_lookup` and `like_counts_daily`. 300 - - `app/jobs/threshold_scan_job.ts` — the queue job dispatched by the 301 - scheduler that runs the threshold poll, does AppView enrichment, 302 - fires webhooks. 303 - - Scheduler configuration — cron entry (via `adonisjs-scheduler`) that 304 - dispatches `ThresholdScanJob` every 60 seconds. 305 - - `app/services/discord_webhook.ts` — builds the payload and POSTs with 306 - retry/backoff. 307 - - `database/migrations/<timestamp>_create_notified_thresholds.ts` — 308 - SQLite table. 309 - - `database/clickhouse/<N>_firehose_tables.sql` — ClickHouse DDL. 310 - - `config/clickhouse.xml.d/memory.xml` — ClickHouse memory caps (actual 311 - path depends on how we mount config overrides today). 312 - 313 - ### Configuration 314 - 315 - New env vars in `start/env.ts` (all via Vine): 316 - 317 - - `FIREHOSE_WEBHOOK_URL_1K` — Discord webhook URL for 1,000-like 318 - crossings. Optional. 319 - - `FIREHOSE_WEBHOOK_URL_10K` — Discord webhook URL for 10,000-like 320 - crossings. Optional. 321 - 322 - Thresholds themselves (`[1000, 10000]`) are hardcoded in the job — they 323 - are tied 1:1 to webhook URLs, so there's no sensible way to configure 324 - them independently. 325 - 326 - ### Testing 327 - 328 - Unit tests (`tests/unit/`): 329 - 330 - - `firehose_consumer.spec.ts` — with injected fake WebSocket, verify 331 - like/unlike events produce the right ClickHouse inserts; verify 332 - unlike with no matching lookup drops silently; verify cursor is 333 - persisted. 334 - - `discord_webhook.spec.ts` — payload shape, truncation, URL conversion, 335 - retry/backoff behavior, graceful degradation when enrichment fails. 336 - - `threshold_scan_job.spec.ts` — given a fake ClickHouse result set and 337 - fake `notified_thresholds`, verify the correct (post, threshold) pairs 338 - fire and dedup rows are written. 339 - 340 - Functional tests (`tests/functional/`, real ClickHouse): 341 - 342 - - Insert synthetic like/unlike events, run the threshold query, verify 343 - only posts above threshold come back and counts match expected net. 344 - - Verify unlikes correctly decrement counts when the like is still in 345 - the lookup window, and drop silently when it isn't. 346 - - Verify partition-level TTL: insert events dated 9 days ago, run the 347 - `OPTIMIZE TABLE` flow, check old partitions are dropped. 348 - - Remember to drain query streams (per CLAUDE.md). 349 - 350 - ### Rollout 351 - 352 - 1. Land schema migrations first. The `firehose:watch` process can be 353 - deployed but kept scaled to zero until we're confident. 354 - 2. Start `firehose-worker` at 1 replica, observe jetstream throughput 355 - and ClickHouse ingest rate for a few hours before enabling the 356 - threshold job. 357 - 3. Enable the threshold job with the two `FIREHOSE_WEBHOOK_URL_*` 358 - env vars initially pointing to test Discord channels; confirm 359 - realistic firings. 360 - 4. Swap to the production channels. 361 - 362 - ### Deferred work 363 - 364 - Things explicitly not built now, with escape hatches: 365 - 366 - - **Exact idempotency under replay.** If jetstream reconnect replays 367 - cause user-visible double-firing, switch `like_events_lookup` to 368 - `ReplacingMergeTree(created_at)` and gate the counts `+1` insert on 369 - "was this like_uri already in the lookup table?". Adds ~1 PK lookup 370 - per like event. 371 - - **Unique-liker filtering (bot resistance).** Switch 372 - `like_counts_daily` to `AggregatingMergeTree` storing 373 - `uniqState(liker_did)`, query with `uniqMerge`. Catches situations 374 - where a single bot account mass-likes to fake virality. Costs more 375 - storage and query memory. 376 - - **RocksDB-backed lookup.** If point-lookup latency on 377 - `like_events_lookup` becomes a bottleneck in production, swap its 378 - engine to `EmbeddedRocksDB`. Requires a manual cleanup job to 379 - replace the TTL. 380 - - **Auto-reposting from a Bluesky account.** The current design just 381 - fires a Discord webhook. Layering an auto-repost action on top is 382 - straightforward once this proves out.

+2 -1

providers/posthog_provider.ts

··· 9 9 // Eagerly initialise the client so it's ready when the first request arrives 10 10 getPostHogClient() 11 11 12 - // Expose the API key to Edge templates for client-side posthog-js 12 + // Expose the API key + host to Edge templates for client-side posthog-js 13 13 edge.global('posthogApiKey', process.env.POSTHOG_API_KEY || '') 14 + edge.global('posthogHost', process.env.POSTHOG_HOST || 'https://us.i.posthog.com') 14 15 } 15 16 16 17 async shutdown() {

+2 -1

resources/js/app.js

··· 8 8 // PostHog client-side analytics 9 9 var phKey = document.querySelector('meta[name="posthog-api-key"]') 10 10 if (phKey) { 11 + var phHost = document.querySelector('meta[name="posthog-host"]') 11 12 posthog.init(phKey.getAttribute('content'), { 12 - api_host: 'https://ph.btao.org', 13 + api_host: phHost ? phHost.getAttribute('content') : 'https://us.i.posthog.com', 13 14 capture_pageview: true, 14 15 capture_pageleave: true, 15 16 })

+1

resources/views/components/layout.edge

··· 20 20 <meta name="description" content="{{ $props.get('description', 'See the most popular posts from any Bluesky account.') }}" /> 21 21 @if(posthogApiKey) 22 22 <meta name="posthog-api-key" content="{{ posthogApiKey }}"> 23 + <meta name="posthog-host" content="{{ posthogHost }}"> 23 24 @if(auth && auth.isAuthenticated) 24 25 <meta name="posthog-distinct-id" content="{{ auth.user.did }}"> 25 26 <meta name="posthog-handle" content="{{ auth.user.handle }}">

Configure Feed

Configure Feed