declarative relay deployment on hetzner relay-eval.waow.tech
atproto relay
14
fork

Configure Feed

Select the types of activity you want to include in your feed.

docs: update zlay backfill section with implementation details

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

zzstoatzz 6fd799ff 266b2fa5

+5 -5
+5 -5
docs/architecture.md
··· 94 94 95 95 ### collection index backfill 96 96 97 - the collection index is live-only — it indexes `create` ops as they flow through the firehose. historical data requires a backfill. recommended approaches: 97 + the collection index is live-only — it indexes `create` ops as they flow through the firehose. historical data is backfilled by importing from a source relay (bsky.network) via `com.atproto.sync.listReposByCollection`. 98 98 99 - 1. **import from bsky.network** (fastest): paginate `listReposByCollection` on the reference relay for each collection, bulk-insert pairs into RocksDB. no PDS crawling, no rate limits. `addCollection` is idempotent. 100 - 2. **describeRepo crawl** (independent): crawl the host table, calling `listRepos` + `describeRepo` per PDS. same rate limit gotchas as indigo collectiondir — see [backfill.md](backfill.md). 101 - 3. **hybrid** (recommended): import from reference relay for immediate parity, then live indexing keeps current. optionally add a slow background verify-crawl later. 99 + the backfiller discovers collections from two sources (lexicon garden llms.txt + RocksDB scan), then pages through each collection on the source relay, adding DIDs to RocksDB. progress is tracked in postgres for crash-resumability. triggered via `POST /admin/backfill-collections`, status via `GET`. 100 + 101 + see the [zlay backfill docs](https://tangled.org/zzstoatzz.io/zlay/tree/main/docs/backfill.md) for full details, or use `scripts/backfill-status` in this repo. 102 102 103 103 ### verification 104 104 ··· 111 111 | metric | value | 112 112 |--------|-------| 113 113 | connected PDS hosts | ~2,749 | 114 - | collection index DIDs | ~497K (live-only, no backfill) | 114 + | collection index DIDs | ~13.6M+ (backfill in progress from bsky.network) | 115 115 | memory request | 512 MiB | 116 116 | memory limit | 8 GiB | 117 117 | PVC | 20 GiB |