atproto relay implementation in zig zlay.waow.tech
9
fork

Configure Feed

Select the types of activity you want to include in your feed.

docs: rewrite readme — drop stale numbers table, fix config, link to spec

the hard-coded metrics table was already stale and duplicated what the
public grafana dashboard shows. replaced the endpoints table with a
link to the sync spec. fixed RELAY_HTTP_PORT → RELAY_METRICS_PORT and
added RESOLVER_THREADS + VALIDATOR_CACHE_SIZE.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

zzstoatzz c1f99c2c 55e541cc

+15 -26
+15 -26
README.md
··· 1 1 # zlay 2 2 3 - an [AT Protocol](https://atproto.com/) relay in zig. subscribes to every PDS on the network, verifies commit signatures, and serves the merged event stream to downstream consumers via `com.atproto.sync.subscribeRepos`. 3 + an [AT Protocol](https://atproto.com/) relay in zig. subscribes to every PDS on the network, verifies commit signatures, and serves the merged event stream to downstream consumers. 4 4 5 - **live instance**: [zlay.waow.tech](https://zlay.waow.tech/_health) — [metrics](https://zlay-metrics.waow.tech) 5 + **live instance**: [zlay.waow.tech](https://zlay.waow.tech/_health) — [metrics dashboard](https://zlay-metrics.waow.tech) 6 6 7 7 ## design 8 8 9 - - **direct PDS crawl** — the bootstrap relay (`bsky.network`) is called once at startup for the host list via `listHosts`, then all data flows directly from each PDS. 9 + - **direct PDS crawl** — the bootstrap relay is called once at startup for the host list via `listHosts`, then all data flows directly from each PDS. 10 10 11 - - **optimistic signature validation** — on signing key cache miss, the frame passes through immediately and the DID is queued for background resolution. the first commit from an unknown account is unvalidated; all subsequent commits are verified against the cached key. >99.9% cache hit rate after warmup. 11 + - **optimistic signature validation** — on signing key cache miss, the frame passes through immediately and the DID is queued for background resolution. all subsequent commits are verified against the cached key. the cache caps at a configurable size and evicts the oldest 10% by resolve time when full. 12 12 13 - - **inline collection index** — indexes `(DID, collection)` pairs directly in the event processing pipeline using [RocksDB](https://rocksdb.org/) with two column families: `rbc` for collection-to-DID lookups and `cbr` for DID-to-collection cleanup. serves `listReposByCollection` from the relay process — no sidecar. the index design draws on [fig](https://tangled.org/microcosm.blue)'s work on [lightrail](https://tangled.org/microcosm.blue/lightrail), which uses adjacent keys from CAR slices to enumerate collections. 13 + - **inline collection index** — indexes `(DID, collection)` pairs in the event processing pipeline using RocksDB. serves `listReposByCollection` from the relay process — no sidecar. the index design draws on [fig](https://tangled.org/microcosm.blue)'s work on [lightrail](https://tangled.org/microcosm.blue/lightrail). 14 14 15 - - **one OS thread per PDS** — predictable memory, no garbage collector. ~2,750 threads is fine; most are blocked on websocket reads. thread stacks are set to 2 MB (zig's default is 16 MB). 15 + - **one OS thread per PDS** — predictable memory, no garbage collector. thread stacks are set to 2 MB (zig's default is 16 MB). 16 16 17 - ## endpoints 17 + ## spec compliance 18 18 19 - | endpoint | method | 20 - |---|---| 21 - | `com.atproto.sync.subscribeRepos` | WebSocket | 22 - | `com.atproto.sync.listRepos` | GET | 23 - | `com.atproto.sync.getRepoStatus` | GET | 24 - | `com.atproto.sync.getLatestCommit` | GET | 25 - | `com.atproto.sync.listReposByCollection` | GET | 26 - | `com.atproto.sync.listHosts` | GET | 27 - | `com.atproto.sync.getHostStatus` | GET | 28 - | `com.atproto.sync.requestCrawl` | POST | 19 + implements the [AT Protocol sync spec](https://atproto.com/specs/sync) — `subscribeRepos`, `listRepos`, `getRepoStatus`, `getLatestCommit`, `listReposByCollection`, `listHosts`, `getHostStatus`, and `requestCrawl`. 29 20 30 21 ## dependencies 31 22 ··· 34 25 | [zat](https://tangled.org/zzstoatzz.io/zat) | AT Protocol primitives (CBOR, CAR, signatures, DID resolution) | 35 26 | [websocket.zig](https://github.com/nicholasgasior/websocket.zig) | WebSocket client/server | 36 27 | [pg.zig](https://github.com/karlseguin/pg.zig) | PostgreSQL driver | 37 - | [rocksdb-zig](https://github.com/Syndica/rocksdb-zig) | [RocksDB](https://rocksdb.org/) bindings | 28 + | [rocksdb-zig](https://github.com/Syndica/rocksdb-zig) | RocksDB bindings | 38 29 39 30 ## build 40 31 ··· 50 41 51 42 | variable | default | description | 52 43 |---|---|---| 53 - | `RELAY_PORT` | `3000` | WebSocket firehose port | 54 - | `RELAY_HTTP_PORT` | `3001` | HTTP API port | 44 + | `RELAY_PORT` | `3000` | firehose + API port | 45 + | `RELAY_METRICS_PORT` | `3001` | prometheus metrics port | 55 46 | `RELAY_UPSTREAM` | `bsky.network` | bootstrap relay for initial host list | 56 47 | `RELAY_DATA_DIR` | `data/events` | event log storage | 57 48 | `RELAY_RETENTION_HOURS` | `72` | event retention window | 58 49 | `COLLECTION_INDEX_DIR` | `data/collection-index` | RocksDB collection index path | 59 50 | `DATABASE_URL` | — | PostgreSQL connection string | 60 51 | `RELAY_ADMIN_PASSWORD` | — | bearer token for admin endpoints | 52 + | `RESOLVER_THREADS` | `4` | background DID resolution threads | 53 + | `VALIDATOR_CACHE_SIZE` | `500000` | max cached signing keys before eviction | 61 54 62 55 see [docs/deployment.md](docs/deployment.md) for production deployment and [docs/backfill.md](docs/backfill.md) for collection index backfill. 63 56 64 - ## numbers 57 + ## license 65 58 66 - | metric | value | 67 - |---|---| 68 - | connected PDS hosts | ~2,750 | 69 - | memory | ~2.9 GiB | 70 - | throughput | ~600 events/sec typical | 59 + MIT