search for standard sites pub-search.waow.tech
search zig blog atproto
11
fork

Configure Feed

Select the types of activity you want to include in your feed.

docs: evaluate hydrant as potential tap replacement

cursor-based replay and runtime filters are appealing, but tap's
pain points are already worked around and hydrant adds operational
complexity (no docker image, unstable storage, single maintainer).
not adopting now; revisit if we need multiple relay sources or
runtime collection management.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

+77
+77
docs/hydrant.md
··· 1 + # hydrant: potential tap replacement 2 + 3 + evaluated 2026-03-22. **decision: not adopting now**, but worth revisiting. 4 + 5 + ## what is hydrant 6 + 7 + [hydrant](https://tangled.org/did:plc:dfl62fgb7wtjj3fcbb72naae/hydrant) is a Rust-based 8 + ATProto indexer/sync tool by [ptr.pet](https://90008.leaflet.pub/3mhp3t4kuw22e). it handles 9 + firehose consumption, record persistence, backfill, and event streaming — a superset of what 10 + tap does for us today. 11 + 12 + ## what we'd gain 13 + 14 + - **cursor-based event replay** — clients own the cursor, no server-side ACK state. eliminates 15 + the class of bugs that led to our ProcessQueue workaround in `tap.zig` (ACK blocking, outbox 16 + growth, drop-on-overflow) 17 + - **runtime filter management** — `PATCH /filter` to add/remove collections without redeploying 18 + (tap's filters are fixed at startup) 19 + - **multiple firehose sources** — subscribe to multiple relays simultaneously; we're currently 20 + locked to a single relay 21 + - **ephemeral mode** — `HYDRANT_EPHEMERAL=true` with a TTL keeps only a rolling window of 22 + events, avoiding permanent disk accumulation. fits our use case since turso is our source of 23 + truth, not hydrant's local store 24 + - **~3x throughput** — ~60k records/sec vs tap's 22-34k (network-bound at 100Mbps) 25 + 26 + ## why not now 27 + 28 + - **we've already worked around tap's pain points** — the ProcessQueue pattern works, memory is 29 + managed with `TAP_RESYNC_PARALLELISM=1` and 2GB RAM, the ACK model is stable 30 + - **hydrant is more than we need** — it persists records in fjall (LSM-tree), implements XRPC 31 + queries, stores blocks as CBOR. even in ephemeral mode that's a lot of machinery to relay 32 + events. tap is simpler for our use case 33 + - **no Docker image** — builds via Nix only. we'd need to create and maintain our own Dockerfile 34 + for Fly.io. tap is `ghcr.io/bluesky-social/indigo/tap:latest` with zero build effort 35 + - **single maintainer, explicitly unstable DB format** — fjall dependency is patched, breaking 36 + changes expected. tap is maintained by the Bluesky team (indigo) 37 + - **`tap.zig` rewrite** — hydrant's WebSocket message format differs from tap's. our ~450-line 38 + consumer would need rewriting. not huge, but not free 39 + - **throughput is irrelevant for us** — we index a tiny slice of the network (5 collection types), 40 + not the full firehose 41 + 42 + ## when to revisit 43 + 44 + - if we need **multiple relay sources** (partial relays, PDS-direct connections) 45 + - if we need **runtime collection management** (adding new platforms without redeploying tap) 46 + - if tap's maintenance slows down or indigo deprecates it 47 + - if hydrant gets **Docker images** and a **stable storage format** 48 + - if hydrant's **XRPC record queries** could replace our reconciler's PDS-direct lookups 49 + 50 + ## integration sketch (for future reference) 51 + 52 + if we do adopt, the simplest path: 53 + 54 + 1. run hydrant on Fly.io with ephemeral mode and our collection filters 55 + 2. rewrite `tap.zig` consumer to connect to hydrant's `/stream?cursor=N` WebSocket 56 + 3. persist cursor locally (e.g. in turso or a file) for crash recovery 57 + 4. remove ProcessQueue — cursor replay makes it unnecessary 58 + 5. keep everything else (extractor, indexer, embedder, reconciler) as-is 59 + 60 + hydrant's message format: 61 + ```json 62 + { 63 + "id": 12345, 64 + "type": "record", 65 + "record": { 66 + "live": true, 67 + "did": "did:plc:abc123", 68 + "collection": "site.standard.document", 69 + "rkey": "3mhp3t4kuw22e", 70 + "action": "create", 71 + "record": { ... }, 72 + "cid": "bafyrei..." 73 + } 74 + } 75 + ``` 76 + 77 + the `action` field maps directly to our existing create/update/delete dispatch in `tap.zig`.