very fast at protocol indexer with flexible filtering, xrpc queries, cursor-backed event stream, and more, built on fjall
rust fjall at-protocol atproto indexer
58
fork

Configure Feed

Select the types of activity you want to include in your feed.

[docs] clarify comments, add links to vs-tap sections in readme

dawn cef95082 fc4121cc

+17 -8
+7 -1
README.md
··· 1 1 #### table-of-contents 2 2 3 3 -> [hydrant](#hydrant)</br> 4 - -> [vs tap](#vs-tap)</br> 4 + -> [vs tap](#vs-tap) | [stream](#stream-behavior) | [multi-relay](#multiple-relay-support) | [crawler sources](#crawler-sources)</br> 5 5 -> [configuration](#configuration)</br> 6 6 -> [rest api](#rest-api) | [filter](#filter-management) | [ingestion](#ingestion-control) | [crawler](#crawler-management) | [firehose](#firehose-management) | [repos](#repository-management)</br> 7 7 -> [xrpc api](#data-access-xrpc) | [backlinks](#bluemicrocosmlinks) | [atproto](#comatproto) | [custom](#systemsgazehydrant) ··· 40 40 41 41 ### stream behavior 42 42 43 + <small>[<- back to toc](#table-of-contents)</small> 44 + 43 45 the `WS /stream` (hydrant) and `WS /channel` (tap) endpoints have different designs: 44 46 45 47 | aspect | `tap` (`/channel`) | `hydrant` (`/stream`) | ··· 52 54 53 55 ### multiple relay support 54 56 57 + <small>[<- back to toc](#table-of-contents)</small> 58 + 55 59 `hydrant` supports connecting to multiple relays simultaneously for firehose 56 60 ingestion. when `RELAY_HOSTS` is configured with multiple URLs: 57 61 ··· 64 68 relay-side account takedowns or if relays set the `time` field. 65 69 66 70 ### crawler sources 71 + 72 + <small>[<- back to toc](#table-of-contents)</small> 67 73 68 74 the crawler is configured separately from the firehose via `CRAWLER_URLS`. each 69 75 source is a `[mode::]url` entry where the mode prefix is optional and defaults
+7 -7
src/control/mod.rs
··· 557 557 /// 558 558 /// - if `cursor` is `None`, streaming starts from the current head (live tail only). 559 559 /// - if `cursor` is `Some(id)`, all persisted `record` events from that ID onward are 560 - /// replayed first, then live events follow seamlessly. 560 + /// replayed first, then the stream will switch to live tailing. 561 561 /// 562 - /// `identity` and `account` events are ephemeral and are never replayed from a cursor - 563 - /// only live occurrences are delivered. use [`ReposControl::get`] to fetch current 564 - /// identity/account state for a specific DID. 562 + /// `identity` and `account` events are ephemeral and are never replayed from a cursor, 563 + /// only live ones are delivered. use [`ReposControl::info`] to fetch current state for 564 + /// a specific repository. 565 565 /// 566 566 /// multiple concurrent subscribers each receive a full independent copy of the stream. 567 567 /// the stream ends when the `EventStream` is dropped. ··· 651 651 652 652 /// returns a future that runs the debug HTTP API server on `127.0.0.1:{port}`. 653 653 /// 654 - /// exposes internal inspection endpoints (`/debug/get`, `/debug/iter`, etc.) 655 - /// that are not safe to expose publicly. binds only to loopback. 654 + /// exposes internal inspection endpoints (`/debug/get`, `/debug/iter`, etc.). 655 + /// binds only to loopback. 656 656 pub fn serve_debug(&self, port: u16) -> impl Future<Output = Result<()>> { 657 657 let state = self.state.clone(); 658 658 async move { crate::api::serve_debug(state, port).await } ··· 734 734 735 735 /// train zstd compression dictionaries for the `repos`, `blocks`, and `events` keyspaces. 736 736 /// 737 - /// dictionaries are written to `dict_{name}.bin` files next to the database. 737 + /// dictionaries are written to `dict_{name}.bin` files inside the database folder. 738 738 /// a restart is required to apply them. training samples data blocks from the 739 739 /// existing database, so the database must have a reasonable amount of data first. 740 740 pub async fn train_dicts(&self) -> Result<()> {
+3
src/control/repos.rs
··· 130 130 /// note that they may not immediately start backfilling if: 131 131 /// - other repos already filled the backfill concurrency limit, 132 132 /// - or there are many repos pending already. 133 + /// 134 + /// this will also clear any error state the repo may have been in, 135 + /// allowing it to resync again. 133 136 pub async fn resync( 134 137 &self, 135 138 dids: impl IntoIterator<Item = Did<'_>>,