···11+#### table-of-contents
22+33+-> [hydrant](#hydrant)</br>
44+-> [vs tap](#vs-tap)</br>
55+-> [configuration](#configuration)</br>
66+-> [rest api](#rest-api) | [filter](#filter-management) | [ingestion](#ingestion-control) | [crawler](#crawler-management) | [firehose](#firehose-management) | [repos](#repository-management)</br>
77+-> [xrpc api](#data-access-xrpc) | [backlinks](#bluemicrocosmlinks) | [atproto](#comatproto) | [custom](#systemsgazehydrant)
88+19# hydrant
210311`hydrant` is an AT Protocol indexer built on the `fjall` database. it's built to
···1725you dont mind losing your existing backfilled data in hydrant if you already
1826processed them.).
19272020-## vs `tap`
2828+## vs tap
2929+3030+<small>[<- back to toc](#table-of-contents)</small>
21312232while [`tap`](https://github.com/bluesky-social/indigo/tree/main/cmd/tap) is
2333designed as a firehose consumer and simply just propagates events while handling
···76867787## configuration
78888989+<small>[<- back to toc](#table-of-contents)</small>
9090+7991`hydrant` is configured via environment variables. all variables are prefixed
8092with `HYDRANT_` (except `RUST_LOG`). if a `.env` file exists in the working
8193directory, it will also be loaded automatically.
···111123112124## REST api
113125126126+<small>[<- back to toc](#table-of-contents)</small>
127127+128128+### event stream
129129+130130+- `GET /stream`: subscribe to the event stream.
131131+ - query parameters:
132132+ - `cursor` (optional): start streaming from a specific event ID.
133133+134134+### stats
135135+136136+- `GET /stats`: get stats about the database:
137137+ - `counts`: counts of repos, records, events, and errors, etc.
138138+ - `sizes`: sizes of the database keyspaces on disk, in bytes.
139139+114140### filter management
141141+142142+<small>[<- back to toc](#table-of-contents)</small>
115143116144- `GET /filter`: get the current filter configuration.
117145- `PATCH /filter`: update the filter configuration.
···150178151179### ingestion control
152180181181+<small>[<- back to toc](#table-of-contents)</small>
182182+153183- `GET /ingestion`: get the current ingestion status.
154184 - returns `{ "crawler": bool, "firehose": bool, "backfill": bool }`.
155185- `PATCH /ingestion`: enable or disable ingestion components at runtime without
···160190 finishes processing the current message). they resume immediately when
161191 re-enabled.
162192163163-### crawler source management
193193+### crawler management
194194+195195+<small>[<- back to toc](#table-of-contents)</small>
164196165197- `GET /crawler/sources`: list all currently active crawler sources.
166198 - returns a JSON array of `{ "url": string, "mode": "relay" | "by_collection", "persisted": bool }`.
···187219 the source to restart from the beginning when re-added.
188220 - returns `200 OK` if the source was found and removed, `404 Not Found` otherwise.
189221190190-### firehose source management
222222+### firehose management
223223+224224+<small>[<- back to toc](#table-of-contents)</small>
191225192226- `GET /firehose/sources`: list all currently active firehose relay sources.
193227 - returns a JSON array of `{ "url": string, "persisted": bool }`.
···212246 the relay to restart from the beginning when re-added.
213247 - returns `200 OK` if the relay was found and removed, `404 Not Found` otherwise.
214248215215-### database operations
216216-217217-- `POST /db/train`: train zstd compression dictionaries for the `repos`,
218218- `blocks`, and `events` keyspaces. dictionaries are written to disk; a restart
219219- is required to apply them. the crawler, firehose, and backfill worker are
220220- paused for the duration and restored on completion.
221221-- `POST /db/compact`: trigger a full major compaction of all database keyspaces
222222- in parallel. the crawler, firehose, and backfill worker are paused for the
223223- duration and restored on completion.
224224-- `DELETE /cursors`: reset all stored cursors for a given URL. body: `{ "key": "..." }`
225225- where key is a URL. clears both the firehose cursor and the relay crawler cursor,
226226- as well as any by-collection cursors associated with that URL. causes the next
227227- firehose connection and crawler pass to restart from the beginning.
228228-229249### repository management
230250251251+<small>[<- back to toc](#table-of-contents)</small>
252252+231253- `GET /repos`: get an NDJSON stream of repositories and their sync status. supports pagination and filtering:
232254 - `limit`: max results (default 100, max 1000)
233255 - `cursor`: opaque key for paginating.
···238260- `PUT /repos`: explicitly track repositories. accepts an NDJSON body of `{"did": "..."}` (or JSON array of the same).
239261- `DELETE /repos`: untrack repositories. accepts an NDJSON body of `{"did": "..."}` (or JSON array of the same).
240262241241-### event stream
242242-243243-- `GET /stream`: subscribe to the event stream.
244244- - query parameters:
245245- - `cursor` (optional): start streaming from a specific event ID.
246246-247247-### stats
263263+### database operations
248264249249-- `GET /stats`: get stats about the database:
250250- - `counts`: counts of repos, records, events, and errors, etc.
251251- - `sizes`: sizes of the database keyspaces on disk, in bytes.
265265+- `POST /db/train`: train zstd compression dictionaries for the `repos`,
266266+ `blocks`, and `events` keyspaces. dictionaries are written to disk; a restart
267267+ is required to apply them. the crawler, firehose, and backfill worker are
268268+ paused for the duration and restored on completion.
269269+- `POST /db/compact`: trigger a full major compaction of all database keyspaces
270270+ in parallel. the crawler, firehose, and backfill worker are paused for the
271271+ duration and restored on completion.
272272+- `DELETE /cursors`: reset all stored cursors for a given URL. body: `{ "key": "..." }`
273273+ where key is a URL. clears both the firehose cursor and the relay crawler cursor,
274274+ as well as any by-collection cursors associated with that URL. causes the next
275275+ firehose connection and crawler pass to restart from the beginning.
252276253277## data access (xrpc)
278278+279279+<small>[<- back to toc](#table-of-contents)</small>
254280255281`hydrant` implements the following XRPC endpoints under `/xrpc/`:
256282257283### com.atproto.*
258284285285+<small>[<- back to toc](#table-of-contents)</small>
286286+287287+these are standard atproto endpoints. you can look at [the atproto api reference](https://docs.bsky.app/docs/category/http-reference) for more info.
288288+259289the following are implemented currently:
260290- `com.atproto.repo.getRecord`
261291- `com.atproto.repo.listRecords`
262292263293### systems.gaze.hydrant.*
294294+295295+<small>[<- back to toc](#table-of-contents)</small>
264296265297these are some non-standard XRPCs that might be useful.
266298···276308returns `{ count }`.
277309278310### blue.microcosm.links.*
311311+312312+<small>[<- back to toc](#table-of-contents)</small>
279313280314hydrant implements a subset of [microcosm constellation](https://constellation.microcosm.blue/)
281315when it's built with the `backlinks` cargo feature (`cargo build --features backlinks`).