···111111112112## REST api
113113114114-### management
114114+### filter management
115115116116- `GET /filter`: get the current filter configuration.
117117- `PATCH /filter`: update the filter configuration.
118118119119-#### ingestion control
119119+#### filter mode
120120+121121+the `mode` field controls what gets indexed:
122122+123123+| mode | behaviour |
124124+| :--- | :--- |
125125+| `filter` | auto-discovers and backfills any account whose firehose commit touches a collection matching one of the `signals` patterns. you can also explicitly track individual repositories via the `/repos` endpoint regardless of matching signals. |
126126+| `full` | index the entire network. `signals` are ignored for discovery, but `excludes` and `collections` still apply. |
127127+128128+#### fields
129129+130130+| field | type | description |
131131+| :--- | :--- | :--- |
132132+| `mode` | `"filter"` \| `"full"` | indexing mode (see above). |
133133+| `signals` | set update | NSID patterns (e.g. `app.bsky.feed.post` or `app.bsky.*`) that trigger auto-discovery in `filter` mode. |
134134+| `collections` | set update | NSID patterns used to filter which records are stored. if empty, all collections are stored. applies in all modes. |
135135+| `excludes` | set update | set of DIDs to always skip, regardless of mode. checked before any other filter logic. |
136136+137137+#### set updates
138138+139139+each set field accepts one of two forms:
140140+141141+- **replace**: an array replaces the entire set, eg. `["did:plc:abc", "did:web:example.org"]`
142142+- **patch**: an object maps items to `true` (add) or `false` (remove), eg. `{"did:plc:abc": true, "did:web:example.org": false}`
143143+144144+#### NSID patterns
145145+146146+`signals` and `collections` support an optional `.*` suffix to match an entire namespace:
147147+148148+- `app.bsky.feed.post`: exact match only
149149+- `app.bsky.feed.*`: matches any collection under `app.bsky.feed`
150150+151151+### ingestion control
120152121153- `GET /ingestion`: get the current ingestion status.
122154 - returns `{ "crawler": bool, "firehose": bool, "backfill": bool }`.
···128160 finishes processing the current message). they resume immediately when
129161 re-enabled.
130162131131-#### crawler source management
163163+### crawler source management
132164133165- `GET /crawler/sources`: list all currently active crawler sources.
134166 - returns a JSON array of `{ "url": string, "mode": "relay" | "by_collection", "persisted": bool }`.
···155187 the source to restart from the beginning when re-added.
156188 - returns `200 OK` if the source was found and removed, `404 Not Found` otherwise.
157189158158-#### firehose source management
190190+### firehose source management
159191160192- `GET /firehose/sources`: list all currently active firehose relay sources.
161193 - returns a JSON array of `{ "url": string, "persisted": bool }`.
···180212 the relay to restart from the beginning when re-added.
181213 - returns `200 OK` if the relay was found and removed, `404 Not Found` otherwise.
182214183183-#### database operations
215215+### database operations
184216185217- `POST /db/train`: train zstd compression dictionaries for the `repos`,
186218 `blocks`, and `events` keyspaces. dictionaries are written to disk; a restart
···193225 where key is a URL. clears both the firehose cursor and the relay crawler cursor,
194226 as well as any by-collection cursors associated with that URL. causes the next
195227 firehose connection and crawler pass to restart from the beginning.
196196-197197-#### filter mode
198198-199199-the `mode` field controls what gets indexed:
200200-201201-| mode | behaviour |
202202-| :--- | :--- |
203203-| `filter` | auto-discovers and backfills any account whose firehose commit touches a collection matching one of the `signals` patterns. you can also explicitly track individual repositories via the `/repos` endpoint regardless of matching signals. |
204204-| `full` | index the entire network. `signals` are ignored for discovery, but `excludes` and `collections` still apply. |
205205-206206-#### fields
207207-208208-| field | type | description |
209209-| :--- | :--- | :--- |
210210-| `mode` | `"filter"` \| `"full"` | indexing mode (see above). |
211211-| `signals` | set update | NSID patterns (e.g. `app.bsky.feed.post` or `app.bsky.*`) that trigger auto-discovery in `filter` mode. |
212212-| `collections` | set update | NSID patterns used to filter which records are stored. if empty, all collections are stored. applies in all modes. |
213213-| `excludes` | set update | set of DIDs to always skip, regardless of mode. checked before any other filter logic. |
214214-215215-#### set updates
216216-217217-each set field accepts one of two forms:
218218-219219-- **replace**: an array replaces the entire set, eg. `["did:plc:abc", "did:web:example.org"]`
220220-- **patch**: an object maps items to `true` (add) or `false` (remove), eg. `{"did:plc:abc": true, "did:web:example.org": false}`
221221-222222-#### NSID patterns
223223-224224-`signals` and `collections` support an optional `.*` suffix to match an entire namespace:
225225-226226-- `app.bsky.feed.post`: exact match only
227227-- `app.bsky.feed.*`: matches any collection under `app.bsky.feed`
228228229229### repository management
230230