very fast at protocol indexer with flexible filtering, xrpc queries, cursor-backed event stream, and more, built on fjall
rust
fjall
at-protocol
atproto
indexer
1---
2title: crawler management
3---
4
5## GET /crawler/sources
6
7list all currently active crawler sources. returns a JSON array of `{ "url": string, "mode": "relay" | "by_collection", "persisted": bool }`.
8
9`persisted: true` means the source was added via the API and is stored in the database; it will survive a restart. `persisted: false` means the source came from `CRAWLER_URLS` and is not written to the database.
10
11## POST /crawler/sources
12
13add a crawler source at runtime.
14
15| field | description |
16| :--- | :--- |
17| `url` | URL of the crawler source |
18| `mode` | `"relay"` or `"by_collection"` |
19
20the source is written to the database before the producer task is started, so it is safe to add sources and then immediately restart without losing them.
21
22if a source with the same URL already exists (whether from `CRAWLER_URLS` or a previous `POST`), it is replaced: the running task is stopped and a new one is started with the new mode. any cursor state for that URL is preserved.
23
24returns `201 Created` on success.
25
26## DELETE /crawler/sources
27
28remove a crawler source at runtime.
29
30| field | description |
31| :--- | :--- |
32| `url` | URL of the source to remove |
33
34the producer task is stopped immediately.
35
36if the source was added via the API (`persisted: true`), it is removed from the database and will not reappear on restart. if it came from `CRAWLER_URLS` (`persisted: false`), only the running task is stopped; the source will reappear on the next restart since `CRAWLER_URLS` is re-applied at startup.
37
38cursor state is not cleared. use `DELETE /crawler/cursors` separately if you want the source to restart from the beginning when re-added.
39
40returns `200 OK` if the source was found and removed, `404 Not Found` otherwise.
41
42## DELETE /crawler/cursors
43
44reset stored cursors for a given crawler URL.
45
46| field | description |
47| :--- | :--- |
48| `key` | URL of the crawler source to reset |
49
50clears the list-repos crawler cursor as well as any by-collection cursors associated with that URL. causes the next crawler pass to restart from the beginning.