···104104- [ ] resync short-circuit: tiny repos may actually return their entire CAR for getRecord
105105- [ ] commit CAR handling: generate a list of keys with gaps noted, to reliably detect missing adjacent keys
106106- [ ] repo-stream: drop record block contents with processor fn
107107-107107+- [ ] meta/metrics keyspace for general stats
108108+ - [ ] total repos (hyperloglog estimate?)
109109+ - [ ] resync queue size
108110109111very much still todo but i'm getting tired
110112- [x] config: add a `--heavy` mode that always uses `getRepo` and never `describeRepo`
111113- [x] config: db mem limit `--fjall-cache-mb`
112114- [x] config: per-host request rate self-throttling `--crawl-qps` (name from collectiondir)
113115- [ ] resync: estimate CAR size from `getRecord` mst height; `getRepo` if it's likely very small
114114-- [ ] multi-relay subscriber
115115-- [ ] special did:web behaviour to keep reusing a stale resolution on failure
116116+- [ ] special did:web ident cache behaviour to keep reusing a stale resolution on failure
116117- [ ] admin view of backfill state etc
117118- [ ] vanity stats for optimizations, like how many in-flight repos were saved from resync due to high-water-mark firehose cursor persistence
118119- [ ] if the upstream is a PDS (check with describeServer?) then make only accept events for DIDs that have it as their PDS
···120121- [ ] combine the throttled http client instance, the db, and the admin info into an appstate fineeeee
121122- [ ] bad word filtering? (collectiondir has it)
122123- [ ] check response headers and adjust self-throttling rate limits per-host if present
124124+- [ ] make backfill go _really fast_
125125+126126+going to be annoying but doable
127127+- [ ] multi-relay subscriber
123128124129125130### special-casing
···130135## some choices
131136132137- tokio for async runtime: works good
133133-- jacquard almost everywhere: makes things *so much* easier
138138+- jacquard almost everywhere: works good
134139- repo-stream for CAR processing
135135-- fjall: workload is write-heavy so LSM is a good fit, space efficiency also very desirable
140140+- fjall: workload is write-heavy so LSM works good, space efficiency also very nice
136141137142138143## resync: getting a repo's full collection list