···2233assessing rocksdb's space efficiency
4455+1. grab a lot of car files (i downloaded all of morel)
66+77+2. load them into rocks, i did 33% sampling for ~1% total network or something. (probably even better to sample when grabbing cars)
88+99+```bash
1010+RUST_LOG=info cargo run --release -- \
1111+ --db-dir /path/to/new/rocks/db \
1212+ --sample 0.33 \
1313+ /path/to/cars
1414+```
1515+1616+3. we can skip running `sample.rs` since we sampled at step 1
1717+1818+4. get stats (may as well do this before sweep since it doesn't take too long)
1919+2020+```bash
2121+cargo run --release --example stats -- \
2222+ /path/to/new/rocks/db > stats.json
2323+```
2424+2525+and then copy them over to the html file.
2626+2727+5. do the sweeps (takes forever)
2828+2929+```bash
3030+cargo run --release --example sweep -- \
3131+ /path/to/new/rocks/db --output sweep-results.csv
3232+```
3333+3434+sweep will read the output file to avoid rerunning the same config. delete a line if you actually do want to rereun a config.
3535+3636+copy the csv into the html.
3737+3838+#### notes
3939+4040+- take all timing measurements with a heap of salt: this is not rigourous benchmarking, just first-order vibes
4141+ - only one timing measurement is taken for each config
4242+ - i'm not running on any kind of controlled environment (doing other things while it goes)
4343+ - my disk doesn't have a lot of free space which probabaly messes with it unpredictably
4444+ - each run has to work with the previous run's config when recompacting! most of the work is on the write side, but the runs aren't fully independent.
4545+4646+4747+--------
4848+4949+old stuff:
5050+551652### first full run: zstd for layers 2+
753