A public mirror for the whole atmosphere hubble.microcosm.blue
27
fork

Configure Feed

Select the types of activity you want to include in your feed.

readme.md

space efficiency check#

assessing rocksdb's space efficiency

  1. grab a lot of car files (i downloaded all of morel)

  2. load them into rocks, i did 33% sampling for ~1% total network or something. (probably even better to sample when grabbing cars)

RUST_LOG=info cargo run --release -- \
  --db-dir /path/to/new/rocks/db \
  --sample 0.33 \
  /path/to/cars
  1. we can skip running sample.rs since we sampled at step 1

  2. get stats (may as well do this before sweep since it doesn't take too long)

cargo run --release --example stats -- \
  /path/to/new/rocks/db > stats.json

and then copy them over to the html file.

  1. do the sweeps (takes forever)
cargo run --release --example sweep -- \
  /path/to/new/rocks/db --output sweep-results.csv

sweep will read the output file to avoid rerunning the same config. delete a line if you actually do want to rereun a config.

copy the csv into the html.

notes#

  • take all timing measurements with a heap of salt: this is not rigourous benchmarking, just first-order vibes
    • only one timing measurement is taken for each config
    • i'm not running on any kind of controlled environment (doing other things while it goes)
    • my disk doesn't have a lot of free space which probabaly messes with it unpredictably
    • each run has to work with the previous run's config when recompacting! most of the work is on the write side, but the runs aren't fully independent.

old stuff:

first full run: zstd for layers 2+#

format <did>/<repo path> => <cbor>

enumerated car directory files=490694 car_bytes=246501100813 car_mib=235081 progress repos=490000 records=665231031 failed=366 workers finished elapsed=1457.296831042s import complete repos=490328 empty=19172 records=665928371 failed=366 total input size car_bytes=246501100813 car_mib=235081 db size before compaction bytes=69084296913 mib=65883 running manual compaction... db size after compaction bytes=66225638581 mib=63157 compact_elapsed=751.790672875s db size / car bytes ratio="0.269"

manually compact with bottommost zstd 6#

mb after: 62186 (65207279363) or another 1.5% only

manually compact with bottommost zstd 9#

mb after: 62173 (65193700507) or barely any better (30m to manually compact)

larger block size 4 -> 64KB (zstd 3)#

mb after: 50576 (53033071353) or 20ish% hey! (4.6x vs raw car) 521s

zstd dictionary (16k) (64KB block zstd 3)#

mb after: 50647 (53108254067) or slightly worse actually (4.6x vs raw car) 860s

zstd optimize for hits (64KB block zstd 3 no dict)#

mb after: 50576 (53033105629) or basically nothin more 429s

zstd key restart interval 64 (optimize hits, 64KB block, zstd3 no dict)#

mb after: 50237 (52677638685) (4.7x) 425s

zstd 32k dict w#

50237 52677524253 455s

128m sst#

just 1mb better, eeh 494s

more attempts at dict settings#

50236 497.587948917s

256KB blocks#

mb after: 49151 (51539349907) (4.8x)

rerunning from cars with new config#

  • no compression of L0
  • lz4 for L1
  • zstd below

enumerated car directory files=490694 car_bytes=246501100813 car_mib=235081 workers finished elapsed=1131.757322333s import complete repos=490328 empty=19172 records=665928371 failed=366 total input size car_bytes=246501100813 car_mib=235081 db size before compaction bytes=76787876531 mib=73230 running manual compaction... db size after compaction bytes=51539292437 mib=49151 compact_elapsed=470.095283125s db size / car bytes ratio="0.209"

  • 22% faster load
  • 11% bigger before compaction
  • 38% faster compaction
  • same size post-compaction

to try:

  • lz4 for L0
  • snappy for L1
  • zstd below

enumerated car directory files=490694 car_bytes=246501100813 car_mib=235081 workers finished elapsed=769.148841542s import complete repos=490328 empty=19172 records=665928371 failed=366 total input size car_bytes=246501100813 car_mib=235081 db size before compaction bytes=72627779170 mib=69263 running manual compaction... db size after compaction bytes=51539427006 mib=49151 compact_elapsed=440.631236042s db size / car bytes ratio="0.209"

  • 48% faster!
  • 5% bigger before compaction
  • 41% faster compaction
  • same size post

try:

  • snappy L0
  • zstd 3 below
  • zstd 6 bottom

enumerated car directory files=490694 car_bytes=246501100813 car_mib=235081 workers finished elapsed=706.999255833s import complete repos=490328 empty=19172 records=665928371 failed=366 total input size car_bytes=246501100813 car_mib=235081 db size before compaction bytes=74153258877 mib=70718 running manual compaction... db size after compaction bytes=49509919781 mib=47216 compact_elapsed=736.388699625s db size / car bytes ratio="0.201"

  • 51% faster
  • 7% bigger before compaction
  • 2% faster compaction
  • 4% smaller post! (allllmost 5x)

zstd 4 above bottom (snappy L0):

enumerated car directory files=490694 car_bytes=246501100813 car_mib=235081 workers finished elapsed=796.088198709s import complete repos=490328 empty=19172 records=665928371 failed=366 total input size car_bytes=246501100813 car_mib=235081 db size before compaction bytes=63662316943 mib=60713 running manual compaction... db size after compaction bytes=49511141686 mib=47217 compact_elapsed=771.272409958s db size / car bytes ratio="0.201"

  • 45% faster
  • 8% bigger before compaction
  • 2% slower compaction
  • 4% smaller post (allllllmot 5x still)

try: no zstd dictionaries at all

enumerated car directory files=490694 car_bytes=246501100813 car_mib=235081 workers finished elapsed=831.602000625s import complete repos=490328 empty=19172 records=665928371 failed=366 total input size car_bytes=246501100813 car_mib=235081 db size before compaction bytes=77854990237 mib=74248 running manual compaction... db size after compaction bytes=49510481839 mib=47216 compact_elapsed=628.108967125s db size / car bytes ratio="0.201"

  • 43% faster
  • 13% bigger
  • 16% faster manual compaction
  • 4% smaller post

==== so dictionaries are actually not worth it (at least for records) ====

try:

  • lz4 L0 then zstd 3 (default options, no bottommost setting)
  • optimize_level_style_compaction(2GB) replaces write_buffer_size, max_write_buffer_number

--

turn off unordered write:

enumerated car directory files=490694 car_bytes=246501100813 car_mib=235081 workers finished elapsed=1461.684272667s import complete repos=490327 empty=19172 records=663121972 failed=367 total input size car_bytes=246501100813 car_mib=235081 db size before compaction bytes=85902581906 mib=81923 running manual compaction... db size after compaction bytes=78751866099 mib=75103 compact_elapsed=340.036456667s db size / car bytes ratio="0.319"

  • same speed
  • 25% bigger
  • 55% faster compact
  • 19% bigger post wtf

same but dictionary again:

enumerated car directory files=490694 car_bytes=246501100813 car_mib=235081 progress repos=490000 records=665287800 failed=366 workers finished elapsed=1510.41171175s import complete repos=490328 empty=19172 records=665928371 failed=366 total input size car_bytes=246501100813 car_mib=235081 db size before compaction bytes=96642919507 mib=92165 running manual compaction... db size after compaction bytes=78766420371 mib=75117 compact_elapsed=328.5108135s db size / car bytes ratio="0.320"

what.

zstd 4 again (no dict):

2026-04-08T23:39:58.226569Z INFO space_efficiency_check::work: enumerated car directory files=490694 car_bytes=246501100813 car_mib=235081 2026-04-08T23:40:01.813198Z INFO space_efficiency_check::work: progress repos=490000 records=665341081 failed=366 2026-04-08T23:40:02.553634Z INFO space_efficiency_check::work: workers finished elapsed=1444.968804209s 2026-04-08T23:40:02.553657Z INFO space_efficiency_check: import complete repos=490328 empty=19172 records=665928371 failed=366 2026-04-08T23:40:02.553660Z INFO space_efficiency_check: total input size car_bytes=246501100813 car_mib=235081 2026-04-08T23:40:02.553700Z INFO space_efficiency_check: db size before compaction bytes=87279007693 mib=83235 2026-04-08T23:40:02.553702Z INFO space_efficiency_check: running manual compaction... 2026-04-08T23:45:32.431196Z INFO space_efficiency_check: db size after compaction bytes=78823738786 mib=75172 compact_elapsed=329.875497584s 2026-04-08T23:45:32.442654Z INFO space_efficiency_check: db size / car bytes ratio="0.320"

zstd default includinb bottom (lz4 l0):

2026-04-09T01:16:41.254396Z INFO space_efficiency_check::work: enumerated car directory files=490694 car_bytes=246501100813 car_mib=235081 2026-04-09T01:16:41.460409Z INFO space_efficiency_check::work: workers finished elapsed=1345.288276167s 2026-04-09T01:16:41.462956Z INFO space_efficiency_check: import complete repos=490328 empty=19172 records=665928371 failed=366 2026-04-09T01:16:41.462967Z INFO space_efficiency_check: total input size car_bytes=246501100813 car_mib=235081 2026-04-09T01:16:41.463007Z INFO space_efficiency_check: db size before compaction bytes=65930950758 mib=62876 2026-04-09T01:16:41.463010Z INFO space_efficiency_check: running manual compaction... 2026-04-09T01:22:07.400826Z INFO space_efficiency_check: db size after compaction bytes=51537661824 mib=49150 compact_elapsed=325.940944625s 2026-04-09T01:22:07.408148Z INFO space_efficiency_check: db size / car bytes ratio="0.209"

ok now, 256k => 64k block size:

2026-04-09T01:46:26.893956Z INFO space_efficiency_check::work: enumerated car directory files=490694 car_bytes=246501100813 car_mib=235081 2026-04-09T01:46:31.098925Z INFO space_efficiency_check::work: progress repos=490000 records=665025090 failed=367 2026-04-09T01:46:33.994992Z INFO space_efficiency_check::work: workers finished elapsed=1355.049800208s 2026-04-09T01:46:33.997804Z INFO space_efficiency_check: import complete repos=490327 empty=19172 records=665612139 failed=367 2026-04-09T01:46:33.997811Z INFO space_efficiency_check: total input size car_bytes=246501100813 car_mib=235081 2026-04-09T01:46:33.997858Z INFO space_efficiency_check: db size before compaction bytes=75196160153 mib=71712 2026-04-09T01:46:33.997860Z INFO space_efficiency_check: running manual compaction... 2026-04-09T01:52:06.451996Z INFO space_efficiency_check: db size after compaction bytes=52672889189 mib=50232 compact_elapsed=332.454359958s 2026-04-09T01:52:06.466039Z INFO space_efficiency_check: db size / car bytes ratio="0.214"

128K block

2026-04-09T02:45:21.213202Z INFO space_efficiency_check::work: enumerated car directory files=490694 car_bytes=246501100813 car_mib=235081 2026-04-09T02:45:22.317758Z INFO space_efficiency_check::work: progress repos=490000 records=665288131 failed=366 2026-04-09T02:45:22.885135Z INFO space_efficiency_check::work: workers finished elapsed=1323.177367625s 2026-04-09T02:45:22.885177Z INFO space_efficiency_check: import complete repos=490328 empty=19172 records=665928371 failed=366 2026-04-09T02:45:22.885186Z INFO space_efficiency_check: total input size car_bytes=246501100813 car_mib=235081 2026-04-09T02:45:22.885234Z INFO space_efficiency_check: db size before compaction bytes=80563021019 mib=76830 2026-04-09T02:45:22.885242Z INFO space_efficiency_check: running manual compaction... 2026-04-09T02:50:39.873626Z INFO space_efficiency_check: db size after compaction bytes=51708170815 mib=49312 compact_elapsed=316.988863208s 2026-04-09T02:50:39.885508Z INFO space_efficiency_check: db size / car bytes ratio="0.210"

16K block

2026-04-09T03:15:58.493554Z INFO space_efficiency_check::work: enumerated car directory files=490694 car_bytes=246501100813 car_mib=235081 2026-04-09T03:15:58.540900Z INFO space_efficiency_check::work: workers finished elapsed=1398.781832625s 2026-04-09T03:15:58.545808Z INFO space_efficiency_check: import complete repos=490328 empty=19172 records=665928371 failed=366 2026-04-09T03:15:58.545820Z INFO space_efficiency_check: total input size car_bytes=246501100813 car_mib=235081 2026-04-09T03:15:58.545865Z INFO space_efficiency_check: db size before compaction bytes=86378449887 mib=82376 2026-04-09T03:15:58.545866Z INFO space_efficiency_check: running manual compaction... 2026-04-09T03:21:26.082768Z INFO space_efficiency_check: db size after compaction bytes=56222871034 mib=53618 compact_elapsed=327.471128792s 2026-04-09T03:21:26.136856Z INFO space_efficiency_check: db size / car bytes ratio="0.228"

8K block

2026-04-09T03:46:15.381506Z INFO space_efficiency_check::work: enumerated car directory files=490694 car_bytes=246501100813 car_mib=235081 2026-04-09T03:46:15.520773Z INFO space_efficiency_check::work: workers finished elapsed=1367.07904175s 2026-04-09T03:46:15.527295Z INFO space_efficiency_check: import complete repos=490328 empty=19172 records=665928371 failed=366 2026-04-09T03:46:15.527307Z INFO space_efficiency_check: total input size car_bytes=246501100813 car_mib=235081 2026-04-09T03:46:15.527348Z INFO space_efficiency_check: db size before compaction bytes=78343679632 mib=74714 2026-04-09T03:46:15.527350Z INFO space_efficiency_check: running manual compaction... 2026-04-09T03:53:05.373778Z INFO space_efficiency_check: db size after compaction bytes=59702615786 mib=56936 compact_elapsed=409.843443917s 2026-04-09T03:53:05.457495Z INFO space_efficiency_check: db size / car bytes ratio="0.242"

8k block + 16k dictionary (256k training):

2026-04-09T04:29:27.446190Z INFO space_efficiency_check::work: enumerated car directory files=490694 car_bytes=246501100813 car_mib=235081 2026-04-09T04:29:27.613743Z INFO space_efficiency_check::work: workers finished elapsed=1496.465470458s 2026-04-09T04:29:27.616045Z INFO space_efficiency_check: import complete repos=490328 empty=19172 records=665928371 failed=366 2026-04-09T04:29:27.616057Z INFO space_efficiency_check: total input size car_bytes=246501100813 car_mib=235081 2026-04-09T04:29:27.616111Z INFO space_efficiency_check: db size before compaction bytes=88278600013 mib=84189 2026-04-09T04:29:27.616113Z INFO space_efficiency_check: running manual compaction... 2026-04-09T04:38:28.940272Z INFO space_efficiency_check: db size after compaction bytes=57737984223 mib=55063 compact_elapsed=541.320268792s 2026-04-09T04:38:29.004843Z INFO space_efficiency_check: db size / car bytes ratio="0.234"

16k block + 16k dict:

db size after compaction bytes=55663485496 mib=53084 compact_elapsed=204.063369459s

32k block + 16k dict:

db size after compaction bytes=53999703450 mib=51498 compact_elapsed=143.929752417s

4k block + 16k dict:

db size after compaction bytes=61098626429 mib=58268 compact_elapsed=136.135160458s

4k block + 0 dict:

db size after compaction bytes=65956690117 mib=62901 compact_elapsed=136.727676042s

32k block + 0 dict:

db size after compaction bytes=54041334794 mib=51537 compact_elapsed=98.191982917s

32k block + 0 dict, z6:

db size after compaction bytes=53352328748 mib=50880 compact_elapsed=162.159684083s

32k block + 16k dict, z6:

db size after compaction bytes=53021812545 mib=50565 compact_elapsed=203.065514958s

8k block + 16k dict, z6:

db size after compaction bytes=57114197996 mib=54468 compact_elapsed=220.894172333s

8k block + 0 dict, z6:

db size after compaction bytes=58704811352 mib=55985 compact_elapsed=177.26946475s

4k block + 0 dict, z6:

db size after compaction bytes=64943379652 mib=61934 compact_elapsed=170.830192667s

4k block + 16k dict, z6:

db size after compaction bytes=60537637714 mib=57733 compact_elapsed=242.577487375s


4k block + 16k dict, z3, restart 8:

db size after compaction bytes=61942405450 mib=59072 compact_elapsed=184.055436916s

4k block + 16k dict, z3, restart 16:

db size after compaction bytes=61527257614 mib=58676 compact_elapsed=181.6921695s

4k block + 16k dict, z3, restart 32:

db size after compaction bytes=61070969206 mib=58241 compact_elapsed=179.042228292s

// 64 we already have

4k block + 16k dict, z3, restart 128:

db size after compaction bytes=61079509127 mib=58249 compact_elapsed=180.653354292s

4k block + 0 dict, z3, restart 8:

db size after compaction bytes=66798905955 mib=63704 compact_elapsed=140.909784041s

32k block + 0 dict, z3, restart 8:

db size after compaction bytes=54771771953 mib=52234 compact_elapsed=102.226233584s

db size after compaction bytes=54765640022 mib=52228 compact_elapsed=130.654591209s

--=

4k block + 0 dict z3 restart 8:

db size after compaction bytes=54771844999 mib=52234 compact_elapsed=77.579734459s

enumerated car directory files=490694 car_bytes=246501100813 car_mib=235081 workers finished elapsed=1366.711822333s import complete repos=490328 empty=19172 records=665928371 failed=366 total input size car_bytes=246501100813 car_mib=235081 db size before compaction bytes=69391838577 mib=66177 running manual compaction... db size after compaction bytes=54771946443 mib=52234 compact_elapsed=292.145749542s db size / car bytes ratio="0.222"

4k block + 8 dict:

4k block + 16 dict:

db size after compaction bytes=54771869920 mib=52234 compact_elapsed=83.646350833s

4k block + 32 dict:

db size after compaction bytes=54771873968 mib=52234 compact_elapsed=82.5698695s

4k block + 64 dict:

db size after compaction bytes=54771906488 mib=52234 compact_elapsed=83.8019415s

4k block + 128 dict:

db size after compaction bytes=54771932651 mib=52234 compact_elapsed=80.826865292s

4k block + 256 dict:

db size after compaction bytes=54771899715 mib=52234 compact_elapsed=81.088142667s

4k block + 512 dict:

db size after compaction bytes=54771905067 mib=52234 compact_elapsed=82.468925208s

4k block + 1k dict:

db size after compaction bytes=54771892224 mib=52234 compact_elapsed=85.311880917s

4k block + 2k dict:

db size after compaction bytes=54771897772 mib=52234 compact_elapsed=81.808258125s

4k block + 4k dict:

db size after compaction bytes=54771899652 mib=52234 compact_elapsed=81.196579125s

4k block + 8k dict:

db size after compaction bytes=54771894044 mib=52234 compact_elapsed=81.155235041s

4k block + 16k dict:

026-04-09T14:40:02.228661Z INFO space_efficiency_check::work: workers finished elapsed=1516.877558667s 2026-04-09T14:40:02.260445Z INFO space_efficiency_check: import complete repos=490328 empty=19172 records=665928371 failed=366 2026-04-09T14:40:02.260455Z INFO space_efficiency_check: total input size car_bytes=246501100813 car_mib=235081 2026-04-09T14:40:02.260503Z INFO space_efficiency_check: db size before compaction bytes=83032996404 mib=79186 2026-04-09T14:40:02.260506Z INFO space_efficiency_check: running manual compaction... 2026-04-09T14:46:33.171340Z INFO space_efficiency_check: db size after compaction bytes=54739156417 mib=52203 compact_elapsed=390.910325s 2026-04-09T14:46:33.233832Z INFO space_efficiency_check: db size / car bytes ratio="0.222"

4k block + 16K dict (again, fresh run again):

db size after compaction bytes=54750108385 mib=52213 compact_elapsed=397.067736125s

after fixing compression:

db size after compaction bytes=54699871870 mib=52165 compact_elapsed=128.5052185s

and with no dictionary:

db size after compaction bytes=54772637044 mib=52235 compact_elapsed=85.942000375s

and then maybe:

  • snappy L0 L1
  • zstd below

and:

  • snappy L0
  • zstd below