···11# repo-stream
2233-Fast and (aspirationally) robust atproto CAR file processing in rust
33+Efficient and robust atproto CAR file processing in rust
44+55+todo
66+77+- [ ] get an *emtpy* car for the test suite
88+- [ ] implement a max size on disk limit
99+1010+1111+-----
1212+1313+older stuff (to clean up):
414515616current car processing times (records processed into their length usize, phil's dev machine):
···2737 -> yeah the commit is returned from init
2838- [ ] spec compliance todos
2939 - [x] assert that keys are ordered and fail if not
3030- - [ ] verify node mst depth from key (possibly pending [interop test fixes](https://github.com/bluesky-social/atproto-interop-tests/issues/5))
4040+ - [x] verify node mst depth from key (possibly pending [interop test fixes](https://github.com/bluesky-social/atproto-interop-tests/issues/5))
3141- [ ] performance todos
3242 - [x] consume the serialized nodes into a mutable efficient format
3343 - [ ] maybe customize the deserialize impl to do that directly?
+6
src/process.rs
···1111approximate total off-stack size of the type. (the on-stack size will be added
1212automatically via `std::mem::get_size`).
13131414+Note that it is **not guaranteed** that the `process` function will run on a
1515+block before storing it in memory or on disk: it's not possible to know if a
1616+block is a record without actually walking the MST, so the best we can do is
1717+apply `process` to any block that we know *cannot* be an MST node, and otherwise
1818+store the raw block bytes.
1919+1420Here's a silly processing function that just collects 'eyy's found in the raw
1521record bytes
1622