STreaming ARchives: stricter, verifiable, deterministic, highly compressible alternatives to CAR files for atproto repositories.
atproto car
9
fork

Configure Feed

Select the types of activity you want to include in your feed.

car conversion pseudo-code

phil 790008b6 1d884db8

+129 -63
+129 -63
star-lite/readme.md
··· 127 127 - serialized into runs of CAR-format blocks, 128 128 - any other transformation 129 129 130 - Once the entire tree has been walked and frozen, the highest-layer MST node can finally be considered frozen to produce the root node CID, which match the CID in a STAR-lite file's header. 130 + Once the entire tree has been walked and frozen, the highest-layer MST node can finally be considered frozen to produce the root node CID, which must match the CID in a STAR-lite file's header. 131 131 132 132 133 133 ### Archive verification ··· 148 148 # link_record(key, cid) appends an entry with a key and value link 149 149 # link_subtree(cid) inserts a node link as the "left" child (empty node), 150 150 # or as the right-most entry's "right" 151 - # to_cbor() => bytes bytes: canonical DAG-CBOR encoding of the MST node 151 + # to_cbor() => bytes canonical DAG-CBOR encoding of the MST node 152 152 153 153 def reconstruct_root_cid(key_record_pairs): 154 154 """Compute the MST root CID from repo contents ··· 158 158 stack: list[MstNode] = [] 159 159 prev_layer = -1 160 160 161 - # the actual walk. everything left of the stack is finalized. 161 + # the actual walk. everything to the left of the stack is finalized. 162 162 # anything remaining in the stack gets rolled up at the end. 163 163 for (key, record_cbor) in key_record_pairs: 164 164 key_layer = compute_mst_layer(key) ··· 168 168 stack.append(MstNode()) 169 169 170 170 # finalize lower levels if this key is at a higher level than last. 171 - # higher key means everything lower in the stack is to-our-left now. 171 + # higher key means everything lower in the stack is left-of-us now. 172 172 if key_layer > prev_layer: 173 173 for node, parent in zip(stack[:key_layer], stack[1:]): 174 174 if node.is_empty(): ··· 177 177 node.reset_to_empty() 178 178 179 179 # add a node entry for the current record 180 - stack[key_layer].link_value(key, compute_cid(record_cbor)) 180 + stack[key_layer].link_record(key, compute_cid(record_cbor)) 181 181 182 182 prev_layer = key_layer 183 183 ··· 204 204 205 205 Since our depth-first walk finalizes children before parents, and the final parent finalizes last, we must unfortunately buffer all serialized CAR frames while the tree is walked. The good news is that a disk-spill-friendly byte log works well for this buffering. 206 206 207 + #### pseudo-code 207 208 208 - #### some old intuition-y words that might go somewhere but not here now 209 + ```python 210 + # MstNode interface changes: 211 + # entries list of (key, cid, log position, right link) 212 + # left, entries[].right optional subtree link + stashed emit plan 213 + # link_record(key, cid, log_pos) stash the carv1 frame's byte log position 214 + # link_subtree(cid, emit_plan) stash an emit plan with the link 209 215 210 - Stream-ordered CARs (in "preorder traversal" block order) are a depth-first walk over the Merkle Search Tree, and keys encountered during a depth-first MST walk are in strict lexicographic order. 216 + def car_frame(data_bytes: bytes) -> tuple[Cid, bytes]: 217 + """CARv1 block framing: [ varint | CID | data ]""" 218 + cid = compute_cid(data_bytes) 219 + data = cid.to_bytes() + data_bytes # wire-encoded CID bytes, not the digest 220 + return cid, varint_bytes(len(data)) + data 211 221 212 - There is a a useful symmetry here: 222 + def frame_at(byte_log: bytes, position: int) -> bytes: 223 + """Get a logged CARv1 frame from `position` using its own varint length""" 224 + varint_len, payload_len = varint_read(byte_log, position) 225 + frame_end = position + varint_len + payload_len 226 + return byte_log[position:frame_end] 213 227 214 - - every subtree of an MST occupies a contiguous region of the stream-order serialized CAR 215 - - every subtree of an MST spans a contiguous range lexicographically-ordered keys 216 228 217 - So, any subtree-spanning range of keys (and records) can be materialized directly into its stream-ordered sequence of CAR blocks, independent of the rest of the archive. 229 + def build_subtree_emit_plan(node: MstNode, node_frame_position): 230 + """assemble the stream-ordered emit plan for finalized subtree 218 231 232 + this is the core of how we drive the CAR preorder traversal output! 219 233 220 - #### pseudo-code 234 + node_frame_position: offset in the byte log of this node's own CARv1 frame 221 235 222 - ```python 223 - # wip! 236 + returns: ordered list of value-log indexes to serialized CARv1 frames 237 + """ 238 + plan = [] 224 239 225 - def to_stream_ordered_car(key_record_pairs): 226 - stack = [] 227 - byte_log = [] # disk spilling omitted from this example 228 - prev_key_layer = 0 240 + # first: the (CBOR-encoded) parent node itself 241 + plan.append(node_frame_position) 229 242 230 - for (key, record) in key_record_pairs: 231 - record_cid = compute_cid(record) 243 + # next, the left sub-subtree, if present 244 + if node.left: 245 + plan.extend(node.left.subtree_emit_plan) 232 246 233 - record_run = byte_log.append_car_frame(record_cid, record) 247 + # finally, each value and entire value-right-subtree, in order: 248 + for entry in node.entries: 249 + # value first (always present in an MST entry) 250 + plan.append(entry.frame_position) 251 + # then after-value right sub-subtree (if present) 252 + if entry.right: 253 + plan.extend(entry.right.subtree_emit_plan) 234 254 235 - key_layer = layer_of(key) 255 + return plan 236 256 237 - extend stack with empty slots until len(stack) >= key_layer + 1 238 257 239 - # every layer below key_layer that has content gets frozen. Its 240 - # node frame is appended to the byte log, and the resulting 241 - # subtree's emit plan is propagated up to layer L+1. 242 - for lower_layer in range(0, key_layer): 243 - if node := stack.get(lower_layer): 244 - (node_cid, node_bytes) = encode_mst_node(node) 245 - node_run = byte_log.append_car_frame(node_cid, node_bytes) 246 - subtree_emit_plan = build_emit_plan(node, node_run) 247 - push_subtree_with_plan(stack[lower_layer + 1], node_cid, subtree_emit_plan) 248 - stack[lower_layer] = None 258 + def to_stream_ordered_car_body(key_record_pairs): 259 + """Get a stream-ordered atproto CAR body from repository contents 249 260 250 - # bleh, None handling kind of sucks. we should actually check nodes for .empty() and push/extend where needed 261 + returns (root_cid, output_bytes) -- does not write a CAR header or the 262 + commit object's block, which must come first in the body for stream-order. 251 263 252 - if stack.get(key_layer) is None: 253 - stack[key_layer] = make_empty_node() # blehhh 264 + key_record_pairs must be in lexicographic key order (= depth-first mst walk) 265 + """ 266 + stack: list[MstNode] = [] 267 + byte_log = bytearray() 268 + prev_layer = -1 254 269 255 - stack[key_layer].entries.append(WhatIsThis( 256 - key=key, 257 - cid=record_cid, 258 - car_run=record_run, 259 - right=None, 260 - right_emit_plan=None, 261 - )) 270 + # the actual walk. everything to the left of the stack is finalized. 271 + # anything remaining in the stack gets rolled up at the end. 272 + # serialized CARv1 frames appended into byte_log as we go. 273 + for (key, record_cbor) in key_record_pairs: 274 + key_layer = compute_mst_layer(key) 262 275 263 - # End of input: fold remaining stack bottom-up the same way. 264 - node_cid, node_emit_plan = None, None 265 - for node in stack: 266 - if node_cid is not None: 267 - push_subtree_with_plan(node, node_cid, node_emit_plan) 268 - node_cid, node_emit_plan = None, None 269 - if node is not empty: 270 - (node_cid, node_bytes) = encode_mst_node(node) 271 - node_run = byte_log.append_car_frame(node_cid, node_bytes) 272 - node_emit_plan = build_emit_plan(node, node_run) 273 - node_cid = node_cid 276 + # grow the stack if needed, init with empty nodes. 277 + while len(stack) <= key_layer: 278 + stack.append(MstNode()) 274 279 275 - # Empty repo: emit the canonical empty MST node into the byte log. 276 - if node_cid is None: 277 - (node_cid, node_bytes) = encode_mst_node(empty stack-slot) 278 - node_run = byte_log.append_car_frame(node_cid, node_bytes) 279 - node_emit_plan = [node_run] 280 + # finalize lower levels if this key is at a higher level than last. 281 + # higher key means everything lower in the stack is left-of-us now. 282 + if key_layer > prev_layer: 283 + for node, parent in zip(stack[:key_layer], stack[1:]): 284 + if node.is_empty(): 285 + continue # skip possible empty bottom-most nodes 280 286 281 - output = [] 282 - for run in node_emit_plan: 283 - output.extend(byte_log[run.what:run.whattt]) 287 + # put finalized (+serialized, CAR-framed) node into the byte log 288 + frame_position = len(byte_log) 289 + cid, framed = car_frame(node.to_cbor()) 290 + byte_log.extend(framed) 284 291 285 - return node_cid, output 292 + # link it from the parent node now it's finalized with a CID 293 + node_emit_plan = build_subtree_emit_plan(node, frame_position) 294 + parent.link_subtree(cid, node_emit_plan) 295 + node.reset_to_empty() 296 + 297 + # put the current record into the byte log 298 + frame_position = len(byte_log) 299 + record_cid, framed = car_frame(record_cbor) 300 + byte_log.extend(framed) 301 + 302 + # and link it from the MST node's entries at this layer 303 + stack[key_layer].link_record(key, record_cid, frame_position) 304 + 305 + prev_layer = key_layer 306 + 307 + # finalize remaining stack 308 + for node, parent in zip(stack[:-1], stack[1:]): 309 + if node.is_empty(): 310 + continue 311 + 312 + frame_position = len(byte_log) 313 + cid, framed = car_frame(node.to_cbor()) 314 + byte_log.extend(framed) 315 + 316 + node_emit_plan = build_subtree_emit_plan(node, frame_position) 317 + parent.link_subtree(cid, node_emit_plan) 318 + node.reset_to_empty() 319 + 320 + # get the finished root node, finally. 321 + if len(stack) > 0: 322 + root = stack[-1] 323 + else: 324 + root = MstNode() # empty repo: atproto CAR writes one single empty node 325 + 326 + # frame the root and get it in the logggggggg 327 + root_frame_position = len(byte_log) 328 + root_cid, framed = car_frame(root.to_cbor()) 329 + byte_log.extend(framed) 330 + 331 + # and pull together the final emit plan 332 + root_emit_plan = build_subtree_emit_plan(root, root_frame_position) 333 + 334 + # walk the plan into the final output!!! 335 + output = bytearray() 336 + for position in root_emit_plan: 337 + output.extend(frame_at(byte_log, position)) 338 + 339 + return root_cid, output 286 340 ``` 341 + 342 + 343 + #### some old intuition-y words that might go somewhere but not here now 344 + 345 + Stream-ordered CARs (in "preorder traversal" block order) are a depth-first walk over the Merkle Search Tree, and keys encountered during a depth-first MST walk are in strict lexicographic order. 346 + 347 + There is a a useful symmetry here: 348 + 349 + - every subtree of an MST occupies a contiguous region of the stream-order serialized CAR 350 + - every subtree of an MST spans a contiguous range lexicographically-ordered keys 351 + 352 + So, any subtree-spanning range of keys (and records) can be materialized directly into its stream-ordered sequence of CAR blocks, independent of the rest of the archive. 287 353 288 354 289 355 #### Empty repos