STreaming ARchives: stricter, verifiable, deterministic, highly compressible alternatives to CAR files for atproto repositories.
atproto car
9
fork

Configure Feed

Select the types of activity you want to include in your feed.

STAR: STreaming ARchive formats#

Stricter, verifiable, deterministic, highly compressible alternatives to CAR files for atproto repositories.

CAR STAR-lite STAR-L0 STAR-L1
verifiable
existing tools
archive size worst best good near-best
streamable ❌^1 ✅^2 ✅ best
bounded memory ❌^1 ✅^2 ✅ best
speed worst^1 good/best^3 best better
complexity ✅ best ✅ best ok tricky
strict
deterministic
slices, sparse ❌^4
subtree

Read more:

  • CAR: best interoperability

    A standardized content-addressed block format

  • STAR-lite: best compression

    A flat key-record encoding with no MST

  • STAR-L0/L1: best for streaming verification

    A strictly-ordered block format with implicit CIDs and MST recovery at lower layers


Notes:

  1. See this issue on the ietf atproto repo draft: it's not possible in general to correctly treat a CAR repo as stream-ordered without knowing (out of band) that it was encoded that way, so parsers must buffer the entire repository. Disk spilling can bound memory usage, like repo-stream does, but requires many random i/o reads. Stream-ordered CARs are competitive with STAR variants on some axes, but given the unresolved issues, are not considered in this comparison.

  2. STAR-lite streaming verification or conversion-to-CAR requires disk spilling to acheive bounded memory, but the i/o is optimized for a small number of one-time in-order reads from disk.

  3. STAR-lite values can be emitted immediately and trivially from its encoded form with zero buffering required. However, MST recovery (or pre-verification) requires either two passes or disk spilling -- but it's still more efficient than CAR.

  4. STAR-lite could support MST slices and probably sparse MSTs, but this is not specified yet. MST slices in particular would be valuable.