this repo has no description
0
fork

Configure Feed

Select the types of activity you want to include in your feed.

docs: add correctness parity analysis and gap decomposition

traced indigo's full decode path to verify no correctness work is
skipped. added "correctness parity" and "where the ~20x comes from"
sections decomposing the gap into implementation factors.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

zzstoatzz 9e2b9749 8cf625f2

+37 -1
+37 -1
README.md
··· 72 72 | go (indigo) | `evt.Deserialize` → typed `RepoCommit` via code-gen CBOR → `car.NewBlockReader` (+ SHA-256 verify) → `cbornode.DecodeInto` per block | 73 73 | python | `Frame.from_bytes` + `parse_subscribe_repos_message` → `CAR.from_bytes` (libipld decodes all blocks internally) | 74 74 75 + ## correctness parity 76 + 77 + we traced the full decode path of every SDK to verify that no SDK is winning by skipping correctness work. 78 + 79 + **what both zat and indigo do per frame:** 80 + - decode full CBOR payload (all commit fields — repo, rev, ops, timestamp, etc.) 81 + - parse CAR header and all block sections 82 + - parse CID structure (version, codec, multihash) for each block 83 + - SHA-256 hash each block and compare against CID digest 84 + - decode every block as DAG-CBOR 85 + 86 + **what neither side does:** 87 + - DAG-CBOR deterministic encoding validation (sorted keys, minimal integers) — indigo's refmt doesn't check this either 88 + - signature verification — separate from decode, not measured here 89 + - MST validation — separate from decode, not measured here 90 + 91 + **the only asymmetry:** indigo enforces size limits on CBOR map lengths and a 2MB cap on the blocks field. these are integer comparisons — effectively free. 92 + 93 + the ~20x gap between zat and indigo (both with CID verification) is entirely implementation cost, not correctness differences. 94 + 95 + ## where the ~20x comes from 96 + 97 + we traced indigo's decode path at the instruction level. the cost compounds from several architectural differences: 98 + 99 + | factor | indigo | zat | approx cost | 100 + |--------|--------|-----|-------------| 101 + | CBOR decode | refmt: token pump → reflection → `reflect.SetMapIndex` per entry | hand-written, direct dispatch | ~3-4x | 102 + | string/byte handling | Go `string` heap allocation per value (repo, rev, path, action, per-block keys) | zero-copy slices into input buffer | ~2-3x | 103 + | memory management | per-object GC'd heap allocation; every map, array, int is boxed | arena allocator, 24-byte `Value` union | ~2-3x | 104 + | CAR block reads | `make([]byte, section_len)` + copy per block; CID parsed twice (once to read, once to verify) | reads directly from input slice; CID parsed once | ~1.5x | 105 + | blocks field | `make([]uint8, len)` + `io.ReadFull` copies entire CAR payload | slices into input buffer | ~1.2x | 106 + 107 + these factors multiply. refmt's reflection overhead × per-value heap allocation × GC pressure × byte copying = ~20x on this workload. 108 + 109 + note: indigo's `cbor-gen` (code-generated unmarshal for the commit struct) is fast — the bottleneck is `cbornode.DecodeInto` (refmt/reflection) for the per-block DAG-CBOR decode, which runs ~10 times per frame. 110 + 75 111 ## fairness notes 76 112 77 113 - **CID verification**: only zat and indigo verify block hashes. this is ~2x overhead for zat (311k vs 630k fps). the decode-only table exists for architectural comparison, but the production-correct table is the one that matters for real-world use 78 114 - **zig** and **rust (raw)** both use arena allocation + zero-copy string/byte decoding. the "alloc per frame" variants are the fair cross-language comparison; "arena reuse" shows the production pattern 79 115 - **rust (jacquard)** is the real AT Protocol SDK that rust developers use. it pays for serde-based owned deserialization (`String`, `BTreeMap`), async CAR parsing (tokio poll/wake per block via iroh-car), and per-object heap allocation 80 116 - **go (raw)** uses fxamacker/cbor (no reflection for known struct types), a hand-rolled sync CAR parser (no CID hash verification), and no indigo dependency. GC pressure remains the fundamental constraint — Go's experimental arena package (`GOEXPERIMENT=arenas`) is on hold and not recommended for production 81 - - **go (indigo)** — bluesky's own production relay — uses code-generated CBOR unmarshal (no reflection at the frame level) but pays for go-car's per-block CID hash verification and cbornode's reflection-based DAG-CBOR decode 117 + - **go (indigo)** — bluesky's own production relay — uses code-generated CBOR unmarshal (no reflection at the frame level) but pays for go-car's per-block CID hash verification and cbornode's reflection-based DAG-CBOR decode via the unmaintained refmt library 82 118 - **python** is faster than jacquard despite being "Python" — its hot path is `libipld` (Rust via PyO3), which does the entire CAR parse + per-block DAG-CBOR decode in one synchronous C-extension call 83 119 - **error handling**: all SDKs use infallible decode functions that never abort on failure — errors are counted and the frame is skipped 84 120 - **capture coupling**: the corpus capture tool uses zat's CBOR decoder for the commit-with-ops header peek. this is standard CBOR parsing (not zat's typed firehose decoder), but it does mean frames that zat's CBOR decoder rejects won't appear in the corpus