protobuf: proto3 default omission + fuzz + hostile + protoc interop
Fixes and extensions building on the scaffolding landed in fd396b81e.
- Encoder: proto3 scalar fields equal to their codec default are now
omitted from the wire. This is the first real interop bug — protoc
output for [Test1 {a = 0}] is empty, but we were emitting "0800".
[Message.encode_fields] checks [v <> codec.default] before writing
required fields.
- Fuzz suite (fuzz/): 17 alcobar invariants covering scalar round-trip,
the kitchen-sink message (every scalar plus optional/repeated/packed/
nested), and decoder robustness against arbitrary bytes (must return
Ok or Error; never raise, loop, or allocate unboundedly).
- Hostile-input tests (test_hostile): eleven regressions for known
protobuf decoder CVE classes — huge length prefix DoS, over-long
varint, truncated tag, reserved tag 0, unsupported wire type, wire
type mismatch, empty input -> defaults, overrun rejected, length past
end, packed corrupt body, and many-repeated scaling. Depth-limited
recursion noted as a TODO (needs a Lazy-wrapped recursive codec and
an explicit depth bound in the decoder).
- Interop test against protoc (test/interop/protoc/): Python oracle
using grpcio-tools 1.73.0 + protobuf 6.31.0, generating two trace
CSVs for a Test1 message and an Everything message covering all 15
scalar types plus optional/repeated/packed/nested. The OCaml test
asserts byte-for-byte equality in both directions (encode matches
protoc, decode reproduces protoc's values). [dune build @regen-traces]
from the package root refreshes traces.
Total test count: 38 unit + 17 fuzz + 2 interop (all passing). The
interop layer is the one that actually proves this speaks protobuf —
the earlier tests just verified self-consistency.