protobuf: replace record-of-closures codec with a format-centric GADT
The previous [type 'a t = { wire_type; write_value; read_wire; ... }]
was a record of closures. Interpreters couldn't be added without
editing every combinator, the structure was opaque to tooling, and
the shape didn't match the ocaml-json / encodings-skill design.
Rewrite as a finally-tagged GADT whose constructors name protobuf's
wire-level alphabet:
type _ t =
| Varint : (int64, 'a) base -> 'a t
| Fixed32_t : (int32, 'a) base -> 'a t
| Fixed64_t : (int64, 'a) base -> 'a t
| Length_delim : (string, 'a) base -> 'a t
| Message : 'a message_spec -> 'a t
| Rec : 'a t Lazy.t -> 'a t
Each scalar codec now produces a typed GADT node carrying its
[Sort.t] (one of the 15 protobuf scalar types — int32, uint32,
sint32, fixed32, sfixed32, float, ..., bytes, message). Sort feeds
into error messages: instead of "expected varint, got
length-delimited" the decoder now says "int32: expected wire type
varint, got length-delimited", which is what users want when a schema
says [int32 a = 1] but the wire carries a length-delim.
[fix] switches from a mutable forwarding placeholder to a [Lazy]-
wrapped [Rec] node. Cleaner: the recursive-codec forcing is explicit
in the GADT shape.
encode_value / decode_value are now [type a. a t -> ...] walkers that
pattern-match on the wire sort. Adding a new interpreter (schema
printer, pp, diff) is adding a new walker alongside these, no change
to the combinator call sites.
Message combinators (Message.required / optional / repeated / packed
and the [let*] chain) retain their shape at the user-facing level;
internally [Message.finish] now produces a [Message { encode_body;
decode_body; msg_default }] GADT node.
Still to do from the encoding skill:
- Split into separate [value.ml] / [codec.ml] / [error.ml] / [foo.ml]
layer files (this commit keeps everything in protobuf.ml for
minimal diff).
- Expose [Protobuf.Value.t] AST and [Cursor].
- Migrate errors to [Loc.Error.kind].
- Add six-verb API (of_string / to_string / of_reader / to_writer /
decode / encode) with _exn twins.
All 40 unit + 17 fuzz + 2 protoc interop tests pass.