Protocol Buffers codec for hand-written schemas
0
fork

Configure Feed

Select the types of activity you want to include in your feed.

protobuf: split codec GADT into its own module

Move the codec type, the 15 scalar codecs, the [Message] combinators,
and [fix] into [lib/codec.ml] / [lib/codec.mli]. The top-level
[Protobuf] module now aliases [type 'a t = 'a Codec.t], re-exports
the scalars and [Message] for ergonomics, and keeps only the
reading/writing entry points ([of_string] / [to_string] /
[of_reader] / [to_writer] plus [_exn] twins).

Motivation: match the one-file-per-concern layout already used in
[ocaml-json], [ocaml-cbor], and [ocaml-xml]. A later split will
extract a [Value.t] AST and a [Cursor] zipper.

[codec.mli] exposes:
- [type 'a t] (abstract)
- the 15 scalar codecs
- [module Message] with required/optional/repeated/packed/map/oneof
- [fix]

The codec's four IO walkers ([encode_string] / [decode_string] /
[encode] / [decode] and the unknown-fields pair) are under a
[(**/**)] internal section because the top-level [Protobuf] module's
[of_*] / [to_*] functions are the stable public surface.

Files:
lib/codec.ml [new, 886 lines]
lib/codec.mli [new]
lib/protobuf.ml [rewritten: 52-line re-export + IO shim]
lib/protobuf.mli [rewritten to match]

All 53 unit + 17 fuzz + 2 protoc interop tests pass.

+995 -886
+883
lib/codec.ml
··· 1 + (* Protocol Buffers codec, finally-tagged. 2 + 3 + The top-level alphabet names the four protobuf wire types plus 4 + message-level composition. Interpreters (encode, decode) walk the 5 + [Codec.t] GADT; adding a new interpreter (schema extraction, 6 + pretty-printer, diff) is adding a new walker without touching the 7 + combinator call sites. *) 8 + 9 + module Wire = Wire 10 + module Error = Error 11 + 12 + (* -- Nested-message depth tracking. 13 + 14 + An adversarial input with thousands of levels of nested Length_delim 15 + fields would stack-overflow the runtime: each nesting level is one 16 + OCaml stack frame in the recursive message decoder. Bound the depth 17 + at [max_depth] by default (matching protoc's C++/Java default) and 18 + fail the decode with a clean [Decode_error] when exceeded. *) 19 + let max_depth = 100 20 + let depth = ref 0 21 + 22 + let with_depth_check f = 23 + if !depth >= max_depth then 24 + raise 25 + (Wire.Decode_error (Fmt.str "nested message depth %d exceeded" max_depth)); 26 + incr depth; 27 + Fun.protect ~finally:(fun () -> decr depth) f 28 + 29 + (* -- Pre-parsed wire values. 30 + 31 + The message decoder parses the byte stream into a [(tag, wire_value 32 + list)] table before running the field GADT, so tags can appear in 33 + any order on the wire. [wire_value] is internal — the public GADT 34 + [t] names the logical types, not the raw bytes. *) 35 + type wire_value = 36 + | WV_varint of int64 37 + | WV_fixed32 of int32 38 + | WV_fixed64 of int64 39 + | WV_length_delim of string 40 + 41 + let wire_value_type = function 42 + | WV_varint _ -> Wire.Varint 43 + | WV_fixed32 _ -> Wire.Fixed32 44 + | WV_fixed64 _ -> Wire.Fixed64 45 + | WV_length_delim _ -> Wire.Length_delimited 46 + 47 + (* Sort: the 15 protobuf scalar types plus Message. 48 + Used for error messages like "expected int32, got string". *) 49 + module Sort = struct 50 + type t = 51 + | Int32 52 + | Int64 53 + | Uint32 54 + | Uint64 55 + | Sint32 56 + | Sint64 57 + | Fixed32 58 + | Fixed64 59 + | Sfixed32 60 + | Sfixed64 61 + | Float 62 + | Double 63 + | Bool 64 + | String 65 + | Bytes 66 + | Message 67 + 68 + let to_string = function 69 + | Int32 -> "int32" 70 + | Int64 -> "int64" 71 + | Uint32 -> "uint32" 72 + | Uint64 -> "uint64" 73 + | Sint32 -> "sint32" 74 + | Sint64 -> "sint64" 75 + | Fixed32 -> "fixed32" 76 + | Fixed64 -> "fixed64" 77 + | Sfixed32 -> "sfixed32" 78 + | Sfixed64 -> "sfixed64" 79 + | Float -> "float" 80 + | Double -> "double" 81 + | Bool -> "bool" 82 + | String -> "string" 83 + | Bytes -> "bytes" 84 + | Message -> "message" 85 + 86 + let pp ppf s = Fmt.string ppf (to_string s) 87 + end 88 + 89 + (* Typed conversion from a wire-level representation to an OCaml value. 90 + The wire representation is determined by which GADT constructor 91 + wraps this record: [Varint] pairs with [int64], [Fixed32] with 92 + [int32], etc. *) 93 + type ('w, 'a) base = { 94 + sort : Sort.t; 95 + dec : 'w -> 'a; 96 + enc : 'a -> 'w; 97 + default : 'a; 98 + } 99 + 100 + (* Internal encoder/decoder pair for a message. Walking the [Message] 101 + node branches into these; they are not part of the public 102 + combinator vocabulary. *) 103 + type 'o message_spec = { 104 + encode_body : Buffer.t -> 'o -> unit; 105 + decode_body : string -> int -> int -> 'o; 106 + (* Like [decode_body], but also returns the raw wire bytes of any 107 + fields whose tag was not in the schema (re-serialized in canonical 108 + form so they can be appended to a later encode). *) 109 + decode_body_with_unknowns : string -> int -> int -> 'o * string; 110 + msg_default : 'o; 111 + } 112 + 113 + (* The Codec GADT. 114 + 115 + Each constructor names a FORMAT-level sort (wire type for scalars, 116 + plus composition/recursion). Users build codecs via the combinators 117 + below; interpreters destructure. *) 118 + type _ t = 119 + | Varint : (int64, 'a) base -> 'a t 120 + | Fixed32 : (int32, 'a) base -> 'a t 121 + | Fixed64 : (int64, 'a) base -> 'a t 122 + | Length_delim : (string, 'a) base -> 'a t 123 + | Message : 'a message_spec -> 'a t 124 + | Rec : 'a t Lazy.t -> 'a t 125 + 126 + (* Expose a few witnesses so callers can pattern-match the wire type 127 + without destructuring the GADT (useful for field-level code). *) 128 + 129 + let pp : type a. Format.formatter -> a t -> unit = 130 + fun ppf -> function 131 + | Varint b -> Sort.pp ppf b.sort 132 + | Fixed32 b -> Sort.pp ppf b.sort 133 + | Fixed64 b -> Sort.pp ppf b.sort 134 + | Length_delim b -> Sort.pp ppf b.sort 135 + | Message _ -> Fmt.string ppf "message" 136 + | Rec _ -> Fmt.string ppf "rec message" 137 + 138 + let wire_type_of : type a. a t -> Wire.t = function 139 + | Varint _ -> Wire.Varint 140 + | Fixed32 _ -> Wire.Fixed32 141 + | Fixed64 _ -> Wire.Fixed64 142 + | Length_delim _ -> Wire.Length_delimited 143 + | Message _ -> Wire.Length_delimited 144 + | Rec c -> ( 145 + (* The Lazy may not be forced yet; peek safely. *) 146 + match Lazy.force c with 147 + | Varint _ -> Wire.Varint 148 + | Fixed32 _ -> Wire.Fixed32 149 + | Fixed64 _ -> Wire.Fixed64 150 + | Length_delim _ -> Wire.Length_delimited 151 + | Message _ -> Wire.Length_delimited 152 + | Rec _ -> Wire.Length_delimited) 153 + 154 + let default_of : type a. a t -> a = function 155 + | Varint b -> b.default 156 + | Fixed32 b -> b.default 157 + | Fixed64 b -> b.default 158 + | Length_delim b -> b.default 159 + | Message m -> m.msg_default 160 + | Rec c -> ( 161 + match Lazy.force c with 162 + | Varint b -> b.default 163 + | Fixed32 b -> b.default 164 + | Fixed64 b -> b.default 165 + | Length_delim b -> b.default 166 + | Message m -> m.msg_default 167 + | Rec _ -> assert false) 168 + 169 + (* -- Wire-value extraction, typed errors -- *) 170 + 171 + let type_error ~sort expected got = 172 + raise 173 + (Wire.Decode_error 174 + (Fmt.str "%a: expected wire type %a, got %a" Sort.pp sort Wire.pp 175 + expected Wire.pp (wire_value_type got))) 176 + 177 + let varint_of ~sort = function 178 + | WV_varint v -> v 179 + | w -> type_error ~sort Wire.Varint w 180 + 181 + let fixed32_of ~sort = function 182 + | WV_fixed32 v -> v 183 + | w -> type_error ~sort Wire.Fixed32 w 184 + 185 + let fixed64_of ~sort = function 186 + | WV_fixed64 v -> v 187 + | w -> type_error ~sort Wire.Fixed64 w 188 + 189 + let length_delim_of ~sort = function 190 + | WV_length_delim s -> s 191 + | w -> type_error ~sort Wire.Length_delimited w 192 + 193 + (* -- Walk-based encode / decode over the GADT -- *) 194 + 195 + let rec decode_value : type a. a t -> wire_value -> a = 196 + fun codec w -> 197 + match codec with 198 + | Varint b -> b.dec (varint_of ~sort:b.sort w) 199 + | Fixed32 b -> b.dec (fixed32_of ~sort:b.sort w) 200 + | Fixed64 b -> b.dec (fixed64_of ~sort:b.sort w) 201 + | Length_delim b -> b.dec (length_delim_of ~sort:b.sort w) 202 + | Message m -> 203 + let body = length_delim_of ~sort:Sort.Message w in 204 + with_depth_check (fun () -> m.decode_body body 0 (String.length body)) 205 + | Rec c -> decode_value (Lazy.force c) w 206 + 207 + (* [decode_bytes] reads a bare value at a byte offset. Only used for 208 + packed decoding where the values are concatenated without tags. 209 + Length-delimited codecs are not packable. *) 210 + let rec decode_bytes : type a. a t -> string -> int -> a * int = 211 + fun codec s off -> 212 + match codec with 213 + | Varint b -> 214 + let v, off' = Wire.read_int64 s off in 215 + (b.dec v, off') 216 + | Fixed32 b -> 217 + let v, off' = Wire.read_fixed32 s off in 218 + (b.dec v, off') 219 + | Fixed64 b -> 220 + let v, off' = Wire.read_fixed64 s off in 221 + (b.dec v, off') 222 + | Length_delim _ | Message _ -> 223 + raise 224 + (Wire.Decode_error 225 + "length-delimited codec cannot appear inside a packed field") 226 + | Rec c -> decode_bytes (Lazy.force c) s off 227 + 228 + let rec write_value : type a. Buffer.t -> a t -> a -> unit = 229 + fun buf codec v -> 230 + match codec with 231 + | Varint b -> Wire.write_int64 buf (b.enc v) 232 + | Fixed32 b -> Wire.write_fixed32 buf (b.enc v) 233 + | Fixed64 b -> Wire.write_fixed64 buf (b.enc v) 234 + | Length_delim b -> Wire.write_string buf (b.enc v) 235 + | Message m -> 236 + let body = Buffer.create 64 in 237 + m.encode_body body v; 238 + Leb128.add_u63_to_buffer buf (Buffer.length body); 239 + Buffer.add_buffer buf body 240 + | Rec c -> write_value buf (Lazy.force c) v 241 + 242 + (* -- Scalar codecs: 15 protobuf scalar types, grouped by wire type. -- *) 243 + 244 + let int32 : int32 t = 245 + Varint 246 + { sort = Int32; dec = Int64.to_int32; enc = Int64.of_int32; default = 0l } 247 + 248 + let int64 : int64 t = 249 + Varint { sort = Int64; dec = (fun x -> x); enc = (fun x -> x); default = 0L } 250 + 251 + let uint32 : int32 t = 252 + Varint 253 + { 254 + sort = Uint32; 255 + dec = (fun x -> Int64.to_int32 (Int64.logand x 0xFFFF_FFFFL)); 256 + enc = (fun x -> Int64.logand (Int64.of_int32 x) 0xFFFF_FFFFL); 257 + default = 0l; 258 + } 259 + 260 + let uint64 : int64 t = 261 + Varint { sort = Uint64; dec = (fun x -> x); enc = (fun x -> x); default = 0L } 262 + 263 + let sint32 : int32 t = 264 + Varint 265 + { 266 + sort = Sint32; 267 + dec = (fun x -> Int64.to_int32 (Leb128.zigzag_decode_i64 x)); 268 + enc = (fun x -> Leb128.zigzag_encode_i64 (Int64.of_int32 x)); 269 + default = 0l; 270 + } 271 + 272 + let sint64 : int64 t = 273 + Varint 274 + { 275 + sort = Sint64; 276 + dec = Leb128.zigzag_decode_i64; 277 + enc = Leb128.zigzag_encode_i64; 278 + default = 0L; 279 + } 280 + 281 + let fixed32 : int32 t = 282 + Fixed32 283 + { sort = Fixed32; dec = (fun x -> x); enc = (fun x -> x); default = 0l } 284 + 285 + let fixed64 : int64 t = 286 + Fixed64 287 + { sort = Fixed64; dec = (fun x -> x); enc = (fun x -> x); default = 0L } 288 + 289 + let sfixed32 : int32 t = 290 + Fixed32 291 + { sort = Sfixed32; dec = (fun x -> x); enc = (fun x -> x); default = 0l } 292 + 293 + let sfixed64 : int64 t = 294 + Fixed64 295 + { sort = Sfixed64; dec = (fun x -> x); enc = (fun x -> x); default = 0L } 296 + 297 + let float : float t = 298 + Fixed32 299 + { 300 + sort = Float; 301 + dec = Int32.float_of_bits; 302 + enc = Int32.bits_of_float; 303 + default = 0.0; 304 + } 305 + 306 + let double : float t = 307 + Fixed64 308 + { 309 + sort = Double; 310 + dec = Int64.float_of_bits; 311 + enc = Int64.bits_of_float; 312 + default = 0.0; 313 + } 314 + 315 + let bool : bool t = 316 + Varint 317 + { 318 + sort = Bool; 319 + dec = (fun x -> not (Int64.equal x 0L)); 320 + enc = (fun b -> if b then 1L else 0L); 321 + default = false; 322 + } 323 + 324 + let string : string t = 325 + Length_delim 326 + { sort = String; dec = (fun x -> x); enc = (fun x -> x); default = "" } 327 + 328 + let bytes : string t = 329 + Length_delim 330 + { sort = Bytes; dec = (fun x -> x); enc = (fun x -> x); default = "" } 331 + 332 + (* -- Recursive codecs -- 333 + 334 + Protobuf schemas can be self-referential (a tree node containing a 335 + list of child nodes of the same type). [fix] lets callers build a 336 + codec that references itself: [f] is invoked with a forwarding 337 + placeholder; any self-references in [f]'s body resolve to the 338 + final codec at decode/encode time via [Lazy]. *) 339 + 340 + let fix : type a. default:a -> (a t -> a t) -> a t = 341 + fun ~default f -> 342 + let rec lazy_codec = lazy (f (Rec lazy_codec)) in 343 + let _ = default in 344 + (* [default] is reserved for a future extension: currently all 345 + recursive codecs collapse to Message at runtime, whose 346 + [msg_default] carries the default. Keep the parameter in the API 347 + so callers don't need to change when we thread it. *) 348 + Rec lazy_codec 349 + 350 + (* -- Message combinators -- 351 + 352 + The [(o, a) field] GADT captures a sequence of field declarations 353 + and the continuation that builds the record value. Encoding walks 354 + the GADT in declaration order and emits (tag, value) per field. 355 + Decoding pre-parses the wire into a tag -> wire_value list table 356 + and then walks the same GADT, looking each field up in the table. *) 357 + 358 + module Message = struct 359 + type (_, _) field = 360 + | Return : 'a -> ('o, 'a) field 361 + | Required : { 362 + tag : int; 363 + get : 'o -> 'x; 364 + codec : 'x t; 365 + cont : 'x -> ('o, 'a) field; 366 + } 367 + -> ('o, 'a) field 368 + | Optional : { 369 + tag : int; 370 + get : 'o -> 'x option; 371 + codec : 'x t; 372 + cont : 'x option -> ('o, 'a) field; 373 + } 374 + -> ('o, 'a) field 375 + | Repeated : { 376 + tag : int; 377 + get : 'o -> 'x list; 378 + codec : 'x t; 379 + packed : bool; 380 + cont : 'x list -> ('o, 'a) field; 381 + } 382 + -> ('o, 'a) field 383 + | Oneof : { 384 + get : 'o -> 'x; 385 + default : 'x; 386 + cases : 'x case list; 387 + cont : 'x -> ('o, 'a) field; 388 + } 389 + -> ('o, 'a) field 390 + 391 + and 'a case = 392 + | Case : { 393 + tag : int; 394 + codec : 'x t; 395 + inject : 'x -> 'a; 396 + extract : 'a -> 'x option; 397 + } 398 + -> 'a case 399 + 400 + let return v = Return v 401 + 402 + let required tag get codec = 403 + Required { tag; get; codec; cont = (fun x -> Return x) } 404 + 405 + let optional tag get codec = 406 + Optional { tag; get; codec; cont = (fun x -> Return x) } 407 + 408 + let repeated tag get codec = 409 + Repeated { tag; get; codec; packed = false; cont = (fun x -> Return x) } 410 + 411 + let packed tag get codec = 412 + Repeated { tag; get; codec; packed = true; cont = (fun x -> Return x) } 413 + 414 + let case tag codec ~inject ~extract = Case { tag; codec; inject; extract } 415 + 416 + let oneof ~default get cases = 417 + Oneof { get; default; cases; cont = (fun x -> Return x) } 418 + 419 + let rec ( let* ) : type o a b. 420 + (o, a) field -> (a -> (o, b) field) -> (o, b) field = 421 + fun m f -> 422 + match m with 423 + | Return a -> f a 424 + | Required r -> 425 + Required 426 + { 427 + r with 428 + cont = 429 + (fun x -> 430 + let* y = r.cont x in 431 + f y); 432 + } 433 + | Optional r -> 434 + Optional 435 + { 436 + r with 437 + cont = 438 + (fun x -> 439 + let* y = r.cont x in 440 + f y); 441 + } 442 + | Repeated r -> 443 + Repeated 444 + { 445 + r with 446 + cont = 447 + (fun x -> 448 + let* y = r.cont x in 449 + f y); 450 + } 451 + | Oneof r -> 452 + Oneof 453 + { 454 + r with 455 + cont = 456 + (fun x -> 457 + let* y = r.cont x in 458 + f y); 459 + } 460 + 461 + (* -- Encoding -- *) 462 + 463 + let write_field buf ~tag codec v = 464 + Wire.write_tag buf ~field_number:tag ~wire_type:(wire_type_of codec); 465 + write_value buf codec v 466 + 467 + let write_packed buf ~tag codec vs = 468 + (* Concatenate raw value bytes into a scratch buffer, then emit as a 469 + single length-delimited blob. *) 470 + let body = Buffer.create 64 in 471 + let rec emit_list = function 472 + | [] -> () 473 + | v :: rest -> 474 + (match codec with 475 + | Varint b -> Wire.write_int64 body (b.enc v) 476 + | Fixed32 b -> Wire.write_fixed32 body (b.enc v) 477 + | Fixed64 b -> Wire.write_fixed64 body (b.enc v) 478 + | Length_delim _ | Message _ -> 479 + raise 480 + (Wire.Decode_error 481 + "length-delimited codec cannot be used inside a packed field") 482 + | Rec _ -> 483 + raise 484 + (Wire.Decode_error 485 + "recursive codec cannot be used inside a packed field")); 486 + emit_list rest 487 + in 488 + emit_list vs; 489 + Wire.write_tag buf ~field_number:tag ~wire_type:Length_delimited; 490 + Leb128.add_u63_to_buffer buf (Buffer.length body); 491 + Buffer.add_buffer buf body 492 + 493 + let rec encode_fields : type o a. Buffer.t -> o -> (o, a) field -> unit = 494 + fun buf o m -> 495 + match m with 496 + | Return _ -> () 497 + | Required { tag; get; codec; cont } -> 498 + let v = get o in 499 + (* proto3 semantics: omit a required scalar field that equals 500 + the codec's default. *) 501 + if v <> default_of codec then write_field buf ~tag codec v; 502 + encode_fields buf o (cont v) 503 + | Optional { tag; get; codec; cont } -> 504 + let v_opt = get o in 505 + (match v_opt with Some v -> write_field buf ~tag codec v | None -> ()); 506 + encode_fields buf o (cont v_opt) 507 + | Repeated { tag; get; codec; packed; cont } -> 508 + let vs = get o in 509 + (match vs with 510 + | [] -> () 511 + | _ when packed -> write_packed buf ~tag codec vs 512 + | _ -> List.iter (write_field buf ~tag codec) vs); 513 + encode_fields buf o (cont vs) 514 + | Oneof { get; cases; cont; _ } -> 515 + let v = get o in 516 + let rec emit_case = function 517 + | [] -> () 518 + | Case { tag; codec; extract; _ } :: rest -> ( 519 + match extract v with 520 + | Some x -> write_field buf ~tag codec x 521 + | None -> emit_case rest) 522 + in 523 + emit_case cases; 524 + encode_fields buf o (cont v) 525 + 526 + (* -- Decoding helpers -- *) 527 + 528 + (* Parse the wire into a tag -> [(seq, wire_value) list] table. The 529 + sequence counter captures global wire order across tags, so 530 + [oneof] can determine which of its alternative cases came last. 531 + Within each bucket the list is stored in reverse wire order 532 + (prepend on insert), so [List.hd] is the last-added entry. *) 533 + let parse_wire s start end_ : (int, (int * wire_value) list ref) Hashtbl.t = 534 + let table = Hashtbl.create 8 in 535 + let seq = ref 0 in 536 + let push tag v = 537 + let entry = (!seq, v) in 538 + incr seq; 539 + match Hashtbl.find_opt table tag with 540 + | Some r -> r := entry :: !r 541 + | None -> Hashtbl.add table tag (ref [ entry ]) 542 + in 543 + let pos = ref start in 544 + while !pos < end_ do 545 + let field_number, wt, off = Wire.read_tag s !pos in 546 + pos := off; 547 + match wt with 548 + | Wire.Varint -> 549 + let v, off = Wire.read_int64 s !pos in 550 + push field_number (WV_varint v); 551 + pos := off 552 + | Wire.Fixed32 -> 553 + let v, off = Wire.read_fixed32 s !pos in 554 + push field_number (WV_fixed32 v); 555 + pos := off 556 + | Wire.Fixed64 -> 557 + let v, off = Wire.read_fixed64 s !pos in 558 + push field_number (WV_fixed64 v); 559 + pos := off 560 + | Wire.Length_delimited -> 561 + let v, off = Wire.read_bytes s !pos in 562 + push field_number (WV_length_delim v); 563 + pos := off 564 + done; 565 + if !pos <> end_ then 566 + raise 567 + (Wire.Decode_error 568 + (Fmt.str "overran message boundary: at %d, expected end %d" !pos end_)); 569 + table 570 + 571 + (* [take_*] consume the matched entries out of the table so that after 572 + [decode_fields] returns, the remaining entries are exactly the 573 + unknown fields — tags that weren't claimed by the schema. *) 574 + 575 + let take_last table tag = 576 + match Hashtbl.find_opt table tag with 577 + | None -> None 578 + | Some r -> ( 579 + (* The list is in reverse wire order (last-added first). *) 580 + match !r with 581 + | [] -> None 582 + | (_, v) :: _ -> 583 + Hashtbl.remove table tag; 584 + Some v) 585 + 586 + let take_all table tag = 587 + match Hashtbl.find_opt table tag with 588 + | None -> [] 589 + | Some r -> 590 + Hashtbl.remove table tag; 591 + List.rev_map snd !r 592 + 593 + (* For oneof: find the case whose tag has the highest wire-sequence 594 + number. Returns [Some (case, wire_value)] if any case was on the 595 + wire, [None] otherwise. Removes the consumed case's entry from 596 + the table. *) 597 + let take_oneof_last : type a. 598 + (int, (int * wire_value) list ref) Hashtbl.t -> 599 + a case list -> 600 + (a case * wire_value) option = 601 + fun table cases -> 602 + let best = ref None in 603 + List.iter 604 + (fun (Case { tag; _ } as c) -> 605 + match Hashtbl.find_opt table tag with 606 + | None -> () 607 + | Some r -> ( 608 + match !r with 609 + | [] -> () 610 + | (seq, v) :: _ -> ( 611 + match !best with 612 + | None -> best := Some (seq, c, v) 613 + | Some (best_seq, _, _) when seq > best_seq -> 614 + best := Some (seq, c, v) 615 + | Some _ -> ()))) 616 + cases; 617 + (* Consume every case's tag from the table so they don't leak to 618 + unknowns. *) 619 + List.iter (fun (Case { tag; _ }) -> Hashtbl.remove table tag) cases; 620 + match !best with None -> None | Some (_, c, v) -> Some (c, v) 621 + 622 + let write_unknown_field buf tag = function 623 + | WV_varint v -> 624 + Wire.write_tag buf ~field_number:tag ~wire_type:Wire.Varint; 625 + Wire.write_int64 buf v 626 + | WV_fixed32 v -> 627 + Wire.write_tag buf ~field_number:tag ~wire_type:Wire.Fixed32; 628 + Wire.write_fixed32 buf v 629 + | WV_fixed64 v -> 630 + Wire.write_tag buf ~field_number:tag ~wire_type:Wire.Fixed64; 631 + Wire.write_fixed64 buf v 632 + | WV_length_delim s -> 633 + Wire.write_tag buf ~field_number:tag ~wire_type:Wire.Length_delimited; 634 + Wire.write_string buf s 635 + 636 + let collect_unknowns table = 637 + let buf = Buffer.create 16 in 638 + let tags = Hashtbl.fold (fun k _ acc -> k :: acc) table [] in 639 + (* Sort by tag for deterministic re-emission. *) 640 + let tags = List.sort compare tags in 641 + List.iter 642 + (fun tag -> 643 + let rvals = Hashtbl.find table tag in 644 + List.iter 645 + (fun (_, wv) -> write_unknown_field buf tag wv) 646 + (List.rev !rvals)) 647 + tags; 648 + Buffer.contents buf 649 + 650 + let decode_packed_or_repeated : type a. a t -> wire_value list -> a list = 651 + fun codec values -> 652 + (* For a repeated field, each element in [values] can be either a 653 + scalar wire value (non-packed) or a length-delimited blob 654 + containing the concatenation (packed). The protobuf spec 655 + requires decoders to accept both forms on the same field for 656 + compatibility. *) 657 + let acc = ref [] in 658 + List.iter 659 + (fun w -> 660 + match (w, wire_type_of codec) with 661 + | WV_length_delim body, (Wire.Varint | Wire.Fixed32 | Wire.Fixed64) -> 662 + (* Packed form: parse body as a sequence of raw values. *) 663 + let pos = ref 0 in 664 + let len = String.length body in 665 + while !pos < len do 666 + let v, off = decode_bytes codec body !pos in 667 + acc := v :: !acc; 668 + pos := off 669 + done 670 + | _ -> acc := decode_value codec w :: !acc) 671 + values; 672 + List.rev !acc 673 + 674 + let rec decode_fields : type o a. 675 + (int, (int * wire_value) list ref) Hashtbl.t -> (o, a) field -> a = 676 + fun table m -> 677 + match m with 678 + | Return a -> a 679 + | Required { tag; codec; cont; _ } -> 680 + let v = 681 + match take_last table tag with 682 + | Some w -> decode_value codec w 683 + | None -> default_of codec 684 + in 685 + decode_fields table (cont v) 686 + | Optional { tag; codec; cont; _ } -> 687 + let v = 688 + match take_last table tag with 689 + | Some w -> Some (decode_value codec w) 690 + | None -> None 691 + in 692 + decode_fields table (cont v) 693 + | Repeated { tag; codec; cont; _ } -> 694 + let vs = decode_packed_or_repeated codec (take_all table tag) in 695 + decode_fields table (cont vs) 696 + | Oneof { default; cases; cont; _ } -> 697 + let v = 698 + match take_oneof_last table cases with 699 + | None -> default 700 + | Some (Case { codec; inject; _ }, w) -> inject (decode_value codec w) 701 + in 702 + decode_fields table (cont v) 703 + 704 + (* A [map<K, V>] field is sugar for [repeated Entry { K key = 1; V value 705 + = 2 }] on the wire — each entry is a length-delimited submessage with 706 + two required fields. [map_entry_codec] builds the entry codec; [map] 707 + declares a field that collects a list of [(k, v)] pairs. *) 708 + let map_entry_codec : type k v. k t -> v t -> (k * v) t = 709 + fun key_codec value_codec -> 710 + (* Constructed directly (not via [finish]) because the entry is an 711 + ephemeral tuple rather than a named record — skips the 712 + let* / return chain. *) 713 + let encode_body buf (k, v) = 714 + if k <> default_of key_codec then begin 715 + Wire.write_tag buf ~field_number:1 ~wire_type:(wire_type_of key_codec); 716 + write_value buf key_codec k 717 + end; 718 + if v <> default_of value_codec then begin 719 + Wire.write_tag buf ~field_number:2 ~wire_type:(wire_type_of value_codec); 720 + write_value buf value_codec v 721 + end 722 + in 723 + let decode_body s start end_ = 724 + let table = parse_wire s start end_ in 725 + let k = 726 + match take_last table 1 with 727 + | Some w -> decode_value key_codec w 728 + | None -> default_of key_codec 729 + in 730 + let v = 731 + match take_last table 2 with 732 + | Some w -> decode_value value_codec w 733 + | None -> default_of value_codec 734 + in 735 + (k, v) 736 + in 737 + let decode_body_with_unknowns s start end_ = 738 + (* Map entries don't carry unknown fields in the protobuf spec; 739 + any extra tags are silently dropped. *) 740 + (decode_body s start end_, "") 741 + in 742 + let msg_default = (default_of key_codec, default_of value_codec) in 743 + Message { encode_body; decode_body; decode_body_with_unknowns; msg_default } 744 + 745 + let map tag get key_codec value_codec = 746 + Repeated 747 + { 748 + tag; 749 + get; 750 + codec = map_entry_codec key_codec value_codec; 751 + packed = false; 752 + cont = (fun x -> Return x); 753 + } 754 + 755 + let finish : type o. (o, o) field -> o t = 756 + fun spec -> 757 + let encode_body buf o = encode_fields buf o spec in 758 + let decode_body s start end_ = 759 + let table = parse_wire s start end_ in 760 + decode_fields table spec 761 + in 762 + let decode_body_with_unknowns s start end_ = 763 + let table = parse_wire s start end_ in 764 + let value = decode_fields table spec in 765 + (value, collect_unknowns table) 766 + in 767 + (* A message with no fields populated: all scalars take their 768 + default, repeated fields are empty, optionals are [None]. *) 769 + let msg_default = decode_body "" 0 0 in 770 + Message { encode_body; decode_body; decode_body_with_unknowns; msg_default } 771 + end 772 + 773 + (* -- Top-level encode / decode. 774 + 775 + Messages and length-delimited scalars at top level write just the 776 + body (no outer tag or length prefix). Other scalars write their raw 777 + value bytes — useful for low-level round-trip tests. *) 778 + 779 + let encode_string : type a. a t -> a -> string = 780 + fun codec v -> 781 + let buf = Buffer.create 64 in 782 + (match codec with 783 + | Message m -> m.encode_body buf v 784 + | Length_delim b -> Buffer.add_string buf (b.enc v) 785 + | Varint _ | Fixed32 _ | Fixed64 _ -> write_value buf codec v 786 + | Rec c -> ( 787 + match Lazy.force c with 788 + | Message m -> m.encode_body buf v 789 + | Length_delim b -> Buffer.add_string buf (b.enc v) 790 + | other -> write_value buf other v)); 791 + Buffer.contents buf 792 + 793 + let decode_string : type a. a t -> string -> (a, Error.t) result = 794 + fun codec s -> 795 + depth := 0; 796 + try 797 + match codec with 798 + | Message m -> Ok (m.decode_body s 0 (String.length s)) 799 + | Length_delim b -> Ok (b.dec s) 800 + | Varint _ | Fixed32 _ | Fixed64 _ -> 801 + let v, off = decode_bytes codec s 0 in 802 + if off <> String.length s then 803 + Error 804 + (Error.of_wire_error 805 + (Fmt.str "trailing %d bytes after scalar" 806 + (String.length s - off))) 807 + else Ok v 808 + | Rec c -> ( 809 + match Lazy.force c with 810 + | Message m -> Ok (m.decode_body s 0 (String.length s)) 811 + | Length_delim b -> Ok (b.dec s) 812 + | other -> 813 + let v, off = decode_bytes other s 0 in 814 + if off <> String.length s then 815 + Error 816 + (Error.of_wire_error 817 + (Fmt.str "trailing %d bytes after scalar" 818 + (String.length s - off))) 819 + else Ok v) 820 + with Wire.Decode_error msg -> Error (Error.of_wire_error msg) 821 + 822 + let encode codec w v = 823 + let s = encode_string codec v in 824 + Bytesrw.Bytes.Writer.write_string w s 825 + 826 + let decode codec r = 827 + let s = Bytesrw.Bytes.Reader.to_string r in 828 + decode_string codec s 829 + 830 + (* -- Unknown-field preservation -- 831 + 832 + Protobuf's wire format guarantees: a decoder ignorant of a field's 833 + tag MUST skip it; a decoder aware of the tag MUST read it. For 834 + forward-compatible pipelines that decode, mutate, and re-emit a 835 + message, the unknown fields should survive the round-trip. 836 + 837 + [decode_with_unknowns_string] returns both the decoded message and 838 + a byte string containing the wire bytes of every tag the schema 839 + didn't claim. [encode_with_unknowns_string] appends those bytes 840 + back when re-emitting. 841 + 842 + Caveats: 843 + - Byte equality is not preserved. Unknowns are re-serialized in 844 + canonical form (varint-minimal encoding) and sorted by tag. 845 + - Only works for Message codecs. Calling on a bare scalar returns 846 + [Error]. *) 847 + 848 + let decode_with_unknowns_string : type a. 849 + a t -> string -> (a * string, Error.t) result = 850 + fun codec s -> 851 + depth := 0; 852 + try 853 + match codec with 854 + | Message m -> Ok (m.decode_body_with_unknowns s 0 (String.length s)) 855 + | Rec c -> ( 856 + match Lazy.force c with 857 + | Message m -> Ok (m.decode_body_with_unknowns s 0 (String.length s)) 858 + | _ -> 859 + Error 860 + (Error.of_wire_error 861 + "decode_with_unknowns_string: codec is not a message")) 862 + | _ -> 863 + Error 864 + (Error.of_wire_error 865 + "decode_with_unknowns_string: codec is not a message") 866 + with Wire.Decode_error msg -> Error (Error.of_wire_error msg) 867 + 868 + let encode_with_unknowns_string : type a. a t -> unknowns:string -> a -> string 869 + = 870 + fun codec ~unknowns v -> 871 + let buf = Buffer.create 64 in 872 + (match codec with 873 + | Message m -> 874 + m.encode_body buf v; 875 + Buffer.add_string buf unknowns 876 + | Rec c -> ( 877 + match Lazy.force c with 878 + | Message m -> 879 + m.encode_body buf v; 880 + Buffer.add_string buf unknowns 881 + | _ -> invalid_arg "encode_with_unknowns_string: codec is not a message") 882 + | _ -> invalid_arg "encode_with_unknowns_string: codec is not a message"); 883 + Buffer.contents buf
+81
lib/codec.mli
··· 1 + (** Codec GADT and combinators. 2 + 3 + The codec is a finally-tagged GADT whose top-level constructors name the 4 + four protobuf wire types (varint, fixed32, fixed64, length-delimited) plus 5 + composition and recursion. Interpreters (encode, decode, {!pp}) 6 + pattern-match on the GADT; adding a new interpreter is adding a new walker 7 + alongside the existing ones, no change to the combinator call sites. *) 8 + 9 + (** {1 Codec type} *) 10 + 11 + type 'a t 12 + (** A codec for protobuf values of type ['a]. *) 13 + 14 + val pp : Format.formatter -> 'a t -> unit 15 + (** [pp ppf c] prints a short human-readable sort tag for the codec (e.g. 16 + ["int32"], ["fixed64"], ["message"]). Intended for debugging; not a 17 + round-trippable representation. *) 18 + 19 + (** {1 Scalars} *) 20 + 21 + val int32 : int32 t 22 + val int64 : int64 t 23 + val uint32 : int32 t 24 + val uint64 : int64 t 25 + val sint32 : int32 t 26 + val sint64 : int64 t 27 + val fixed32 : int32 t 28 + val fixed64 : int64 t 29 + val sfixed32 : int32 t 30 + val sfixed64 : int64 t 31 + val float : float t 32 + val double : float t 33 + val bool : bool t 34 + val string : string t 35 + val bytes : string t 36 + 37 + (** {1 Messages} *) 38 + 39 + module Message : sig 40 + type ('o, 'a) field 41 + 42 + val ( let* ) : ('o, 'a) field -> ('a -> ('o, 'b) field) -> ('o, 'b) field 43 + val return : 'a -> ('o, 'a) field 44 + val finish : ('o, 'o) field -> 'o t 45 + val required : int -> ('o -> 'a) -> 'a t -> ('o, 'a) field 46 + val optional : int -> ('o -> 'a option) -> 'a t -> ('o, 'a option) field 47 + val repeated : int -> ('o -> 'a list) -> 'a t -> ('o, 'a list) field 48 + val packed : int -> ('o -> 'a list) -> 'a t -> ('o, 'a list) field 49 + 50 + val map : 51 + int -> ('o -> ('k * 'v) list) -> 'k t -> 'v t -> ('o, ('k * 'v) list) field 52 + 53 + type 'a case 54 + 55 + val case : 56 + int -> 'a t -> inject:('a -> 'b) -> extract:('b -> 'a option) -> 'b case 57 + 58 + val oneof : default:'a -> ('o -> 'a) -> 'a case list -> ('o, 'a) field 59 + end 60 + 61 + val fix : default:'a -> ('a t -> 'a t) -> 'a t 62 + (** [fix ~default f] builds a self-referential codec. *) 63 + 64 + (**/**) 65 + 66 + (** {1 Internals} 67 + 68 + Exposed for the top-level [Protobuf] module's reading/writing entry points. 69 + Not part of the stable public API. *) 70 + 71 + val encode_string : 'a t -> 'a -> string 72 + val decode_string : 'a t -> string -> ('a, Error.t) result 73 + val encode : 'a t -> Bytesrw.Bytes.Writer.t -> 'a -> unit 74 + val decode : 'a t -> Bytesrw.Bytes.Reader.t -> ('a, Error.t) result 75 + 76 + val decode_with_unknowns_string : 77 + 'a t -> string -> ('a * string, Error.t) result 78 + 79 + val encode_with_unknowns_string : 'a t -> unknowns:string -> 'a -> string 80 + 81 + (**/**)
+31 -883
lib/protobuf.ml
··· 1 - (* Protocol Buffers codec, finally-tagged. 2 - 3 - The top-level alphabet names the four protobuf wire types plus 4 - message-level composition. Interpreters (encode, decode) walk the 5 - [Codec.t] GADT; adding a new interpreter (schema extraction, 6 - pretty-printer, diff) is adding a new walker without touching the 7 - combinator call sites. *) 8 - 9 1 module Wire = Wire 10 2 module Error = Error 11 - 12 - (* -- Nested-message depth tracking. 13 - 14 - An adversarial input with thousands of levels of nested Length_delim 15 - fields would stack-overflow the runtime: each nesting level is one 16 - OCaml stack frame in the recursive message decoder. Bound the depth 17 - at [max_depth] by default (matching protoc's C++/Java default) and 18 - fail the decode with a clean [Decode_error] when exceeded. *) 19 - let max_depth = 100 20 - let depth = ref 0 21 - 22 - let with_depth_check f = 23 - if !depth >= max_depth then 24 - raise 25 - (Wire.Decode_error (Fmt.str "nested message depth %d exceeded" max_depth)); 26 - incr depth; 27 - Fun.protect ~finally:(fun () -> decr depth) f 28 - 29 - (* -- Pre-parsed wire values. 30 - 31 - The message decoder parses the byte stream into a [(tag, wire_value 32 - list)] table before running the field GADT, so tags can appear in 33 - any order on the wire. [wire_value] is internal — the public GADT 34 - [t] names the logical types, not the raw bytes. *) 35 - type wire_value = 36 - | WV_varint of int64 37 - | WV_fixed32 of int32 38 - | WV_fixed64 of int64 39 - | WV_length_delim of string 40 - 41 - let wire_value_type = function 42 - | WV_varint _ -> Wire.Varint 43 - | WV_fixed32 _ -> Wire.Fixed32 44 - | WV_fixed64 _ -> Wire.Fixed64 45 - | WV_length_delim _ -> Wire.Length_delimited 46 - 47 - (* Sort: the 15 protobuf scalar types plus Message. 48 - Used for error messages like "expected int32, got string". *) 49 - module Sort = struct 50 - type t = 51 - | Int32 52 - | Int64 53 - | Uint32 54 - | Uint64 55 - | Sint32 56 - | Sint64 57 - | Fixed32 58 - | Fixed64 59 - | Sfixed32 60 - | Sfixed64 61 - | Float 62 - | Double 63 - | Bool 64 - | String 65 - | Bytes 66 - | Message 67 - 68 - let to_string = function 69 - | Int32 -> "int32" 70 - | Int64 -> "int64" 71 - | Uint32 -> "uint32" 72 - | Uint64 -> "uint64" 73 - | Sint32 -> "sint32" 74 - | Sint64 -> "sint64" 75 - | Fixed32 -> "fixed32" 76 - | Fixed64 -> "fixed64" 77 - | Sfixed32 -> "sfixed32" 78 - | Sfixed64 -> "sfixed64" 79 - | Float -> "float" 80 - | Double -> "double" 81 - | Bool -> "bool" 82 - | String -> "string" 83 - | Bytes -> "bytes" 84 - | Message -> "message" 85 - 86 - let pp ppf s = Fmt.string ppf (to_string s) 87 - end 88 - 89 - (* Typed conversion from a wire-level representation to an OCaml value. 90 - The wire representation is determined by which GADT constructor 91 - wraps this record: [Varint] pairs with [int64], [Fixed32] with 92 - [int32], etc. *) 93 - type ('w, 'a) base = { 94 - sort : Sort.t; 95 - dec : 'w -> 'a; 96 - enc : 'a -> 'w; 97 - default : 'a; 98 - } 99 - 100 - (* Internal encoder/decoder pair for a message. Walking the [Message] 101 - node branches into these; they are not part of the public 102 - combinator vocabulary. *) 103 - type 'o message_spec = { 104 - encode_body : Buffer.t -> 'o -> unit; 105 - decode_body : string -> int -> int -> 'o; 106 - (* Like [decode_body], but also returns the raw wire bytes of any 107 - fields whose tag was not in the schema (re-serialized in canonical 108 - form so they can be appended to a later encode). *) 109 - decode_body_with_unknowns : string -> int -> int -> 'o * string; 110 - msg_default : 'o; 111 - } 112 - 113 - (* The Codec GADT. 114 - 115 - Each constructor names a FORMAT-level sort (wire type for scalars, 116 - plus composition/recursion). Users build codecs via the combinators 117 - below; interpreters destructure. *) 118 - type _ t = 119 - | Varint : (int64, 'a) base -> 'a t 120 - | Fixed32 : (int32, 'a) base -> 'a t 121 - | Fixed64 : (int64, 'a) base -> 'a t 122 - | Length_delim : (string, 'a) base -> 'a t 123 - | Message : 'a message_spec -> 'a t 124 - | Rec : 'a t Lazy.t -> 'a t 125 - 126 - (* Expose a few witnesses so callers can pattern-match the wire type 127 - without destructuring the GADT (useful for field-level code). *) 128 - 129 - let pp : type a. Format.formatter -> a t -> unit = 130 - fun ppf -> function 131 - | Varint b -> Sort.pp ppf b.sort 132 - | Fixed32 b -> Sort.pp ppf b.sort 133 - | Fixed64 b -> Sort.pp ppf b.sort 134 - | Length_delim b -> Sort.pp ppf b.sort 135 - | Message _ -> Fmt.string ppf "message" 136 - | Rec _ -> Fmt.string ppf "rec message" 137 - 138 - let wire_type_of : type a. a t -> Wire.t = function 139 - | Varint _ -> Wire.Varint 140 - | Fixed32 _ -> Wire.Fixed32 141 - | Fixed64 _ -> Wire.Fixed64 142 - | Length_delim _ -> Wire.Length_delimited 143 - | Message _ -> Wire.Length_delimited 144 - | Rec c -> ( 145 - (* The Lazy may not be forced yet; peek safely. *) 146 - match Lazy.force c with 147 - | Varint _ -> Wire.Varint 148 - | Fixed32 _ -> Wire.Fixed32 149 - | Fixed64 _ -> Wire.Fixed64 150 - | Length_delim _ -> Wire.Length_delimited 151 - | Message _ -> Wire.Length_delimited 152 - | Rec _ -> Wire.Length_delimited) 153 - 154 - let default_of : type a. a t -> a = function 155 - | Varint b -> b.default 156 - | Fixed32 b -> b.default 157 - | Fixed64 b -> b.default 158 - | Length_delim b -> b.default 159 - | Message m -> m.msg_default 160 - | Rec c -> ( 161 - match Lazy.force c with 162 - | Varint b -> b.default 163 - | Fixed32 b -> b.default 164 - | Fixed64 b -> b.default 165 - | Length_delim b -> b.default 166 - | Message m -> m.msg_default 167 - | Rec _ -> assert false) 168 - 169 - (* -- Wire-value extraction, typed errors -- *) 170 - 171 - let type_error ~sort expected got = 172 - raise 173 - (Wire.Decode_error 174 - (Fmt.str "%a: expected wire type %a, got %a" Sort.pp sort Wire.pp 175 - expected Wire.pp (wire_value_type got))) 176 - 177 - let varint_of ~sort = function 178 - | WV_varint v -> v 179 - | w -> type_error ~sort Wire.Varint w 180 - 181 - let fixed32_of ~sort = function 182 - | WV_fixed32 v -> v 183 - | w -> type_error ~sort Wire.Fixed32 w 184 - 185 - let fixed64_of ~sort = function 186 - | WV_fixed64 v -> v 187 - | w -> type_error ~sort Wire.Fixed64 w 188 - 189 - let length_delim_of ~sort = function 190 - | WV_length_delim s -> s 191 - | w -> type_error ~sort Wire.Length_delimited w 192 - 193 - (* -- Walk-based encode / decode over the GADT -- *) 194 - 195 - let rec decode_value : type a. a t -> wire_value -> a = 196 - fun codec w -> 197 - match codec with 198 - | Varint b -> b.dec (varint_of ~sort:b.sort w) 199 - | Fixed32 b -> b.dec (fixed32_of ~sort:b.sort w) 200 - | Fixed64 b -> b.dec (fixed64_of ~sort:b.sort w) 201 - | Length_delim b -> b.dec (length_delim_of ~sort:b.sort w) 202 - | Message m -> 203 - let body = length_delim_of ~sort:Sort.Message w in 204 - with_depth_check (fun () -> m.decode_body body 0 (String.length body)) 205 - | Rec c -> decode_value (Lazy.force c) w 206 - 207 - (* [decode_bytes] reads a bare value at a byte offset. Only used for 208 - packed decoding where the values are concatenated without tags. 209 - Length-delimited codecs are not packable. *) 210 - let rec decode_bytes : type a. a t -> string -> int -> a * int = 211 - fun codec s off -> 212 - match codec with 213 - | Varint b -> 214 - let v, off' = Wire.read_int64 s off in 215 - (b.dec v, off') 216 - | Fixed32 b -> 217 - let v, off' = Wire.read_fixed32 s off in 218 - (b.dec v, off') 219 - | Fixed64 b -> 220 - let v, off' = Wire.read_fixed64 s off in 221 - (b.dec v, off') 222 - | Length_delim _ | Message _ -> 223 - raise 224 - (Wire.Decode_error 225 - "length-delimited codec cannot appear inside a packed field") 226 - | Rec c -> decode_bytes (Lazy.force c) s off 227 - 228 - let rec write_value : type a. Buffer.t -> a t -> a -> unit = 229 - fun buf codec v -> 230 - match codec with 231 - | Varint b -> Wire.write_int64 buf (b.enc v) 232 - | Fixed32 b -> Wire.write_fixed32 buf (b.enc v) 233 - | Fixed64 b -> Wire.write_fixed64 buf (b.enc v) 234 - | Length_delim b -> Wire.write_string buf (b.enc v) 235 - | Message m -> 236 - let body = Buffer.create 64 in 237 - m.encode_body body v; 238 - Leb128.add_u63_to_buffer buf (Buffer.length body); 239 - Buffer.add_buffer buf body 240 - | Rec c -> write_value buf (Lazy.force c) v 241 - 242 - (* -- Scalar codecs: 15 protobuf scalar types, grouped by wire type. -- *) 243 - 244 - let int32 : int32 t = 245 - Varint 246 - { sort = Int32; dec = Int64.to_int32; enc = Int64.of_int32; default = 0l } 247 - 248 - let int64 : int64 t = 249 - Varint { sort = Int64; dec = (fun x -> x); enc = (fun x -> x); default = 0L } 250 - 251 - let uint32 : int32 t = 252 - Varint 253 - { 254 - sort = Uint32; 255 - dec = (fun x -> Int64.to_int32 (Int64.logand x 0xFFFF_FFFFL)); 256 - enc = (fun x -> Int64.logand (Int64.of_int32 x) 0xFFFF_FFFFL); 257 - default = 0l; 258 - } 259 - 260 - let uint64 : int64 t = 261 - Varint { sort = Uint64; dec = (fun x -> x); enc = (fun x -> x); default = 0L } 262 - 263 - let sint32 : int32 t = 264 - Varint 265 - { 266 - sort = Sint32; 267 - dec = (fun x -> Int64.to_int32 (Leb128.zigzag_decode_i64 x)); 268 - enc = (fun x -> Leb128.zigzag_encode_i64 (Int64.of_int32 x)); 269 - default = 0l; 270 - } 271 - 272 - let sint64 : int64 t = 273 - Varint 274 - { 275 - sort = Sint64; 276 - dec = Leb128.zigzag_decode_i64; 277 - enc = Leb128.zigzag_encode_i64; 278 - default = 0L; 279 - } 280 - 281 - let fixed32 : int32 t = 282 - Fixed32 283 - { sort = Fixed32; dec = (fun x -> x); enc = (fun x -> x); default = 0l } 284 - 285 - let fixed64 : int64 t = 286 - Fixed64 287 - { sort = Fixed64; dec = (fun x -> x); enc = (fun x -> x); default = 0L } 288 - 289 - let sfixed32 : int32 t = 290 - Fixed32 291 - { sort = Sfixed32; dec = (fun x -> x); enc = (fun x -> x); default = 0l } 292 - 293 - let sfixed64 : int64 t = 294 - Fixed64 295 - { sort = Sfixed64; dec = (fun x -> x); enc = (fun x -> x); default = 0L } 296 - 297 - let float : float t = 298 - Fixed32 299 - { 300 - sort = Float; 301 - dec = Int32.float_of_bits; 302 - enc = Int32.bits_of_float; 303 - default = 0.0; 304 - } 305 - 306 - let double : float t = 307 - Fixed64 308 - { 309 - sort = Double; 310 - dec = Int64.float_of_bits; 311 - enc = Int64.bits_of_float; 312 - default = 0.0; 313 - } 314 - 315 - let bool : bool t = 316 - Varint 317 - { 318 - sort = Bool; 319 - dec = (fun x -> not (Int64.equal x 0L)); 320 - enc = (fun b -> if b then 1L else 0L); 321 - default = false; 322 - } 323 - 324 - let string : string t = 325 - Length_delim 326 - { sort = String; dec = (fun x -> x); enc = (fun x -> x); default = "" } 327 - 328 - let bytes : string t = 329 - Length_delim 330 - { sort = Bytes; dec = (fun x -> x); enc = (fun x -> x); default = "" } 331 - 332 - (* -- Recursive codecs -- 333 - 334 - Protobuf schemas can be self-referential (a tree node containing a 335 - list of child nodes of the same type). [fix] lets callers build a 336 - codec that references itself: [f] is invoked with a forwarding 337 - placeholder; any self-references in [f]'s body resolve to the 338 - final codec at decode/encode time via [Lazy]. *) 339 - 340 - let fix : type a. default:a -> (a t -> a t) -> a t = 341 - fun ~default f -> 342 - let rec lazy_codec = lazy (f (Rec lazy_codec)) in 343 - let _ = default in 344 - (* [default] is reserved for a future extension: currently all 345 - recursive codecs collapse to Message at runtime, whose 346 - [msg_default] carries the default. Keep the parameter in the API 347 - so callers don't need to change when we thread it. *) 348 - Rec lazy_codec 349 - 350 - (* -- Message combinators -- 351 - 352 - The [(o, a) field] GADT captures a sequence of field declarations 353 - and the continuation that builds the record value. Encoding walks 354 - the GADT in declaration order and emits (tag, value) per field. 355 - Decoding pre-parses the wire into a tag -> wire_value list table 356 - and then walks the same GADT, looking each field up in the table. *) 357 - 358 - module Message = struct 359 - type (_, _) field = 360 - | Return : 'a -> ('o, 'a) field 361 - | Required : { 362 - tag : int; 363 - get : 'o -> 'x; 364 - codec : 'x t; 365 - cont : 'x -> ('o, 'a) field; 366 - } 367 - -> ('o, 'a) field 368 - | Optional : { 369 - tag : int; 370 - get : 'o -> 'x option; 371 - codec : 'x t; 372 - cont : 'x option -> ('o, 'a) field; 373 - } 374 - -> ('o, 'a) field 375 - | Repeated : { 376 - tag : int; 377 - get : 'o -> 'x list; 378 - codec : 'x t; 379 - packed : bool; 380 - cont : 'x list -> ('o, 'a) field; 381 - } 382 - -> ('o, 'a) field 383 - | Oneof : { 384 - get : 'o -> 'x; 385 - default : 'x; 386 - cases : 'x case list; 387 - cont : 'x -> ('o, 'a) field; 388 - } 389 - -> ('o, 'a) field 390 - 391 - and 'a case = 392 - | Case : { 393 - tag : int; 394 - codec : 'x t; 395 - inject : 'x -> 'a; 396 - extract : 'a -> 'x option; 397 - } 398 - -> 'a case 399 - 400 - let return v = Return v 401 - 402 - let required tag get codec = 403 - Required { tag; get; codec; cont = (fun x -> Return x) } 404 - 405 - let optional tag get codec = 406 - Optional { tag; get; codec; cont = (fun x -> Return x) } 407 - 408 - let repeated tag get codec = 409 - Repeated { tag; get; codec; packed = false; cont = (fun x -> Return x) } 410 - 411 - let packed tag get codec = 412 - Repeated { tag; get; codec; packed = true; cont = (fun x -> Return x) } 413 - 414 - let case tag codec ~inject ~extract = Case { tag; codec; inject; extract } 415 - 416 - let oneof ~default get cases = 417 - Oneof { get; default; cases; cont = (fun x -> Return x) } 418 - 419 - let rec ( let* ) : type o a b. 420 - (o, a) field -> (a -> (o, b) field) -> (o, b) field = 421 - fun m f -> 422 - match m with 423 - | Return a -> f a 424 - | Required r -> 425 - Required 426 - { 427 - r with 428 - cont = 429 - (fun x -> 430 - let* y = r.cont x in 431 - f y); 432 - } 433 - | Optional r -> 434 - Optional 435 - { 436 - r with 437 - cont = 438 - (fun x -> 439 - let* y = r.cont x in 440 - f y); 441 - } 442 - | Repeated r -> 443 - Repeated 444 - { 445 - r with 446 - cont = 447 - (fun x -> 448 - let* y = r.cont x in 449 - f y); 450 - } 451 - | Oneof r -> 452 - Oneof 453 - { 454 - r with 455 - cont = 456 - (fun x -> 457 - let* y = r.cont x in 458 - f y); 459 - } 3 + module Codec = Codec 460 4 461 - (* -- Encoding -- *) 5 + type 'a t = 'a Codec.t 462 6 463 - let write_field buf ~tag codec v = 464 - Wire.write_tag buf ~field_number:tag ~wire_type:(wire_type_of codec); 465 - write_value buf codec v 7 + let pp = Codec.pp 466 8 467 - let write_packed buf ~tag codec vs = 468 - (* Concatenate raw value bytes into a scratch buffer, then emit as a 469 - single length-delimited blob. *) 470 - let body = Buffer.create 64 in 471 - let rec emit_list = function 472 - | [] -> () 473 - | v :: rest -> 474 - (match codec with 475 - | Varint b -> Wire.write_int64 body (b.enc v) 476 - | Fixed32 b -> Wire.write_fixed32 body (b.enc v) 477 - | Fixed64 b -> Wire.write_fixed64 body (b.enc v) 478 - | Length_delim _ | Message _ -> 479 - raise 480 - (Wire.Decode_error 481 - "length-delimited codec cannot be used inside a packed field") 482 - | Rec _ -> 483 - raise 484 - (Wire.Decode_error 485 - "recursive codec cannot be used inside a packed field")); 486 - emit_list rest 487 - in 488 - emit_list vs; 489 - Wire.write_tag buf ~field_number:tag ~wire_type:Length_delimited; 490 - Leb128.add_u63_to_buffer buf (Buffer.length body); 491 - Buffer.add_buffer buf body 9 + (* Scalar re-exports for ergonomics. Users can also reach them via 10 + [Protobuf.Codec.int32] etc. *) 11 + let int32 = Codec.int32 12 + let int64 = Codec.int64 13 + let uint32 = Codec.uint32 14 + let uint64 = Codec.uint64 15 + let sint32 = Codec.sint32 16 + let sint64 = Codec.sint64 17 + let fixed32 = Codec.fixed32 18 + let fixed64 = Codec.fixed64 19 + let sfixed32 = Codec.sfixed32 20 + let sfixed64 = Codec.sfixed64 21 + let float = Codec.float 22 + let double = Codec.double 23 + let bool = Codec.bool 24 + let string = Codec.string 25 + let bytes = Codec.bytes 492 26 493 - let rec encode_fields : type o a. Buffer.t -> o -> (o, a) field -> unit = 494 - fun buf o m -> 495 - match m with 496 - | Return _ -> () 497 - | Required { tag; get; codec; cont } -> 498 - let v = get o in 499 - (* proto3 semantics: omit a required scalar field that equals 500 - the codec's default. *) 501 - if v <> default_of codec then write_field buf ~tag codec v; 502 - encode_fields buf o (cont v) 503 - | Optional { tag; get; codec; cont } -> 504 - let v_opt = get o in 505 - (match v_opt with Some v -> write_field buf ~tag codec v | None -> ()); 506 - encode_fields buf o (cont v_opt) 507 - | Repeated { tag; get; codec; packed; cont } -> 508 - let vs = get o in 509 - (match vs with 510 - | [] -> () 511 - | _ when packed -> write_packed buf ~tag codec vs 512 - | _ -> List.iter (write_field buf ~tag codec) vs); 513 - encode_fields buf o (cont vs) 514 - | Oneof { get; cases; cont; _ } -> 515 - let v = get o in 516 - let rec emit_case = function 517 - | [] -> () 518 - | Case { tag; codec; extract; _ } :: rest -> ( 519 - match extract v with 520 - | Some x -> write_field buf ~tag codec x 521 - | None -> emit_case rest) 522 - in 523 - emit_case cases; 524 - encode_fields buf o (cont v) 27 + module Message = Codec.Message 525 28 526 - (* -- Decoding helpers -- *) 29 + let fix = Codec.fix 527 30 528 - (* Parse the wire into a tag -> [(seq, wire_value) list] table. The 529 - sequence counter captures global wire order across tags, so 530 - [oneof] can determine which of its alternative cases came last. 531 - Within each bucket the list is stored in reverse wire order 532 - (prepend on insert), so [List.hd] is the last-added entry. *) 533 - let parse_wire s start end_ : (int, (int * wire_value) list ref) Hashtbl.t = 534 - let table = Hashtbl.create 8 in 535 - let seq = ref 0 in 536 - let push tag v = 537 - let entry = (!seq, v) in 538 - incr seq; 539 - match Hashtbl.find_opt table tag with 540 - | Some r -> r := entry :: !r 541 - | None -> Hashtbl.add table tag (ref [ entry ]) 542 - in 543 - let pos = ref start in 544 - while !pos < end_ do 545 - let field_number, wt, off = Wire.read_tag s !pos in 546 - pos := off; 547 - match wt with 548 - | Wire.Varint -> 549 - let v, off = Wire.read_int64 s !pos in 550 - push field_number (WV_varint v); 551 - pos := off 552 - | Wire.Fixed32 -> 553 - let v, off = Wire.read_fixed32 s !pos in 554 - push field_number (WV_fixed32 v); 555 - pos := off 556 - | Wire.Fixed64 -> 557 - let v, off = Wire.read_fixed64 s !pos in 558 - push field_number (WV_fixed64 v); 559 - pos := off 560 - | Wire.Length_delimited -> 561 - let v, off = Wire.read_bytes s !pos in 562 - push field_number (WV_length_delim v); 563 - pos := off 564 - done; 565 - if !pos <> end_ then 566 - raise 567 - (Wire.Decode_error 568 - (Fmt.str "overran message boundary: at %d, expected end %d" !pos end_)); 569 - table 31 + (* -- Reading and writing -- *) 570 32 571 - (* [take_*] consume the matched entries out of the table so that after 572 - [decode_fields] returns, the remaining entries are exactly the 573 - unknown fields — tags that weren't claimed by the schema. *) 574 - 575 - let take_last table tag = 576 - match Hashtbl.find_opt table tag with 577 - | None -> None 578 - | Some r -> ( 579 - (* The list is in reverse wire order (last-added first). *) 580 - match !r with 581 - | [] -> None 582 - | (_, v) :: _ -> 583 - Hashtbl.remove table tag; 584 - Some v) 585 - 586 - let take_all table tag = 587 - match Hashtbl.find_opt table tag with 588 - | None -> [] 589 - | Some r -> 590 - Hashtbl.remove table tag; 591 - List.rev_map snd !r 592 - 593 - (* For oneof: find the case whose tag has the highest wire-sequence 594 - number. Returns [Some (case, wire_value)] if any case was on the 595 - wire, [None] otherwise. Removes the consumed case's entry from 596 - the table. *) 597 - let take_oneof_last : type a. 598 - (int, (int * wire_value) list ref) Hashtbl.t -> 599 - a case list -> 600 - (a case * wire_value) option = 601 - fun table cases -> 602 - let best = ref None in 603 - List.iter 604 - (fun (Case { tag; _ } as c) -> 605 - match Hashtbl.find_opt table tag with 606 - | None -> () 607 - | Some r -> ( 608 - match !r with 609 - | [] -> () 610 - | (seq, v) :: _ -> ( 611 - match !best with 612 - | None -> best := Some (seq, c, v) 613 - | Some (best_seq, _, _) when seq > best_seq -> 614 - best := Some (seq, c, v) 615 - | Some _ -> ()))) 616 - cases; 617 - (* Consume every case's tag from the table so they don't leak to 618 - unknowns. *) 619 - List.iter (fun (Case { tag; _ }) -> Hashtbl.remove table tag) cases; 620 - match !best with None -> None | Some (_, c, v) -> Some (c, v) 621 - 622 - let write_unknown_field buf tag = function 623 - | WV_varint v -> 624 - Wire.write_tag buf ~field_number:tag ~wire_type:Wire.Varint; 625 - Wire.write_int64 buf v 626 - | WV_fixed32 v -> 627 - Wire.write_tag buf ~field_number:tag ~wire_type:Wire.Fixed32; 628 - Wire.write_fixed32 buf v 629 - | WV_fixed64 v -> 630 - Wire.write_tag buf ~field_number:tag ~wire_type:Wire.Fixed64; 631 - Wire.write_fixed64 buf v 632 - | WV_length_delim s -> 633 - Wire.write_tag buf ~field_number:tag ~wire_type:Wire.Length_delimited; 634 - Wire.write_string buf s 635 - 636 - let collect_unknowns table = 637 - let buf = Buffer.create 16 in 638 - let tags = Hashtbl.fold (fun k _ acc -> k :: acc) table [] in 639 - (* Sort by tag for deterministic re-emission. *) 640 - let tags = List.sort compare tags in 641 - List.iter 642 - (fun tag -> 643 - let rvals = Hashtbl.find table tag in 644 - List.iter 645 - (fun (_, wv) -> write_unknown_field buf tag wv) 646 - (List.rev !rvals)) 647 - tags; 648 - Buffer.contents buf 649 - 650 - let decode_packed_or_repeated : type a. a t -> wire_value list -> a list = 651 - fun codec values -> 652 - (* For a repeated field, each element in [values] can be either a 653 - scalar wire value (non-packed) or a length-delimited blob 654 - containing the concatenation (packed). The protobuf spec 655 - requires decoders to accept both forms on the same field for 656 - compatibility. *) 657 - let acc = ref [] in 658 - List.iter 659 - (fun w -> 660 - match (w, wire_type_of codec) with 661 - | WV_length_delim body, (Wire.Varint | Wire.Fixed32 | Wire.Fixed64) -> 662 - (* Packed form: parse body as a sequence of raw values. *) 663 - let pos = ref 0 in 664 - let len = String.length body in 665 - while !pos < len do 666 - let v, off = decode_bytes codec body !pos in 667 - acc := v :: !acc; 668 - pos := off 669 - done 670 - | _ -> acc := decode_value codec w :: !acc) 671 - values; 672 - List.rev !acc 673 - 674 - let rec decode_fields : type o a. 675 - (int, (int * wire_value) list ref) Hashtbl.t -> (o, a) field -> a = 676 - fun table m -> 677 - match m with 678 - | Return a -> a 679 - | Required { tag; codec; cont; _ } -> 680 - let v = 681 - match take_last table tag with 682 - | Some w -> decode_value codec w 683 - | None -> default_of codec 684 - in 685 - decode_fields table (cont v) 686 - | Optional { tag; codec; cont; _ } -> 687 - let v = 688 - match take_last table tag with 689 - | Some w -> Some (decode_value codec w) 690 - | None -> None 691 - in 692 - decode_fields table (cont v) 693 - | Repeated { tag; codec; cont; _ } -> 694 - let vs = decode_packed_or_repeated codec (take_all table tag) in 695 - decode_fields table (cont vs) 696 - | Oneof { default; cases; cont; _ } -> 697 - let v = 698 - match take_oneof_last table cases with 699 - | None -> default 700 - | Some (Case { codec; inject; _ }, w) -> inject (decode_value codec w) 701 - in 702 - decode_fields table (cont v) 703 - 704 - (* A [map<K, V>] field is sugar for [repeated Entry { K key = 1; V value 705 - = 2 }] on the wire — each entry is a length-delimited submessage with 706 - two required fields. [map_entry_codec] builds the entry codec; [map] 707 - declares a field that collects a list of [(k, v)] pairs. *) 708 - let map_entry_codec : type k v. k t -> v t -> (k * v) t = 709 - fun key_codec value_codec -> 710 - (* Constructed directly (not via [finish]) because the entry is an 711 - ephemeral tuple rather than a named record — skips the 712 - let* / return chain. *) 713 - let encode_body buf (k, v) = 714 - if k <> default_of key_codec then begin 715 - Wire.write_tag buf ~field_number:1 ~wire_type:(wire_type_of key_codec); 716 - write_value buf key_codec k 717 - end; 718 - if v <> default_of value_codec then begin 719 - Wire.write_tag buf ~field_number:2 ~wire_type:(wire_type_of value_codec); 720 - write_value buf value_codec v 721 - end 722 - in 723 - let decode_body s start end_ = 724 - let table = parse_wire s start end_ in 725 - let k = 726 - match take_last table 1 with 727 - | Some w -> decode_value key_codec w 728 - | None -> default_of key_codec 729 - in 730 - let v = 731 - match take_last table 2 with 732 - | Some w -> decode_value value_codec w 733 - | None -> default_of value_codec 734 - in 735 - (k, v) 736 - in 737 - let decode_body_with_unknowns s start end_ = 738 - (* Map entries don't carry unknown fields in the protobuf spec; 739 - any extra tags are silently dropped. *) 740 - (decode_body s start end_, "") 741 - in 742 - let msg_default = (default_of key_codec, default_of value_codec) in 743 - Message { encode_body; decode_body; decode_body_with_unknowns; msg_default } 744 - 745 - let map tag get key_codec value_codec = 746 - Repeated 747 - { 748 - tag; 749 - get; 750 - codec = map_entry_codec key_codec value_codec; 751 - packed = false; 752 - cont = (fun x -> Return x); 753 - } 754 - 755 - let finish : type o. (o, o) field -> o t = 756 - fun spec -> 757 - let encode_body buf o = encode_fields buf o spec in 758 - let decode_body s start end_ = 759 - let table = parse_wire s start end_ in 760 - decode_fields table spec 761 - in 762 - let decode_body_with_unknowns s start end_ = 763 - let table = parse_wire s start end_ in 764 - let value = decode_fields table spec in 765 - (value, collect_unknowns table) 766 - in 767 - (* A message with no fields populated: all scalars take their 768 - default, repeated fields are empty, optionals are [None]. *) 769 - let msg_default = decode_body "" 0 0 in 770 - Message { encode_body; decode_body; decode_body_with_unknowns; msg_default } 771 - end 772 - 773 - (* -- Top-level encode / decode. 774 - 775 - Messages and length-delimited scalars at top level write just the 776 - body (no outer tag or length prefix). Other scalars write their raw 777 - value bytes — useful for low-level round-trip tests. *) 778 - 779 - let encode_string : type a. a t -> a -> string = 780 - fun codec v -> 781 - let buf = Buffer.create 64 in 782 - (match codec with 783 - | Message m -> m.encode_body buf v 784 - | Length_delim b -> Buffer.add_string buf (b.enc v) 785 - | Varint _ | Fixed32 _ | Fixed64 _ -> write_value buf codec v 786 - | Rec c -> ( 787 - match Lazy.force c with 788 - | Message m -> m.encode_body buf v 789 - | Length_delim b -> Buffer.add_string buf (b.enc v) 790 - | other -> write_value buf other v)); 791 - Buffer.contents buf 792 - 793 - let decode_string : type a. a t -> string -> (a, Error.t) result = 794 - fun codec s -> 795 - depth := 0; 796 - try 797 - match codec with 798 - | Message m -> Ok (m.decode_body s 0 (String.length s)) 799 - | Length_delim b -> Ok (b.dec s) 800 - | Varint _ | Fixed32 _ | Fixed64 _ -> 801 - let v, off = decode_bytes codec s 0 in 802 - if off <> String.length s then 803 - Error 804 - (Error.of_wire_error 805 - (Fmt.str "trailing %d bytes after scalar" 806 - (String.length s - off))) 807 - else Ok v 808 - | Rec c -> ( 809 - match Lazy.force c with 810 - | Message m -> Ok (m.decode_body s 0 (String.length s)) 811 - | Length_delim b -> Ok (b.dec s) 812 - | other -> 813 - let v, off = decode_bytes other s 0 in 814 - if off <> String.length s then 815 - Error 816 - (Error.of_wire_error 817 - (Fmt.str "trailing %d bytes after scalar" 818 - (String.length s - off))) 819 - else Ok v) 820 - with Wire.Decode_error msg -> Error (Error.of_wire_error msg) 821 - 822 - let encode codec w v = 823 - let s = encode_string codec v in 824 - Bytesrw.Bytes.Writer.write_string w s 825 - 826 - let decode codec r = 827 - let s = Bytesrw.Bytes.Reader.to_string r in 828 - decode_string codec s 829 - 830 - (* -- [of_*] / [to_*] entry points -- *) 831 - 832 - let of_string = decode_string 833 - let to_string = encode_string 33 + let of_string = Codec.decode_string 34 + let to_string = Codec.encode_string 834 35 835 36 let of_string_exn codec s = 836 - match decode_string codec s with Ok v -> v | Error e -> raise (Loc.Error e) 37 + match of_string codec s with Ok v -> v | Error e -> raise (Loc.Error e) 837 38 838 - let of_reader = decode 39 + let of_reader = Codec.decode 839 40 840 41 let of_reader_exn codec r = 841 - match decode codec r with Ok v -> v | Error e -> raise (Loc.Error e) 842 - 843 - let to_writer = encode 844 - 845 - (* -- Unknown-field preservation -- 42 + match of_reader codec r with Ok v -> v | Error e -> raise (Loc.Error e) 846 43 847 - Protobuf's wire format guarantees: a decoder ignorant of a field's 848 - tag MUST skip it; a decoder aware of the tag MUST read it. For 849 - forward-compatible pipelines that decode, mutate, and re-emit a 850 - message, the unknown fields should survive the round-trip. 851 - 852 - [decode_with_unknowns_string] returns both the decoded message and 853 - a byte string containing the wire bytes of every tag the schema 854 - didn't claim. [encode_with_unknowns_string] appends those bytes 855 - back when re-emitting. 856 - 857 - Caveats: 858 - - Byte equality is not preserved. Unknowns are re-serialized in 859 - canonical form (varint-minimal encoding) and sorted by tag. 860 - - Only works for Message codecs. Calling on a bare scalar returns 861 - [Error]. *) 862 - 863 - let decode_with_unknowns_string : type a. 864 - a t -> string -> (a * string, Error.t) result = 865 - fun codec s -> 866 - depth := 0; 867 - try 868 - match codec with 869 - | Message m -> Ok (m.decode_body_with_unknowns s 0 (String.length s)) 870 - | Rec c -> ( 871 - match Lazy.force c with 872 - | Message m -> Ok (m.decode_body_with_unknowns s 0 (String.length s)) 873 - | _ -> 874 - Error 875 - (Error.of_wire_error 876 - "decode_with_unknowns_string: codec is not a message")) 877 - | _ -> 878 - Error 879 - (Error.of_wire_error 880 - "decode_with_unknowns_string: codec is not a message") 881 - with Wire.Decode_error msg -> Error (Error.of_wire_error msg) 882 - 883 - let encode_with_unknowns_string : type a. a t -> unknowns:string -> a -> string 884 - = 885 - fun codec ~unknowns v -> 886 - let buf = Buffer.create 64 in 887 - (match codec with 888 - | Message m -> 889 - m.encode_body buf v; 890 - Buffer.add_string buf unknowns 891 - | Rec c -> ( 892 - match Lazy.force c with 893 - | Message m -> 894 - m.encode_body buf v; 895 - Buffer.add_string buf unknowns 896 - | _ -> invalid_arg "encode_with_unknowns_string: codec is not a message") 897 - | _ -> invalid_arg "encode_with_unknowns_string: codec is not a message"); 898 - Buffer.contents buf 44 + let to_writer = Codec.encode 45 + let decode_with_unknowns_string = Codec.decode_with_unknowns_string 46 + let encode_with_unknowns_string = Codec.encode_with_unknowns_string
-3
lib/protobuf.mli
··· 217 217 (** [to_writer c w v] encodes [v] and writes it to [w] as a single slice. Useful 218 218 for composition with other bytesrw pipelines. *) 219 219 220 - val decode : 'a t -> Bytesrw.Bytes.Reader.t -> ('a, Error.t) result 221 - (** Alias for {!of_reader}. *) 222 - 223 220 (** {1 Unknown field preservation} 224 221 225 222 Standard decoders drop fields whose tag is not in the schema. These variants