Persistent store with Git semantics: lazy reads, delayed writes, content-addressing
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

irmin/schema: add link/inline child constructors

The child variant [`Link of hash | `Inline of block] already expresses
the per-child external-vs-internal choice, but callers were writing
the backtick polymorphic-variant tags by hand. Add two thin wrappers
so the intent reads as code:

S.inline bytes instead of `Inline bytes
S.link hash instead of `Link hash

Update irmin_git.ml to use them in tree_parse / entry_parse.

Also expand the child-type docstring and document the link/inline
distinction in README.md (what it means, when to pick which, why the
schema stays pure and heap writes happen at flush).

No dynamic size-based picker yet; that belongs in a separate chunker
layer above the schema so dec/enc stay side-effect-free.

+45 -4
+24
README.md
··· 82 82 | None -> None 83 83 ``` 84 84 85 + ### Links vs inlines 86 + 87 + A child of a node is either *linked* (a separate content-addressed 88 + block, referenced by hash, deduplicated across the DAG) or *inlined* 89 + (bytes stored inside the parent block, not content-addressed on their 90 + own). The schema's `dec`/`enc` decides per child; the cursor walks 91 + both transparently. 92 + 93 + ```ocaml 94 + (* A Git tree entry: the permission bits live inline in the 95 + parent, the target blob lives as a separate Link. *) 96 + let entry_parse : S.dec = fun data -> 97 + let entry = parse_entry data in 98 + S.Named 99 + [ ("mode", S.inline (perm_to_string entry.perm)); 100 + ("target", S.link (irmin_hash entry.hash)) ] 101 + ``` 102 + 103 + Rule of thumb: anything you want to share across blocks (deduplicated, 104 + independently fetchable, reachable from proofs) should be a `link`. 105 + Anything you always want to materialise with the parent (a small flag, 106 + a permission, a short tag) should be `inline`. The schema stays pure 107 + either way — heap writes happen at `flush`, not inside `enc`. 108 + 85 109 ### Merge with typed strategies 86 110 87 111 ```ocaml
+4 -3
lib/git/irmin_git.ml
··· 28 28 S.Named 29 29 (Git.Tree.to_list tree 30 30 |> List.map (fun (entry : Git.Tree.entry) -> 31 - (entry.name, `Inline (Git.Tree.to_string (Git.Tree.v [ entry ]))))) 31 + (entry.name, S.inline (Git.Tree.to_string (Git.Tree.v [ entry ])))) 32 + ) 32 33 | Error _ -> S.Named [] 33 34 34 35 let entry_parse : S.dec = ··· 39 40 | [ entry ] -> 40 41 S.Named 41 42 [ 42 - ("mode", `Inline (Git.Tree.perm_to_string entry.perm)); 43 - ("target", `Link (irmin_hash entry.hash)); 43 + ("mode", S.inline (Git.Tree.perm_to_string entry.perm)); 44 + ("target", S.link (irmin_hash entry.hash)); 44 45 ] 45 46 | _ -> S.Named []) 46 47 | Error _ -> S.Named []
+4
lib/schema.ml
··· 7 7 end) = 8 8 struct 9 9 type child = [ `Link of H.hash | `Inline of H.block ] 10 + 11 + let link h : child = `Link h 12 + let inline b : child = `Inline b 13 + 10 14 type children = Named of (string * child) list | Indexed of child array 11 15 type dec = H.block -> children 12 16 type enc = children -> H.block
+13 -1
lib/schema.mli
··· 36 36 an inline blob or a link to another block. *) 37 37 38 38 type child = [ `Link of H.hash | `Inline of H.block ] 39 - (** A child entry: a hash link to another block, or an inline value. *) 39 + (** A child entry: a hash link to another block (external, hashed separately, 40 + deduplicated across the DAG), or an inline value (bytes stored inside the 41 + parent block, not content-addressed on their own). 42 + 43 + The decoder chooses per child whether to tag it as a link or inline; the 44 + same structural position can be either, and the cursor navigates both 45 + transparently ({!val-step}, {!val-get}). *) 46 + 47 + val link : H.hash -> child 48 + (** [link h] is [`Link h]. *) 49 + 50 + val inline : H.block -> child 51 + (** [inline b] is [`Inline b]. *) 40 52 41 53 type children = 42 54 | Named of (string * child) list