Pure OCaml B-tree implementation for persistent storage
0
fork

Configure Feed

Select the types of activity you want to include in your feed.

OCaml 98.4%
Dune 0.5%
Other 1.1%
43 1 0

Clone this repository

https://tangled.org/gazagnaire.org/ocaml-btree https://tangled.org/did:plc:jhift2vwcxhou52p3sewcrpx/ocaml-btree
git@git.recoil.org:gazagnaire.org/ocaml-btree git@git.recoil.org:did:plc:jhift2vwcxhou52p3sewcrpx/ocaml-btree

For self-hosted knots, clone URLs may differ based on your setup.

Download tar.gz
README.md

btree#

Pure OCaml B-tree storage engine with SQLite-compatible file format.

Overview#

A persistent, page-based B-tree that produces valid SQLite database files. You get a fast embedded storage engine whose files can be inspected and debugged with standard sqlite3 tools.

Two B-tree variants following the SQLite file format specification:

  • Table B-tree (B+tree) -- data lives only in leaves, interior nodes hold rowid keys and child pointers. Optimised for sequential scans.
  • Index B-tree -- keys stored in both interior and leaf nodes. Optimised for point lookups.

Features#

  • Page sizes 512 to 65536 (powers of 2)
  • Overflow pages for payloads exceeding U - 35 bytes
  • File-backed or purely in-memory operation
  • Configurable page cache

Installation#

Install with opam:

$ opam install btree

If opam cannot find the package, it may not yet be released in the public opam-repository. Add the overlay repository, then install it:

$ opam repo add samoht https://tangled.org/gazagnaire.org/opam-overlay.git
$ opam update
$ opam install btree

Usage#

Table: insert, find, iterate#

let demo () =
  let pager = Btree.Pager.mem ~page_size:4096 () in
  let table = Btree.Table.v pager in
  Btree.Table.insert table ~rowid:1L "Hello";
  Btree.Table.insert table ~rowid:2L "World";
  (match Btree.Table.find table 1L with
   | Some v -> Fmt.pr "1 -> %s@." v
   | None -> ());
  Btree.Table.iter table (fun rowid data ->
      Printf.printf "%Ld: %s\n" rowid data)

Index: add, membership, range scan#

let index_demo () =
  let pager = Btree.Pager.mem ~page_size:4096 () in
  let index = Btree.Index.v pager in
  Btree.Index.insert index "key";
  Btree.Index.mem index "key"

File-backed#

Point the pager at an Eio file and call Pager.sync to persist. The result is a file that sqlite3 can open:

let persist ~fs =
  Eio.Path.with_open_out
    ~create:(`If_missing 0o644) Eio.Path.(fs / "demo.db")
    (fun file ->
      let pager = Btree.Pager.v ~page_size:4096 file in
      let table = Btree.Table.v pager in
      Btree.Table.insert table ~rowid:1L "persisted";
      Btree.Pager.sync pager)

Reopen by constructing a pager on the same file and calling Btree.Table.open_ pager ~root_page; the root page number comes from Btree.Table.save_root (or 1 for a fresh file).

API#

Table#

  • v pager / open_ pager ~root_page
  • insert t ~rowid data / find t rowid / delete t rowid
  • iter t f / fold t ~init ~f
  • save_root t / restore_root t root -- persist the root for reopen

Index#

  • v pager / open_ pager ~root_page
  • insert t key / mem t key
  • Iteration and range scans mirror Table

Pager#

  • Pager.v ~page_size file -- file-backed
  • Pager.mem ~page_size () -- in-memory, sync is a no-op
  • Pager.sync t -- flush dirty pages
  • Pager.snapshot t / rollback t snap -- capture and restore for aborting a logical transaction

Internals#

Modules#

Module Purpose
Pager Page cache and file I/O (file-backed or in-memory)
Table B+tree for rowid-keyed records (int64 -> string)
Index B-tree for string key sets
Page Page header parsing, binary helpers
Cell Cell encoding/decoding (table leaf, interior, index)
Record SQLite record format (serial types, column values)
Varint Variable-length integer encoding

Page header (8 bytes leaf, 12 bytes interior)#

Offset Size Description
0 1 Page type: 0x0d leaf table, 0x05 interior table, 0x0a leaf index, 0x02 interior index
1 2 First freeblock offset (0 if none)
3 2 Cell count
5 2 Cell content area start (0 = 65536)
7 1 Fragmented free bytes (max 60)
8 4 Right-most child pointer (interior pages only)

Overflow#

When a cell payload exceeds X = U - 35 bytes (where U is usable page size), excess data is stored in a chain of overflow pages. Each overflow page has a 4-byte next-page pointer followed by up to U - 4 bytes of data. The local payload size is computed per the SQLite spec:

M = ((U - 12) * 32 / 255) - 23
K = M + ((P - M) mod (U - 4))
local = if K <= X then K else M

Design choices#

The SQLite file format is an implementation choice, not a limitation. It brings free tooling (sqlite3 CLI, DB Browser, etc.) and a spec with 20+ years of battle-testing. The user-facing API is generic -- persistent ordered map (Table) and persistent ordered set (Index).

What the format does not give you (compared to LMDB/sanakirja):

Feature SQLite format LMDB / sanakirja
Concurrency In-place updates + WAL/journal Copy-on-write (lock-free readers)
Range scans Via parent traversal Leaf sibling pointers
Crash safety Rollback journal or WAL Atomic root pointer swap

These are deliberate tradeoffs -- compatibility and tooling vs. raw throughput on concurrent workloads.

  • SQLite file format -- the specification this library implements
  • ocaml-sqlite -- database layer built on top of this library (KV API, named tables, schema)
  • LMDB -- C B+tree with memory-mapped COW (different tradeoffs)
  • sanakirja -- Rust COW B-tree, used by Pijul
  • bbolt -- Go B+tree, used by etcd
  • Limbo -- Rust SQLite reimplementation

Licence#

MIT. See LICENSE.md for details.