# sqlite Minimal SQLite key-value store for OCaml. ## Overview A pure OCaml key-value store that reads and writes the SQLite file format, backed by a B-tree implementation. Supports: - Namespaced tables for organizing data - WAL mode for concurrent access - Efficient batch operations - Eio-compatible synchronous API ## Installation Install with opam: ```sh $ opam install sqlite ``` If opam cannot find the package, it may not yet be released in the public `opam-repository`. Add the overlay repository, then install it: ```sh $ opam repo add samoht https://tangled.org/gazagnaire.org/opam-overlay.git $ opam update $ opam install sqlite ``` ## Usage ```ocaml let run ~fs = Eio.Switch.run @@ fun sw -> (* Open or create a database. *) let db = Sqlite.open_ ~sw Eio.Path.(fs / "data.db") in (* Basic key-value operations. *) Sqlite.put db "key1" "value1"; assert (Sqlite.find db "key1" = Some "value1"); (* Namespaced tables. *) let blocks = Sqlite.Table.create db ~name:"blocks" in Sqlite.Table.put blocks "cid1" "data1"; (* Sync to disk. *) Sqlite.sync db; (* Close when done. *) Sqlite.close db ``` ## API ### Database - `Sqlite.open_ ~sw ?create path` - Open or create a SQLite database at path - `Sqlite.find db key` - Get value for key, or None - `Sqlite.put db key value` - Store value at key - `Sqlite.delete db key` - Remove key - `Sqlite.mem db key` - Check if key exists - `Sqlite.iter db ~f` - Iterate over all entries - `Sqlite.fold db ~init ~f` - Fold over all entries - `Sqlite.sync db` - Flush to disk (WAL checkpoint) - `Sqlite.close db` - Close the database ### Namespaced Tables - `Sqlite.Table.create db ~name` - Create or open a named table - `Sqlite.Table.get`, `put`, `delete`, `mem`, `iter` - Same as database operations ## Related Work - [sqlite3-ocaml](https://github.com/mmottl/sqlite3-ocaml) - Low-level SQLite3 C bindings - [ezsqlite](https://opam.ocaml.org/packages/ezsqlite/) - Alternative SQLite bindings with extensions - [irmin](https://github.com/mirage/irmin) - Git-like distributed database (different use case) ## Pure OCaml Implementation This implementation is written in pure OCaml using a B-tree backend. This enables: - Unikernel deployment (MirageOS, Solo5) - Browser targets via js_of_ocaml - Full control over I/O with bytesrw streaming - Better debugging and error handling ### Research: Limbo (Rust) [Limbo](https://github.com/tursodatabase/limbo) is a Rust implementation of SQLite, providing a clean reference for pure-language SQLite implementations. Key design decisions from Limbo: - **Async-first**: Built on Rust async/await (we'd use Eio) - **Modular pager**: Separates page cache from storage backend - **Incremental parsing**: Streams large records without full buffering - **WAL-focused**: Prioritizes WAL mode over legacy rollback journal ### SQLite File Format The [SQLite file format](https://www.sqlite.org/fileformat2.html) is well-documented: **Database structure:** ``` ┌──────────────────────────────────────┐ │ Database Header (100 bytes) │ ← Page 1 (first 100 bytes) ├──────────────────────────────────────┤ │ Schema Table (sqlite_master B-tree) │ ← Page 1 (remaining) ├──────────────────────────────────────┤ │ User Tables & Indexes (B-trees) │ ← Pages 2..N ├──────────────────────────────────────┤ │ Freelist (unused pages) │ └──────────────────────────────────────┘ ``` **B-tree pages:** - Interior pages: keys + child page pointers (routing) - Leaf pages: keys + record data (storage) - Overflow pages: continuation for large records **Record format:** - Header: serial types for each column (varint-encoded) - Body: column values in declared order ### Implementation Approach **Phase 1: Read-only access** 1. Parse database header (page size, encoding, version) 2. Read B-tree pages (interior and leaf) 3. Traverse B-trees to find records 4. Decode record format (serial types → OCaml values) **Phase 2: Write support** 1. B-tree insertion with page splits 2. Freelist management 3. WAL mode implementation 4. Checkpointing **Phase 3: Eio integration** 1. bytesrw-based page I/O 2. Async file operations with Eio.File 3. LRU page cache with configurable size ### References - [SQLite File Format](https://www.sqlite.org/fileformat2.html) - Official specification - [Limbo](https://github.com/tursodatabase/limbo) - Rust implementation (primary inspiration) - [SQLite Database System](https://www.amazon.com/SQLite-Database-System-Design-Implementation/dp/1453861866) - Sibsankar Haldar's design book - [Architecture of SQLite](https://www.sqlite.org/arch.html) - Official architecture docs - [SQLite Source Code](https://sqlite.org/src/doc/trunk/README.md) - C reference implementation ## Licence MIT License. See [LICENSE.md](LICENSE.md) for details.