Minimal SQLite key-value store for OCaml
0
fork

Configure Feed

Select the types of activity you want to include in your feed.

at main 151 lines 5.3 kB view raw view rendered
1# sqlite 2 3Minimal SQLite key-value store for OCaml. 4 5## Overview 6 7A pure OCaml key-value store that reads and writes the SQLite file format, 8backed by a B-tree implementation. Supports: 9- Namespaced tables for organizing data 10- WAL mode for concurrent access 11- Efficient batch operations 12- Eio-compatible synchronous API 13 14## Installation 15 16Install with opam: 17 18<!-- $MDX skip --> 19```sh 20$ opam install sqlite 21``` 22 23If opam cannot find the package, it may not yet be released in the public 24`opam-repository`. Add the overlay repository, then install it: 25 26<!-- $MDX skip --> 27```sh 28$ opam repo add samoht https://tangled.org/gazagnaire.org/opam-overlay.git 29$ opam update 30$ opam install sqlite 31``` 32 33## Usage 34 35```ocaml 36let run ~fs = 37 Eio.Switch.run @@ fun sw -> 38 (* Open or create a database. *) 39 let db = Sqlite.open_ ~sw Eio.Path.(fs / "data.db") in 40 (* Basic key-value operations. *) 41 Sqlite.put db "key1" "value1"; 42 assert (Sqlite.find db "key1" = Some "value1"); 43 (* Namespaced tables. *) 44 let blocks = Sqlite.Table.create db ~name:"blocks" in 45 Sqlite.Table.put blocks "cid1" "data1"; 46 (* Sync to disk. *) 47 Sqlite.sync db; 48 (* Close when done. *) 49 Sqlite.close db 50``` 51 52## API 53 54### Database 55 56- `Sqlite.open_ ~sw ?create path` - Open or create a SQLite database at path 57- `Sqlite.find db key` - Get value for key, or None 58- `Sqlite.put db key value` - Store value at key 59- `Sqlite.delete db key` - Remove key 60- `Sqlite.mem db key` - Check if key exists 61- `Sqlite.iter db ~f` - Iterate over all entries 62- `Sqlite.fold db ~init ~f` - Fold over all entries 63- `Sqlite.sync db` - Flush to disk (WAL checkpoint) 64- `Sqlite.close db` - Close the database 65 66### Namespaced Tables 67 68- `Sqlite.Table.create db ~name` - Create or open a named table 69- `Sqlite.Table.get`, `put`, `delete`, `mem`, `iter` - Same as database operations 70 71## Related Work 72 73- [sqlite3-ocaml](https://github.com/mmottl/sqlite3-ocaml) - Low-level SQLite3 C bindings 74- [ezsqlite](https://opam.ocaml.org/packages/ezsqlite/) - Alternative SQLite bindings with extensions 75- [irmin](https://github.com/mirage/irmin) - Git-like distributed database (different use case) 76 77## Pure OCaml Implementation 78 79This implementation is written in pure OCaml using a B-tree backend. This enables: 80- Unikernel deployment (MirageOS, Solo5) 81- Browser targets via js_of_ocaml 82- Full control over I/O with bytesrw streaming 83- Better debugging and error handling 84 85### Research: Limbo (Rust) 86 87[Limbo](https://github.com/tursodatabase/limbo) is a Rust implementation of SQLite, 88providing a clean reference for pure-language SQLite implementations. 89 90Key design decisions from Limbo: 91- **Async-first**: Built on Rust async/await (we'd use Eio) 92- **Modular pager**: Separates page cache from storage backend 93- **Incremental parsing**: Streams large records without full buffering 94- **WAL-focused**: Prioritizes WAL mode over legacy rollback journal 95 96### SQLite File Format 97 98The [SQLite file format](https://www.sqlite.org/fileformat2.html) is well-documented: 99 100**Database structure:** 101``` 102┌──────────────────────────────────────┐ 103│ Database Header (100 bytes) │ ← Page 1 (first 100 bytes) 104├──────────────────────────────────────┤ 105│ Schema Table (sqlite_master B-tree) │ ← Page 1 (remaining) 106├──────────────────────────────────────┤ 107│ User Tables & Indexes (B-trees) │ ← Pages 2..N 108├──────────────────────────────────────┤ 109│ Freelist (unused pages) │ 110└──────────────────────────────────────┘ 111``` 112 113**B-tree pages:** 114- Interior pages: keys + child page pointers (routing) 115- Leaf pages: keys + record data (storage) 116- Overflow pages: continuation for large records 117 118**Record format:** 119- Header: serial types for each column (varint-encoded) 120- Body: column values in declared order 121 122### Implementation Approach 123 124**Phase 1: Read-only access** 1251. Parse database header (page size, encoding, version) 1262. Read B-tree pages (interior and leaf) 1273. Traverse B-trees to find records 1284. Decode record format (serial types → OCaml values) 129 130**Phase 2: Write support** 1311. B-tree insertion with page splits 1322. Freelist management 1333. WAL mode implementation 1344. Checkpointing 135 136**Phase 3: Eio integration** 1371. bytesrw-based page I/O 1382. Async file operations with Eio.File 1393. LRU page cache with configurable size 140 141### References 142 143- [SQLite File Format](https://www.sqlite.org/fileformat2.html) - Official specification 144- [Limbo](https://github.com/tursodatabase/limbo) - Rust implementation (primary inspiration) 145- [SQLite Database System](https://www.amazon.com/SQLite-Database-System-Design-Implementation/dp/1453861866) - Sibsankar Haldar's design book 146- [Architecture of SQLite](https://www.sqlite.org/arch.html) - Official architecture docs 147- [SQLite Source Code](https://sqlite.org/src/doc/trunk/README.md) - C reference implementation 148 149## Licence 150 151MIT License. See [LICENSE.md](LICENSE.md) for details.