Minimal SQLite key-value store for OCaml
0
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge commit '8ea5d9d92d27cfb95024e1a69c9c514402ef3974'

+73
+73
README.md
··· 62 62 - [ezsqlite](https://opam.ocaml.org/packages/ezsqlite/) - Alternative SQLite bindings with extensions 63 63 - [irmin](https://github.com/mirage/irmin) - Git-like distributed database (different use case) 64 64 65 + ## Future: Pure OCaml Implementation 66 + 67 + The current implementation uses C bindings via sqlite3-ocaml. A future pure OCaml 68 + implementation would enable: 69 + - Unikernel deployment (MirageOS, Solo5) 70 + - Browser targets via js_of_ocaml 71 + - Full control over I/O with bytesrw streaming 72 + - Better debugging and error handling 73 + 74 + ### Research: Limbo (Rust) 75 + 76 + [Limbo](https://github.com/tursodatabase/limbo) is a Rust implementation of SQLite, 77 + providing a clean reference for pure-language SQLite implementations. 78 + 79 + Key design decisions from Limbo: 80 + - **Async-first**: Built on Rust async/await (we'd use Eio) 81 + - **Modular pager**: Separates page cache from storage backend 82 + - **Incremental parsing**: Streams large records without full buffering 83 + - **WAL-focused**: Prioritizes WAL mode over legacy rollback journal 84 + 85 + ### SQLite File Format 86 + 87 + The [SQLite file format](https://www.sqlite.org/fileformat2.html) is well-documented: 88 + 89 + **Database structure:** 90 + ``` 91 + ┌──────────────────────────────────────┐ 92 + │ Database Header (100 bytes) │ ← Page 1 (first 100 bytes) 93 + ├──────────────────────────────────────┤ 94 + │ Schema Table (sqlite_master B-tree) │ ← Page 1 (remaining) 95 + ├──────────────────────────────────────┤ 96 + │ User Tables & Indexes (B-trees) │ ← Pages 2..N 97 + ├──────────────────────────────────────┤ 98 + │ Freelist (unused pages) │ 99 + └──────────────────────────────────────┘ 100 + ``` 101 + 102 + **B-tree pages:** 103 + - Interior pages: keys + child page pointers (routing) 104 + - Leaf pages: keys + record data (storage) 105 + - Overflow pages: continuation for large records 106 + 107 + **Record format:** 108 + - Header: serial types for each column (varint-encoded) 109 + - Body: column values in declared order 110 + 111 + ### Implementation Approach 112 + 113 + **Phase 1: Read-only access** 114 + 1. Parse database header (page size, encoding, version) 115 + 2. Read B-tree pages (interior and leaf) 116 + 3. Traverse B-trees to find records 117 + 4. Decode record format (serial types → OCaml values) 118 + 119 + **Phase 2: Write support** 120 + 1. B-tree insertion with page splits 121 + 2. Freelist management 122 + 3. WAL mode implementation 123 + 4. Checkpointing 124 + 125 + **Phase 3: Eio integration** 126 + 1. bytesrw-based page I/O 127 + 2. Async file operations with Eio.File 128 + 3. LRU page cache with configurable size 129 + 130 + ### References 131 + 132 + - [SQLite File Format](https://www.sqlite.org/fileformat2.html) - Official specification 133 + - [Limbo](https://github.com/tursodatabase/limbo) - Rust implementation (primary inspiration) 134 + - [SQLite Database System](https://www.amazon.com/SQLite-Database-System-Design-Implementation/dp/1453861866) - Sibsankar Haldar's design book 135 + - [Architecture of SQLite](https://www.sqlite.org/arch.html) - Official architecture docs 136 + - [SQLite Source Code](https://sqlite.org/src/doc/trunk/README.md) - C reference implementation 137 + 65 138 ## License 66 139 67 140 MIT License. See [LICENSE.md](LICENSE.md) for details.