My aggregated monorepo of OCaml code, automaintained
0
fork

Configure Feed

Select the types of activity you want to include in your feed.

at main 343 lines 10 kB view raw view rendered
1# Monorepo Improvement Recommendations 2 3Analysis date: 2026-01-19 4 5This document captures opportunities to share code, simplify implementations, improve documentation, and take advantage of the different libraries in this monorepo. 6 7--- 8 9## Executive Summary 10 11This monorepo contains **35+ OCaml libraries and tools** with strong architectural consistency: 12- **Jsont** is the universal JSON/codec standard 13- **Eio** is the universal async foundation 14- **Requests + Conpool** is the HTTP stack 15- **Cmdliner** is the CLI framework 16 17The primary opportunities are extracting shared code that is duplicated across multiple libraries. 18 19--- 20 21## High-Priority Recommendations 22 23### 1. Extract Shared Session/Profile Management 24 25**Impact:** HIGH | **Effort:** Medium | **Libraries affected:** 3 26 27Three protocol libraries independently implement identical session management: 28- `/workspace/ocaml-atp/xrpc-auth/lib/xrpc_auth_session.ml` 29- `/workspace/ocaml-apubt/lib/auth/apub_auth_session.ml` 30- `/workspace/ocaml-matrix/lib/matrix_client/session.ml` 31 32**Duplicated functionality:** 33- XDG-based profile directory creation (`~/.config/<app>/profiles/<profile>/`) 34- Profile save/load/list/clear operations 35- Current profile selection and persistence 36- Directory creation with proper permissions 37 38**Recommendation:** Create `ocaml-protocol-session` library with: 39```ocaml 40module Profile_dir : sig 41 val base_config_dir : Eio.Fs.dir_ty Eio.Path.t -> app_name:string -> Eio.Fs.dir_ty Eio.Path.t 42 val profile_dir : Eio.Fs.dir_ty Eio.Path.t -> app_name:string -> ?profile:string -> unit -> Eio.Fs.dir_ty Eio.Path.t 43 val list_profiles : Eio.Fs.dir_ty Eio.Path.t -> app_name:string -> string list 44 val set_current_profile : ... -> unit 45 val get_current_profile : ... -> string option 46end 47``` 48 49**Estimated savings:** ~300 lines of duplicated code 50 51--- 52 53### 2. Unify Codec Error Types 54 55**Impact:** MEDIUM-HIGH | **Effort:** Low | **Libraries affected:** 5 56 57`cbort`, `tomlt`, `init`, and others all define nearly identical error types: 58 59```ocaml 60(* From cbort *) 61| Type_mismatch of { expected : string; got : string } 62| Missing_member of string 63| Unknown_member of string 64 65(* From tomlt - identical semantics *) 66| Type_mismatch of { expected : string; got : string } 67| Missing_member of string 68| Unknown_member of string 69``` 70 71**Files with duplicate error handling:** 72- `/workspace/ocaml-cbort/lib/cbort.mli` lines 59-123 73- `/workspace/ocaml-tomlt/lib/tomlt.mli` lines 247-263 74- `/workspace/ocaml-init/src/init.mli` lines 481-570 75 76**Recommendation:** Create shared `codec-error` module: 77```ocaml 78module Codec_error : sig 79 type path = string list 80 type kind = 81 | Type_mismatch of { expected : string; got : string } 82 | Missing_member of string 83 | Unknown_member of string 84 | Invalid_value of string 85 type t = { path : path; kind : kind } 86 val pp : Format.formatter -> t -> unit 87end 88``` 89 90--- 91 92### 3. Create API Client Base Module 93 94**Impact:** MEDIUM | **Effort:** Medium | **Libraries affected:** 7 95 96All API clients implement identical patterns: 97 98**Clients:** karakeep, peertube, typesense, zulip, zotero, claudeio, owntracks 99 100**Duplicated patterns:** 101 1021. **Client factory:** 103```ocaml 104let create ~sw env ~base_url ~api_key = 105 let session = Requests.create ~sw env in 106 let session = Requests.set_auth session (Requests.Auth.bearer ~token:api_key) in 107 { session; base_url } 108``` 109 1102. **Query string building:** 111```ocaml 112let query_string params = 113 match params with 114 | [] -> "" 115 | _ -> "?" ^ String.concat "&" (List.map (fun (k,v) -> 116 Uri.pct_encode k ^ "=" ^ Uri.pct_encode v) params) 117``` 118 1193. **JSON decode wrapper:** 120```ocaml 121let decode_json codec body_str = 122 match Jsont_bytesrw.decode_string' codec body_str with 123 | Ok v -> v 124 | Error e -> raise (err (Json_error ...)) 125``` 126 1274. **Error handling with Eio.Exn.err:** 128```ocaml 129type error = Api_error of { status: int; message: string } | ... 130type Eio.Exn.err += E of error 131let err e = Eio.Exn.create (E e) 132``` 133 134**Recommendation:** Create `ocaml-api-client` with these utilities. 135 136**Estimated savings:** ~100 LOC per client (700 total) 137 138--- 139 140### 4. Extract Bushel's Markdown Link System 141 142**Impact:** MEDIUM | **Effort:** Medium | **Reusability:** High 143 144The 650+ lines implementing custom link syntax could benefit other knowledge management tools: 145 146**Files:** 147- `/workspace/ocaml-bushel/lib/bushel_md.ml` (418 lines) 148- `/workspace/ocaml-bushel/lib/bushel_link.ml` (235 lines) 149 150**Custom syntax supported:** 151- `:slug` - Entry references 152- `@handle` - Contact references 153- `##tag` - Tag grouping 154- `###type` - Type filtering 155 156**Key functions:** 157```ocaml 158let is_bushel_slug = String.starts_with ~prefix:":" 159let is_tag_slug link = String.starts_with ~prefix:"##" 160let is_contact_slug = String.starts_with ~prefix:"@" 161``` 162 163**Recommendation:** Extract to `ocaml-mdlink` library for reusable custom markdown extensions. 164 165--- 166 167## Medium-Priority Recommendations 168 169### 5. Add Standard Logging to All Protocol Libraries 170 171**Impact:** MEDIUM | **Effort:** Low | **Libraries affected:** 7 172 173Only `webfinger` properly uses the `logs` library: 174```ocaml 175(* /workspace/ocaml-webfinger/lib/webfinger.ml line 17 *) 176let src = Logs.Src.create "webfinger" ~doc:"WebFinger Protocol" 177module Log = (val Logs.src_log src : Logs.LOG) 178``` 179 180**Recommendation:** Add consistent logging to all protocol libraries (atp, apubt, jmap, imap, matrix, mqtte). 181 182--- 183 184### 6. Create TOML Config Helper Library 185 186**Impact:** MEDIUM | **Effort:** Low | **Libraries affected:** 4+ 187 188Several tools repeat TOML config patterns: 189- `/workspace/monopam/lib/config.ml` (175 lines) 190- `/workspace/poe/lib/config.ml` (81 lines) 191- `/workspace/ocaml-matrix/lib/matrix_client/session.ml` (TOML codec helpers) 192 193**Common patterns:** 194- XDG directory resolution with fallbacks 195- Tilde expansion for paths 196- Validation of absolute paths 197- Per-codec helper functions (ptime_tomlt, uri_tomlt) 198 199**Recommendation:** Extract shared config utilities. 200 201--- 202 203### 7. CLI Logging Setup Extraction 204 205**Impact:** LOW | **Effort:** Low | **Tools affected:** 4 206 207Duplicate logging setup in CLI tools: 208```ocaml 209(* Identical in monopam/bin/main.ml and poe/bin/main.ml *) 210let setup_logging style_renderer level = 211 Fmt_tty.setup_std_outputs ?style_renderer (); 212 Logs.set_level level; 213 Logs.set_reporter (Logs_fmt.reporter ()) 214 215let logging_term = 216 Term.(const setup_logging $ Fmt_cli.style_renderer () $ Logs_cli.level ()) 217``` 218 219**Recommendation:** Create shared CLI utilities module. 220 221--- 222 223### 8. Standardize Object/Table Builder APIs 224 225**Impact:** MEDIUM | **Effort:** High | **Libraries affected:** 3 226 227Currently inconsistent patterns: 228- `cbort` uses monadic style (`let*`) 229- `tomlt` uses applicative pipeline (`|>`) 230- `init` uses applicative pipeline (`|>`) 231 232**Recommendation:** Provide both syntaxes in each library for user choice. 233 234--- 235 236## Low-Priority Recommendations 237 238### 9. Add Cmdliner to Srcsetter 239 240**Impact:** LOW | **Effort:** Low | **Status:** Already in TODO 241 242File: `/workspace/srcsetter/TODO.md` line 2 243 244Currently uses bare argv parsing instead of cmdliner like other tools. 245 246--- 247 248### 10. Add Alcotest Suite to PublicSuffix 249 250**Impact:** LOW | **Effort:** Low 251 252Currently only has CLI test tool (`/workspace/ocaml-publicsuffix/test/psl_test.ml`) but no automated Alcotest suite. 253 254**Recommendation:** Create `/workspace/ocaml-publicsuffix/test/psl_alcotest.ml` with: 255- Roundtrip tests 256- Edge case testing for malformed domains 257- Wildcard and exception rule testing 258 259--- 260 261### 11. Add Mail-Flag JSON Serialization 262 263**Impact:** LOW | **Effort:** Low 264 265Missing JSON serialization for IMAP/JMAP wire formats in `/workspace/ocaml-mail-flag/`. 266 267**Recommendation:** Add jsont codecs for keyword lists in both protocol formats. 268 269--- 270 271## Documentation Gaps 272 273| Library | Gap | Location | 274|---------|-----|----------| 275| **json-pointer** | README lacks JMAP extended pointers section | Code has it at mli lines 482-581 | 276| **yamlt** | Missing architecture docs on Jsont.t integration | README.md | 277| **Serialization libs** | No unified codec architecture guide | New document needed | 278| **API clients** | No "OCaml HTTP client pattern" guide | New document needed | 279| **Protocol libs** | No cross-library integration examples | New tutorials needed | 280 281--- 282 283## Testing Gaps 284 285| Library | Current State | Recommendation | 286|---------|--------------|----------------| 287| **publicsuffix** | CLI test tool only | Add Alcotest suite | 288| **bushel** | No unit tests | Add roundtrip tests | 289| **html5rw** | Excellent (html5lib conformance) | None needed | 290| **crockford** | Comprehensive | None needed | 291| **punycode** | Comprehensive (39+ cases) | None needed | 292| **langdetect** | Very comprehensive (54 cases) | None needed | 293 294--- 295 296## Architecture Notes 297 298### Consistent Patterns (Do Not Change) 299 300These patterns are working well across the monorepo: 301 3021. **Jsont for JSON** - Universal, no variation needed 3032. **Eio for async** - Consistent Switch-based resource management 3043. **Requests + Conpool for HTTP** - Well-integrated 3054. **Cmdliner for CLI** - Excellent man page generation 3065. **Bytesrw for streaming I/O** - Used consistently 307 308### Dependency Ecosystem 309 310Common dependencies across all libraries: 311 312| Dependency | Usage | 313|------------|-------| 314| `jsont` | JSON codec (universal) | 315| `eio` | Async primitives (universal) | 316| `requests` | HTTP client (5+ libraries) | 317| `cmdliner` | CLI (all tools) | 318| `fmt` | Pretty-printing (universal) | 319| `ptime` | Timestamps (6+ libraries) | 320| `uri` | URI handling (6+ libraries) | 321| `tomlt` | TOML config (4+ libraries) | 322| `xdge` | XDG directories (4+ libraries) | 323 324--- 325 326## Implementation Priority Order 327 3281. **Extract protocol-session module** - Highest ROI, affects 3 production libraries 3292. **Unify codec error types** - Low effort, improves consistency 3303. **Create api_client base** - Medium effort, high code savings 3314. **Add logging to protocols** - Low effort, improves debuggability 3325. **Documentation improvements** - Ongoing, no code changes 3336. **Extract bushel link system** - If external reuse is desired 3347. **CLI utility extraction** - Nice to have 3358. **Testing improvements** - Ongoing maintenance 336 337--- 338 339## Notes 340 341- All file paths are absolute and verified as of analysis date 342- Line numbers may shift as code evolves 343- Some recommendations may be superseded by upstream library changes