My aggregated monorepo of OCaml code, automaintained
1# Monorepo Improvement Recommendations
2
3Analysis date: 2026-01-19
4
5This document captures opportunities to share code, simplify implementations, improve documentation, and take advantage of the different libraries in this monorepo.
6
7---
8
9## Executive Summary
10
11This monorepo contains **35+ OCaml libraries and tools** with strong architectural consistency:
12- **Jsont** is the universal JSON/codec standard
13- **Eio** is the universal async foundation
14- **Requests + Conpool** is the HTTP stack
15- **Cmdliner** is the CLI framework
16
17The primary opportunities are extracting shared code that is duplicated across multiple libraries.
18
19---
20
21## High-Priority Recommendations
22
23### 1. Extract Shared Session/Profile Management
24
25**Impact:** HIGH | **Effort:** Medium | **Libraries affected:** 3
26
27Three protocol libraries independently implement identical session management:
28- `/workspace/ocaml-atp/xrpc-auth/lib/xrpc_auth_session.ml`
29- `/workspace/ocaml-apubt/lib/auth/apub_auth_session.ml`
30- `/workspace/ocaml-matrix/lib/matrix_client/session.ml`
31
32**Duplicated functionality:**
33- XDG-based profile directory creation (`~/.config/<app>/profiles/<profile>/`)
34- Profile save/load/list/clear operations
35- Current profile selection and persistence
36- Directory creation with proper permissions
37
38**Recommendation:** Create `ocaml-protocol-session` library with:
39```ocaml
40module Profile_dir : sig
41 val base_config_dir : Eio.Fs.dir_ty Eio.Path.t -> app_name:string -> Eio.Fs.dir_ty Eio.Path.t
42 val profile_dir : Eio.Fs.dir_ty Eio.Path.t -> app_name:string -> ?profile:string -> unit -> Eio.Fs.dir_ty Eio.Path.t
43 val list_profiles : Eio.Fs.dir_ty Eio.Path.t -> app_name:string -> string list
44 val set_current_profile : ... -> unit
45 val get_current_profile : ... -> string option
46end
47```
48
49**Estimated savings:** ~300 lines of duplicated code
50
51---
52
53### 2. Unify Codec Error Types
54
55**Impact:** MEDIUM-HIGH | **Effort:** Low | **Libraries affected:** 5
56
57`cbort`, `tomlt`, `init`, and others all define nearly identical error types:
58
59```ocaml
60(* From cbort *)
61| Type_mismatch of { expected : string; got : string }
62| Missing_member of string
63| Unknown_member of string
64
65(* From tomlt - identical semantics *)
66| Type_mismatch of { expected : string; got : string }
67| Missing_member of string
68| Unknown_member of string
69```
70
71**Files with duplicate error handling:**
72- `/workspace/ocaml-cbort/lib/cbort.mli` lines 59-123
73- `/workspace/ocaml-tomlt/lib/tomlt.mli` lines 247-263
74- `/workspace/ocaml-init/src/init.mli` lines 481-570
75
76**Recommendation:** Create shared `codec-error` module:
77```ocaml
78module Codec_error : sig
79 type path = string list
80 type kind =
81 | Type_mismatch of { expected : string; got : string }
82 | Missing_member of string
83 | Unknown_member of string
84 | Invalid_value of string
85 type t = { path : path; kind : kind }
86 val pp : Format.formatter -> t -> unit
87end
88```
89
90---
91
92### 3. Create API Client Base Module
93
94**Impact:** MEDIUM | **Effort:** Medium | **Libraries affected:** 7
95
96All API clients implement identical patterns:
97
98**Clients:** karakeep, peertube, typesense, zulip, zotero, claudeio, owntracks
99
100**Duplicated patterns:**
101
1021. **Client factory:**
103```ocaml
104let create ~sw env ~base_url ~api_key =
105 let session = Requests.create ~sw env in
106 let session = Requests.set_auth session (Requests.Auth.bearer ~token:api_key) in
107 { session; base_url }
108```
109
1102. **Query string building:**
111```ocaml
112let query_string params =
113 match params with
114 | [] -> ""
115 | _ -> "?" ^ String.concat "&" (List.map (fun (k,v) ->
116 Uri.pct_encode k ^ "=" ^ Uri.pct_encode v) params)
117```
118
1193. **JSON decode wrapper:**
120```ocaml
121let decode_json codec body_str =
122 match Jsont_bytesrw.decode_string' codec body_str with
123 | Ok v -> v
124 | Error e -> raise (err (Json_error ...))
125```
126
1274. **Error handling with Eio.Exn.err:**
128```ocaml
129type error = Api_error of { status: int; message: string } | ...
130type Eio.Exn.err += E of error
131let err e = Eio.Exn.create (E e)
132```
133
134**Recommendation:** Create `ocaml-api-client` with these utilities.
135
136**Estimated savings:** ~100 LOC per client (700 total)
137
138---
139
140### 4. Extract Bushel's Markdown Link System
141
142**Impact:** MEDIUM | **Effort:** Medium | **Reusability:** High
143
144The 650+ lines implementing custom link syntax could benefit other knowledge management tools:
145
146**Files:**
147- `/workspace/ocaml-bushel/lib/bushel_md.ml` (418 lines)
148- `/workspace/ocaml-bushel/lib/bushel_link.ml` (235 lines)
149
150**Custom syntax supported:**
151- `:slug` - Entry references
152- `@handle` - Contact references
153- `##tag` - Tag grouping
154- `###type` - Type filtering
155
156**Key functions:**
157```ocaml
158let is_bushel_slug = String.starts_with ~prefix:":"
159let is_tag_slug link = String.starts_with ~prefix:"##"
160let is_contact_slug = String.starts_with ~prefix:"@"
161```
162
163**Recommendation:** Extract to `ocaml-mdlink` library for reusable custom markdown extensions.
164
165---
166
167## Medium-Priority Recommendations
168
169### 5. Add Standard Logging to All Protocol Libraries
170
171**Impact:** MEDIUM | **Effort:** Low | **Libraries affected:** 7
172
173Only `webfinger` properly uses the `logs` library:
174```ocaml
175(* /workspace/ocaml-webfinger/lib/webfinger.ml line 17 *)
176let src = Logs.Src.create "webfinger" ~doc:"WebFinger Protocol"
177module Log = (val Logs.src_log src : Logs.LOG)
178```
179
180**Recommendation:** Add consistent logging to all protocol libraries (atp, apubt, jmap, imap, matrix, mqtte).
181
182---
183
184### 6. Create TOML Config Helper Library
185
186**Impact:** MEDIUM | **Effort:** Low | **Libraries affected:** 4+
187
188Several tools repeat TOML config patterns:
189- `/workspace/monopam/lib/config.ml` (175 lines)
190- `/workspace/poe/lib/config.ml` (81 lines)
191- `/workspace/ocaml-matrix/lib/matrix_client/session.ml` (TOML codec helpers)
192
193**Common patterns:**
194- XDG directory resolution with fallbacks
195- Tilde expansion for paths
196- Validation of absolute paths
197- Per-codec helper functions (ptime_tomlt, uri_tomlt)
198
199**Recommendation:** Extract shared config utilities.
200
201---
202
203### 7. CLI Logging Setup Extraction
204
205**Impact:** LOW | **Effort:** Low | **Tools affected:** 4
206
207Duplicate logging setup in CLI tools:
208```ocaml
209(* Identical in monopam/bin/main.ml and poe/bin/main.ml *)
210let setup_logging style_renderer level =
211 Fmt_tty.setup_std_outputs ?style_renderer ();
212 Logs.set_level level;
213 Logs.set_reporter (Logs_fmt.reporter ())
214
215let logging_term =
216 Term.(const setup_logging $ Fmt_cli.style_renderer () $ Logs_cli.level ())
217```
218
219**Recommendation:** Create shared CLI utilities module.
220
221---
222
223### 8. Standardize Object/Table Builder APIs
224
225**Impact:** MEDIUM | **Effort:** High | **Libraries affected:** 3
226
227Currently inconsistent patterns:
228- `cbort` uses monadic style (`let*`)
229- `tomlt` uses applicative pipeline (`|>`)
230- `init` uses applicative pipeline (`|>`)
231
232**Recommendation:** Provide both syntaxes in each library for user choice.
233
234---
235
236## Low-Priority Recommendations
237
238### 9. Add Cmdliner to Srcsetter
239
240**Impact:** LOW | **Effort:** Low | **Status:** Already in TODO
241
242File: `/workspace/srcsetter/TODO.md` line 2
243
244Currently uses bare argv parsing instead of cmdliner like other tools.
245
246---
247
248### 10. Add Alcotest Suite to PublicSuffix
249
250**Impact:** LOW | **Effort:** Low
251
252Currently only has CLI test tool (`/workspace/ocaml-publicsuffix/test/psl_test.ml`) but no automated Alcotest suite.
253
254**Recommendation:** Create `/workspace/ocaml-publicsuffix/test/psl_alcotest.ml` with:
255- Roundtrip tests
256- Edge case testing for malformed domains
257- Wildcard and exception rule testing
258
259---
260
261### 11. Add Mail-Flag JSON Serialization
262
263**Impact:** LOW | **Effort:** Low
264
265Missing JSON serialization for IMAP/JMAP wire formats in `/workspace/ocaml-mail-flag/`.
266
267**Recommendation:** Add jsont codecs for keyword lists in both protocol formats.
268
269---
270
271## Documentation Gaps
272
273| Library | Gap | Location |
274|---------|-----|----------|
275| **json-pointer** | README lacks JMAP extended pointers section | Code has it at mli lines 482-581 |
276| **yamlt** | Missing architecture docs on Jsont.t integration | README.md |
277| **Serialization libs** | No unified codec architecture guide | New document needed |
278| **API clients** | No "OCaml HTTP client pattern" guide | New document needed |
279| **Protocol libs** | No cross-library integration examples | New tutorials needed |
280
281---
282
283## Testing Gaps
284
285| Library | Current State | Recommendation |
286|---------|--------------|----------------|
287| **publicsuffix** | CLI test tool only | Add Alcotest suite |
288| **bushel** | No unit tests | Add roundtrip tests |
289| **html5rw** | Excellent (html5lib conformance) | None needed |
290| **crockford** | Comprehensive | None needed |
291| **punycode** | Comprehensive (39+ cases) | None needed |
292| **langdetect** | Very comprehensive (54 cases) | None needed |
293
294---
295
296## Architecture Notes
297
298### Consistent Patterns (Do Not Change)
299
300These patterns are working well across the monorepo:
301
3021. **Jsont for JSON** - Universal, no variation needed
3032. **Eio for async** - Consistent Switch-based resource management
3043. **Requests + Conpool for HTTP** - Well-integrated
3054. **Cmdliner for CLI** - Excellent man page generation
3065. **Bytesrw for streaming I/O** - Used consistently
307
308### Dependency Ecosystem
309
310Common dependencies across all libraries:
311
312| Dependency | Usage |
313|------------|-------|
314| `jsont` | JSON codec (universal) |
315| `eio` | Async primitives (universal) |
316| `requests` | HTTP client (5+ libraries) |
317| `cmdliner` | CLI (all tools) |
318| `fmt` | Pretty-printing (universal) |
319| `ptime` | Timestamps (6+ libraries) |
320| `uri` | URI handling (6+ libraries) |
321| `tomlt` | TOML config (4+ libraries) |
322| `xdge` | XDG directories (4+ libraries) |
323
324---
325
326## Implementation Priority Order
327
3281. **Extract protocol-session module** - Highest ROI, affects 3 production libraries
3292. **Unify codec error types** - Low effort, improves consistency
3303. **Create api_client base** - Medium effort, high code savings
3314. **Add logging to protocols** - Low effort, improves debuggability
3325. **Documentation improvements** - Ongoing, no code changes
3336. **Extract bushel link system** - If external reuse is desired
3347. **CLI utility extraction** - Nice to have
3358. **Testing improvements** - Ongoing maintenance
336
337---
338
339## Notes
340
341- All file paths are absolute and verified as of analysis date
342- Line numbers may shift as code evolves
343- Some recommendations may be superseded by upstream library changes