Monorepo Improvement Recommendations#
Analysis date: 2026-01-19
This document captures opportunities to share code, simplify implementations, improve documentation, and take advantage of the different libraries in this monorepo.
Executive Summary#
This monorepo contains 35+ OCaml libraries and tools with strong architectural consistency:
- Jsont is the universal JSON/codec standard
- Eio is the universal async foundation
- Requests + Conpool is the HTTP stack
- Cmdliner is the CLI framework
The primary opportunities are extracting shared code that is duplicated across multiple libraries.
High-Priority Recommendations#
1. Extract Shared Session/Profile Management#
Impact: HIGH | Effort: Medium | Libraries affected: 3
Three protocol libraries independently implement identical session management:
/workspace/ocaml-atp/xrpc-auth/lib/xrpc_auth_session.ml/workspace/ocaml-apubt/lib/auth/apub_auth_session.ml/workspace/ocaml-matrix/lib/matrix_client/session.ml
Duplicated functionality:
- XDG-based profile directory creation (
~/.config/<app>/profiles/<profile>/) - Profile save/load/list/clear operations
- Current profile selection and persistence
- Directory creation with proper permissions
Recommendation: Create ocaml-protocol-session library with:
module Profile_dir : sig
val base_config_dir : Eio.Fs.dir_ty Eio.Path.t -> app_name:string -> Eio.Fs.dir_ty Eio.Path.t
val profile_dir : Eio.Fs.dir_ty Eio.Path.t -> app_name:string -> ?profile:string -> unit -> Eio.Fs.dir_ty Eio.Path.t
val list_profiles : Eio.Fs.dir_ty Eio.Path.t -> app_name:string -> string list
val set_current_profile : ... -> unit
val get_current_profile : ... -> string option
end
Estimated savings: ~300 lines of duplicated code
2. Unify Codec Error Types#
Impact: MEDIUM-HIGH | Effort: Low | Libraries affected: 5
cbort, tomlt, init, and others all define nearly identical error types:
(* From cbort *)
| Type_mismatch of { expected : string; got : string }
| Missing_member of string
| Unknown_member of string
(* From tomlt - identical semantics *)
| Type_mismatch of { expected : string; got : string }
| Missing_member of string
| Unknown_member of string
Files with duplicate error handling:
/workspace/ocaml-cbort/lib/cbort.mlilines 59-123/workspace/ocaml-tomlt/lib/tomlt.mlilines 247-263/workspace/ocaml-init/src/init.mlilines 481-570
Recommendation: Create shared codec-error module:
module Codec_error : sig
type path = string list
type kind =
| Type_mismatch of { expected : string; got : string }
| Missing_member of string
| Unknown_member of string
| Invalid_value of string
type t = { path : path; kind : kind }
val pp : Format.formatter -> t -> unit
end
3. Create API Client Base Module#
Impact: MEDIUM | Effort: Medium | Libraries affected: 7
All API clients implement identical patterns:
Clients: karakeep, peertube, typesense, zulip, zotero, claudeio, owntracks
Duplicated patterns:
- Client factory:
let create ~sw env ~base_url ~api_key =
let session = Requests.create ~sw env in
let session = Requests.set_auth session (Requests.Auth.bearer ~token:api_key) in
{ session; base_url }
- Query string building:
let query_string params =
match params with
| [] -> ""
| _ -> "?" ^ String.concat "&" (List.map (fun (k,v) ->
Uri.pct_encode k ^ "=" ^ Uri.pct_encode v) params)
- JSON decode wrapper:
let decode_json codec body_str =
match Jsont_bytesrw.decode_string' codec body_str with
| Ok v -> v
| Error e -> raise (err (Json_error ...))
- Error handling with Eio.Exn.err:
type error = Api_error of { status: int; message: string } | ...
type Eio.Exn.err += E of error
let err e = Eio.Exn.create (E e)
Recommendation: Create ocaml-api-client with these utilities.
Estimated savings: ~100 LOC per client (700 total)
4. Extract Bushel's Markdown Link System#
Impact: MEDIUM | Effort: Medium | Reusability: High
The 650+ lines implementing custom link syntax could benefit other knowledge management tools:
Files:
/workspace/ocaml-bushel/lib/bushel_md.ml(418 lines)/workspace/ocaml-bushel/lib/bushel_link.ml(235 lines)
Custom syntax supported:
:slug- Entry references@handle- Contact references##tag- Tag grouping###type- Type filtering
Key functions:
let is_bushel_slug = String.starts_with ~prefix:":"
let is_tag_slug link = String.starts_with ~prefix:"##"
let is_contact_slug = String.starts_with ~prefix:"@"
Recommendation: Extract to ocaml-mdlink library for reusable custom markdown extensions.
Medium-Priority Recommendations#
5. Add Standard Logging to All Protocol Libraries#
Impact: MEDIUM | Effort: Low | Libraries affected: 7
Only webfinger properly uses the logs library:
(* /workspace/ocaml-webfinger/lib/webfinger.ml line 17 *)
let src = Logs.Src.create "webfinger" ~doc:"WebFinger Protocol"
module Log = (val Logs.src_log src : Logs.LOG)
Recommendation: Add consistent logging to all protocol libraries (atp, apubt, jmap, imap, matrix, mqtte).
6. Create TOML Config Helper Library#
Impact: MEDIUM | Effort: Low | Libraries affected: 4+
Several tools repeat TOML config patterns:
/workspace/monopam/lib/config.ml(175 lines)/workspace/poe/lib/config.ml(81 lines)/workspace/ocaml-matrix/lib/matrix_client/session.ml(TOML codec helpers)
Common patterns:
- XDG directory resolution with fallbacks
- Tilde expansion for paths
- Validation of absolute paths
- Per-codec helper functions (ptime_tomlt, uri_tomlt)
Recommendation: Extract shared config utilities.
7. CLI Logging Setup Extraction#
Impact: LOW | Effort: Low | Tools affected: 4
Duplicate logging setup in CLI tools:
(* Identical in monopam/bin/main.ml and poe/bin/main.ml *)
let setup_logging style_renderer level =
Fmt_tty.setup_std_outputs ?style_renderer ();
Logs.set_level level;
Logs.set_reporter (Logs_fmt.reporter ())
let logging_term =
Term.(const setup_logging $ Fmt_cli.style_renderer () $ Logs_cli.level ())
Recommendation: Create shared CLI utilities module.
8. Standardize Object/Table Builder APIs#
Impact: MEDIUM | Effort: High | Libraries affected: 3
Currently inconsistent patterns:
cbortuses monadic style (let*)tomltuses applicative pipeline (|>)inituses applicative pipeline (|>)
Recommendation: Provide both syntaxes in each library for user choice.
Low-Priority Recommendations#
9. Add Cmdliner to Srcsetter#
Impact: LOW | Effort: Low | Status: Already in TODO
File: /workspace/srcsetter/TODO.md line 2
Currently uses bare argv parsing instead of cmdliner like other tools.
10. Add Alcotest Suite to PublicSuffix#
Impact: LOW | Effort: Low
Currently only has CLI test tool (/workspace/ocaml-publicsuffix/test/psl_test.ml) but no automated Alcotest suite.
Recommendation: Create /workspace/ocaml-publicsuffix/test/psl_alcotest.ml with:
- Roundtrip tests
- Edge case testing for malformed domains
- Wildcard and exception rule testing
11. Add Mail-Flag JSON Serialization#
Impact: LOW | Effort: Low
Missing JSON serialization for IMAP/JMAP wire formats in /workspace/ocaml-mail-flag/.
Recommendation: Add jsont codecs for keyword lists in both protocol formats.
Documentation Gaps#
| Library | Gap | Location |
|---|---|---|
| json-pointer | README lacks JMAP extended pointers section | Code has it at mli lines 482-581 |
| yamlt | Missing architecture docs on Jsont.t integration | README.md |
| Serialization libs | No unified codec architecture guide | New document needed |
| API clients | No "OCaml HTTP client pattern" guide | New document needed |
| Protocol libs | No cross-library integration examples | New tutorials needed |
Testing Gaps#
| Library | Current State | Recommendation |
|---|---|---|
| publicsuffix | CLI test tool only | Add Alcotest suite |
| bushel | No unit tests | Add roundtrip tests |
| html5rw | Excellent (html5lib conformance) | None needed |
| crockford | Comprehensive | None needed |
| punycode | Comprehensive (39+ cases) | None needed |
| langdetect | Very comprehensive (54 cases) | None needed |
Architecture Notes#
Consistent Patterns (Do Not Change)#
These patterns are working well across the monorepo:
- Jsont for JSON - Universal, no variation needed
- Eio for async - Consistent Switch-based resource management
- Requests + Conpool for HTTP - Well-integrated
- Cmdliner for CLI - Excellent man page generation
- Bytesrw for streaming I/O - Used consistently
Dependency Ecosystem#
Common dependencies across all libraries:
| Dependency | Usage |
|---|---|
jsont |
JSON codec (universal) |
eio |
Async primitives (universal) |
requests |
HTTP client (5+ libraries) |
cmdliner |
CLI (all tools) |
fmt |
Pretty-printing (universal) |
ptime |
Timestamps (6+ libraries) |
uri |
URI handling (6+ libraries) |
tomlt |
TOML config (4+ libraries) |
xdge |
XDG directories (4+ libraries) |
Implementation Priority Order#
- Extract protocol-session module - Highest ROI, affects 3 production libraries
- Unify codec error types - Low effort, improves consistency
- Create api_client base - Medium effort, high code savings
- Add logging to protocols - Low effort, improves debuggability
- Documentation improvements - Ongoing, no code changes
- Extract bushel link system - If external reuse is desired
- CLI utility extraction - Nice to have
- Testing improvements - Ongoing maintenance
Notes#
- All file paths are absolute and verified as of analysis date
- Line numbers may shift as code evolves
- Some recommendations may be superseded by upstream library changes