My aggregated monorepo of OCaml code, automaintained
0
fork

Configure Feed

Select the types of activity you want to include in your feed.

Monorepo Improvement Recommendations#

Analysis date: 2026-01-19

This document captures opportunities to share code, simplify implementations, improve documentation, and take advantage of the different libraries in this monorepo.


Executive Summary#

This monorepo contains 35+ OCaml libraries and tools with strong architectural consistency:

  • Jsont is the universal JSON/codec standard
  • Eio is the universal async foundation
  • Requests + Conpool is the HTTP stack
  • Cmdliner is the CLI framework

The primary opportunities are extracting shared code that is duplicated across multiple libraries.


High-Priority Recommendations#

1. Extract Shared Session/Profile Management#

Impact: HIGH | Effort: Medium | Libraries affected: 3

Three protocol libraries independently implement identical session management:

  • /workspace/ocaml-atp/xrpc-auth/lib/xrpc_auth_session.ml
  • /workspace/ocaml-apubt/lib/auth/apub_auth_session.ml
  • /workspace/ocaml-matrix/lib/matrix_client/session.ml

Duplicated functionality:

  • XDG-based profile directory creation (~/.config/<app>/profiles/<profile>/)
  • Profile save/load/list/clear operations
  • Current profile selection and persistence
  • Directory creation with proper permissions

Recommendation: Create ocaml-protocol-session library with:

module Profile_dir : sig
  val base_config_dir : Eio.Fs.dir_ty Eio.Path.t -> app_name:string -> Eio.Fs.dir_ty Eio.Path.t
  val profile_dir : Eio.Fs.dir_ty Eio.Path.t -> app_name:string -> ?profile:string -> unit -> Eio.Fs.dir_ty Eio.Path.t
  val list_profiles : Eio.Fs.dir_ty Eio.Path.t -> app_name:string -> string list
  val set_current_profile : ... -> unit
  val get_current_profile : ... -> string option
end

Estimated savings: ~300 lines of duplicated code


2. Unify Codec Error Types#

Impact: MEDIUM-HIGH | Effort: Low | Libraries affected: 5

cbort, tomlt, init, and others all define nearly identical error types:

(* From cbort *)
| Type_mismatch of { expected : string; got : string }
| Missing_member of string
| Unknown_member of string

(* From tomlt - identical semantics *)
| Type_mismatch of { expected : string; got : string }
| Missing_member of string
| Unknown_member of string

Files with duplicate error handling:

  • /workspace/ocaml-cbort/lib/cbort.mli lines 59-123
  • /workspace/ocaml-tomlt/lib/tomlt.mli lines 247-263
  • /workspace/ocaml-init/src/init.mli lines 481-570

Recommendation: Create shared codec-error module:

module Codec_error : sig
  type path = string list
  type kind =
    | Type_mismatch of { expected : string; got : string }
    | Missing_member of string
    | Unknown_member of string
    | Invalid_value of string
  type t = { path : path; kind : kind }
  val pp : Format.formatter -> t -> unit
end

3. Create API Client Base Module#

Impact: MEDIUM | Effort: Medium | Libraries affected: 7

All API clients implement identical patterns:

Clients: karakeep, peertube, typesense, zulip, zotero, claudeio, owntracks

Duplicated patterns:

  1. Client factory:
let create ~sw env ~base_url ~api_key =
  let session = Requests.create ~sw env in
  let session = Requests.set_auth session (Requests.Auth.bearer ~token:api_key) in
  { session; base_url }
  1. Query string building:
let query_string params =
  match params with
  | [] -> ""
  | _ -> "?" ^ String.concat "&" (List.map (fun (k,v) ->
      Uri.pct_encode k ^ "=" ^ Uri.pct_encode v) params)
  1. JSON decode wrapper:
let decode_json codec body_str =
  match Jsont_bytesrw.decode_string' codec body_str with
  | Ok v -> v
  | Error e -> raise (err (Json_error ...))
  1. Error handling with Eio.Exn.err:
type error = Api_error of { status: int; message: string } | ...
type Eio.Exn.err += E of error
let err e = Eio.Exn.create (E e)

Recommendation: Create ocaml-api-client with these utilities.

Estimated savings: ~100 LOC per client (700 total)


Impact: MEDIUM | Effort: Medium | Reusability: High

The 650+ lines implementing custom link syntax could benefit other knowledge management tools:

Files:

  • /workspace/ocaml-bushel/lib/bushel_md.ml (418 lines)
  • /workspace/ocaml-bushel/lib/bushel_link.ml (235 lines)

Custom syntax supported:

  • :slug - Entry references
  • @handle - Contact references
  • ##tag - Tag grouping
  • ###type - Type filtering

Key functions:

let is_bushel_slug = String.starts_with ~prefix:":"
let is_tag_slug link = String.starts_with ~prefix:"##"
let is_contact_slug = String.starts_with ~prefix:"@"

Recommendation: Extract to ocaml-mdlink library for reusable custom markdown extensions.


Medium-Priority Recommendations#

5. Add Standard Logging to All Protocol Libraries#

Impact: MEDIUM | Effort: Low | Libraries affected: 7

Only webfinger properly uses the logs library:

(* /workspace/ocaml-webfinger/lib/webfinger.ml line 17 *)
let src = Logs.Src.create "webfinger" ~doc:"WebFinger Protocol"
module Log = (val Logs.src_log src : Logs.LOG)

Recommendation: Add consistent logging to all protocol libraries (atp, apubt, jmap, imap, matrix, mqtte).


6. Create TOML Config Helper Library#

Impact: MEDIUM | Effort: Low | Libraries affected: 4+

Several tools repeat TOML config patterns:

  • /workspace/monopam/lib/config.ml (175 lines)
  • /workspace/poe/lib/config.ml (81 lines)
  • /workspace/ocaml-matrix/lib/matrix_client/session.ml (TOML codec helpers)

Common patterns:

  • XDG directory resolution with fallbacks
  • Tilde expansion for paths
  • Validation of absolute paths
  • Per-codec helper functions (ptime_tomlt, uri_tomlt)

Recommendation: Extract shared config utilities.


7. CLI Logging Setup Extraction#

Impact: LOW | Effort: Low | Tools affected: 4

Duplicate logging setup in CLI tools:

(* Identical in monopam/bin/main.ml and poe/bin/main.ml *)
let setup_logging style_renderer level =
  Fmt_tty.setup_std_outputs ?style_renderer ();
  Logs.set_level level;
  Logs.set_reporter (Logs_fmt.reporter ())

let logging_term =
  Term.(const setup_logging $ Fmt_cli.style_renderer () $ Logs_cli.level ())

Recommendation: Create shared CLI utilities module.


8. Standardize Object/Table Builder APIs#

Impact: MEDIUM | Effort: High | Libraries affected: 3

Currently inconsistent patterns:

  • cbort uses monadic style (let*)
  • tomlt uses applicative pipeline (|>)
  • init uses applicative pipeline (|>)

Recommendation: Provide both syntaxes in each library for user choice.


Low-Priority Recommendations#

9. Add Cmdliner to Srcsetter#

Impact: LOW | Effort: Low | Status: Already in TODO

File: /workspace/srcsetter/TODO.md line 2

Currently uses bare argv parsing instead of cmdliner like other tools.


10. Add Alcotest Suite to PublicSuffix#

Impact: LOW | Effort: Low

Currently only has CLI test tool (/workspace/ocaml-publicsuffix/test/psl_test.ml) but no automated Alcotest suite.

Recommendation: Create /workspace/ocaml-publicsuffix/test/psl_alcotest.ml with:

  • Roundtrip tests
  • Edge case testing for malformed domains
  • Wildcard and exception rule testing

11. Add Mail-Flag JSON Serialization#

Impact: LOW | Effort: Low

Missing JSON serialization for IMAP/JMAP wire formats in /workspace/ocaml-mail-flag/.

Recommendation: Add jsont codecs for keyword lists in both protocol formats.


Documentation Gaps#

Library Gap Location
json-pointer README lacks JMAP extended pointers section Code has it at mli lines 482-581
yamlt Missing architecture docs on Jsont.t integration README.md
Serialization libs No unified codec architecture guide New document needed
API clients No "OCaml HTTP client pattern" guide New document needed
Protocol libs No cross-library integration examples New tutorials needed

Testing Gaps#

Library Current State Recommendation
publicsuffix CLI test tool only Add Alcotest suite
bushel No unit tests Add roundtrip tests
html5rw Excellent (html5lib conformance) None needed
crockford Comprehensive None needed
punycode Comprehensive (39+ cases) None needed
langdetect Very comprehensive (54 cases) None needed

Architecture Notes#

Consistent Patterns (Do Not Change)#

These patterns are working well across the monorepo:

  1. Jsont for JSON - Universal, no variation needed
  2. Eio for async - Consistent Switch-based resource management
  3. Requests + Conpool for HTTP - Well-integrated
  4. Cmdliner for CLI - Excellent man page generation
  5. Bytesrw for streaming I/O - Used consistently

Dependency Ecosystem#

Common dependencies across all libraries:

Dependency Usage
jsont JSON codec (universal)
eio Async primitives (universal)
requests HTTP client (5+ libraries)
cmdliner CLI (all tools)
fmt Pretty-printing (universal)
ptime Timestamps (6+ libraries)
uri URI handling (6+ libraries)
tomlt TOML config (4+ libraries)
xdge XDG directories (4+ libraries)

Implementation Priority Order#

  1. Extract protocol-session module - Highest ROI, affects 3 production libraries
  2. Unify codec error types - Low effort, improves consistency
  3. Create api_client base - Medium effort, high code savings
  4. Add logging to protocols - Low effort, improves debuggability
  5. Documentation improvements - Ongoing, no code changes
  6. Extract bushel link system - If external reuse is desired
  7. CLI utility extraction - Nice to have
  8. Testing improvements - Ongoing maintenance

Notes#

  • All file paths are absolute and verified as of analysis date
  • Line numbers may shift as code evolves
  • Some recommendations may be superseded by upstream library changes