My aggregated monorepo of OCaml code, automaintained
0
fork

Configure Feed

Select the types of activity you want to include in your feed.

Interactive OCaml Tutorials — System Design#

Overview#

A system for authoring web-based, purely client-side interactive OCaml tutorials and exercises. Authors write .mld files using odoc, run dune build @doc, and get HTML pages where code blocks are live, editable, and executable — backed by a Web Worker running the OCaml toplevel.

The system supports three use cases:

  1. Scrollycode tutorials — step-by-step, scroll-driven code walkthroughs (already prototyped via odoc-scrollycode-extension)
  2. Exercises and assessment — editable skeleton code with immutable test cells, in the style of Jupyter/nbgrader (e.g., Cambridge "Foundations of Computer Science")
  3. Interactive widgets — reactive UI elements (sliders, plots, mini-apps) driven by an FRP library running in the Worker

There is no server-side component beyond serving static files over HTTP.

Architecture (Approach C — Thin Plugin + Smart WebComponent)#

The odoc plugin is deliberately thin. Its job is to translate {@ocaml ...} tagged code blocks into <x-ocaml> HTML elements with data attributes. All interactive behaviour lives in the x-ocaml WebComponent and the js_top_worker backend.

Author writes .mld          odoc plugin            x-ocaml + worker
─────────────────────  ──>  ────────────────  ──>  ─────────────────
{@ocaml exercise           <x-ocaml              WebComponent reads
 id=factorial               mode="exercise"       data attrs, manages
 [let facr n = ...]}        data-id="factorial">  UI, sends code to
                            ...                   worker for execution
                            </x-ocaml>

This approach means:

  • The plugin stays simple — pure data transformation
  • Interactive behaviour can be iterated by updating x-ocaml.js without re-running odoc
  • The system works outside odoc (hand-written HTML, other doc generators)

1. Universe Structure#

A universe is a self-consistent set of compiled OCaml packages, discoverable via a findlib_index.json file. OCaml requires that all libraries in a universe are built with exactly the same versions of all transitive dependencies — you cannot mix libraries from different build environments.

This constraint means a central host (e.g., ocaml.org) cannot serve a single universal set of libraries. Different tutorials may need different package combinations. The system supports both self-hosted universes and centrally-hosted ones.

Additionally, OCaml and OxCaml are fundamentally incompatible — each requires its own universe.

Directory layout#

universe/
├── findlib_index.json         # lists META paths for all packages
├── worker.js                  # js_top_worker compiled to JS
├── x-ocaml.js                 # WebComponent runtime
├── stdlib/
│   ├── META
│   ├── dynamic_cmis.json
│   ├── stdlib.cma.js
│   └── *.cmi
├── cmdliner/
│   ├── META
│   ├── dynamic_cmis.json
│   ├── cmdliner.cma.js
│   └── *.cmi
└── .../

Discovery mechanism#

The existing js_top_worker discovery mechanism (findlibish) is the runtime — no new protocol needed:

  • findlib_index.json — JSON file listing META file paths and pointers to other universes
  • META files — standard findlib metadata with requires, archive, directory fields
  • dynamic_cmis.json — per-library file listing available modules and CMI file prefixes for on-demand loading
  • Universe linking — findlib_index.json can reference other universes via a universes field; the worker resolves dependencies transitively from META files

Building a universe#

From an opam switch (common case): A CLI tool walks the switch, and for each installed package: copies its META file, compiles the .cma to .cma.js (via js_of_ocaml), generates dynamic_cmis.json from the .cmi files, and writes the findlib_index.json listing all packages. Tooling for this partially exists and needs to be connected into a coherent tool.

From day10 (at scale): The CI pipeline builds universes across multiple compiler versions and OS targets, producing the same layout for each coherent package set. day10 manages many co-existing universes with its filesystem hierarchy.

ocaml.org integration#

The doc build pipeline produces both HTML docs and the JS/cmi artifacts needed for interactivity. Each package's tutorial gets a universe consisting of that package plus its transitive dependencies, all already compiled as part of the doc pipeline.

2. Authoring Format#

Page-level configuration#

Custom tags at the top of the .mld file configure the page:

@x-ocaml.universe https://ocaml.org/universe/5.3.0
@x-ocaml.requires cmdliner, astring
@x-ocaml.auto-execute false
@x-ocaml.merlin false
  • universe — URL where the findlib_index.json lives. Falls back to relative ./universe/ if omitted.
  • requires — packages to load at initialization, before any cells run.
  • auto-execute — whether cells run automatically on page load (default: true).
  • merlin — whether Merlin-based LSP feedback is enabled (default: true). Can be overridden per-cell.

Cell types#

Code blocks use odoc's tagged code block syntax: {@ocaml <attributes> [...code...]}.

Attribute Purpose Editable? Visible?
interactive Demo/example cell No Yes
exercise Skeleton for student to fill in Yes Yes
test Immutable test assertions No Yes
hidden Setup code, runs but not shown No No

Additional per-cell attributes#

Attribute Purpose
id=name Name this cell for explicit linking
for=name Link a test cell to a specific exercise
env=name Named execution environment
merlin Override page-level merlin setting

Exercise linking#

Test cells are linked to exercise cells by two mechanisms:

  • Positional (default) — a test cell applies to the nearest preceding exercise cell.
  • Explicit — use id and for attributes when the test is distant or ambiguous.

Example: assessment worksheet#

{@ocaml hidden [
(* Setup code the student doesn't see *)
let check_positive f =
  assert (f 0 = 1);
  assert (f 1 = 1)
]}

Write an OCaml function [facr] to compute the factorial by recursion.

{@ocaml exercise id=factorial [
let rec facr n =
    (* YOUR CODE HERE *)
    failwith "Not implemented"
]}

{@ocaml test for=factorial [
assert (facr 10 = 3628800);;
assert (facr 11 = 39916800);;
]}

What the odoc plugin emits#

The plugin maps attributes directly to HTML data attributes:

<x-ocaml mode="hidden">
(* Setup code the student doesn't see *)
let check_positive f = ...
</x-ocaml>

<p>Write an OCaml function <code>facr</code> to compute the factorial
by recursion.</p>

<x-ocaml mode="exercise" data-id="factorial">
let rec facr n =
    (* YOUR CODE HERE *)
    failwith "Not implemented"
</x-ocaml>

<x-ocaml mode="test" data-for="factorial">
assert (facr 10 = 3628800);;
assert (facr 11 = 39916800);;
</x-ocaml>

The plugin also injects a <script> tag for x-ocaml.js (once per page) and a <meta> tag for the universe URL.

3. odoc Plugin#

The plugin is deliberately minimal:

  1. Parse {@ocaml <tags> [...]} blocks
  2. Emit <x-ocaml> HTML elements with tags mapped to data attributes
  3. Inject a <script> tag for x-ocaml.js (once per page)
  4. Emit <meta> tags for page-level configuration (@x-ocaml.* custom tags)

The plugin does not handle:

  • Exercise grouping logic (x-ocaml's job)
  • Universe discovery (WebComponent + worker)
  • Widget wiring (FRP bridge)
  • Cell execution or state management

4. Runtime (x-ocaml WebComponent)#

The x-ocaml WebComponent is where interactive behaviour lives. It:

  • Reads mode, data-id, data-for, data-requires etc. from its HTML attributes
  • Discovers the universe from <meta name="x-ocaml-universe"> or falls back to ./universe/
  • Manages the js_top_worker Web Worker lifecycle
  • Provides CodeMirror editing for exercise cells
  • Provides Merlin integration (configurable)
  • Handles cell dependency ordering and execution
  • Resolves test-to-exercise linking (positional + explicit)

Execution environments#

Cells sharing an env attribute see each other's definitions. By default, all cells on a page share one environment. Named environments allow isolation when needed.

5. Widget/FRP Bridge (Experimental)#

Architecture#

Backend-authoritative, inspired by Marimo's model. The Worker owns all state. View descriptions flow out via postMessage, user events flow back in.

Worker                          Main Thread (x-ocaml)
┌─────────────────┐            ┌─────────────────────┐
│ OCaml code       │            │                     │
│   ↓              │  view      │                     │
│ FRP library      │  desc.     │  Render to DOM      │
│   ↓              │ ────────>  │   ↓                 │
│ Serializable     │            │  Real DOM            │
│ view description │  events    │   ↓                 │
│                  │ <────────  │  User interaction   │
└─────────────────┘            └─────────────────────┘

Reactivity library (to be determined)#

Two candidates:

Lwd (Frederic Bour) — lightweight incremental computation. Lwd.var (mutable inputs) and Lwd.t (derived values) with monadic/applicative composition. Natural for building tree-structured view descriptions. Actively maintained (v0.4, May 2025). Multiple existing backends (terminal, web).

Note (Daniel Bunzli) — classic FRP with events (E) and signals (S). Designed explicitly for js_of_ocaml (no weak references). Cleaner model for discrete user interactions.

View description format#

A custom serializable type — no closures, no JS object references. Event handlers represented as symbolic descriptors. Inspired by ocaml-vdom's pure Vdom module and TyXML's functorial architecture.

Optional: instantiate TyXML's functors over the serializable type for type-safe HTML construction.

Proposed experiments#

  1. Experiment A — Counter with Lwd: Minimal counter (button + display). Lwd in Worker produces serializable view, main thread renders, click events sent back.

  2. Experiment B — Counter with Note: Same counter using Note's signals and events. Compare code ergonomics.

  3. Experiment C — Richer widget: Multiple interacting controls (e.g., two sliders controlling a computed value). Tests composition in each library.

  4. Experiment D — TyXML integration: Instantiate TyXML's functors over a serializable backend. Evaluate whether the type safety is worth the complexity.

Evaluation criteria:

  • Code ergonomics
  • Serialization format (what does the view description look like on the wire?)
  • Event round-trip latency
  • Bundle size impact on worker.js

Summary of Components#

Component Status Purpose
js_top_worker Exists OCaml toplevel in a Web Worker
x-ocaml Exists WebComponent for interactive code cells
odoc (fork) Exists Doc generator with extension plugin system
odoc-scrollycode-extension Exists Scroll-driven code tutorials
odoc-interactive-extension Exists Thin plugin: tags → x-ocaml elements
Universe builder (opam) Exists opam switch → hostable artifacts
Universe builder (day10) Partially exists At-scale universe management
Widget/FRP bridge To design Experiments needed (Lwd vs Note)

Open Questions#

  • Exercise grouping: Should cells be explicitly grouped into named exercises, or is implicit grouping from document structure sufficient? Needs prototyping.
  • FRP library choice: Lwd vs Note — to be resolved by experiments.
  • View serialization format: Depends on FRP library choice. Virtual DOM diffs vs structured widget protocol vs something else.
  • TyXML integration: Worth the complexity? Experiment D will tell.