My aggregated monorepo of OCaml code, automaintained
0
fork

Configure Feed

Select the types of activity you want to include in your feed.

Add plans for website improvements

Design sketches for four improvements:
- sherlodoc mld page indexing
- custom inline extensions in odoc
- @page-tags producer and consumer
- @figure native block-tag plugin

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

+528
+160
docs/plans/2026-04-15-native-figures.md
··· 1 + # Native `@figure` for .mld pages 2 + 3 + **Status:** Planned 4 + **Date:** 2026-04-15 5 + 6 + ## Problem 7 + 8 + Blog posts currently embed figures as raw HTML: 9 + 10 + ``` 11 + {%html: 12 + <figure> 13 + <a href="…"><img src="parseff.png" alt="…"></a> 14 + <figcaption><em>A screenshot…</em></figcaption> 15 + </figure> 16 + %} 17 + ``` 18 + 19 + See e.g. `site/blog/2026/04/weeknotes-2026-15.mld:66-71` and several 20 + instances in 21 + `site/blog/2026/04/odoc_and_ocaml_notebooks.mld`. 22 + 23 + This is verbose, HTML-only, and loses odoc's semantic layer (captions 24 + can't contain references, no consistent CSS hook, no alt-text 25 + discipline). 26 + 27 + Odoc's native `{image:…}` syntax (`odoc/src/parser/token.ml:42,45`) 28 + renders bare `<img>` — no caption, no link wrapping. 29 + 30 + ## Goal 31 + 32 + A block-level tag that produces a `<figure>` with caption, linkable 33 + image, and sensible defaults: 34 + 35 + ``` 36 + @figure parseff.png "A screenshot of the parseff site" 37 + https://jon.ludl.am/experiments/parseff 38 + ``` 39 + 40 + or a multi-line form where the body is the caption (so it can be 41 + formatted): 42 + 43 + ``` 44 + @figure parseff.png 45 + Produced by the {{:…}parseff plugin}. Click for the full site. 46 + ``` 47 + 48 + ## Ground truth 49 + 50 + - Block-tag extensions receive `Comment.nestable_block_element list` 51 + as the body (`odoc/src/extension_api/odoc_extension_api.ml:121-148`) 52 + — so the caption **can** contain formatted inlines (bold, italic, 53 + links, references). This is the key advantage over a code-block 54 + extension, which gets a raw string 55 + (`odoc_extension_api.ml:174-185`). 56 + - The existing admonition extension 57 + (`odoc-admonition-extension/src/admonition_extension.ml:125-146`) 58 + is the closest pattern: tag + formatted block body → rendered HTML 59 + with custom class. 60 + 61 + ## Design 62 + 63 + ### Syntax 64 + 65 + ``` 66 + @figure <src> [alt="…"] [link="…"] [class="…"] 67 + <caption block — any nestable block content> 68 + ``` 69 + 70 + First line: the image source plus optional key=value attributes. 71 + Rest of the tag body (until next `@tag` or end of section): the 72 + caption. Attributes parsed with a tiny `key="value"` tokenizer. 73 + 74 + Alternative: first paragraph of body is the caption; attributes pulled 75 + from a `key=value` word list. Slightly more forgiving but harder to 76 + write a parse error for. Going with explicit attrs on the tag line. 77 + 78 + ### Rendering 79 + 80 + Output: 81 + 82 + ```html 83 + <figure class="figure {class}"> 84 + <a href="{link}"><img src="{src}" alt="{alt}"></a> <!-- link optional --> 85 + <figcaption>{caption}</figcaption> 86 + </figure> 87 + ``` 88 + 89 + If no `link`, emit bare `<img>`. If no `alt`, warn (accessibility). 90 + If no caption, emit `<figure>` without `<figcaption>`. 91 + 92 + The caption is rendered by feeding the body blocks back through 93 + `Odoc_document` standard inline/block rendering — no need to 94 + reimplement formatting. Admonitions already do this. 95 + 96 + ### Asset resolution 97 + 98 + The `src` is relative to the `.mld` file. Odoc's existing asset 99 + handling (`{image:…}`) resolves paths against the page's location; 100 + reuse that logic by parsing the src through the same helper rather 101 + than inlining it as raw HTML. Check what admonition does with 102 + references to see if this is straightforward. 103 + 104 + ### CSS 105 + 106 + One block in `odoc_jons_plugins_css.ml`: 107 + 108 + ```css 109 + figure.figure { margin: 1.5em 0; text-align: center; } 110 + figure.figure img { max-width: 100%; height: auto; } 111 + figure.figure figcaption { font-style: italic; color: #666; } 112 + ``` 113 + 114 + ## Sequence 115 + 116 + 1. Add `Figure` module in 117 + `odoc-jons-plugins/src/odoc_jons_plugins.ml` using the block-tag 118 + extension pattern. 119 + 2. Write an attribute parser for `key="value"` pairs on the tag's 120 + first line. 121 + 3. Render via the block-tag `to_document` → construct 122 + `Odoc_document.Types` nodes (or emit raw `<figure>` wrapping 123 + around already-rendered inner content — see admonition for which 124 + idiom fits). 125 + 4. Resolve `src` through odoc's asset path logic so the generated 126 + `<img src>` matches what `{image:…}` would produce. 127 + 5. CSS. 128 + 6. Convert the April 2026 posts from `{%html: <figure>…%}` to 129 + `@figure`. 130 + 7. Warn (don't fail) when `alt` is missing. 131 + 132 + ## Effort 133 + 134 + Small–medium. Most of the machinery already exists in 135 + `admonition_extension.ml`. Biggest risk is getting asset path 136 + resolution right without duplicating odoc internals. ~100–150 LOC. 137 + 138 + ## Gotchas 139 + 140 + - **Attribute parsing in .mld.** Odoc's parser treats the first line 141 + after `@figure` as prose. The extension body is whatever lands 142 + between this tag and the next block-level sibling. Two options: 143 + (a) parse attrs from the first paragraph's raw text, (b) require 144 + attrs on separate marker lines like `@figure src=…`. (a) is 145 + friendlier. Prototype and see what the received AST looks like. 146 + - **Link-wrapped images.** If the image links to itself at full 147 + size (common pattern), default `link` to `src` when absent? Or 148 + leave explicit. Prefer explicit — less magic. 149 + - **Multiple images per figure.** Out of scope for v1. 150 + - **Non-HTML backends.** Same concern as inline extensions. For now, 151 + HTML-only is fine; warn on other backends. 152 + 153 + ## Not doing 154 + 155 + - **Full Pandoc-style figure syntax** (`![alt](src "title")`). Keep to 156 + odoc's `@tag` style. 157 + - **Automatic width/height detection** from image dimensions. Lets 158 + the browser handle it. 159 + - **Gallery/lightbox JS.** Out of scope; can be added later as a 160 + separate plugin that enhances `.figure` elements.
+149
docs/plans/2026-04-15-odoc-custom-inlines.md
··· 1 + # Custom Inline Extensions for Odoc 2 + 3 + **Status:** Planned 4 + **Date:** 2026-04-15 5 + 6 + ## Problem 7 + 8 + Odoc supports custom block-level tags (`@custom …`) and custom code 9 + blocks (`{@name[ … ]}`) via a plugin registry. There is no equivalent 10 + at the inline level. Authors who want marginal annotations, citations, 11 + keyboard-key styling, or other small inline decorations must fall back 12 + to `{%html: … %}` raw markup, which is verbose, HTML-only, and 13 + bypasses odoc's semantic layer. 14 + 15 + ## Goal 16 + 17 + Let plugins register inline-level handlers, so `.mld` authors can 18 + write something like `{%margin:this is a side note}` and have a plugin 19 + render it to arbitrary inline HTML. 20 + 21 + ## Core constraint 22 + 23 + `inline_element` in `odoc/src/model/comment.ml:35-39` is a **closed 24 + polymorphic variant**. The lexer (`odoc/src/parser/lexer.mll`) has a 25 + hard-coded set of brace commands. There is no extension point at the 26 + inline level today — block-level `@custom` works because the lexer 27 + treats any `@name` as a generic `Custom` token dispatched through the 28 + extension registry at 29 + `odoc/src/extension_api/odoc_extension_api.ml:121-148`. 30 + 31 + Adding inline extensibility therefore requires an odoc patch. We can 32 + follow the shape of the block-level registry. 33 + 34 + ## Design 35 + 36 + ### 1. AST — `odoc/src/model/comment.ml` 37 + 38 + Add an extension variant to `inline_element`: 39 + 40 + ```ocaml 41 + | `Extension of string * string (* name, raw payload *) 42 + ``` 43 + 44 + Add the mirror variant to the document IR at 45 + `odoc/src/document/types.ml` so it can survive through to the 46 + renderer. 47 + 48 + ### 2. Syntax — `{%name:payload}` 49 + 50 + Chosen because: 51 + 52 + - `{%` is already used only for `{%html: … %}` (raw markup, closed by 53 + `%}`). The new form uses `}` to close, so the lexer can disambiguate 54 + by lookahead: `{%html:` → raw markup path; `{%name:` where `name ≠ 55 + html` → inline extension. 56 + - Does not collide with `{!ref}`, `{{:url} text}`, or any existing 57 + brace command. 58 + - The `%` cues "injected content", matching the raw-markup 59 + convention. 60 + 61 + Alternative considered: `{:name payload}`. Mirrors `@name` block 62 + syntax nicely but visually close to the `{{:url}}` link form. Sticking 63 + with `{%…}` unless feedback says otherwise. 64 + 65 + ### 3. Lexer + parser 66 + 67 + - `odoc/src/parser/token.ml` — new token `Inline_extension of string * 68 + string`. 69 + - `odoc/src/parser/lexer.mll` — recognise `{%name:content}` where 70 + `name` is `[a-z][a-z0-9_.-]*` and `name ≠ html`. Warn and recover on 71 + unknown-looking forms. 72 + - `odoc/src/parser/syntax.ml` — consume the token inside 73 + `inline_element`, produce `` `Extension (name, payload) ``. 74 + 75 + ### 4. Document phase 76 + 77 + - `odoc/src/document/comment.ml` — in the inline dispatcher, map AST 78 + `Extension` to the document IR's `Extension` variant. 79 + 80 + ### 5. Plugin API — `odoc/src/extension_api/odoc_extension_api.ml` 81 + 82 + Add a module type alongside the block-level `Extension`: 83 + 84 + ```ocaml 85 + module type Inline_Extension = sig 86 + val prefix : string 87 + val to_inline : string -> Inline.t 88 + end 89 + ``` 90 + 91 + Add `Registry.register_inline` that stores handlers in a parallel 92 + `Hashtbl` keyed by prefix. 93 + 94 + Plugins return `Inline.t` directly (the document IR), so they can emit 95 + styled text, links, raw HTML, or any mix — same flexibility the 96 + block-level API gives them. 97 + 98 + ### 6. HTML rendering 99 + 100 + - `odoc/src/html/generator.ml:inline` — when the IR has 101 + `Extension (name, payload)`, look up the handler in the registry and 102 + splice its `Inline.t` result. Fall back to rendering the payload as 103 + plain text if no handler is registered (with a warning), so sites 104 + don't break when a plugin is missing. 105 + 106 + ### 7. Smoke test — `odoc-jons-plugins` 107 + 108 + Register one small inline plugin (e.g. `{%margin:…}`) rendering to a 109 + `<span class="margin-note">`, plus CSS in 110 + `odoc_jons_plugins_css.ml`. Use it in a blog post. 111 + 112 + ## Sequence 113 + 114 + 1. AST variant in `model/comment.ml` + `document/types.ml`. 115 + 2. Token + lexer rule. 116 + 3. Parser wiring in `syntax.ml`. 117 + 4. Document-phase mapping in `document/comment.ml`. 118 + 5. Plugin API module type + `register_inline`. 119 + 6. HTML dispatcher + fallback. 120 + 7. Tests: parse, error recovery on unknown/malformed forms, roundtrip, 121 + handler lookup. 122 + 8. Ship `{%margin:…}` via `odoc-jons-plugins`. 123 + 124 + ## Effort 125 + 126 + ~150–200 LOC across 7 files. Roughly 2× the sherlodoc patch. The 127 + parser changes are the only genuinely new territory; everything else 128 + mirrors the block-level code path. 129 + 130 + ## Gotchas 131 + 132 + - **Payload escaping.** The payload is "everything until `}`". Need a 133 + plan for `}` inside payload — either backslash-escape, or disallow 134 + (plugins can accept references to external content instead). 135 + - **Non-HTML backends.** Odoc also renders to man pages / LaTeX. Either 136 + implement `Extension` in each backend (with a "plain text of 137 + payload" default) or restrict this feature to HTML for now and warn 138 + on other backends. 139 + - **Upstream.** Worth proposing to odoc rather than carrying locally — 140 + inline extensibility is generally useful and patching the parser is 141 + costly to maintain out-of-tree. 142 + 143 + ## Not doing 144 + 145 + - **Post-processing `Raw_markup` or `Link`** to fake inline 146 + extensibility without an odoc patch. Ugly syntax, fragile, and 147 + breaks non-HTML backends. Rejected. 148 + - **Full attribute grammar** (`{%name attr=foo: payload}`). Start with 149 + `name + string payload`; plugins can parse the payload themselves.
+110
docs/plans/2026-04-15-page-tags.md
··· 1 + # `@page-tags` for .mld pages 2 + 3 + **Status:** Planned 4 + **Date:** 2026-04-15 5 + 6 + ## Problem 7 + 8 + Pages have no machine-readable tag metadata. Cross-linking related 9 + posts today means hand-written "see also" lists, which rot. A 10 + lightweight `@page-tags foo bar baz` tag would let us (a) surface a 11 + visible tag chip row on each page and (b) power cross-page queries 12 + ("all pages tagged `ocaml`", "related posts by shared tags") the same 13 + way `@recent-posts` already does. 14 + 15 + ## Goal 16 + 17 + - Author syntax: `@page-tags ocaml odoc plugins` at the top of any 18 + `.mld`. 19 + - Visible rendering: a small chip row near the page header. 20 + - Programmatic consumption: a sibling extension (e.g. `@tagged-pages 21 + ocaml`) that enumerates matching pages. 22 + 23 + ## Ground truth 24 + 25 + - Block-tag extensions register via 26 + `Odoc_extension_api.Registry.register` and receive 27 + `Comment.nestable_block_element list` (`odoc-jons-plugins/src/odoc_jons_plugins.ml:383-401`). 28 + - Cross-page data is available at **link phase** via the `Env` API. 29 + `@recent-posts` uses `Api.Env.lookup_page_by_path` to pull other 30 + pages' content 31 + (`odoc-jons-plugins/src/odoc_jons_plugins.ml:687, 696`; 32 + `odoc/src/extension_api/odoc_extension_api.ml:116, 140-147`). 33 + 34 + ## Design 35 + 36 + Two extensions, one for producing tags, one for consuming them. 37 + 38 + ### 1. `@page-tags` — producer 39 + 40 + Register a block-tag extension with `prefix = "page-tags"`. 41 + 42 + - Parse the block content into a flat list of tag tokens (split on 43 + whitespace; each tag `[a-z0-9][a-z0-9-]*`; warn on anything else). 44 + - Render: a `<div class="page-tags">` with one `<a 45 + class="tag-chip">` per tag linking to `/tags/<tag>.html` (or 46 + wherever the index lives; see "Tag index page" below). 47 + - Emit a small CSS block via `odoc_jons_plugins_css.ml`. 48 + 49 + ### 2. `@tagged-pages <tag>` — consumer 50 + 51 + A link-phase extension (same shape as `recent-posts`) that: 52 + 53 + - Walks the page tree via `Env`. 54 + - For each page, looks at its raw `Comment.docs` for a top-level 55 + `@page-tags` block and extracts the tags. 56 + - Emits a bulleted list of pages whose tags include the argument. 57 + 58 + This is the mechanism for "quick referencing between pages". A 59 + separate per-tag index page (`/tags/ocaml.mld`) can just be a thin 60 + `.mld` containing `@tagged-pages ocaml`. 61 + 62 + ### 3. Tag normalisation 63 + 64 + Lowercase, trim, dedupe at extraction. Same function used in both the 65 + producer (for rendering) and the consumer (for matching), so that 66 + `Ocaml` and `ocaml` are the same tag. 67 + 68 + ## Sequence 69 + 70 + 1. Add `Page_tags` module in 71 + `odoc-jons-plugins/src/odoc_jons_plugins.ml` — block-tag extension, 72 + parse + render. 73 + 2. Add CSS for `.page-tags` / `.tag-chip` in 74 + `odoc_jons_plugins_css.ml`. 75 + 3. Add `Tagged_pages` link-phase extension — mirror the structure of 76 + `Recent_posts`. 77 + 4. Factor a shared `extract_tags : Comment.docs -> string list` 78 + helper so producer and consumer agree. 79 + 5. Optional: a `/tags/index.mld` listing every tag with counts; build 80 + with a third extension `@tag-cloud` or generate offline. 81 + 6. Add `@page-tags` to a handful of posts; link from the blog index. 82 + 83 + ## Effort 84 + 85 + Small. Producer is ~30 LOC copying the `hidden_tag_extension` 86 + pattern. Consumer is ~100 LOC copying `Recent_posts`. No odoc patch 87 + needed. 88 + 89 + ## Gotchas 90 + 91 + - **Tag discovery.** The consumer must know which tags exist. Either 92 + (a) walk all pages in the consumer and collect, or (b) persist tag 93 + index at build time via a hook. (a) is simpler; (b) is faster if 94 + the site grows. Start with (a). 95 + - **Positioning in rendering.** Tags ideally render near the page 96 + header, not wherever the `@page-tags` block physically sits. The 97 + block-tag extension can't control position directly; simplest is to 98 + put `@page-tags` at the top of the file and accept the block 99 + renders in-place. If that's ugly, a post-render shell hook can 100 + relocate the node. 101 + - **Anchor for "tagged by"**: each rendered chip links to a tag 102 + index. Decide the URL scheme up front (`/tags/<tag>`) so links 103 + don't need to be rewritten later. 104 + 105 + ## Not doing 106 + 107 + - **YAML front-matter-style metadata.** Keeping with odoc's native 108 + `@tag` style so it parses with no syntax extension. 109 + - **Full-text tag search integration with sherlodoc.** Handled 110 + separately once mld indexing is in.
+109
docs/plans/2026-04-15-sherlodoc-mld-indexing.md
··· 1 + # Sherlodoc mld Page-Prose Indexing 2 + 3 + **Status:** Planned 4 + **Date:** 2026-04-15 5 + 6 + ## Problem 7 + 8 + Sherlodoc does not index `.mld` page content. Its indexer explicitly 9 + drops entries of kind `Doc | Page _ | Dir | Impl` before they reach the 10 + full-text index (`odoc/sherlodoc/index/load_doc.ml:220-226`). Only API 11 + items (values, types, modules) and their docstrings are searchable. A 12 + user searching the site cannot find blog posts, tutorials, or narrative 13 + pages by their content. 14 + 15 + ## Goal 16 + 17 + Make headings (and optionally paragraphs/list items) from `.mld` pages 18 + searchable via sherlodoc, with results that deep-link to the relevant 19 + heading anchor. 20 + 21 + ## Scope 22 + 23 + All changes live inside the vendored `odoc/` tree. No upstream 24 + coordination required; regenerate `.db` after deployment. 25 + 26 + Rough size: ~150 LOC across 4–5 files. 27 + 28 + ## Design 29 + 30 + ### 1. Extract prose entries — `odoc/src/index/skeleton.ml:338-343` 31 + 32 + Today `from_page` emits one `Entry` per page. Walk 33 + `p.content.elements` (`Comment.block_element`) recursively and emit 34 + child entries for: 35 + 36 + - `Heading` — reuse its existing `Identifier.Label.t` as the entry 37 + id (the same label already drives the HTML fragment anchor). 38 + - `Paragraph` / list items — synthesize child ids from the parent 39 + label plus a counter. (Phase 2; see "Staging" below.) 40 + 41 + Attach the new entries as children of the page node in the `Tree` so 42 + hierarchy is preserved. 43 + 44 + ### 2. Entry kind — `odoc/src/index/entry.ml:45-62` and `odoc/sherlodoc/db/entry.ml:7-34` 45 + 46 + Add a `Heading` variant. Reusing `Doc` would work but loses the ability 47 + to badge results distinctly and tune ranking. Since sherlodoc is 48 + vendored, the binary format bump is local to us. 49 + 50 + ### 3. Unblock the filter — `odoc/sherlodoc/index/load_doc.ml:220-226` 51 + 52 + `is_pure_documentation` short-circuits before `register_entry`, 53 + `register_doc`, and `register_full_name`. Remove `Heading` (and 54 + optionally `Doc`) from that guard so prose reaches tokenization. Keep 55 + `Page _` excluded (the whole-page entry is already in the tree) and 56 + `Dir`/`Impl` (not prose). 57 + 58 + ### 4. URL fragments — `odoc/src/search/html.ml:7-29` 59 + 60 + When the entry id is a `Label`, build `<page-url>#<label>`. Odoc 61 + already emits matching `id=` anchors on headings, so no new anchor 62 + logic is needed — just make sure the search URL picks up the fragment. 63 + 64 + ### 5. Result display — `odoc/sherlodoc/jsoo/odoc_html_frontend.ml:39-52` 65 + 66 + Add a `kind_heading` badge. Compose `name` as `"Page title › Heading"`, 67 + leave `rhs = None`, and put a short prose snippet in `doc_html`. All 68 + rendering is server-side OCaml — no JS changes. 69 + 70 + ### 6. Ranking — `odoc/sherlodoc/index/load_doc.ml:37-47` 71 + 72 + Existing `cost_doc = 100` already ranks prose below API items. 73 + Optionally add a small bonus for top-level headings (Title < 74 + Section < Subsection) so page-level hits float up. 75 + 76 + ## Sequence 77 + 78 + 1. Add `Heading` kind in both `entry.ml`s. 79 + 2. Recurse in `skeleton.ml`; emit heading entries. 80 + 3. Unblock `Heading` in `load_doc.ml`. 81 + 4. Compose fragment URL in `search/html.ml`. 82 + 5. Add `kind_heading` constant in `odoc_html_frontend.ml`. 83 + 6. Regenerate `.db`, test search against the live site. 84 + 85 + ## Staging 86 + 87 + - **Phase 1 (headings only):** smallest useful increment. Low noise, 88 + high value — users usually search for section titles. 89 + - **Phase 2 (paragraph bodies):** gate behind `--index-prose`. Decide 90 + after Phase 1 ships whether the added recall outweighs the noise. 91 + 92 + ## Gotchas 93 + 94 + - **Anchor stability.** Auto-generated labels for unlabeled headings 95 + change when you reorder or rename them — search results will rot. 96 + Consider requiring explicit `{1:label ...}` syntax on headings that 97 + should be indexed, or accept some churn. 98 + - **Result flooding.** Indexing every paragraph easily drowns API hits. 99 + Headings-first avoids this. 100 + - **DB format bump.** Any deployed `.db` must be rebuilt on upgrade. 101 + - **Upstream.** Worth floating the design to the sherlodoc maintainer 102 + even while we carry the patch locally. 103 + 104 + ## Not doing 105 + 106 + - Building a separate page-only search index (lunr/pagefind). Rejected 107 + because it splits the search UX. 108 + - Rewriting sherlodoc's ranking model. The existing cost model is good 109 + enough for a first cut.