···2233A human-friendly DSL for ATProto Lexicons
4455-**This is a work in progress, things are subject to break and change**
55+*This is a work in progress, things are subject to break and change*
6677## What it looks like
88···2222};
2323```
24242525-## Getting started
2525+## Installation
26262727-### Install
2727+Right now you can only install mlf from source:
28282929```bash
3030# Install with all code generators (default: TypeScript, Go, Rust)
···3737cargo install --path mlf-cli --no-default-features
3838```
39394040-### Generate code from MLF
4141-4242-```bash
4343-# Generate TypeScript types
4444-mlf generate code -g typescript -i examples/**/*.mlf -o output/
4545-4646-# Generate Go structs
4747-mlf generate code -g go -i examples/**/*.mlf -o output/
4848-4949-# Generate Rust structs with serde
5050-mlf generate code -g rust -i examples/**/*.mlf -o output/
5151-5252-# Generate JSON lexicons (always available)
5353-mlf generate code -g json -i examples/**/*.mlf -o output/
5454-# Or use the legacy command:
5555-mlf generate lexicon -i examples/**/*.mlf -o output/
5656-```
5757-5858-### Validate MLF files
5959-6060-```bash
6161-mlf check examples/app.bsky.feed.post.mlf
6262-```
6363-6464-### Validate JSON records
6565-6666-```bash
6767-mlf validate examples/app.bsky.feed.post.mlf record.json
6868-```
6969-7070-### Convert JSON lexicons to MLF
7171-7272-Convert existing ATProto JSON lexicons to MLF format:
4040+## Documentation
73417474-```bash
7575-# Convert a single lexicon
7676-mlf generate mlf -i my-lexicon.json -o ./
4242+Visit the [MLF website](https://mlf.lol/docs) for comprehensive documentation, guides, and examples.
77437878-# Convert multiple lexicons
7979-mlf generate mlf -i "dist/lexicons/**/*.json" -o src/lexicons/
8080-```
4444+## Architecture
81458282-This is useful for:
8383-- Migrating existing JSON lexicons to the MLF format
8484-- Learning MLF syntax by comparing JSON and MLF
8585-- Working with lexicons from external sources
4646+Please review [ARCHITECTURE.md](ARCHITECTURE.md) for an overview of how the project is structured.
86478787-## Project layout
4848+## License
88498989-```
9090-mlf/
9191-├── mlf-cli/ # Command-line app
9292-├── mlf-lang/ # Parser and lexer (no_std compatible)
9393-├── mlf-codegen/ # Core code generation with plugin system
9494-├── codegen-plugins/ # Language-specific code generators
9595-│ ├── mlf-codegen-typescript/ # TypeScript generator
9696-│ ├── mlf-codegen-go/ # Go generator
9797-│ └── mlf-codegen-rust/ # Rust generator
9898-├── mlf-validation/ # Lexicon validation
9999-├── mlf-diagnostics/ # Fancy error reporting
100100-├── mlf-wasm/ # WASM bindings for browser use
101101-├── tree-sitter-mlf/ # Tree-sitter grammar for syntax highlighting
102102-└── website/ # Docs and playground
103103- └── mlf-playground-wasm/ # Playground WASM with all generators
104104-```
105105-106106-## Documentation
107107-108108-Full documentation available at the [MLF website](https://mlf.lol) (or run `just serve` in `website/`).
109109-110110-See [SPEC.md](SPEC.md) for the complete language specification.
5050+MIT
-975
SPEC.md
···11-# MLF (Matt's Lexicon Format) Specification
22-33-## Overview
44-55-MLF is a domain-specific language (DSL) for writing ATProto Lexicons with 100% fidelity to the [AT Protocol Lexicon specification](https://atproto.com/specs/lexicon). It provides a more ergonomic, type-safe syntax for defining records, queries, procedures, and types.
66-77-## Design Goals
88-99-1. **100% ATProto Fidelity**: Every valid ATProto Lexicon can be represented in MLF
1010-2. **Human-Readable**: Clear, concise syntax that's easy to read and write
1111-3. **no_std Compatible**: Core parser can run in constrained environments
1212-4. **Tooling-Friendly**: Enable validation, code generation, and formatting
1313-1414-## File Structure
1515-1616-### File Extension
1717-- `.mlf` - MLF source files
1818-1919-### Shebang (Optional)
2020-```mlf
2121-#!/usr/bin/env mlf
2222-```
2323-2424-The `#` character is reserved for shebangs only and is not used elsewhere in the syntax.
2525-2626-### File Naming Convention
2727-The file path determines the lexicon NSID. Files should follow the lexicon NSID structure:
2828-- `app.bsky.feed.post.mlf` → Lexicon NSID: `app.bsky.feed.post`
2929-- `sh.tangled.repo.issue.mlf` → Lexicon NSID: `sh.tangled.repo.issue`
3030-3131-The lexicon NSID is derived solely from the filename, not from any internal namespace declarations.
3232-3333-## Core Concepts
3434-3535-### NSIDs (Namespaced Identifiers)
3636-3737-NSIDs use dotted notation:
3838-```
3939-app.bsky.feed.post
4040-com.example.thing
4141-sh.tangled.repo.issue
4242-```
4343-4444-- Format: `authority.name(.name)*`
4545-- Authority: Typically a reversed domain name
4646-- Segments: Lowercase letters, numbers, hyphens (no underscores)
4747-4848-### Lexicon Resolution
4949-5050-References to definitions can be:
5151-5252-1. **Local (same file)**: Just use the name
5353- ```mlf
5454- record myRecord {
5555- field: myType // References type in same file
5656- }
5757-5858- def type myType = { /* ... */ }
5959- ```
6060-6161-2. **Cross-file (different lexicon)**: Use full dotted path
6262- ```mlf
6363- record myRecord {
6464- profile: app.bsky.actor.profile // References app/bsky/actor/profile.mlf
6565- author: com.example.user.author // References com/example/user/author.mlf
6666- }
6767- ```
6868-6969-**Note**: The `#` character is NOT used for references. All references use dotted notation.
7070-7171-### Syntax Rules
7272-7373-#### Semicolons
7474-7575-All definitions require semicolons:
7676-- `record` definitions end with `};`
7777-- `use` statements end with `;`
7878-- `token` definitions end with `;`
7979-- `inline type` definitions end with `;`
8080-- `def type` definitions end with `;`
8181-- `query` definitions end with `;`
8282-- `procedure` definitions end with `;`
8383-- `subscription` definitions end with `;`
8484-8585-#### Commas
8686-8787-Commas are **required** between items, with **trailing commas allowed**:
8888-8989-- **Record fields**: Commas required between fields, trailing comma allowed
9090- ```mlf
9191- record example {
9292- field1: string,
9393- field2: integer, // trailing comma allowed
9494- }
9595- ```
9696-9797-- **Constraints**: Commas required between constraint properties, trailing comma allowed
9898- ```mlf
9999- title: string constrained {
100100- maxLength: 200,
101101- minLength: 1, // trailing comma allowed
102102- }
103103- ```
104104-105105-- **Error definitions**: Commas required between errors, trailing comma allowed
106106- ```mlf
107107- query getThread(): thread | error {
108108- NotFound,
109109- BadRequest, // trailing comma allowed
110110- }
111111- ```
112112-113113-## Type System
114114-115115-### Primitive Types
116116-117117-```mlf
118118-null // Null value
119119-boolean // True or false
120120-integer // 64-bit integer
121121-string // UTF-8 string
122122-bytes // Byte array
123123-```
124124-125125-**Note:** ATProto Lexicons do not support floating-point numbers. Only `integer` is available for numeric values.
126126-127127-### Special String Formats
128128-129129-Defined in `prelude.mlf` and available everywhere:
130130-131131-```mlf
132132-Did // Decentralized Identifier (did:*)
133133-AtUri // AT-URI (at://...)
134134-AtIdentifier // Either a DID or Handle
135135-Handle // Handle identifier (domain name)
136136-Datetime // ISO 8601 datetime
137137-Uri // Generic URI
138138-Cid // Content Identifier
139139-Nsid // Namespaced Identifier
140140-Tid // Timestamp Identifier
141141-RecordKey // Record key
142142-Language // BCP 47 language code
143143-```
144144-145145-### Blob Types
146146-147147-```mlf
148148-blob // Generic blob
149149-```
150150-151151-With constraints:
152152-```mlf
153153-avatar: blob constrained {
154154- accept: ["image/png", "image/jpeg"]
155155- maxSize: 1000000 // bytes
156156-}
157157-```
158158-159159-### Unknown Type
160160-161161-```mlf
162162-unknown // Represents any value, used for forward compatibility
163163-```
164164-165165-## Definitions
166166-167167-### Records
168168-169169-Records are the primary data structure, stored in repositories:
170170-171171-```mlf
172172-record post {
173173- text!: string constrained {
174174- maxLength: 300
175175- maxGraphemes: 300
176176- }
177177- createdAt!: Datetime
178178- reply: replyRef // Optional field (default)
179179-}
180180-```
181181-182182-### Type Definitions
183183-184184-MLF supports two kinds of type definitions:
185185-186186-**Inline Types** - Expanded at the point of use, never appear in generated lexicon defs:
187187-188188-```mlf
189189-inline type AtIdentifier = string constrained {
190190- format "at-identifier"
191191-};
192192-```
193193-194194-**Def Types** - Become named definitions in the lexicon's defs block:
195195-196196-```mlf
197197-def type ReplyRef = {
198198- root!: AtUri
199199- parent!: AtUri
200200-};
201201-```
202202-203203-Use `inline type` for type aliases that should be expanded inline (like primitive type wrappers). Use `def type` for types that should be referenced by name in the generated lexicon.
204204-205205-### Tokens
206206-207207-Tokens are named constants used in enums and unions:
208208-209209-```mlf
210210-/// Open state
211211-token open;
212212-213213-/// Closed state
214214-token closed;
215215-216216-record issue {
217217- state!: string constrained {
218218- knownValues: [
219219- open // References token defined above
220220- closed
221221- ]
222222- default: "open"
223223- }
224224-}
225225-```
226226-227227-Tokens must have doc comments describing their purpose.
228228-229229-### Queries
230230-231231-Queries are read-only HTTP endpoints (GET):
232232-233233-```mlf
234234-/// Get a user profile
235235-query getProfile(
236236- /// The actor's DID or handle
237237- actor!: AtIdentifier
238238- /// Optional viewer context (default)
239239- viewer: Did
240240-): profileView | error {
241241- /// Profile not found
242242- ProfileNotFound
243243- /// Invalid request parameters
244244- BadRequest
245245-};
246246-```
247247-248248-### Procedures
249249-250250-Procedures are write operations (POST):
251251-252252-```mlf
253253-/// Create a new post
254254-procedure createPost(
255255- text!: string
256256- createdAt!: Datetime
257257-): {
258258- uri!: AtUri
259259- cid!: Cid
260260-} | error {
261261- /// Text exceeds maximum length
262262- TextTooLong
263263-};
264264-```
265265-266266-### Subscriptions
267267-268268-Subscriptions are WebSocket-based event streams that emit messages over time. They are used for real-time updates and event notifications.
269269-270270-```mlf
271271-/// Subscribe to repository events
272272-subscription subscribeRepos(
273273- /// Optional cursor for resuming from a specific point (default)
274274- cursor: integer
275275-): commit | identity | handle | migrate | tombstone | info;
276276-```
277277-278278-**Message definitions** for subscriptions are defined as def types or records:
279279-280280-```mlf
281281-/// Commit message emitted by subscribeRepos
282282-def type commit = {
283283- seq!: integer
284284- rebase!: boolean
285285- tooBig!: boolean
286286- repo!: Did
287287- commit!: Cid
288288- rev!: string
289289- since!: string
290290- blocks!: bytes
291291- ops!: repoOp[]
292292- blobs!: Cid[]
293293- time!: Datetime
294294-};
295295-296296-/// Info message
297297-def type info = {
298298- name!: string
299299- message: string // Optional (default)
300300-};
301301-```
302302-303303-**Subscription features:**
304304-305305-- Parameters: Like queries, subscriptions can have parameters
306306-- Return type: A union of message types that can be emitted
307307-- Each message type must be defined as a def type or record
308308-- Message types can be local or imported from other lexicons
309309-- Subscriptions are long-lived WebSocket connections
310310-- No error block (errors are handled at the WebSocket protocol level)
311311-312312-**Example: Chat message subscription**
313313-314314-```mlf
315315-/// Subscribe to chat messages for a stream
316316-subscription subscribeChat(
317317- /// The DID of the streamer
318318- streamer!: Did
319319- /// Optional cursor to resume from (default)
320320- cursor: string
321321-): message | delete | join | leave;
322322-323323-/// Chat message payload
324324-def type message = {
325325- id!: string
326326- text!: string
327327- author!: Did
328328- createdAt!: Datetime
329329-};
330330-331331-/// Delete event payload
332332-def type delete = {
333333- id!: string
334334-};
335335-336336-/// Join event payload
337337-def type join = {
338338- user!: Did
339339-};
340340-341341-/// Leave event payload
342342-def type leave = {
343343- user!: Did
344344-};
345345-```
346346-347347-### Return Types
348348-349349-Queries and procedures can return:
350350-351351-1. **Simple success**: `(): returnType`
352352-2. **Success with errors**: `(): successType | error { ErrorName, ... }`
353353- - Each error must have a doc comment describing it
354354-3. **Unknown/empty**: `(): unknown`
355355-356356-## Type Modifiers
357357-358358-### Optional and Required Fields
359359-360360-Fields are **optional by default**. Use `!:` to mark a field as required:
361361-362362-```mlf
363363-record example {
364364- optional: string // Optional (default)
365365- required!: string // Required (marked with !)
366366-}
367367-```
368368-369369-### Arrays
370370-371371-```mlf
372372-record example {
373373- tags: string[]
374374- items: string[] constrained {
375375- minLength: 1
376376- maxLength: 10
377377- }
378378-}
379379-```
380380-381381-### Unions
382382-383383-Use the pipe operator `|`. Unions are **open by default** (allowing unknown types):
384384-385385-```mlf
386386-record example {
387387- // Open union (default, can include unknown types)
388388- content: text | image | video
389389-390390- // Union of tokens (also open by default)
391391- state: open | closed | pending
392392-}
393393-```
394394-395395-Closed unions (only allowing listed types) use `| !`:
396396-397397-```mlf
398398-record example {
399399- // Closed union (marked with !, only these types allowed)
400400- content: text | image | video | !
401401-}
402402-```
403403-404404-### References
405405-406406-Reference local or external definitions:
407407-408408-```mlf
409409-// Local reference (same file)
410410-record post {
411411- author: author // References 'def type author' in same file
412412-}
413413-414414-// Cross-file reference
415415-record post {
416416- profile: app.bsky.actor.profile // References app/bsky/actor/profile.mlf
417417-}
418418-```
419419-420420-## Constraints
421421-422422-Constraints refine types by adding additional restrictions. A key principle is that constraints can only make types **more restrictive**, never less restrictive. This ensures type safety and proper substitutability.
423423-424424-### Constraint Refinement Rules
425425-426426-When applying constraints, each constraint must be **at least as restrictive** as any parent constraint:
427427-428428-```mlf
429429-// Valid: More restrictive constraints
430430-def type shortString = string constrained {
431431- maxLength: 100
432432-};
433433-434434-record post {
435435- // Can further constrain to 50 (more restrictive than 100)
436436- title: shortString constrained {
437437- maxLength: 50 // ✓ Valid: 50 ≤ 100
438438- }
439439-}
440440-441441-// Invalid: Less restrictive constraints
442442-record invalid {
443443- // ERROR: Cannot expand to 200 (less restrictive than 100)
444444- content: shortString constrained {
445445- maxLength: 200 // ✗ Invalid: 200 > 100
446446- }
447447-}
448448-```
449449-450450-**Refinement rules by constraint type:**
451451-452452-- **Numeric bounds**: `minimum` can only increase, `maximum` can only decrease
453453-- **Length bounds**: `minLength`/`minGraphemes` can only increase, `maxLength`/`maxGraphemes` can only decrease
454454-- **Enums**: Can only restrict to a subset of values
455455-- **Known values**: Can add new values (extensible) but cannot remove specified ones
456456-- **Format**: Cannot change once specified
457457-- **Defaults**: Can be specified if not already set
458458-459459-### String Constraints
460460-461461-```mlf
462462-field: string constrained {
463463- minLength: 1 // Minimum byte length
464464- maxLength: 1000 // Maximum byte length
465465- minGraphemes: 1 // Minimum grapheme clusters
466466- maxGraphemes: 100 // Maximum grapheme clusters
467467- format: "uri" // Format validation
468468- enum: ["a", "b", "c"] // Allowed values (closed set) - string literals
469469- knownValues: [ // Known values (extensible set) - can be string literals OR token references
470470- value1 // Token reference
471471- "value2" // String literal
472472- ]
473473- default: "defaultValue" // Default value
474474-}
475475-```
476476-477477-**Note**: `enum`, `knownValues`, and `default` can accept either:
478478-- **Literals**: `"open"`, `42`, `true` (string, integer, or boolean)
479479-- **References**: `open`, `myType` (references to tokens, records, types, etc.)
480480-481481-When using references, the identifier will be resolved to its string representation in the generated lexicon.
482482-483483-### Integer Constraints
484484-485485-```mlf
486486-field: integer constrained {
487487- minimum: 0
488488- maximum: 100
489489- enum: [1, 2, 3]
490490- default: 1
491491-}
492492-```
493493-494494-### Array Constraints
495495-496496-```mlf
497497-field: string[] constrained {
498498- minLength: 1
499499- maxLength: 10
500500-}
501501-```
502502-503503-### Blob Constraints
504504-505505-```mlf
506506-field: blob constrained {
507507- accept: ["image/png", "image/jpeg"] // MIME types
508508- maxSize: 1000000 // Bytes
509509-}
510510-```
511511-512512-### Boolean Constraints
513513-514514-```mlf
515515-field: boolean constrained {
516516- default: false
517517-}
518518-```
519519-520520-## Comments
521521-522522-### Documentation Comments
523523-524524-Use `///` for documentation (appears in generated docs/code):
525525-526526-```mlf
527527-/// A user profile record
528528-record profile {
529529- /// The user's display name
530530- displayName?: string
531531-}
532532-```
533533-534534-### Regular Comments
535535-536536-Regular comments (`//`) are ignored when processing and will have no impact on any output.
537537-538538-## Annotations
539539-540540-Annotations use the `@` symbol and are metadata markers for external tooling. MLF itself assigns no semantic meaning to annotations - they are purely for tools, linters, code generators, and other processors to interpret.
541541-542542-### Annotation Syntax
543543-544544-Three forms of annotations are supported:
545545-546546-**1. Simple annotation:**
547547-```mlf
548548-@deprecated
549549-record oldRecord {
550550- field: string
551551-}
552552-```
553553-554554-**2. Positional arguments:**
555555-```mlf
556556-@since(1, 2, 0)
557557-@doc("https://example.com/docs")
558558-record example {
559559- field: string
560560-}
561561-```
562562-563563-Arguments can be:
564564-- Strings: `"value"`
565565-- Numbers: `42`, `3.14`
566566-- Booleans: `true`, `false`
567567-568568-**3. Named arguments:**
569569-```mlf
570570-@validate(min: 0, max: 100, strict: true)
571571-@codegen(language: "rust", derive: "Debug, Clone")
572572-record example {
573573- field: integer
574574-}
575575-```
576576-577577-### Annotation Placement
578578-579579-Annotations can be placed on:
580580-- Records
581581-- Inline Types
582582-- Def Types
583583-- Tokens
584584-- Queries
585585-- Procedures
586586-- Subscriptions
587587-- Fields within records/types
588588-589589-```mlf
590590-/// A user profile
591591-@table(name: "profiles", indexes: "did,handle")
592592-record profile {
593593- /// User's DID
594594- @indexed
595595- did!: Did
596596-597597- /// Display name (optional)
598598- @sensitive(pii: true)
599599- displayName: string
600600-}
601601-```
602602-603603-### Common Annotation Examples
604604-605605-```mlf
606606-// Deprecation
607607-@deprecated
608608-@deprecated(since: "2.0.0", replacement: "newRecord")
609609-record oldRecord { /* ... */ }
610610-611611-// Code generation hints
612612-@derive("Debug, Clone, Serialize")
613613-@table(name: "users")
614614-record user { /* ... */ }
615615-616616-// Validation
617617-@validate(custom: "validateEmail")
618618-@range(min: 0, max: 100)
619619-field: integer
620620-621621-// Documentation
622622-@example("did:plc:abc123")
623623-@see("https://atproto.com/specs/did")
624624-field: Did
625625-626626-// Versioning
627627-@since(1, 0, 0)
628628-@unstable
629629-record experimentalFeature { /* ... */ }
630630-```
631631-632632-**Note:** The interpretation of annotations is entirely up to the tooling consuming the MLF. Different tools may support different annotation sets.
633633-634634-## Use Statements
635635-636636-Import definitions from other lexicons:
637637-638638-```mlf
639639-// Named imports
640640-use app.bsky.actor.{profile, profileView};
641641-use sh.tangled.repo.issue.{issue, open, closed};
642642-643643-// Alias entire namespace
644644-use app.bsky.actor as Actor;
645645-646646-// Wildcard import
647647-use app.bsky.feed.*;
648648-649649-// Mixed
650650-use sh.tangled.repo.issue.{issue as IssueRecord, open, closed};
651651-```
652652-653653-After importing, use the short name:
654654-655655-```mlf
656656-use app.bsky.actor.profile;
657657-658658-record myThing {
659659- author: profile // Instead of app.bsky.actor.profile
660660-}
661661-```
662662-663663-## Lexicon Discovery & Resolution
664664-665665-### File Discovery
666666-667667-Tools discover lexicons explicit paths: Single file, list of files, or glob pattern
668668-669669-```bash
670670-mlf validate app.bsky.feed.post.mlf
671671-mlf validate *.mlf
672672-mlf validate "**/*.mlf"
673673-```
674674-675675-### Resolution Order
676676-677677-When resolving cross-file references:
678678-679679-1. Current file (local definitions)
680680-2. Explicitly imported lexicons (via `use`)
681681-3. Configured lexicon paths
682682-4. (Future) Remote fetch via ATProto
683683-684684-### File Path Convention
685685-686686-The lexicon NSID is determined by the file path. Lexicons can follow a directory structure matching their NSID:
687687-688688-```
689689-lexicons/
690690- app/
691691- bsky/
692692- actor/
693693- profile.mlf → app.bsky.actor.profile
694694- feed/
695695- post.mlf → app.bsky.feed.post
696696- com/
697697- example/
698698- thing.mlf → com.example.thing
699699-```
700700-701701-Or use a flat structure with dots in the filename:
702702-```
703703-lexicons/
704704- app.bsky.actor.profile.mlf
705705- app.bsky.feed.post.mlf
706706- com.example.thing.mlf
707707-```
708708-709709-In both cases, the NSID is derived from the file path, not from internal declarations.
710710-711711-## CLI Commands
712712-713713-```bash
714714-# Generation
715715-mlf generate code --input "**/*.mlf" --plugin rust src/*
716716-mlf generate lexicon --input "**/*.mlf" lexicons/*
717717-mlf generate example --input "**/*.mlf" --count 5 examples/*
718718-719719-# Convert JSON lexicons to MLF
720720-mlf generate mlf --input "lexicons/**/*.json" --output ./mlf/
721721-722722-# Validate lexicons
723723-mlf validate <files|globs>
724724-mlf validate "**/*.mlf"
725725-726726-# Format lexicons
727727-mlf fmt <files|globs>
728728-729729-# Validate a record against a lexicon
730730-mlf check --input app.bsky.feed.post.mlf ./record.json
731731-```
732732-733733-### JSON to MLF Conversion
734734-735735-The `mlf generate mlf` command converts ATProto JSON lexicons back to MLF format. This is useful for:
736736-737737-- **Migration**: Converting existing JSON lexicons to MLF
738738-- **Interoperability**: Working with lexicons from external sources
739739-- **Learning**: Seeing how JSON lexicons map to MLF syntax
740740-- **Comparison**: Generating MLF from JSON to compare with hand-written MLF
741741-742742-The converter automatically:
743743-- Converts format strings (did, datetime, handle) to prelude types (Did, Datetime, Handle)
744744-- Properly formats required (`!`) and optional (default) fields
745745-- Converts `namespace#name` references to `namespace.name` notation
746746-- Generates clean, properly indented MLF with correct syntax
747747-748748-## Examples
749749-750750-### Complete Lexicon Example
751751-752752-```mlf
753753-#!/usr/bin/env mlf
754754-755755-use app.bsky.actor.profile;
756756-757757-/// Open issue state
758758-token open;
759759-760760-/// Closed issue state
761761-token closed;
762762-763763-/// An issue in a repository
764764-record issue {
765765- /// The repository this issue belongs to
766766- repo!: AtUri
767767- /// Issue title
768768- title!: string constrained {
769769- minGraphemes: 1
770770- maxGraphemes: 200
771771- }
772772- /// Issue body (markdown)
773773- body: string constrained {
774774- maxGraphemes: 10000
775775- }
776776- /// Issue state
777777- state!: string constrained {
778778- knownValues: [
779779- open
780780- closed
781781- ]
782782- default: "open"
783783- }
784784- /// Creation timestamp
785785- createdAt!: Datetime
786786-}
787787-788788-/// A comment on an issue
789789-record comment {
790790- /// The issue this comment belongs to
791791- issue!: AtUri
792792- /// Comment body (markdown)
793793- body!: string constrained {
794794- minGraphemes: 1
795795- maxGraphemes: 10000
796796- }
797797- /// Creation timestamp
798798- createdAt!: Datetime
799799- /// Optional reply target
800800- replyTo: AtUri
801801-}
802802-803803-/// Get an issue by URI
804804-query getIssue(
805805- /// Issue AT-URI
806806- uri!: AtUri
807807-): issue | error {
808808- /// Issue not found
809809- NotFound
810810-};
811811-812812-/// Create a new issue
813813-procedure createIssue(
814814- repo!: AtUri
815815- title!: string
816816- body: string // Optional (default)
817817-): {
818818- uri!: AtUri
819819- cid!: Cid
820820-} | error {
821821- /// Repository not found
822822- RepoNotFound
823823- /// Title too long
824824- TitleTooLong
825825-};
826826-```
827827-828828-## ATProto Mapping
829829-830830-### MLF → JSON Lexicon
831831-832832-MLF compiles to standard ATProto JSON Lexicons:
833833-834834-**MLF:**
835835-```mlf
836836-record post {
837837- text!: string constrained {
838838- maxLength: 300
839839- }
840840- createdAt!: Datetime
841841-}
842842-```
843843-844844-**JSON:**
845845-```json
846846-{
847847- "lexicon": 1,
848848- "id": "app.bsky.feed.post",
849849- "defs": {
850850- "main": {
851851- "type": "record",
852852- "key": "tid",
853853- "record": {
854854- "type": "object",
855855- "required": ["text", "createdAt"],
856856- "properties": {
857857- "text": {
858858- "type": "string",
859859- "maxLength": 300
860860- },
861861- "createdAt": {
862862- "type": "string",
863863- "format": "datetime"
864864- }
865865- }
866866- }
867867- }
868868- }
869869-}
870870-```
871871-872872-### Subscription Mapping
873873-874874-**MLF:**
875875-```mlf
876876-subscription subscribeRepos(
877877- cursor: integer // Optional (default)
878878-): commit | identity;
879879-```
880880-881881-**JSON:**
882882-```json
883883-{
884884- "lexicon": 1,
885885- "id": "com.atproto.sync.subscribeRepos",
886886- "defs": {
887887- "main": {
888888- "type": "subscription",
889889- "parameters": {
890890- "type": "params",
891891- "properties": {
892892- "cursor": {
893893- "type": "integer"
894894- }
895895- }
896896- },
897897- "message": {
898898- "schema": {
899899- "type": "union",
900900- "refs": ["#commit", "#identity"]
901901- }
902902- }
903903- },
904904- "commit": {
905905- "type": "object",
906906- "required": ["seq", "repo", "commit"],
907907- "properties": {
908908- "seq": { "type": "integer" },
909909- "repo": { "type": "string", "format": "did" },
910910- "commit": { "type": "string", "format": "cid" }
911911- }
912912- }
913913- }
914914-}
915915-```
916916-917917-## Future Considerations
918918-919919-### Potential Extensions
920920-921921-- **Version constraints**: Specify compatible lexicon versions in lexicon headers
922922-- **Custom validation**: Pluggable validators beyond built-in constraints
923923-- **Documentation generation**: Automatic API docs from MLF with annotation support
924924-- **Standard annotation registry**: Common annotations like `@deprecated`, `@since`, `@internal`
925925-- **Import resolution**: Remote lexicon fetching and caching
926926-- **Type inference**: Automatic type inference for constrained types
927927-928928-### Versioning
929929-930930-Lexicons are versioned at the NSID level. MLF files should include version metadata in comments or future version declarations.
931931-932932-## Appendix
933933-934934-### Reserved Keywords
935935-936936-```
937937-as, blob, boolean, bytes, constrained, def, error, inline, integer,
938938-null, procedure, query, record, string, subscription, token,
939939-type, unknown, use
940940-```
941941-942942-### Reserved Names
943943-944944-The following names cannot be used as item names:
945945-946946-```
947947-main, defs
948948-```
949949-950950-### Raw Identifiers
951951-952952-To use a reserved keyword as an identifier, wrap it in backticks:
953953-954954-```mlf
955955-def type `record` = {
956956- `record`: com.atproto.repo.strongRef
957957- `error`: string
958958-};
959959-```
960960-961961-This allows field names or type names to match reserved keywords when necessary for compatibility with existing schemas.
962962-963963-### Constraint Keywords
964964-965965-```
966966-accept, default, enum, format, knownValues, maxGraphemes,
967967-maxLength, maxSize, maximum, minGraphemes, minLength, minimum
968968-```
969969-970970-### Format Values
971971-972972-```
973973-at-identifier, at-uri, cid, datetime, did, handle, language,
974974-nsid, record-key, tid, uri
975975-```
+209-49
mlf-cli/src/check.rs
···3737 help: Option<String>,
3838 },
39394040- #[error("Failed to expand glob pattern")]
4141- #[diagnostic(code(mlf::check::glob_error))]
4242- GlobError {
4343- #[source]
4444- source: glob::GlobError,
4545- },
4646-4747- #[error("Invalid glob pattern: {pattern}")]
4848- #[diagnostic(code(mlf::check::invalid_glob))]
4949- InvalidGlob {
5050- pattern: String,
5151- #[source]
5252- source: glob::PatternError,
5353- },
54405541 #[error("Record validation failed")]
5642 #[diagnostic(code(mlf::check::record_validation))]
···6349 ConfigError(#[from] ConfigError),
6450}
65516666-pub fn run_check(input_patterns: Vec<String>) -> Result<(), CheckError> {
6767- // If no input patterns provided, use source directory from mlf.toml
6868- let patterns = if input_patterns.is_empty() {
6969- let current_dir = std::env::current_dir()
7070- .map_err(|e| CheckError::ReadFile {
7171- path: ".".to_string(),
7272- source: e,
7373- })?;
5252+pub fn run_check(input_paths: Vec<PathBuf>, explicit_root: Option<PathBuf>) -> Result<(), CheckError> {
5353+ let current_dir = std::env::current_dir()
5454+ .map_err(|e| CheckError::ReadFile {
5555+ path: ".".to_string(),
5656+ source: e,
5757+ })?;
74585959+ // Determine root directory and input paths
6060+ let (root_dir, file_paths) = if input_paths.is_empty() {
6161+ // No input provided: must use mlf.toml
7562 match find_project_root(¤t_dir) {
7663 Ok(project_root) => {
7764 let config_path = project_root.join("mlf.toml");
7865 let config = MlfConfig::load(&config_path)?;
7979- let source_pattern = format!("{}/**/*.mlf", config.source.directory);
6666+ let source_dir = project_root.join(&config.source.directory);
6767+ let root = explicit_root.unwrap_or_else(|| source_dir.clone());
8068 println!("Using source directory from mlf.toml: {}", config.source.directory);
8181- vec![source_pattern]
6969+7070+ // Collect all .mlf files from source directory
7171+ let files = collect_mlf_files(&source_dir)?;
7272+ (root, files)
8273 }
8374 Err(ConfigError::NotFound) => {
8475 return Err(CheckError::ValidationErrors {
···8879 Err(e) => return Err(CheckError::ConfigError(e)),
8980 }
9081 } else {
9191- input_patterns
9292- };
8282+ // Input provided: determine root
8383+ let root = if let Some(explicit) = explicit_root {
8484+ // --root flag takes precedence
8585+ explicit
8686+ } else if let Ok(project_root) = find_project_root(¤t_dir) {
8787+ // Try to use mlf.toml source directory
8888+ let config_path = project_root.join("mlf.toml");
8989+ if let Ok(config) = MlfConfig::load(&config_path) {
9090+ project_root.join(&config.source.directory)
9191+ } else {
9292+ current_dir.clone()
9393+ }
9494+ } else {
9595+ // Fall back to current directory
9696+ current_dir.clone()
9797+ };
93989494- let mut file_paths = Vec::new();
9999+ // Collect files from input paths
100100+ let mut files = Vec::new();
101101+ for input_path in input_paths {
102102+ let path = if input_path.is_absolute() {
103103+ input_path
104104+ } else {
105105+ current_dir.join(input_path)
106106+ };
951079696- for pattern in patterns {
9797- if pattern.contains('*') || pattern.contains('?') {
9898- for entry in glob::glob(&pattern).map_err(|source| CheckError::InvalidGlob {
9999- pattern: pattern.clone(),
100100- source,
101101- })? {
102102- let path = entry.map_err(|source| CheckError::GlobError { source })?;
103103- file_paths.push(path);
108108+ if path.is_dir() {
109109+ files.extend(collect_mlf_files(&path)?);
110110+ } else if path.is_file() {
111111+ files.push(path);
112112+ } else {
113113+ return Err(CheckError::ReadFile {
114114+ path: path.display().to_string(),
115115+ source: std::io::Error::new(std::io::ErrorKind::NotFound, "Path not found"),
116116+ });
104117 }
105105- } else {
106106- file_paths.push(PathBuf::from(pattern));
107118 }
108108- }
119119+ (root, files)
120120+ };
109121110122 // Try to load cached lexicons from .mlf directory
111123 let current_dir = std::env::current_dir()
···148160 }
149161 };
150162151151- let namespace = file_path
152152- .file_stem()
153153- .and_then(|s| s.to_str())
154154- .unwrap_or("unknown")
155155- .to_string();
163163+ let namespace = extract_namespace(&file_path, &root_dir)?;
156164157165 if let Err(e) = workspace.add_module(namespace.clone(), lexicon.clone()) {
158158- let diagnostic = ValidationDiagnostic::new(filename.clone(), source.clone(), e);
166166+ let diagnostic = ValidationDiagnostic::new(filename.clone(), source.clone(), namespace.clone(), e);
159167 eprintln!("{:?}", miette::Report::new(diagnostic));
160168 had_parse_errors = true;
161169 continue;
162170 }
163171164164- source_files.push((filename.clone(), source));
172172+ source_files.push((filename.clone(), namespace.clone(), source));
165173 println!("✓ {}: Parsed successfully", file_path.display());
166174 }
167175···172180 }
173181174182 if let Err(e) = workspace.resolve() {
175175- // Show all errors from the first source file
176176- if let Some((filename, source)) = source_files.first() {
177177- let diagnostic = ValidationDiagnostic::new(filename.clone(), source.clone(), e);
178178- eprintln!("{:?}", miette::Report::new(diagnostic));
183183+ // Collect all modules that have errors
184184+ let mut modules_with_errors: std::collections::BTreeMap<String, (Option<String>, String)> = std::collections::BTreeMap::new();
185185+186186+ // First, add all explicitly checked files
187187+ for (filename, namespace, source) in &source_files {
188188+ modules_with_errors.insert(namespace.clone(), (Some(filename.clone()), source.clone()));
179189 }
190190+191191+ // Then, find any cached modules with errors and try to load their source
192192+ for error in &e.errors {
193193+ let error_namespace = mlf_diagnostics::get_error_module_namespace_str(error);
194194+ if !modules_with_errors.contains_key(error_namespace) {
195195+ let namespace_path = error_namespace.replace('.', "/");
196196+ let mut source_loaded = false;
197197+198198+ // Try multiple locations for the source file
199199+ let mut possible_paths = vec![
200200+ // Check in lexicons/ directory (common structure)
201201+ current_dir.join("lexicons").join(format!("{}.mlf", namespace_path)),
202202+ // Check in source directory from config
203203+ current_dir.join("src").join(format!("{}.mlf", namespace_path)),
204204+ // Check relative to current directory
205205+ current_dir.join(format!("{}.mlf", namespace_path)),
206206+ ];
207207+208208+ // Add cache directory if available (lexicons are in lexicons/mlf/ subdirectory)
209209+ if let Some(cache_dir) = &mlf_cache_dir {
210210+ possible_paths.push(cache_dir.join("lexicons").join("mlf").join(format!("{}.mlf", namespace_path)));
211211+ }
212212+213213+ for path in possible_paths {
214214+ if let Ok(source) = std::fs::read_to_string(&path) {
215215+ modules_with_errors.insert(
216216+ error_namespace.to_string(),
217217+ (Some(path.display().to_string()), source)
218218+ );
219219+ source_loaded = true;
220220+ break;
221221+ }
222222+ }
223223+224224+ if !source_loaded {
225225+ // Couldn't load source, add placeholder
226226+ modules_with_errors.insert(
227227+ error_namespace.to_string(),
228228+ (None, String::new())
229229+ );
230230+ }
231231+ }
232232+ }
233233+234234+ // Show diagnostics for all modules with errors
235235+ for (namespace, (filename_opt, source)) in &modules_with_errors {
236236+ // Only show diagnostic if this module has errors
237237+ let has_errors = e.errors.iter().any(|error| {
238238+ mlf_diagnostics::get_error_module_namespace_str(error) == namespace
239239+ });
240240+241241+ if has_errors {
242242+ if let Some(filename) = filename_opt {
243243+ // Have source file, show full diagnostic
244244+ let diagnostic = ValidationDiagnostic::new(filename.clone(), source.clone(), namespace.clone(), e.clone());
245245+ eprintln!("{:?}", miette::Report::new(diagnostic));
246246+ } else {
247247+ // No source available, just list the errors
248248+ let error_count = e.errors.iter()
249249+ .filter(|err| mlf_diagnostics::get_error_module_namespace_str(err) == namespace)
250250+ .count();
251251+ eprintln!("\n{}: {} error(s) (source not available)", namespace, error_count);
252252+ }
253253+ }
254254+ }
255255+180256 return Err(CheckError::ValidationErrors {
181257 help: Some("Workspace validation failed".to_string()),
182258 });
···235311 }
236312 }
237313}
314314+315315+/// Recursively collect all .mlf files from a directory
316316+fn collect_mlf_files(dir: &std::path::Path) -> Result<Vec<PathBuf>, CheckError> {
317317+ let mut files = Vec::new();
318318+319319+ if !dir.exists() {
320320+ return Err(CheckError::ReadFile {
321321+ path: dir.display().to_string(),
322322+ source: std::io::Error::new(std::io::ErrorKind::NotFound, "Directory not found"),
323323+ });
324324+ }
325325+326326+ fn visit_dirs(dir: &std::path::Path, files: &mut Vec<PathBuf>) -> std::io::Result<()> {
327327+ if dir.is_dir() {
328328+ for entry in std::fs::read_dir(dir)? {
329329+ let entry = entry?;
330330+ let path = entry.path();
331331+ if path.is_dir() {
332332+ visit_dirs(&path, files)?;
333333+ } else if path.extension().and_then(|s| s.to_str()) == Some("mlf") {
334334+ files.push(path);
335335+ }
336336+ }
337337+ }
338338+ Ok(())
339339+ }
340340+341341+ visit_dirs(dir, &mut files).map_err(|source| CheckError::ReadFile {
342342+ path: dir.display().to_string(),
343343+ source,
344344+ })?;
345345+346346+ Ok(files)
347347+}
348348+349349+/// Extract namespace from file path relative to root directory
350350+/// e.g., root=/project/lexicons, file=/project/lexicons/com/example/foo.mlf -> com.example.foo
351351+fn extract_namespace(file_path: &std::path::Path, root_dir: &std::path::Path) -> Result<String, CheckError> {
352352+ // Get the canonical paths to handle . and .. correctly
353353+ let file_canonical = file_path.canonicalize().map_err(|source| CheckError::ReadFile {
354354+ path: file_path.display().to_string(),
355355+ source,
356356+ })?;
357357+358358+ let root_canonical = root_dir.canonicalize().map_err(|source| CheckError::ReadFile {
359359+ path: root_dir.display().to_string(),
360360+ source,
361361+ })?;
362362+363363+ // Get relative path from root to file
364364+ let relative_path = file_canonical.strip_prefix(&root_canonical)
365365+ .map_err(|_| CheckError::ValidationErrors {
366366+ help: Some(format!(
367367+ "File {} is not within root directory {}",
368368+ file_path.display(),
369369+ root_dir.display()
370370+ )),
371371+ })?;
372372+373373+ // Convert path to namespace
374374+ let mut components = Vec::new();
375375+ for component in relative_path.components() {
376376+ if let std::path::Component::Normal(os_str) = component {
377377+ if let Some(s) = os_str.to_str() {
378378+ components.push(s);
379379+ }
380380+ }
381381+ }
382382+383383+ // Remove .mlf extension from last component
384384+ if let Some(last) = components.last_mut() {
385385+ if let Some(stem) = last.strip_suffix(".mlf") {
386386+ *last = stem;
387387+ }
388388+ }
389389+390390+ if components.is_empty() {
391391+ return Err(CheckError::ValidationErrors {
392392+ help: Some(format!("Could not extract namespace from path: {}", file_path.display())),
393393+ });
394394+ }
395395+396396+ Ok(components.join("."))
397397+}
+161-42
mlf-cli/src/generate/code.rs
···2121 source: std::io::Error,
2222 },
23232424- #[error("Failed to expand glob pattern")]
2525- #[diagnostic(code(mlf::generate::glob_error))]
2626- GlobError {
2727- #[source]
2828- source: glob::GlobError,
2929- },
3030-3131- #[error("Invalid glob pattern: {pattern}")]
3232- #[diagnostic(code(mlf::generate::invalid_glob))]
3333- InvalidGlob {
3434- pattern: String,
3535- #[source]
3636- source: glob::PatternError,
3737- },
3838-3924 #[error("Generator '{name}' not found")]
4025 #[diagnostic(code(mlf::generate::generator_not_found))]
4126 #[help("Available generators: {}", available.join(", "))]
···5136}
52375338pub fn run(
5454- generator_name: String,
5555- input_patterns: Vec<String>,
5656- output_dir: PathBuf,
3939+ generator_name: Option<String>,
4040+ input_paths: Vec<PathBuf>,
4141+ output_dir: Option<PathBuf>,
4242+ root: Option<PathBuf>,
5743 flat: bool,
5844) -> Result<(), GenerateError> {
4545+ let current_dir = std::env::current_dir().map_err(|source| GenerateError::WriteOutput {
4646+ path: "current directory".to_string(),
4747+ source,
4848+ })?;
4949+5050+ // Load mlf.toml if available
5151+ let project_root = crate::config::find_project_root(¤t_dir).ok();
5252+ let config = project_root
5353+ .as_ref()
5454+ .and_then(|root| {
5555+ let config_path = root.join("mlf.toml");
5656+ crate::config::MlfConfig::load(&config_path).ok()
5757+ });
5858+5959+ // Determine generator name
6060+ let generator_name = if let Some(explicit) = generator_name {
6161+ explicit
6262+ } else if let Some(cfg) = &config {
6363+ // Find first non-lexicon, non-mlf output in mlf.toml
6464+ cfg.output
6565+ .iter()
6666+ .find(|o| o.r#type != "lexicon" && o.r#type != "mlf")
6767+ .map(|o| o.r#type.clone())
6868+ .ok_or_else(|| GenerateError::GeneratorNotFound {
6969+ name: "any".to_string(),
7070+ available: vec!["No code generator outputs configured in mlf.toml. Either add an output configuration or provide --generator flag.".to_string()],
7171+ })?
7272+ } else {
7373+ return Err(GenerateError::GeneratorNotFound {
7474+ name: "any".to_string(),
7575+ available: vec!["No mlf.toml found and no --generator flag provided. Either create a mlf.toml or provide --generator flag.".to_string()],
7676+ });
7777+ };
7878+5979 // Find the generator
6080 let generators = mlf_codegen::plugin::generators();
6181 let generator = generators
···7292 println!("Using generator: {} ({})", generator.name(), generator.description());
7393 println!("Output extension: {}\n", generator.file_extension());
74949595+ // Determine output directory
9696+ let output_dir = if let Some(explicit) = output_dir {
9797+ explicit
9898+ } else if let Some(cfg) = &config {
9999+ // Find output matching the generator type
100100+ cfg.output
101101+ .iter()
102102+ .find(|o| o.r#type == generator_name)
103103+ .map(|o| PathBuf::from(&o.directory))
104104+ .ok_or_else(|| GenerateError::WriteOutput {
105105+ path: "mlf.toml".to_string(),
106106+ source: std::io::Error::new(
107107+ std::io::ErrorKind::NotFound,
108108+ format!("No output configured for generator '{}' in mlf.toml", generator_name)
109109+ ),
110110+ })?
111111+ } else {
112112+ return Err(GenerateError::WriteOutput {
113113+ path: "mlf.toml".to_string(),
114114+ source: std::io::Error::new(
115115+ std::io::ErrorKind::NotFound,
116116+ "No mlf.toml found and no --output flag provided"
117117+ ),
118118+ });
119119+ };
120120+121121+ // Determine root directory
122122+ let root_dir = if let Some(explicit) = root {
123123+ explicit
124124+ } else if let Some(cfg) = &config {
125125+ project_root.as_ref().unwrap().join(&cfg.source.directory)
126126+ } else {
127127+ current_dir.clone()
128128+ };
129129+130130+ // Determine input paths
131131+ let input_paths = if input_paths.is_empty() {
132132+ if let Some(cfg) = &config {
133133+ vec![project_root.as_ref().unwrap().join(&cfg.source.directory)]
134134+ } else {
135135+ return Err(GenerateError::WriteOutput {
136136+ path: "input".to_string(),
137137+ source: std::io::Error::new(
138138+ std::io::ErrorKind::NotFound,
139139+ "No input files specified and no mlf.toml found"
140140+ ),
141141+ });
142142+ }
143143+ } else {
144144+ input_paths
145145+ };
146146+75147 // Collect input files
76148 let mut file_paths = Vec::new();
7777- for pattern in input_patterns {
7878- if pattern.contains('*') || pattern.contains('?') {
7979- for entry in glob::glob(&pattern).map_err(|source| GenerateError::InvalidGlob {
8080- pattern: pattern.clone(),
8181- source,
8282- })? {
8383- let path = entry.map_err(|source| GenerateError::GlobError { source })?;
8484- file_paths.push(path);
8585- }
8686- } else {
8787- file_paths.push(PathBuf::from(pattern));
149149+ for path in input_paths {
150150+ if path.is_dir() {
151151+ file_paths.extend(collect_mlf_files(&path)?);
152152+ } else if path.is_file() && path.extension().and_then(|s| s.to_str()) == Some("mlf") {
153153+ file_paths.push(path);
88154 }
89155 }
90156···116182 }
117183 };
118184119119- let namespace = extract_namespace(&file_path);
185185+ let namespace = match extract_namespace(&file_path, &root_dir) {
186186+ Ok(ns) => ns,
187187+ Err(e) => {
188188+ errors.push((
189189+ file_path.display().to_string(),
190190+ format!("Failed to extract namespace: {}", e),
191191+ ));
192192+ continue;
193193+ }
194194+ };
120195121196 // Create workspace with standard library and .mlf cache
122197 let mlf_cache_dir = crate::config::find_project_root(&std::env::current_dir().unwrap())
···220295 Ok(())
221296}
222297223223-fn extract_namespace(file_path: &Path) -> String {
224224- // Extract namespace from path components
225225- // e.g., com/atproto/admin/defs.mlf -> com.atproto.admin.defs
298298+/// Collect all .mlf files recursively from a directory
299299+fn collect_mlf_files(dir: &Path) -> Result<Vec<PathBuf>, GenerateError> {
300300+ let mut files = Vec::new();
301301+302302+ for entry in std::fs::read_dir(dir).map_err(|source| GenerateError::WriteOutput {
303303+ path: dir.display().to_string(),
304304+ source,
305305+ })? {
306306+ let entry = entry.map_err(|source| GenerateError::WriteOutput {
307307+ path: dir.display().to_string(),
308308+ source,
309309+ })?;
310310+311311+ let path = entry.path();
312312+313313+ if path.is_dir() {
314314+ files.extend(collect_mlf_files(&path)?);
315315+ } else if path.extension().and_then(|s| s.to_str()) == Some("mlf") {
316316+ files.push(path);
317317+ }
318318+ }
319319+320320+ Ok(files)
321321+}
226322227227- let mut components = Vec::new();
323323+fn extract_namespace(file_path: &Path, root_dir: &Path) -> Result<String, std::io::Error> {
324324+ // Canonicalize both paths for comparison
325325+ let file_canonical = file_path.canonicalize()?;
326326+ let root_canonical = root_dir.canonicalize()?;
228327229229- for component in file_path.components() {
328328+ // Get the relative path from root to file
329329+ let relative_path = file_canonical
330330+ .strip_prefix(&root_canonical)
331331+ .map_err(|_| {
332332+ std::io::Error::new(
333333+ std::io::ErrorKind::Other,
334334+ format!(
335335+ "File path {} is not under root directory {}",
336336+ file_path.display(),
337337+ root_dir.display()
338338+ ),
339339+ )
340340+ })?;
341341+342342+ // Convert path components to namespace parts
343343+ let mut namespace_parts = Vec::new();
344344+345345+ for component in relative_path.components() {
230346 match component {
231347 std::path::Component::Normal(os_str) => {
232348 if let Some(s) = os_str.to_str() {
233233- components.push(s);
349349+ namespace_parts.push(s);
234350 }
235351 }
236236- _ => continue, // Skip ., .., /, etc.
352352+ _ => continue,
237353 }
238354 }
239355240240- // Remove the .mlf extension from the last component if present
241241- if let Some(last) = components.last_mut() {
356356+ // Remove .mlf extension from the last component if present
357357+ if let Some(last) = namespace_parts.last_mut() {
242358 if let Some(stem) = last.strip_suffix(".mlf") {
243359 *last = stem;
244360 }
245361 }
246362247247- if components.is_empty() {
248248- return "unknown".to_string();
363363+ if namespace_parts.is_empty() {
364364+ return Err(std::io::Error::new(
365365+ std::io::ErrorKind::Other,
366366+ format!("Could not extract namespace from path: {}", file_path.display()),
367367+ ));
249368 }
250369251251- components.join(".")
370370+ Ok(namespace_parts.join("."))
252371}
+143-39
mlf-cli/src/generate/lexicon.rs
···2929 source: std::io::Error,
3030 },
31313232- #[error("Failed to expand glob pattern")]
3333- #[diagnostic(code(mlf::generate::glob_error))]
3434- GlobError {
3535- #[source]
3636- source: glob::GlobError,
3737- },
3232+}
3333+3434+pub fn run(input_paths: Vec<PathBuf>, output_dir: Option<PathBuf>, explicit_root: Option<PathBuf>, flat: bool) -> Result<(), GenerateError> {
3535+ let current_dir = std::env::current_dir()
3636+ .map_err(|e| GenerateError::WriteOutput {
3737+ path: ".".to_string(),
3838+ source: e,
3939+ })?;
4040+4141+ // Load mlf.toml if available
4242+ let project_root = crate::config::find_project_root(¤t_dir).ok();
4343+ let config = project_root
4444+ .as_ref()
4545+ .and_then(|root| {
4646+ let config_path = root.join("mlf.toml");
4747+ crate::config::MlfConfig::load(&config_path).ok()
4848+ });
4949+5050+ // Determine output directory
5151+ let output_dir = if let Some(explicit) = output_dir {
5252+ explicit
5353+ } else if let Some(cfg) = &config {
5454+ // Find first lexicon output in mlf.toml
5555+ cfg.output
5656+ .iter()
5757+ .find(|o| o.r#type == "lexicon")
5858+ .map(|o| PathBuf::from(&o.directory))
5959+ .ok_or_else(|| GenerateError::ParseLexicon {
6060+ path: "mlf.toml".to_string(),
6161+ help: Some("No lexicon output configured in mlf.toml. Either add an output configuration or provide --output flag.".to_string()),
6262+ })?
6363+ } else {
6464+ return Err(GenerateError::ParseLexicon {
6565+ path: "mlf.toml".to_string(),
6666+ help: Some("No mlf.toml found and no --output flag provided. Either create a mlf.toml or provide --output flag.".to_string()),
6767+ });
6868+ };
6969+7070+ // Determine root directory
7171+ let root_dir = if let Some(explicit) = explicit_root {
7272+ explicit
7373+ } else if let Some(cfg) = &config {
7474+ project_root.as_ref().unwrap().join(&cfg.source.directory)
7575+ } else {
7676+ current_dir.clone()
7777+ };
38783939- #[error("Invalid glob pattern: {pattern}")]
4040- #[diagnostic(code(mlf::generate::invalid_glob))]
4141- InvalidGlob {
4242- pattern: String,
4343- #[source]
4444- source: glob::PatternError,
4545- },
4646-}
7979+ // Determine input paths
8080+ let input_paths = if input_paths.is_empty() {
8181+ if let Some(cfg) = &config {
8282+ vec![project_root.as_ref().unwrap().join(&cfg.source.directory)]
8383+ } else {
8484+ return Err(GenerateError::ParseLexicon {
8585+ path: "input".to_string(),
8686+ help: Some("No input files specified and no mlf.toml found. Either provide input files or create a mlf.toml.".to_string()),
8787+ });
8888+ }
8989+ } else {
9090+ input_paths
9191+ };
47924848-pub fn run(input_patterns: Vec<String>, output_dir: PathBuf, flat: bool) -> Result<(), GenerateError> {
9393+ // Collect files from input paths
4994 let mut file_paths = Vec::new();
9595+ for input_path in input_paths {
9696+ let path = if input_path.is_absolute() {
9797+ input_path
9898+ } else {
9999+ current_dir.join(input_path)
100100+ };
501015151- for pattern in input_patterns {
5252- if pattern.contains('*') || pattern.contains('?') {
5353- for entry in glob::glob(&pattern).map_err(|source| GenerateError::InvalidGlob {
5454- pattern: pattern.clone(),
5555- source,
5656- })? {
5757- let path = entry.map_err(|source| GenerateError::GlobError { source })?;
5858- file_paths.push(path);
5959- }
102102+ if path.is_dir() {
103103+ file_paths.extend(collect_mlf_files(&path)?);
104104+ } else if path.is_file() {
105105+ file_paths.push(path);
60106 } else {
6161- file_paths.push(PathBuf::from(pattern));
107107+ return Err(GenerateError::ReadFile {
108108+ path: path.display().to_string(),
109109+ source: std::io::Error::new(std::io::ErrorKind::NotFound, "Path not found"),
110110+ });
62111 }
63112 }
64113···87136 }
88137 };
891389090- let namespace = extract_namespace(&file_path);
139139+ let namespace = extract_namespace(&file_path, &root_dir)?;
9114092141 // Create workspace with standard library and .mlf cache for inline type resolution
93142 let mlf_cache_dir = crate::config::find_project_root(&std::env::current_dir().unwrap())
···157206 Ok(())
158207}
159208160160-fn extract_namespace(file_path: &Path) -> String {
161161- // Extract namespace from path components
162162- // e.g., com/atproto/admin/defs.mlf -> com.atproto.admin.defs
209209+/// Recursively collect all .mlf files from a directory
210210+fn collect_mlf_files(dir: &Path) -> Result<Vec<PathBuf>, GenerateError> {
211211+ let mut files = Vec::new();
163212164164- let mut components = Vec::new();
213213+ if !dir.exists() {
214214+ return Err(GenerateError::ReadFile {
215215+ path: dir.display().to_string(),
216216+ source: std::io::Error::new(std::io::ErrorKind::NotFound, "Directory not found"),
217217+ });
218218+ }
165219166166- for component in file_path.components() {
167167- match component {
168168- std::path::Component::Normal(os_str) => {
169169- if let Some(s) = os_str.to_str() {
170170- components.push(s);
220220+ fn visit_dirs(dir: &Path, files: &mut Vec<PathBuf>) -> std::io::Result<()> {
221221+ if dir.is_dir() {
222222+ for entry in std::fs::read_dir(dir)? {
223223+ let entry = entry?;
224224+ let path = entry.path();
225225+ if path.is_dir() {
226226+ visit_dirs(&path, files)?;
227227+ } else if path.extension().and_then(|s| s.to_str()) == Some("mlf") {
228228+ files.push(path);
171229 }
172230 }
173173- _ => continue, // Skip ., .., /, etc.
231231+ }
232232+ Ok(())
233233+ }
234234+235235+ visit_dirs(dir, &mut files).map_err(|source| GenerateError::ReadFile {
236236+ path: dir.display().to_string(),
237237+ source,
238238+ })?;
239239+240240+ Ok(files)
241241+}
242242+243243+/// Extract namespace from file path relative to root directory
244244+/// e.g., root=/project/lexicons, file=/project/lexicons/com/example/foo.mlf -> com.example.foo
245245+fn extract_namespace(file_path: &Path, root_dir: &Path) -> Result<String, GenerateError> {
246246+ // Get the canonical paths to handle . and .. correctly
247247+ let file_canonical = file_path.canonicalize().map_err(|source| GenerateError::ReadFile {
248248+ path: file_path.display().to_string(),
249249+ source,
250250+ })?;
251251+252252+ let root_canonical = root_dir.canonicalize().map_err(|source| GenerateError::ReadFile {
253253+ path: root_dir.display().to_string(),
254254+ source,
255255+ })?;
256256+257257+ // Get relative path from root to file
258258+ let relative_path = file_canonical.strip_prefix(&root_canonical)
259259+ .map_err(|_| GenerateError::ParseLexicon {
260260+ path: file_path.display().to_string(),
261261+ help: Some(format!(
262262+ "File {} is not within root directory {}",
263263+ file_path.display(),
264264+ root_dir.display()
265265+ )),
266266+ })?;
267267+268268+ // Convert path to namespace
269269+ let mut components = Vec::new();
270270+ for component in relative_path.components() {
271271+ if let std::path::Component::Normal(os_str) = component {
272272+ if let Some(s) = os_str.to_str() {
273273+ components.push(s);
274274+ }
174275 }
175276 }
176277177177- // Remove the .mlf extension from the last component if present
278278+ // Remove .mlf extension from last component
178279 if let Some(last) = components.last_mut() {
179280 if let Some(stem) = last.strip_suffix(".mlf") {
180281 *last = stem;
···182283 }
183284184285 if components.is_empty() {
185185- return "unknown".to_string();
286286+ return Err(GenerateError::ParseLexicon {
287287+ path: file_path.display().to_string(),
288288+ help: Some("Could not extract namespace from path".to_string()),
289289+ });
186290 }
187291188188- components.join(".")
292292+ Ok(components.join("."))
189293}
+123-35
mlf-cli/src/generate/mlf.rs
···5151 },
5252}
53535454-pub fn run(input_patterns: Vec<String>, output_dir: PathBuf) -> Result<(), MlfGenerateError> {
5454+pub fn run(input_patterns: Vec<String>, output_dir: Option<PathBuf>) -> Result<(), MlfGenerateError> {
5555+ let current_dir = std::env::current_dir().map_err(|source| MlfGenerateError::WriteOutput {
5656+ path: "current directory".to_string(),
5757+ source,
5858+ })?;
5959+6060+ // Load mlf.toml if available
6161+ let project_root = crate::config::find_project_root(¤t_dir).ok();
6262+ let config = project_root
6363+ .as_ref()
6464+ .and_then(|root| {
6565+ let config_path = root.join("mlf.toml");
6666+ crate::config::MlfConfig::load(&config_path).ok()
6767+ });
6868+6969+ // Determine output directory
7070+ let output_dir = if let Some(explicit) = output_dir {
7171+ explicit
7272+ } else if let Some(cfg) = &config {
7373+ // Find first mlf output in mlf.toml
7474+ cfg.output
7575+ .iter()
7676+ .find(|o| o.r#type == "mlf")
7777+ .map(|o| PathBuf::from(&o.directory))
7878+ .ok_or_else(|| MlfGenerateError::InvalidLexicon {
7979+ message: "No mlf output configured in mlf.toml. Either add an output configuration or provide --output flag.".to_string(),
8080+ })?
8181+ } else {
8282+ return Err(MlfGenerateError::InvalidLexicon {
8383+ message: "No mlf.toml found and no --output flag provided. Either create a mlf.toml or provide --output flag.".to_string(),
8484+ });
8585+ };
8686+5587 let mut file_paths = Vec::new();
56885789 for pattern in input_patterns {
···179211 }
180212 })?;
181213214214+ // Create a context to pass the current namespace to type generation
215215+ let ctx = ConversionContext {
216216+ current_namespace: nsid.to_string(),
217217+ };
218218+182219 // Process all definitions
183220 for (name, def) in defs {
184221 let def_type = def.get("type").and_then(|v| v.as_str()).ok_or_else(|| {
···189226190227 match def_type {
191228 "record" => {
192192- let mlf = generate_record(name, def, last_segment)?;
229229+ let mlf = generate_record(name, def, last_segment, &ctx)?;
193230 output.push_str(&mlf);
194231 output.push('\n');
195232 }
196233 "query" => {
197197- let mlf = generate_query(name, def, last_segment)?;
234234+ let mlf = generate_query(name, def, last_segment, &ctx)?;
198235 output.push_str(&mlf);
199236 output.push('\n');
200237 }
201238 "procedure" => {
202202- let mlf = generate_procedure(name, def, last_segment)?;
239239+ let mlf = generate_procedure(name, def, last_segment, &ctx)?;
203240 output.push_str(&mlf);
204241 output.push('\n');
205242 }
206243 "subscription" => {
207207- let mlf = generate_subscription(name, def, last_segment)?;
244244+ let mlf = generate_subscription(name, def, last_segment, &ctx)?;
208245 output.push_str(&mlf);
209246 output.push('\n');
210247 }
···213250 output.push_str(&mlf);
214251 output.push('\n');
215252 }
216216- "object" => {
217217- let mlf = generate_def_type(name, def, last_segment)?;
253253+ _ => {
254254+ // All other types (object, string, array, union, etc.) are treated as def type
255255+ let mlf = generate_def_type(name, def, last_segment, &ctx)?;
218256 output.push_str(&mlf);
219257 output.push('\n');
220220- }
221221- _ => {
222222- // Unknown type, skip
223258 }
224259 }
225260 }
226261227262 Ok(output)
263263+}
264264+265265+struct ConversionContext {
266266+ current_namespace: String,
228267}
229268230269/// Reserved words in MLF that need to be escaped
···243282 }
244283}
245284246246-fn generate_record(name: &str, def: &Value, last_segment: &str) -> Result<String, MlfGenerateError> {
285285+fn generate_record(name: &str, def: &Value, last_segment: &str, ctx: &ConversionContext) -> Result<String, MlfGenerateError> {
247286 let mut output = String::new();
248287249288 // Add doc comment if present
···253292 output.push_str(&format!("/// {}\n", line));
254293 }
255294 }
295295+ }
296296+297297+ // Add @main annotation for "main" definitions
298298+ if name == "main" {
299299+ output.push_str("@main\n");
256300 }
257301258302 // Use last segment of NSID for "main" definitions
···301345 let is_required = required.contains(&field_name.as_str());
302346 let required_marker = if is_required { "!" } else { "" };
303347304304- let field_type = generate_type(field_def)?;
348348+ let field_type = generate_type(field_def, ctx)?;
305349 let escaped_field_name = escape_name(field_name);
306350 output.push_str(&format!(
307351 " {}{}: {},\n",
···313357 Ok(output)
314358}
315359316316-fn generate_query(name: &str, def: &Value, last_segment: &str) -> Result<String, MlfGenerateError> {
360360+fn generate_query(name: &str, def: &Value, last_segment: &str, ctx: &ConversionContext) -> Result<String, MlfGenerateError> {
317361 let mut output = String::new();
318362319363 // Add doc comment
···325369 }
326370 }
327371372372+ // Add @main annotation for "main" definitions
373373+ if name == "main" {
374374+ output.push_str("@main\n");
375375+ }
376376+328377 let query_name = if name == "main" {
329378 escape_name(last_segment)
330379 } else {
···352401 .map(|(param_name, param_def)| {
353402 let is_required = required.contains(¶m_name.as_str());
354403 let required_marker = if is_required { "!" } else { "" };
355355- let param_type = generate_type(param_def).unwrap_or_else(|_| "unknown".to_string());
404404+ let param_type = generate_type(param_def, ctx).unwrap_or_else(|_| "unknown".to_string());
356405 let escaped_param_name = escape_name(param_name);
357406358407 // Add doc comment inline if present
···377426 // Output type
378427 if let Some(output_obj) = def.get("output").and_then(|v| v.as_object()) {
379428 if let Some(schema) = output_obj.get("schema") {
380380- let return_type = generate_type(schema)?;
429429+ let return_type = generate_type(schema, ctx)?;
381430 output.push_str(&format!(": {}", return_type));
382431383432 // Check for errors
···400449 Ok(output)
401450}
402451403403-fn generate_procedure(name: &str, def: &Value, last_segment: &str) -> Result<String, MlfGenerateError> {
452452+fn generate_procedure(name: &str, def: &Value, last_segment: &str, ctx: &ConversionContext) -> Result<String, MlfGenerateError> {
404453 let mut output = String::new();
405454406455 // Add doc comment
···412461 }
413462 }
414463464464+ // Add @main annotation for "main" definitions
465465+ if name == "main" {
466466+ output.push_str("@main\n");
467467+ }
468468+415469 let procedure_name = if name == "main" {
416470 escape_name(last_segment)
417471 } else {
···441495 let is_required = required.contains(¶m_name.as_str());
442496 let required_marker = if is_required { "!" } else { "" };
443497 let param_type =
444444- generate_type(param_def).unwrap_or_else(|_| "unknown".to_string());
498498+ generate_type(param_def, ctx).unwrap_or_else(|_| "unknown".to_string());
445499 let escaped_param_name = escape_name(param_name);
446500447501 // Add doc comment inline if present
···470524 // Output type
471525 if let Some(output_obj) = def.get("output").and_then(|v| v.as_object()) {
472526 if let Some(schema) = output_obj.get("schema") {
473473- let return_type = generate_type(schema)?;
527527+ let return_type = generate_type(schema, ctx)?;
474528 output.push_str(&format!(": {}", return_type));
475529476530 // Check for errors
···493547 Ok(output)
494548}
495549496496-fn generate_subscription(name: &str, def: &Value, last_segment: &str) -> Result<String, MlfGenerateError> {
550550+fn generate_subscription(name: &str, def: &Value, last_segment: &str, ctx: &ConversionContext) -> Result<String, MlfGenerateError> {
497551 let mut output = String::new();
498552499553 // Add doc comment
···505559 }
506560 }
507561562562+ // Add @main annotation for "main" definitions
563563+ if name == "main" {
564564+ output.push_str("@main\n");
565565+ }
566566+508567 let subscription_name = if name == "main" {
509568 escape_name(last_segment)
510569 } else {
···532591 .map(|(param_name, param_def)| {
533592 let is_required = required.contains(¶m_name.as_str());
534593 let required_marker = if is_required { "!" } else { "" };
535535- let param_type = generate_type(param_def).unwrap_or_else(|_| "unknown".to_string());
594594+ let param_type = generate_type(param_def, ctx).unwrap_or_else(|_| "unknown".to_string());
536595 let escaped_param_name = escape_name(param_name);
537596538597 format!("{}{}: {}", escaped_param_name, required_marker, param_type)
···549608 // Message types
550609 if let Some(message) = def.get("message").and_then(|v| v.as_object()) {
551610 if let Some(schema) = message.get("schema") {
552552- let message_type = generate_type(schema)?;
611611+ let message_type = generate_type(schema, ctx)?;
553612 output.push_str(&format!(": {}", message_type));
554613 }
555614 }
···575634 Ok(output)
576635}
577636578578-fn generate_def_type(name: &str, def: &Value, last_segment: &str) -> Result<String, MlfGenerateError> {
637637+fn generate_def_type(name: &str, def: &Value, last_segment: &str, ctx: &ConversionContext) -> Result<String, MlfGenerateError> {
579638 let mut output = String::new();
580639640640+ // Add doc comment if present
641641+ if let Some(desc) = def.get("description").and_then(|v| v.as_str()) {
642642+ if !desc.is_empty() {
643643+ for line in desc.lines() {
644644+ output.push_str(&format!("/// {}\n", line));
645645+ }
646646+ }
647647+ }
648648+649649+ // Add @main annotation for "main" definitions
650650+ if name == "main" {
651651+ output.push_str("@main\n");
652652+ }
653653+581654 // Use last segment of NSID for "main" definitions
582655 let def_name = if name == "main" {
583656 escape_name(last_segment)
···586659 };
587660588661 output.push_str(&format!("def type {} = ", def_name));
589589- let type_str = generate_type_with_indent(def, 0)?;
662662+ let type_str = generate_type_with_indent(def, 0, ctx)?;
590663 output.push_str(&type_str);
591664 output.push_str(";\n");
592665593666 Ok(output)
594667}
595668596596-fn generate_type_with_indent(type_def: &Value, indent_level: usize) -> Result<String, MlfGenerateError> {
669669+fn generate_type_with_indent(type_def: &Value, indent_level: usize, ctx: &ConversionContext) -> Result<String, MlfGenerateError> {
597670 let type_name = type_def.get("type").and_then(|v| v.as_str());
598671599672 match type_name {
···631704632705 let is_required = required.contains(&field_name.as_str());
633706 let required_marker = if is_required { "!" } else { "" };
634634- let field_type = generate_type_with_indent(field_def, indent_level + 1)?;
707707+ let field_type = generate_type_with_indent(field_def, indent_level + 1, ctx)?;
635708 let escaped_field_name = escape_name(field_name);
636709 output.push_str(&format!(
637710 "{}{}{}: {},\n",
···642715 output.push_str(&format!("{}}}", indent));
643716 Ok(output)
644717 }
645645- _ => generate_type(type_def),
718718+ _ => generate_type(type_def, ctx),
646719 }
647720}
648721649649-fn generate_type(type_def: &Value) -> Result<String, MlfGenerateError> {
722722+fn generate_type(type_def: &Value, ctx: &ConversionContext) -> Result<String, MlfGenerateError> {
650723 let type_name = type_def.get("type").and_then(|v| v.as_str());
651724652725 match type_name {
···735808 .unwrap_or("unknown")
736809 .to_string()
737810 } else {
738738- generate_type(items)?
811811+ generate_type(items, ctx)?
739812 };
740813741814 let mut result = format!("{}[]", item_type);
···773846774847 let is_required = required.contains(&field_name.as_str());
775848 let required_marker = if is_required { "!" } else { "" };
776776- let field_type = generate_type(field_def)?;
849849+ let field_type = generate_type(field_def, ctx)?;
777850 let escaped_field_name = escape_name(field_name);
778851 output.push_str(&format!(
779852 " {}{}: {},\n",
···793866794867 let type_strs: Vec<String> = refs
795868 .iter()
796796- .map(|r| generate_type(r).unwrap_or_else(|_| "unknown".to_string()))
869869+ .map(|r| generate_type(r, ctx).unwrap_or_else(|_| "unknown".to_string()))
797870 .collect();
798871799872 let mut result = type_strs.join(" | ");
···807880 }
808881 Some("ref") => {
809882 if let Some(ref_str) = type_def.get("ref").and_then(|v| v.as_str()) {
810810- // Convert refs: strip leading # and convert remaining # to .
811811- // "#audio" -> "audio" (local ref, just the name)
812812- // "com.example#foo" -> "com.example.foo" (external ref)
813813- let clean_ref = ref_str.trim_start_matches('#').replace('#', ".");
814814- Ok(clean_ref)
883883+ // Handle references:
884884+ // "#defName" -> "defName" (local reference, same file)
885885+ // "namespace.id#defName" -> Check if same namespace, if so use "defName", else use full path
886886+887887+ if let Some(stripped) = ref_str.strip_prefix('#') {
888888+ // Local reference: #defName -> defName
889889+ Ok(stripped.to_string())
890890+ } else if let Some((namespace, def_name)) = ref_str.split_once('#') {
891891+ // Check if this references the current namespace
892892+ if namespace == ctx.current_namespace {
893893+ // Same namespace - use just the def name
894894+ Ok(def_name.to_string())
895895+ } else {
896896+ // Different namespace - use full NSID format
897897+ Ok(format!("{}.{}", namespace, def_name))
898898+ }
899899+ } else {
900900+ // No # at all - shouldn't happen in valid lexicons, but handle gracefully
901901+ Ok(ref_str.to_string())
902902+ }
815903 } else {
816904 Err(MlfGenerateError::InvalidLexicon {
817905 message: "Missing 'ref' in ref type".to_string(),
+941
mlf-cli/src/generate/mlf.rs.backup
···11+use miette::Diagnostic;
22+use serde_json::Value;
33+use std::path::PathBuf;
44+use thiserror::Error;
55+66+#[derive(Error, Debug, Diagnostic)]
77+pub enum MlfGenerateError {
88+ #[error("Failed to read file: {path}")]
99+ #[diagnostic(code(mlf::generate::read_file))]
1010+ #[allow(dead_code)]
1111+ ReadFile {
1212+ path: String,
1313+ #[source]
1414+ source: std::io::Error,
1515+ },
1616+1717+ #[error("Failed to parse JSON: {path}")]
1818+ #[diagnostic(code(mlf::generate::parse_json))]
1919+ #[allow(dead_code)]
2020+ ParseJson {
2121+ path: String,
2222+ #[source]
2323+ source: serde_json::Error,
2424+ },
2525+2626+ #[error("Failed to write output: {path}")]
2727+ #[diagnostic(code(mlf::generate::write_output))]
2828+ WriteOutput {
2929+ path: String,
3030+ #[source]
3131+ source: std::io::Error,
3232+ },
3333+3434+ #[error("Invalid lexicon format: {message}")]
3535+ #[diagnostic(code(mlf::generate::invalid_lexicon))]
3636+ InvalidLexicon { message: String },
3737+3838+ #[error("Failed to expand glob pattern")]
3939+ #[diagnostic(code(mlf::generate::glob_error))]
4040+ GlobError {
4141+ #[source]
4242+ source: glob::GlobError,
4343+ },
4444+4545+ #[error("Invalid glob pattern: {pattern}")]
4646+ #[diagnostic(code(mlf::generate::invalid_glob))]
4747+ InvalidGlob {
4848+ pattern: String,
4949+ #[source]
5050+ source: glob::PatternError,
5151+ },
5252+}
5353+5454+pub fn run(input_patterns: Vec<String>, output_dir: PathBuf) -> Result<(), MlfGenerateError> {
5555+ let mut file_paths = Vec::new();
5656+5757+ for pattern in input_patterns {
5858+ if pattern.contains('*') || pattern.contains('?') {
5959+ for entry in glob::glob(&pattern).map_err(|source| MlfGenerateError::InvalidGlob {
6060+ pattern: pattern.clone(),
6161+ source,
6262+ })? {
6363+ let path = entry.map_err(|source| MlfGenerateError::GlobError { source })?;
6464+ file_paths.push(path);
6565+ }
6666+ } else {
6767+ file_paths.push(PathBuf::from(pattern));
6868+ }
6969+ }
7070+7171+ std::fs::create_dir_all(&output_dir).map_err(|source| MlfGenerateError::WriteOutput {
7272+ path: output_dir.display().to_string(),
7373+ source,
7474+ })?;
7575+7676+ let mut errors = Vec::new();
7777+ let mut success_count = 0;
7878+7979+ for file_path in file_paths {
8080+ let source = match std::fs::read_to_string(&file_path) {
8181+ Ok(s) => s,
8282+ Err(source) => {
8383+ errors.push((
8484+ file_path.display().to_string(),
8585+ format!("Failed to read file: {}", source),
8686+ ));
8787+ continue;
8888+ }
8989+ };
9090+9191+ let json: Value = match serde_json::from_str(&source) {
9292+ Ok(j) => j,
9393+ Err(source) => {
9494+ errors.push((
9595+ file_path.display().to_string(),
9696+ format!("Failed to parse JSON: {}", source),
9797+ ));
9898+ continue;
9999+ }
100100+ };
101101+102102+ let mlf_content = match generate_mlf_from_json(&json) {
103103+ Ok(content) => content,
104104+ Err(e) => {
105105+ errors.push((file_path.display().to_string(), format!("{:?}", e)));
106106+ continue;
107107+ }
108108+ };
109109+110110+ // Extract namespace from JSON "id" field
111111+ let namespace = json
112112+ .get("id")
113113+ .and_then(|v| v.as_str())
114114+ .ok_or_else(|| MlfGenerateError::InvalidLexicon {
115115+ message: "Missing 'id' field in lexicon".to_string(),
116116+ })?;
117117+118118+ // Create output path from namespace
119119+ let mut output_path = output_dir.clone();
120120+ for segment in namespace.split('.') {
121121+ output_path.push(segment);
122122+ }
123123+ if let Err(source) = std::fs::create_dir_all(&output_path.parent().unwrap()) {
124124+ errors.push((
125125+ file_path.display().to_string(),
126126+ format!("Failed to create directory: {}", source),
127127+ ));
128128+ continue;
129129+ }
130130+ output_path.set_extension("mlf");
131131+132132+ if let Err(source) = std::fs::write(&output_path, mlf_content) {
133133+ errors.push((
134134+ output_path.display().to_string(),
135135+ format!("Failed to write file: {}", source),
136136+ ));
137137+ continue;
138138+ }
139139+140140+ println!("Generated: {}", output_path.display());
141141+ success_count += 1;
142142+ }
143143+144144+ if !errors.is_empty() {
145145+ eprintln!(
146146+ "\n{} file(s) generated successfully, {} error(s) encountered:\n",
147147+ success_count,
148148+ errors.len()
149149+ );
150150+ for (path, error) in &errors {
151151+ eprintln!(" {} - {}", path, error);
152152+ }
153153+ eprintln!();
154154+ return Err(MlfGenerateError::InvalidLexicon {
155155+ message: format!("{} errors total", errors.len()),
156156+ });
157157+ }
158158+159159+ println!("\nSuccessfully generated {} file(s)", success_count);
160160+ Ok(())
161161+}
162162+163163+pub fn generate_mlf_from_json(json: &Value) -> Result<String, MlfGenerateError> {
164164+ let mut output = String::new();
165165+166166+ // Extract NSID to get the last segment for "main" definitions
167167+ let nsid = json
168168+ .get("id")
169169+ .and_then(|v| v.as_str())
170170+ .ok_or_else(|| MlfGenerateError::InvalidLexicon {
171171+ message: "Missing 'id' field in lexicon".to_string(),
172172+ })?;
173173+174174+ let last_segment = nsid.split('.').last().unwrap_or("main");
175175+176176+ let defs = json.get("defs").and_then(|v| v.as_object()).ok_or_else(|| {
177177+ MlfGenerateError::InvalidLexicon {
178178+ message: "Missing or invalid 'defs' field".to_string(),
179179+ }
180180+ })?;
181181+182182+ // Create a context to pass the current namespace to type generation
183183+ let ctx = ConversionContext {
184184+ current_namespace: nsid.to_string(),
185185+ };
186186+187187+ // Process all definitions
188188+ for (name, def) in defs {
189189+ let def_type = def.get("type").and_then(|v| v.as_str()).ok_or_else(|| {
190190+ MlfGenerateError::InvalidLexicon {
191191+ message: format!("Missing 'type' field for definition '{}'", name),
192192+ }
193193+ })?;
194194+195195+ match def_type {
196196+ "record" => {
197197+ let mlf = generate_record(name, def, last_segment, &ctx)?;
198198+ output.push_str(&mlf);
199199+ output.push('\n');
200200+ }
201201+ "query" => {
202202+ let mlf = generate_query(name, def, last_segment, &ctx)?;
203203+ output.push_str(&mlf);
204204+ output.push('\n');
205205+ }
206206+ "procedure" => {
207207+ let mlf = generate_procedure(name, def, last_segment, &ctx)?;
208208+ output.push_str(&mlf);
209209+ output.push('\n');
210210+ }
211211+ "subscription" => {
212212+ let mlf = generate_subscription(name, def, last_segment, &ctx)?;
213213+ output.push_str(&mlf);
214214+ output.push('\n');
215215+ }
216216+ "token" => {
217217+ let mlf = generate_token(name, def)?;
218218+ output.push_str(&mlf);
219219+ output.push('\n');
220220+ }
221221+ "object" => {
222222+ let mlf = generate_def_type(name, def, last_segment, &ctx)?;
223223+ output.push_str(&mlf);
224224+ output.push('\n');
225225+ }
226226+ _ => {
227227+ // Unknown type, skip
228228+ }
229229+ }
230230+ }
231231+232232+ Ok(output)
233233+}
234234+235235+struct ConversionContext {
236236+ current_namespace: String,
237237+}
238238+239239+/// Reserved words in MLF that need to be escaped
240240+const RESERVED_WORDS: &[&str] = &[
241241+ "main", "record", "query", "procedure", "subscription", "token", "def", "type", "use",
242242+ "pub", "alias", "namespace", "constrained", "error", "unit", "null", "boolean",
243243+ "integer", "string", "bytes", "blob", "unknown", "array", "object", "union", "ref",
244244+];
245245+246246+/// Escape a name if it's a reserved word
247247+fn escape_name(name: &str) -> String {
248248+ if RESERVED_WORDS.contains(&name) {
249249+ format!("`{}`", name)
250250+ } else {
251251+ name.to_string()
252252+ }
253253+}
254254+255255+fn generate_record(name: &str, def: &Value, last_segment: &str, ctx: &ConversionContext) -> Result<String, MlfGenerateError> {
256256+ let mut output = String::new();
257257+258258+ // Add doc comment if present
259259+ if let Some(desc) = def.get("description").and_then(|v| v.as_str()) {
260260+ if !desc.is_empty() {
261261+ for line in desc.lines() {
262262+ output.push_str(&format!("/// {}\n", line));
263263+ }
264264+ }
265265+ }
266266+267267+ // Add @main annotation for "main" definitions
268268+ if name == "main" {
269269+ output.push_str("@main\n");
270270+ }
271271+272272+ // Use last segment of NSID for "main" definitions
273273+ let record_name = if name == "main" {
274274+ escape_name(last_segment)
275275+ } else {
276276+ escape_name(name)
277277+ };
278278+279279+ output.push_str(&format!("record {} {{\n", record_name));
280280+281281+ // Get the record object
282282+ let record_obj = def.get("record").and_then(|v| v.as_object()).ok_or_else(|| {
283283+ MlfGenerateError::InvalidLexicon {
284284+ message: format!("Missing 'record' field in record definition '{}'", name),
285285+ }
286286+ })?;
287287+288288+ let properties = record_obj
289289+ .get("properties")
290290+ .and_then(|v| v.as_object())
291291+ .ok_or_else(|| MlfGenerateError::InvalidLexicon {
292292+ message: format!("Missing 'properties' in record '{}'", name),
293293+ })?;
294294+295295+ let required = record_obj
296296+ .get("required")
297297+ .and_then(|v| v.as_array())
298298+ .map(|arr| {
299299+ arr.iter()
300300+ .filter_map(|v| v.as_str())
301301+ .collect::<Vec<_>>()
302302+ })
303303+ .unwrap_or_default();
304304+305305+ for (field_name, field_def) in properties {
306306+ // Add field doc comment
307307+ if let Some(desc) = field_def.get("description").and_then(|v| v.as_str()) {
308308+ if !desc.is_empty() {
309309+ for line in desc.lines() {
310310+ output.push_str(&format!(" /// {}\n", line));
311311+ }
312312+ }
313313+ }
314314+315315+ let is_required = required.contains(&field_name.as_str());
316316+ let required_marker = if is_required { "!" } else { "" };
317317+318318+ let field_type = generate_type(field_def)?;
319319+ let escaped_field_name = escape_name(field_name);
320320+ output.push_str(&format!(
321321+ " {}{}: {},\n",
322322+ escaped_field_name, required_marker, field_type
323323+ ));
324324+ }
325325+326326+ output.push_str("}\n");
327327+ Ok(output)
328328+}
329329+330330+fn generate_query(name: &str, def: &Value, last_segment: &str, ctx: &ConversionContext) -> Result<String, MlfGenerateError> {
331331+ let mut output = String::new();
332332+333333+ // Add doc comment
334334+ if let Some(desc) = def.get("description").and_then(|v| v.as_str()) {
335335+ if !desc.is_empty() {
336336+ for line in desc.lines() {
337337+ output.push_str(&format!("/// {}\n", line));
338338+ }
339339+ }
340340+ }
341341+342342+ // Add @main annotation for "main" definitions
343343+ if name == "main" {
344344+ output.push_str("@main\n");
345345+ }
346346+347347+ let query_name = if name == "main" {
348348+ escape_name(last_segment)
349349+ } else {
350350+ escape_name(name)
351351+ };
352352+ output.push_str(&format!("query {}", query_name));
353353+354354+ // Parameters
355355+ output.push('(');
356356+ if let Some(params) = def.get("parameters").and_then(|v| v.as_object()) {
357357+ let properties = params.get("properties").and_then(|v| v.as_object());
358358+ let required = params
359359+ .get("required")
360360+ .and_then(|v| v.as_array())
361361+ .map(|arr| {
362362+ arr.iter()
363363+ .filter_map(|v| v.as_str())
364364+ .collect::<Vec<_>>()
365365+ })
366366+ .unwrap_or_default();
367367+368368+ if let Some(props) = properties {
369369+ let param_strs: Vec<String> = props
370370+ .iter()
371371+ .map(|(param_name, param_def)| {
372372+ let is_required = required.contains(¶m_name.as_str());
373373+ let required_marker = if is_required { "!" } else { "" };
374374+ let param_type = generate_type(param_def).unwrap_or_else(|_| "unknown".to_string());
375375+ let escaped_param_name = escape_name(param_name);
376376+377377+ // Add doc comment inline if present
378378+ let mut result = String::new();
379379+ if let Some(desc) = param_def.get("description").and_then(|v| v.as_str()) {
380380+ if !desc.is_empty() {
381381+ result.push_str(&format!("\n /// {}\n ", desc));
382382+ }
383383+ }
384384+ result.push_str(&format!("{}{}: {}", escaped_param_name, required_marker, param_type));
385385+ result
386386+ })
387387+ .collect();
388388+389389+ if !param_strs.is_empty() {
390390+ output.push_str(¶m_strs.join(","));
391391+ }
392392+ }
393393+ }
394394+ output.push(')');
395395+396396+ // Output type
397397+ if let Some(output_obj) = def.get("output").and_then(|v| v.as_object()) {
398398+ if let Some(schema) = output_obj.get("schema") {
399399+ let return_type = generate_type(schema)?;
400400+ output.push_str(&format!(": {}", return_type));
401401+402402+ // Check for errors
403403+ if let Some(errors) = output_obj.get("errors").and_then(|v| v.as_object()) {
404404+ output.push_str(" | error {\n");
405405+ for (error_name, error_def) in errors {
406406+ if let Some(desc) = error_def.get("description").and_then(|v| v.as_str()) {
407407+ if !desc.is_empty() {
408408+ output.push_str(&format!(" /// {}\n", desc));
409409+ }
410410+ }
411411+ output.push_str(&format!(" {},\n", error_name));
412412+ }
413413+ output.push('}');
414414+ }
415415+ }
416416+ }
417417+418418+ output.push_str(";\n");
419419+ Ok(output)
420420+}
421421+422422+fn generate_procedure(name: &str, def: &Value, last_segment: &str, ctx: &ConversionContext) -> Result<String, MlfGenerateError> {
423423+ let mut output = String::new();
424424+425425+ // Add doc comment
426426+ if let Some(desc) = def.get("description").and_then(|v| v.as_str()) {
427427+ if !desc.is_empty() {
428428+ for line in desc.lines() {
429429+ output.push_str(&format!("/// {}\n", line));
430430+ }
431431+ }
432432+ }
433433+434434+ // Add @main annotation for "main" definitions
435435+ if name == "main" {
436436+ output.push_str("@main\n");
437437+ }
438438+439439+ let procedure_name = if name == "main" {
440440+ escape_name(last_segment)
441441+ } else {
442442+ escape_name(name)
443443+ };
444444+ output.push_str(&format!("procedure {}", procedure_name));
445445+446446+ // Input parameters
447447+ output.push('(');
448448+ if let Some(input) = def.get("input").and_then(|v| v.as_object()) {
449449+ if let Some(schema) = input.get("schema").and_then(|v| v.as_object()) {
450450+ let properties = schema.get("properties").and_then(|v| v.as_object());
451451+ let required = schema
452452+ .get("required")
453453+ .and_then(|v| v.as_array())
454454+ .map(|arr| {
455455+ arr.iter()
456456+ .filter_map(|v| v.as_str())
457457+ .collect::<Vec<_>>()
458458+ })
459459+ .unwrap_or_default();
460460+461461+ if let Some(props) = properties {
462462+ let param_strs: Vec<String> = props
463463+ .iter()
464464+ .map(|(param_name, param_def)| {
465465+ let is_required = required.contains(¶m_name.as_str());
466466+ let required_marker = if is_required { "!" } else { "" };
467467+ let param_type =
468468+ generate_type(param_def).unwrap_or_else(|_| "unknown".to_string());
469469+ let escaped_param_name = escape_name(param_name);
470470+471471+ // Add doc comment inline if present
472472+ let mut result = String::new();
473473+ if let Some(desc) = param_def.get("description").and_then(|v| v.as_str()) {
474474+ if !desc.is_empty() {
475475+ result.push_str(&format!("\n /// {}\n ", desc));
476476+ }
477477+ }
478478+ result.push_str(&format!(
479479+ "{}{}: {}",
480480+ escaped_param_name, required_marker, param_type
481481+ ));
482482+ result
483483+ })
484484+ .collect();
485485+486486+ if !param_strs.is_empty() {
487487+ output.push_str(¶m_strs.join(","));
488488+ }
489489+ }
490490+ }
491491+ }
492492+ output.push(')');
493493+494494+ // Output type
495495+ if let Some(output_obj) = def.get("output").and_then(|v| v.as_object()) {
496496+ if let Some(schema) = output_obj.get("schema") {
497497+ let return_type = generate_type(schema)?;
498498+ output.push_str(&format!(": {}", return_type));
499499+500500+ // Check for errors
501501+ if let Some(errors) = output_obj.get("errors").and_then(|v| v.as_object()) {
502502+ output.push_str(" | error {\n");
503503+ for (error_name, error_def) in errors {
504504+ if let Some(desc) = error_def.get("description").and_then(|v| v.as_str()) {
505505+ if !desc.is_empty() {
506506+ output.push_str(&format!(" /// {}\n", desc));
507507+ }
508508+ }
509509+ output.push_str(&format!(" {},\n", error_name));
510510+ }
511511+ output.push('}');
512512+ }
513513+ }
514514+ }
515515+516516+ output.push_str(";\n");
517517+ Ok(output)
518518+}
519519+520520+fn generate_subscription(name: &str, def: &Value, last_segment: &str, ctx: &ConversionContext) -> Result<String, MlfGenerateError> {
521521+ let mut output = String::new();
522522+523523+ // Add doc comment
524524+ if let Some(desc) = def.get("description").and_then(|v| v.as_str()) {
525525+ if !desc.is_empty() {
526526+ for line in desc.lines() {
527527+ output.push_str(&format!("/// {}\n", line));
528528+ }
529529+ }
530530+ }
531531+532532+ // Add @main annotation for "main" definitions
533533+ if name == "main" {
534534+ output.push_str("@main\n");
535535+ }
536536+537537+ let subscription_name = if name == "main" {
538538+ escape_name(last_segment)
539539+ } else {
540540+ escape_name(name)
541541+ };
542542+ output.push_str(&format!("subscription {}", subscription_name));
543543+544544+ // Parameters
545545+ output.push('(');
546546+ if let Some(params) = def.get("parameters").and_then(|v| v.as_object()) {
547547+ let properties = params.get("properties").and_then(|v| v.as_object());
548548+ let required = params
549549+ .get("required")
550550+ .and_then(|v| v.as_array())
551551+ .map(|arr| {
552552+ arr.iter()
553553+ .filter_map(|v| v.as_str())
554554+ .collect::<Vec<_>>()
555555+ })
556556+ .unwrap_or_default();
557557+558558+ if let Some(props) = properties {
559559+ let param_strs: Vec<String> = props
560560+ .iter()
561561+ .map(|(param_name, param_def)| {
562562+ let is_required = required.contains(¶m_name.as_str());
563563+ let required_marker = if is_required { "!" } else { "" };
564564+ let param_type = generate_type(param_def).unwrap_or_else(|_| "unknown".to_string());
565565+ let escaped_param_name = escape_name(param_name);
566566+567567+ format!("{}{}: {}", escaped_param_name, required_marker, param_type)
568568+ })
569569+ .collect();
570570+571571+ if !param_strs.is_empty() {
572572+ output.push_str(¶m_strs.join(", "));
573573+ }
574574+ }
575575+ }
576576+ output.push(')');
577577+578578+ // Message types
579579+ if let Some(message) = def.get("message").and_then(|v| v.as_object()) {
580580+ if let Some(schema) = message.get("schema") {
581581+ let message_type = generate_type(schema)?;
582582+ output.push_str(&format!(": {}", message_type));
583583+ }
584584+ }
585585+586586+ output.push_str(";\n");
587587+ Ok(output)
588588+}
589589+590590+fn generate_token(name: &str, def: &Value) -> Result<String, MlfGenerateError> {
591591+ let mut output = String::new();
592592+593593+ // Add doc comment
594594+ if let Some(desc) = def.get("description").and_then(|v| v.as_str()) {
595595+ if !desc.is_empty() {
596596+ for line in desc.lines() {
597597+ output.push_str(&format!("/// {}\n", line));
598598+ }
599599+ }
600600+ }
601601+602602+ let escaped_name = escape_name(name);
603603+ output.push_str(&format!("token {};\n", escaped_name));
604604+ Ok(output)
605605+}
606606+607607+fn generate_def_type(name: &str, def: &Value, last_segment: &str, ctx: &ConversionContext) -> Result<String, MlfGenerateError> {
608608+ let mut output = String::new();
609609+610610+ // Add @main annotation for "main" definitions
611611+ if name == "main" {
612612+ output.push_str("@main\n");
613613+ }
614614+615615+ // Use last segment of NSID for "main" definitions
616616+ let def_name = if name == "main" {
617617+ escape_name(last_segment)
618618+ } else {
619619+ escape_name(name)
620620+ };
621621+622622+ output.push_str(&format!("def type {} = ", def_name));
623623+ let type_str = generate_type_with_indent(def, 0)?;
624624+ output.push_str(&type_str);
625625+ output.push_str(";\n");
626626+627627+ Ok(output)
628628+}
629629+630630+fn generate_type_with_indent(type_def: &Value, indent_level: usize, ctx: &ConversionContext) -> Result<String, MlfGenerateError> {
631631+ let type_name = type_def.get("type").and_then(|v| v.as_str());
632632+633633+ match type_name {
634634+ Some("object") => {
635635+ let indent = " ".repeat(indent_level);
636636+ let field_indent = " ".repeat(indent_level + 1);
637637+638638+ let mut output = String::from("{\n");
639639+ let properties = type_def
640640+ .get("properties")
641641+ .and_then(|v| v.as_object())
642642+ .ok_or_else(|| MlfGenerateError::InvalidLexicon {
643643+ message: "Missing 'properties' in object type".to_string(),
644644+ })?;
645645+646646+ let required = type_def
647647+ .get("required")
648648+ .and_then(|v| v.as_array())
649649+ .map(|arr| {
650650+ arr.iter()
651651+ .filter_map(|v| v.as_str())
652652+ .collect::<Vec<_>>()
653653+ })
654654+ .unwrap_or_default();
655655+656656+ for (field_name, field_def) in properties {
657657+ // Add field doc comment
658658+ if let Some(desc) = field_def.get("description").and_then(|v| v.as_str()) {
659659+ if !desc.is_empty() {
660660+ for line in desc.lines() {
661661+ output.push_str(&format!("{}/// {}\n", field_indent, line));
662662+ }
663663+ }
664664+ }
665665+666666+ let is_required = required.contains(&field_name.as_str());
667667+ let required_marker = if is_required { "!" } else { "" };
668668+ let field_type = generate_type_with_indent(field_def, indent_level + 1)?;
669669+ let escaped_field_name = escape_name(field_name);
670670+ output.push_str(&format!(
671671+ "{}{}{}: {},\n",
672672+ field_indent, escaped_field_name, required_marker, field_type
673673+ ));
674674+ }
675675+676676+ output.push_str(&format!("{}}}", indent));
677677+ Ok(output)
678678+ }
679679+ _ => generate_type(type_def),
680680+ }
681681+}
682682+683683+fn generate_type(type_def: &Value, ctx: &ConversionContext) -> Result<String, MlfGenerateError> {
684684+ let type_name = type_def.get("type").and_then(|v| v.as_str());
685685+686686+ match type_name {
687687+ Some("null") => Ok("null".to_string()),
688688+ Some("boolean") => Ok("boolean".to_string()),
689689+ Some("integer") => {
690690+ let mut result = "integer".to_string();
691691+ result = apply_constraints(result, type_def);
692692+ Ok(result)
693693+ }
694694+ Some("string") => {
695695+ // Check if this is a format string that maps to a prelude type
696696+ if let Some(format) = type_def.get("format").and_then(|v| v.as_str()) {
697697+ let prelude_type = match format {
698698+ "did" => "Did",
699699+ "at-uri" => "AtUri",
700700+ "at-identifier" => "AtIdentifier",
701701+ "handle" => "Handle",
702702+ "datetime" => "Datetime",
703703+ "uri" => "Uri",
704704+ "cid" => "Cid",
705705+ "nsid" => "Nsid",
706706+ "tid" => "Tid",
707707+ "record-key" => "RecordKey",
708708+ "language" => "Language",
709709+ _ => {
710710+ // Unknown format, fall through to normal string with constraints
711711+ let mut result = "string".to_string();
712712+ result = apply_constraints(result, type_def);
713713+ return Ok(result);
714714+ }
715715+ };
716716+ // If it's a known prelude type with only the format constraint, use the prelude type directly
717717+ // Check if there are other constraints besides format
718718+ let has_other_constraints = type_def.get("minLength").is_some()
719719+ || type_def.get("maxLength").is_some()
720720+ || type_def.get("minGraphemes").is_some()
721721+ || type_def.get("maxGraphemes").is_some()
722722+ || type_def.get("enum").is_some()
723723+ || type_def.get("knownValues").is_some()
724724+ || type_def.get("default").is_some();
725725+726726+ if !has_other_constraints {
727727+ return Ok(prelude_type.to_string());
728728+ }
729729+ }
730730+731731+ let mut result = "string".to_string();
732732+ result = apply_constraints(result, type_def);
733733+ Ok(result)
734734+ }
735735+ Some("bytes") => Ok("bytes".to_string()),
736736+ Some("blob") => {
737737+ let mut result = "blob".to_string();
738738+ result = apply_constraints(result, type_def);
739739+ Ok(result)
740740+ }
741741+ Some("unknown") => Ok("unknown".to_string()),
742742+ Some("array") => {
743743+ let items = type_def.get("items").ok_or_else(|| {
744744+ MlfGenerateError::InvalidLexicon {
745745+ message: "Missing 'items' in array type".to_string(),
746746+ }
747747+ })?;
748748+749749+ // Check if items have constraints
750750+ let items_obj = items.as_object();
751751+ let has_item_constraints = items_obj.map_or(false, |obj| {
752752+ obj.contains_key("minLength") ||
753753+ obj.contains_key("maxLength") ||
754754+ obj.contains_key("minGraphemes") ||
755755+ obj.contains_key("maxGraphemes") ||
756756+ obj.contains_key("minimum") ||
757757+ obj.contains_key("maximum") ||
758758+ obj.contains_key("enum") ||
759759+ obj.contains_key("knownValues") ||
760760+ obj.contains_key("default")
761761+ });
762762+763763+ let item_type = if has_item_constraints {
764764+ // If item has constraints, we need to wrap in parentheses to apply constraints before []
765765+ // For now, just generate the base type without item constraints
766766+ // TODO: Consider generating a type alias for complex constrained items
767767+ items.get("type")
768768+ .and_then(|t| t.as_str())
769769+ .unwrap_or("unknown")
770770+ .to_string()
771771+ } else {
772772+ generate_type(items)?
773773+ };
774774+775775+ let mut result = format!("{}[]", item_type);
776776+ result = apply_constraints(result, type_def);
777777+ Ok(result)
778778+ }
779779+ Some("object") => {
780780+ let mut output = String::from("{\n");
781781+ let properties = type_def
782782+ .get("properties")
783783+ .and_then(|v| v.as_object())
784784+ .ok_or_else(|| MlfGenerateError::InvalidLexicon {
785785+ message: "Missing 'properties' in object type".to_string(),
786786+ })?;
787787+788788+ let required = type_def
789789+ .get("required")
790790+ .and_then(|v| v.as_array())
791791+ .map(|arr| {
792792+ arr.iter()
793793+ .filter_map(|v| v.as_str())
794794+ .collect::<Vec<_>>()
795795+ })
796796+ .unwrap_or_default();
797797+798798+ for (field_name, field_def) in properties {
799799+ // Add field doc comment
800800+ if let Some(desc) = field_def.get("description").and_then(|v| v.as_str()) {
801801+ if !desc.is_empty() {
802802+ for line in desc.lines() {
803803+ output.push_str(&format!(" /// {}\n", line));
804804+ }
805805+ }
806806+ }
807807+808808+ let is_required = required.contains(&field_name.as_str());
809809+ let required_marker = if is_required { "!" } else { "" };
810810+ let field_type = generate_type(field_def)?;
811811+ let escaped_field_name = escape_name(field_name);
812812+ output.push_str(&format!(
813813+ " {}{}: {},\n",
814814+ escaped_field_name, required_marker, field_type
815815+ ));
816816+ }
817817+818818+ output.push_str(" }");
819819+ Ok(output)
820820+ }
821821+ Some("union") => {
822822+ let refs = type_def.get("refs").and_then(|v| v.as_array()).ok_or_else(|| {
823823+ MlfGenerateError::InvalidLexicon {
824824+ message: "Missing 'refs' in union type".to_string(),
825825+ }
826826+ })?;
827827+828828+ let type_strs: Vec<String> = refs
829829+ .iter()
830830+ .map(|r| generate_type(r).unwrap_or_else(|_| "unknown".to_string()))
831831+ .collect();
832832+833833+ let mut result = type_strs.join(" | ");
834834+835835+ // Check if closed
836836+ if type_def.get("closed").and_then(|v| v.as_bool()).unwrap_or(false) {
837837+ result.push_str(" | !");
838838+ }
839839+840840+ Ok(result)
841841+ }
842842+ Some("ref") => {
843843+ if let Some(ref_str) = type_def.get("ref").and_then(|v| v.as_str()) {
844844+ // Handle references:
845845+ // "#defName" -> "defName" (local reference, same file)
846846+ // "namespace.id#defName" -> Check if same namespace, if so use "defName", else use full path
847847+848848+ if let Some(stripped) = ref_str.strip_prefix('#') {
849849+ // Local reference: #defName -> defName
850850+ Ok(stripped.to_string())
851851+ } else if let Some((namespace, def_name)) = ref_str.split_once('#') {
852852+ // Check if this is the current namespace
853853+ // For now, we'll just use the def name if it's the same namespace
854854+ // Note: This requires passing context through, which we'll add
855855+ // For external refs, we keep the full NSID format
856856+ Ok(format!("{}.{}", namespace, def_name))
857857+ } else {
858858+ // No # at all - shouldn't happen in valid lexicons, but handle gracefully
859859+ Ok(ref_str.to_string())
860860+ }
861861+ } else {
862862+ Err(MlfGenerateError::InvalidLexicon {
863863+ message: "Missing 'ref' in ref type".to_string(),
864864+ })
865865+ }
866866+ }
867867+ _ => Ok("unknown".to_string()),
868868+ }
869869+}
870870+871871+fn apply_constraints(mut type_str: String, type_def: &Value) -> String {
872872+ let mut constraints = Vec::new();
873873+874874+ if let Some(min_length) = type_def.get("minLength").and_then(|v| v.as_i64()) {
875875+ constraints.push(format!("minLength: {}", min_length));
876876+ }
877877+ if let Some(max_length) = type_def.get("maxLength").and_then(|v| v.as_i64()) {
878878+ constraints.push(format!("maxLength: {}", max_length));
879879+ }
880880+ if let Some(min_graphemes) = type_def.get("minGraphemes").and_then(|v| v.as_i64()) {
881881+ constraints.push(format!("minGraphemes: {}", min_graphemes));
882882+ }
883883+ if let Some(max_graphemes) = type_def.get("maxGraphemes").and_then(|v| v.as_i64()) {
884884+ constraints.push(format!("maxGraphemes: {}", max_graphemes));
885885+ }
886886+ if let Some(minimum) = type_def.get("minimum").and_then(|v| v.as_i64()) {
887887+ constraints.push(format!("minimum: {}", minimum));
888888+ }
889889+ if let Some(maximum) = type_def.get("maximum").and_then(|v| v.as_i64()) {
890890+ constraints.push(format!("maximum: {}", maximum));
891891+ }
892892+ if let Some(format) = type_def.get("format").and_then(|v| v.as_str()) {
893893+ constraints.push(format!("format: \"{}\"", format));
894894+ }
895895+ if let Some(enum_vals) = type_def.get("enum").and_then(|v| v.as_array()) {
896896+ let vals: Vec<String> = enum_vals
897897+ .iter()
898898+ .filter_map(|v| v.as_str())
899899+ .map(|s| format!("\"{}\"", s))
900900+ .collect();
901901+ constraints.push(format!("enum: [{}]", vals.join(", ")));
902902+ }
903903+ if let Some(known_vals) = type_def.get("knownValues").and_then(|v| v.as_array()) {
904904+ let vals: Vec<String> = known_vals
905905+ .iter()
906906+ .filter_map(|v| v.as_str())
907907+ .map(|s| format!("\"{}\"", s))
908908+ .collect();
909909+ constraints.push(format!("knownValues: [{}]", vals.join(", ")));
910910+ }
911911+ if let Some(accept) = type_def.get("accept").and_then(|v| v.as_array()) {
912912+ let mimes: Vec<String> = accept
913913+ .iter()
914914+ .filter_map(|v| v.as_str())
915915+ .map(|s| format!("\"{}\"", s))
916916+ .collect();
917917+ constraints.push(format!("accept: [{}]", mimes.join(", ")));
918918+ }
919919+ if let Some(max_size) = type_def.get("maxSize").and_then(|v| v.as_i64()) {
920920+ constraints.push(format!("maxSize: {}", max_size));
921921+ }
922922+ if let Some(default) = type_def.get("default") {
923923+ let default_str = match default {
924924+ Value::String(s) => format!("\"{}\"", s),
925925+ Value::Number(n) => n.to_string(),
926926+ Value::Bool(b) => b.to_string(),
927927+ _ => "null".to_string(),
928928+ };
929929+ constraints.push(format!("default: {}", default_str));
930930+ }
931931+932932+ if !constraints.is_empty() {
933933+ type_str.push_str(" constrained {\n");
934934+ for constraint in &constraints {
935935+ type_str.push_str(&format!(" {},\n", constraint));
936936+ }
937937+ type_str.push_str(" }");
938938+ }
939939+940940+ type_str
941941+}
···11+// Test imports - new syntax with dot
22+use com.example.forum.profile;
33+use com.example.thread.{ main };
44+use com.example.types.{ author, postRef };
55+use com.example.post.{ main as Post };
66+use com.example.user.{ main as User, userMeta };
77+88+// Test wildcard and alias
99+use com.example.forum.*;
1010+use com.example.types as Types;
1111+112/// A simple post record
213record post {
314 /// Post text
···718 },
819 /// Creation timestamp
920 createdAt!: Datetime,
2121+ author: profile,
2222+ thread: Post,
1023}
+107-6
website/content/docs/cli/02-configuration.md
···41414242This directory is used by:
4343- `mlf check` (when run without arguments)
4444-- `mlf generate` (when run without arguments)
4444+- `mlf generate` commands (when run without `--input` or `--root`)
4545+4646+The source directory also serves as the default **root** for namespace calculation. For example, a file at `./lexicons/com/example/thread.mlf` will have the namespace `com.example.thread`.
45474648### Output Configurations
4749···6365[[output]]
6466type = "rust"
6567directory = "./src/lexicons"
6868+6969+[[output]]
7070+type = "mlf"
7171+directory = "./converted"
6672```
67736874**Supported types:**
···7076- `"typescript"` - Generate TypeScript types
7177- `"go"` - Generate Go structs
7278- `"rust"` - Generate Rust structs
7979+- `"mlf"` - Convert JSON lexicons to MLF
73807481When you run `mlf generate` without arguments, it will generate all configured outputs.
7582···9610397104```bash
98105mlf check
9999-# Equivalent to: mlf check "./lexicons/**/*.mlf"
106106+# Uses: input=./lexicons, root=./lexicons
107107+```
108108+109109+You can override with explicit arguments:
110110+111111+```bash
112112+mlf check ./custom-lexicons --root ./custom-lexicons
100113```
101114102115### `mlf generate`
···108121# Runs all [[output]] configurations
109122```
110123124124+### `mlf generate lexicon`
125125+126126+When run without arguments, uses configuration defaults:
127127+128128+```bash
129129+mlf generate lexicon
130130+# Uses: input=./lexicons, output=first lexicon output, root=./lexicons
131131+```
132132+133133+Override with explicit arguments:
134134+135135+```bash
136136+mlf generate lexicon -i ./src -o ./dist --root ./src
137137+```
138138+139139+### `mlf generate code`
140140+141141+When run without arguments, uses configuration defaults:
142142+143143+```bash
144144+mlf generate code
145145+# Uses: generator=first non-lexicon output, input=./lexicons, output=matching directory
146146+```
147147+148148+Override with explicit arguments:
149149+150150+```bash
151151+mlf generate code -g typescript -i ./src -o ./types --root ./src
152152+```
153153+154154+### `mlf generate mlf`
155155+156156+When run without `--output`, uses configuration defaults:
157157+158158+```bash
159159+mlf generate mlf -i external.json
160160+# Uses: output=first mlf output
161161+```
162162+111163### `mlf fetch`
112164113165When run without arguments, fetches all dependencies:
···124176# Downloads lexicons AND adds to dependencies array
125177```
126178179179+## Understanding Namespace Calculation
180180+181181+The `source.directory` in `mlf.toml` acts as the **root** for namespace calculation. File paths relative to this root become the namespace.
182182+183183+**Example:**
184184+185185+```toml
186186+[source]
187187+directory = "./lexicons"
188188+```
189189+190190+| File Path | Namespace |
191191+|-----------|-----------|
192192+| `./lexicons/com/example/thread.mlf` | `com.example.thread` |
193193+| `./lexicons/com/example/types/post.mlf` | `com.example.types.post` |
194194+| `./lexicons/app/bsky/feed/post.mlf` | `app.bsky.feed.post` |
195195+196196+If your files are in a different location, use the `--root` flag:
197197+198198+```bash
199199+mlf generate lexicon -i ./src/schemas -o ./dist --root ./src/schemas
200200+```
201201+202202+Now `./src/schemas/com/example/thread.mlf` → namespace `com.example.thread`
203203+127204## Complete Example
128205129206Here's a complete `mlf.toml` for a TypeScript project using ATProto lexicons:
···1972741. **Commit `mlf.toml`** - Version control your configuration
1982752. **Don't commit `.mlf/`** - Let each developer fetch dependencies
1992763. **Use semantic namespaces** - Organize lexicons by domain
200200-4. **Multiple outputs** - Generate both lexicons and code simultaneously
201201-5. **CI/CD integration** - Run `mlf check` in your CI pipeline
277277+4. **Set consistent root** - Keep your source directory as the root for namespace calculation
278278+5. **Multiple outputs** - Generate both lexicons and code simultaneously
279279+6. **CI/CD integration** - Run `mlf check` in your CI pipeline
202280203281## Override Configuration
204282···206284207285```bash
208286# Override source directory
209209-mlf check "./other-lexicons/**/*.mlf"
287287+mlf check ./other-lexicons --root ./other-lexicons
210288211289# Override output
212212-mlf generate lexicon -i custom.mlf -o ./custom-output/
290290+mlf generate lexicon -i custom.mlf -o ./custom-output/ --root ./
291291+292292+# Override generator
293293+mlf generate code -g go -i ./lexicons -o ./go-types --root ./lexicons
213294214295# Fetch specific namespace (ignoring dependencies list)
215296mlf fetch stream.place
216297```
298298+299299+## Multiple Projects
300300+301301+If you have multiple MLF projects, each can have its own `mlf.toml`:
302302+303303+```
304304+my-app/
305305+├── mlf.toml
306306+├── lexicons/
307307+│ └── com/example/app/...
308308+└── dist/
309309+310310+shared-lexicons/
311311+├── mlf.toml
312312+├── lexicons/
313313+│ └── com/example/shared/...
314314+└── dist/
315315+```
316316+317317+Commands always use the `mlf.toml` in the current directory or nearest parent directory.
-14
website/content/docs/cli/04-check.md
···168168169169**Solution:** Correct the field type to match the schema
170170171171-## Integration with CI/CD
172172-173173-Use `mlf check` in your continuous integration pipeline:
174174-175175-```yaml
176176-# GitHub Actions example
177177-- name: Validate MLF Lexicons
178178- run: |
179179- mlf fetch
180180- mlf check
181181-```
182182-183183-This ensures all lexicons remain valid as your project evolves.
184184-185171## Tips
1861721871731. **Use configuration** - Set up `mlf.toml` to avoid typing paths repeatedly
+159-32
website/content/docs/cli/06-generate.md
···1313mlf generate
14141515# Generate JSON lexicons
1616-mlf generate lexicon -i <INPUT> -o <OUTPUT>
1616+mlf generate lexicon [OPTIONS]
17171818# Generate code in a specific language
1919-mlf generate code -g <GENERATOR> -i <INPUT> -o <OUTPUT>
1919+mlf generate code [OPTIONS]
20202121# Convert JSON lexicons to MLF
2222-mlf generate mlf -i <INPUT> -o <OUTPUT>
2222+mlf generate mlf -i <INPUT> [OPTIONS]
2323```
24242525+All generate commands can use defaults from `mlf.toml`, making them easier to run without arguments.
2626+2527## Generate All Outputs
26282729When run without a subcommand, `mlf generate` uses your `mlf.toml` configuration to generate all specified outputs:
28302931```toml
3232+[source]
3333+directory = "./lexicons"
3434+3035[[output]]
3136type = "lexicon"
3237directory = "./dist/lexicons"
···6772Generate ATProto JSON lexicons from MLF files.
68736974```bash
7070-mlf generate lexicon -i <INPUT> -o <OUTPUT> [OPTIONS]
7575+mlf generate lexicon [OPTIONS]
7176```
72777378**Options:**
7474-- `-i, --input <INPUT>` - Input MLF files (glob patterns supported, can be specified multiple times)
7575-- `-o, --output <OUTPUT>` - Output directory (required)
7979+- `-i, --input <INPUT>` - Input MLF file(s) or directory (defaults to `source.directory` from mlf.toml)
8080+- `-o, --output <OUTPUT>` - Output directory (defaults to first `type = "lexicon"` output from mlf.toml)
8181+- `--root <ROOT>` - Root directory for namespace calculation (defaults to `source.directory` from mlf.toml)
7682- `--flat` - Use flat file structure (e.g., `com.example.thread.json`)
77837878-**Examples:**
8484+### Using mlf.toml Defaults
8585+8686+With this configuration:
8787+8888+```toml
8989+[source]
9090+directory = "./lexicons"
9191+9292+[[output]]
9393+type = "lexicon"
9494+directory = "./dist/lexicons"
9595+```
9696+9797+You can run:
9898+9999+```bash
100100+mlf generate lexicon
101101+# Uses: input=./lexicons, output=./dist/lexicons, root=./lexicons
102102+```
103103+104104+### Explicit Arguments
105105+106106+You can override defaults with explicit arguments:
7910780108```bash
81109# Generate with folder structure
8282-mlf generate lexicon -i thread.mlf -o lexicons/
110110+mlf generate lexicon -i thread.mlf -o lexicons/ --root ./
83111# Creates: lexicons/com/example/thread.json
8411285113# Generate with flat structure
8686-mlf generate lexicon -i thread.mlf -o lexicons/ --flat
114114+mlf generate lexicon -i thread.mlf -o lexicons/ --root ./ --flat
87115# Creates: lexicons/com.example.thread.json
881168989-# Generate from multiple files
9090-mlf generate lexicon -i thread.mlf -i reply.mlf -o lexicons/
117117+# Generate from directory
118118+mlf generate lexicon -i ./src/lexicons -o dist/lexicons/ --root ./src/lexicons
119119+```
911209292-# Generate from glob pattern
9393-mlf generate lexicon -i "src/**/*.mlf" -o dist/lexicons/
121121+### Understanding --root
122122+123123+The `--root` flag tells MLF where to calculate namespaces from. For example:
124124+125125+```bash
126126+# File: ./lexicons/com/example/thread.mlf
127127+mlf generate lexicon -i ./lexicons -o ./dist --root ./lexicons
128128+# Namespace: com.example.thread (relative to ./lexicons)
129129+130130+# File: ./src/lexicons/com/example/thread.mlf
131131+mlf generate lexicon -i ./src/lexicons -o ./dist --root ./src/lexicons
132132+# Namespace: com.example.thread (relative to ./src/lexicons)
94133```
95134135135+Without `--root`, it defaults to the `source.directory` from mlf.toml or the current directory.
136136+96137---
9713898139## Generate Code
···100141Generate code in various programming languages from MLF files.
101142102143```bash
103103-mlf generate code -g <GENERATOR> -i <INPUT> -o <OUTPUT> [OPTIONS]
144144+mlf generate code [OPTIONS]
104145```
105146106147**Options:**
107107-- `-g, --generator <GENERATOR>` - Generator to use (required): `json`, `typescript`, `go`, or `rust`
108108-- `-i, --input <INPUT>` - Input MLF files (glob patterns supported, can be specified multiple times)
109109-- `-o, --output <OUTPUT>` - Output directory (required)
148148+- `-g, --generator <GENERATOR>` - Generator to use (defaults to first non-lexicon output from mlf.toml)
149149+- `-i, --input <INPUT>` - Input MLF file(s) or directory (defaults to `source.directory` from mlf.toml)
150150+- `-o, --output <OUTPUT>` - Output directory (defaults to matching output from mlf.toml)
151151+- `--root <ROOT>` - Root directory for namespace calculation (defaults to `source.directory` from mlf.toml)
110152- `--flat` - Use flat file structure
111153112154**Available Generators:**
113155114156| Generator | Output | Features |
115157|-----------|--------|----------|
116116-| `json` | `.json` | AT Protocol JSON lexicons (always available) |
117158| `typescript` | `.ts` | TypeScript interfaces with JSDoc, optional fields with `?` |
118159| `go` | `.go` | Go structs with JSON tags, proper capitalization |
119160| `rust` | `.rs` | Rust structs with serde, `Option<T>` for optional fields |
120161162162+### Using mlf.toml Defaults
163163+164164+With this configuration:
165165+166166+```toml
167167+[source]
168168+directory = "./lexicons"
169169+170170+[[output]]
171171+type = "typescript"
172172+directory = "./src/types"
173173+```
174174+175175+You can run:
176176+177177+```bash
178178+mlf generate code
179179+# Uses: generator=typescript, input=./lexicons, output=./src/types
180180+```
181181+182182+### Explicit Arguments
183183+184184+```bash
185185+# Generate TypeScript
186186+mlf generate code -g typescript -i thread.mlf -o src/types/ --root ./
187187+188188+# Generate Go
189189+mlf generate code -g go -i ./lexicons -o pkg/models/ --root ./lexicons
190190+191191+# Generate Rust with flat structure
192192+mlf generate code -g rust -i ./src -o ./generated --root ./src --flat
193193+```
194194+121195### TypeScript Example
122196123197```bash
124124-mlf generate code -g typescript -i thread.mlf -o src/types/
198198+mlf generate code -g typescript -i thread.mlf -o src/types/ --root ./
125199```
126200127201**Input MLF:**
···160234### Go Example
161235162236```bash
163163-mlf generate code -g go -i thread.mlf -o pkg/models/
237237+mlf generate code -g go -i thread.mlf -o pkg/models/ --root ./
164238```
165239166240**Generated Go:**
···184258### Rust Example
185259186260```bash
187187-mlf generate code -g rust -i thread.mlf -o src/models/
261261+mlf generate code -g rust -i thread.mlf -o src/models/ --root ./
188262```
189263190264**Generated Rust:**
···215289Convert ATProto JSON lexicons back to MLF format.
216290217291```bash
218218-mlf generate mlf -i <INPUT> -o <OUTPUT>
292292+mlf generate mlf -i <INPUT> [OPTIONS]
219293```
220294221295**Options:**
222222-- `-i, --input <INPUT>` - Input JSON lexicon files (glob patterns supported, can be specified multiple times)
223223-- `-o, --output <OUTPUT>` - Output directory (required)
296296+- `-i, --input <INPUT>` - Input JSON lexicon files (required, can be specified multiple times)
297297+- `-o, --output <OUTPUT>` - Output directory (defaults to first `type = "mlf"` output from mlf.toml)
224298225225-**Examples:**
299299+### Using mlf.toml Defaults
300300+301301+With this configuration:
302302+303303+```toml
304304+[[output]]
305305+type = "mlf"
306306+directory = "./converted"
307307+```
308308+309309+You can run:
310310+311311+```bash
312312+mlf generate mlf -i external-lexicon.json
313313+# Uses: output=./converted
314314+```
315315+316316+### Examples
226317227318```bash
228319# Convert a single JSON lexicon to MLF
···232323# Convert multiple JSON lexicons
233324mlf generate mlf -i lexicon1.json -i lexicon2.json -o ./mlf/
234325235235-# Convert using glob pattern
236236-mlf generate mlf -i "dist/lexicons/**/*.json" -o ./src/
326326+# Convert from directory
327327+mlf generate mlf -i dist/lexicons/com/example/thread.json -o ./src/
237328```
238329239330**Features:**
···275366276367**Generated MLF:**
277368```mlf
278278-record main {
369369+@main
370370+record thread {
279371 title!: string constrained {
280372 maxLength: 200,
281373 },
282374 createdAt!: Datetime,
283283-};
375375+}
284376```
285377286378---
···293385294386```mlf
295387/// This is a user profile
296296-def Profile = {
388388+def type Profile = {
297389 /// The user's display name
298390 displayName: string,
299391};
···333425334426---
335427428428+## Multiple Output Targets
429429+430430+You can configure multiple generators in `mlf.toml`:
431431+432432+```toml
433433+[source]
434434+directory = "./lexicons"
435435+436436+[[output]]
437437+type = "lexicon"
438438+directory = "./dist/lexicons"
439439+440440+[[output]]
441441+type = "typescript"
442442+directory = "./src/types"
443443+444444+[[output]]
445445+type = "go"
446446+directory = "./pkg/lexicons"
447447+448448+[[output]]
449449+type = "rust"
450450+directory = "./rust-client/src/lexicons"
451451+```
452452+453453+Then run:
454454+455455+```bash
456456+mlf generate
457457+```
458458+459459+This generates all four output types from the same MLF source files.
460460+461461+---
462462+336463## Tips
337464338338-1. **Use configuration** - Set up `mlf.toml` for multi-output generation
339339-2. **Commit generated code** - If it's part of your build artifacts
465465+1. **Use configuration** - Set up `mlf.toml` to avoid repetitive arguments
466466+2. **Set explicit root** - Use `--root` when your file structure doesn't match your namespaces
3404673. **Regenerate often** - Run `mlf generate` after any lexicon changes
3414684. **Use flat mode** - For simpler directory structures
3424695. **Multiple generators** - Generate multiple languages from the same MLF files
+215-19
website/content/docs/language-guide/08-imports.md
···7788## Basic Import
991010-Import a definition from another file:
1010+Import a specific definition from another file:
11111212```mlf
1313use com.example.forum.profile;
···1717}
1818```
19192020-This imports the `profile` record from `com/example/forum/profile.mlf`.
2020+This imports the `profile` definition from `com/example/forum/profile.mlf`.
21212222## How Imports Work
23232424-The namespace matches the file path:
2424+The import path consists of the file's namespace plus the definition name:
25252626-| File Path | Namespace | Import Statement |
2626+| File Path | Definition | Import Statement |
2727|-----------|-----------|------------------|
2828-| `com/example/forum/user.mlf` | `com.example.forum.user` | `use com.example.forum.user;` |
2929-| `com/example/forum/post.mlf` | `com.example.forum.post` | `use com.example.forum.post;` |
2828+| `com/example/forum/user.mlf` | `record user { ... }` | `use com.example.forum.user;` |
2929+| `com/example/forum/post.mlf` | `def type postMeta = { ... }` | `use com.example.forum.post.postMeta;` |
3030+3131+**Key point:** You import specific definitions by their full path: `namespace.definitionName`
3232+3333+For example:
3434+- File `com/example/forum/post.mlf` has namespace `com.example.forum.post`
3535+- To import `postMeta` from that file: `use com.example.forum.post.postMeta;`
3636+- Or using the main definition: `use com.example.forum.post;` (imports the record named `post`)
30373138## What Can Be Imported
3239···61686269## Multiple Imports
63706464-Import multiple definitions with separate `use` statements:
7171+Import multiple definitions from the same namespace with a single statement using `.{ }`:
7272+7373+```mlf
7474+use com.example.forum.{ author, timestamp, location };
7575+7676+record post {
7777+ author: author,
7878+ createdAt: timestamp,
7979+ location?: location,
8080+}
8181+```
8282+8383+This works for any namespace, regardless of how many levels deep:
8484+8585+```mlf
8686+use com.example.forum.types.{ author, postRef };
8787+8888+record comment {
8989+ author: author,
9090+ replyTo: postRef,
9191+}
9292+```
9393+9494+Or import them separately:
65956696```mlf
6797use com.example.forum.author;
6898use com.example.forum.timestamp;
6999use com.example.forum.location;
100100+```
101101+102102+## Renaming Imports
103103+104104+Sometimes you need to rename an imported definition to avoid conflicts or improve clarity. Use the `as` keyword:
105105+106106+```mlf
107107+use com.example.types.author as ForumAuthor;
108108+use com.social.types.author as SocialAuthor;
109109+110110+record crossPost {
111111+ forumAuthor: ForumAuthor,
112112+ socialAuthor: SocialAuthor,
113113+}
114114+```
115115+116116+This is useful when:
117117+- Two imports have the same name
118118+- You want a shorter or clearer name
119119+- You're dealing with naming conflicts
120120+121121+You can rename multiple imports at once:
122122+123123+```mlf
124124+use com.example.types.{ author as ForumAuthor, postRef as PostReference };
125125+126126+record crossPost {
127127+ author: ForumAuthor,
128128+ ref: PostReference,
129129+}
130130+```
131131+132132+You can also rename when there's a local definition with the same name:
133133+134134+```mlf
135135+// Local definition
136136+def type thread = {
137137+ localId!: string,
138138+}
139139+140140+// Import with rename to avoid conflict
141141+use com.example.types.thread as ExternalThread;
7014271143record post {
7272- author: author,
7373- createdAt: timestamp,
7474- location?: location,
144144+ localThread: thread, // Local def
145145+ externalThread: ExternalThread, // Imported type
75146}
76147```
77148149149+## Importing Main Definitions
150150+151151+Every MLF file has a "main" definition - the primary export. You can import it using just the file's namespace:
152152+153153+```mlf
154154+use com.example.thread; // Imports the main definition, bound as "thread"
155155+```
156156+157157+This is shorthand for `use com.example.thread.{ main }`.
158158+159159+**Note:** The `@main` annotation is only needed when there's a naming conflict (see [Important Info](/docs/language-guide/important-info/#the-main-annotation)). Otherwise, the main definition is determined automatically.
160160+161161+### Example
162162+163163+**File: `com/example/thread.mlf`**
164164+```mlf
165165+/// The main thread record
166166+record thread {
167167+ title!: string,
168168+ body!: string,
169169+}
170170+171171+/// Thread metadata
172172+def type threadMeta = {
173173+ id!: string,
174174+ viewCount!: integer,
175175+}
176176+```
177177+178178+**File: `com/example/post.mlf`**
179179+```mlf
180180+// Import the main thread record
181181+use com.example.thread;
182182+183183+// Import the threadMeta definition
184184+use com.example.thread.threadMeta;
185185+186186+record post {
187187+ thread: thread, // The main record
188188+ meta: threadMeta, // The def type
189189+ text!: string,
190190+}
191191+```
192192+193193+### Explicit Main Import
194194+195195+You can also explicitly import main definitions using `.{ }`:
196196+197197+```mlf
198198+use com.example.thread.{ main };
199199+200200+record post {
201201+ thread: thread, // Bound as "thread" (the last segment of the namespace)
202202+}
203203+```
204204+205205+### Importing and Renaming Main
206206+207207+You can rename the main definition when importing:
208208+209209+```mlf
210210+use com.example.thread.{ main as Thread };
211211+212212+record post {
213213+ thread: Thread,
214214+}
215215+```
216216+217217+Or import both main and other definitions together:
218218+219219+```mlf
220220+use com.example.thread.{ main as Thread, threadMeta as Meta };
221221+222222+record post {
223223+ thread: Thread,
224224+ meta: Meta,
225225+}
226226+```
227227+228228+## Wildcard Imports
229229+230230+Import all definitions from a namespace with `.*`:
231231+232232+```mlf
233233+use com.example.forum.*;
234234+235235+record post {
236236+ author: author, // All definitions from com.example.forum
237237+ postRef: postRef, // are now available
238238+}
239239+```
240240+241241+**Note:** Wildcard imports bring all public definitions into scope, which can lead to naming conflicts. Use with caution.
242242+243243+## Namespace Aliasing
244244+245245+Alias an entire namespace for shorter references:
246246+247247+```mlf
248248+use com.example.forum as Forum;
249249+250250+record post {
251251+ author: Forum.author,
252252+ ref: Forum.postRef,
253253+}
254254+```
255255+256256+This is useful for:
257257+- Avoiding naming conflicts
258258+- Shortening long namespace paths
259259+- Making code more readable
260260+261261+## Import Syntax Summary
262262+263263+Here's what MLF supports for imports:
264264+265265+| Import Type | Syntax Example |
266266+|-------------|----------------|
267267+| **Single import** | `use com.example.forum.profile;` |
268268+| **Multiple imports** | `use com.example.forum.{ author, postRef };` |
269269+| **Main import** | `use com.example.thread;` or `use com.example.thread.{ main };` |
270270+| **With renaming** | `use com.example.post.{ main as Post };` |
271271+| **Mixed imports** | `use com.example.thread.{ main as Thread, threadMeta };` |
272272+| **Wildcard imports** | `use com.example.forum.*;` |
273273+| **Namespace aliasing** | `use com.example.forum as Forum;` |
274274+78275## Organizing Files
7927680277Common organization patterns:
···94291com/
95292 example/
96293 forum/
9797- author.mlf
9898- postRef.mlf
294294+ types/
295295+ author.mlf
296296+ postRef.mlf
99297 post.mlf
100298 comment.mlf
101299```
···145343146344Here's a well-organized multi-file lexicon:
147345148148-**File: `com/example/forum/author.mlf`**
346346+**File: `com/example/forum/types/author.mlf`**
149347```mlf
150348/// Basic author information
151349def type author = {
···155353};
156354```
157355158158-**File: `com/example/forum/postRef.mlf`**
356356+**File: `com/example/forum/types/postRef.mlf`**
159357```mlf
160358/// Reference to a post
161359def type postRef = {
···166364167365**File: `com/example/forum/post.mlf`**
168366```mlf
169169-use com.example.forum.author;
170170-use com.example.forum.postRef;
367367+use com.example.forum.types.{ author, postRef };
171368172369/// A forum post
173370record post {
···190387/// Get a post by URI
191388query getPost(
192389 uri: AtUri
193193-):post | error {
390390+): post | error {
194391 NotFound,
195392};
196393```
197394198395**File: `com/example/forum/comment.mlf`**
199396```mlf
200200-use com.example.forum.author;
201201-use com.example.forum.postRef;
397397+use com.example.forum.types.{ author, postRef };
202398203399/// A comment on a post
204400record comment {
+5-7
website/content/docs/language-guide/09-prelude.md
···7676// Use fully qualified names
7777record myPost {
7878 reference: com.atproto.repo.strongRef,
7979- labels: [com.atproto.label.label],
7979+ labels: [com.atproto.label.defs.label],
8080}
81818282// Or import them
8383use com.atproto.repo.strongRef;
8484-use com.atproto.label.label;
8484+use com.atproto.label.defs.label;
85858686record myPost {
8787 reference: strongRef,
···112112}
113113```
114114115115-**`com.atproto.label.label`** - Content labels for moderation:
115115+**`com.atproto.label.defs.label`** - Content labels for moderation:
116116```mlf
117117record post {
118118- labels: [com.atproto.label.label],
118118+ labels: [com.atproto.label.defs.label],
119119}
120120```
121121···245245246246## What's Next?
247247248248-Read the [Important Info](/docs/language-guide/10-important-info/) section to understand how MLF maps to ATProto Lexicons, especially the rules for the `"main"` definition.
249249-250250-Then check out the [Playground](/playground/) to experiment with MLF, or read the [CLI documentation](/docs/cli/) to learn how to compile your lexicons.
248248+Next, learn about [Annotations](/docs/language-guide/annotations/) to add metadata for code generators and tooling.
···5566This section covers important details about how MLF maps to ATProto Lexicons.
7788+## Shebang Support
99+1010+MLF files can optionally include a shebang for direct execution:
1111+1212+```mlf
1313+#!/usr/bin/env mlf
1414+1515+record post {
1616+ text: string,
1717+}
1818+```
1919+2020+The `#` character is **only** used for shebangs at the start of files. It has no other meaning in MLF syntax.
2121+822## The "main" Definition
9231024In ATProto Lexicons, each lexicon has a `defs` object where definitions are stored. One special definition is called `"main"` - it's the primary definition for that lexicon.
···105119### Supporting Definitions
106120107121These are **never** `"main"` - they're always named defs:
108108-- `def type` definitions
109122- `token` definitions
110123- `inline type` definitions (don't appear in output at all)
111124···151164152165When the NSID ends with `defs`, all items become named defs (no `"main"`).
153166167167+## The @main Annotation
168168+169169+Sometimes you need both a main definition **and** a def with the same name. This happens when the name matches your namespace suffix.
170170+171171+### Why Would You Need This?
172172+173173+Consider `app.bsky.embed.external`. You might want:
174174+1. A **main record** called `external` (the primary export)
175175+2. A **def type** called `external` (metadata about externals)
176176+177177+Normally, duplicate names aren't allowed. But when the name matches the namespace suffix (the last part), you can use `@main`:
178178+179179+```mlf
180180+// File: app.bsky.embed.external.mlf
181181+182182+/// The main external embed record
183183+@main
184184+record external {
185185+ external!: externalDetail,
186186+}
187187+188188+/// External link details
189189+def type externalDetail = {
190190+ uri!: Uri,
191191+ title!: string,
192192+ description!: string,
193193+}
194194+```
195195+196196+### Rules
197197+198198+1. **Duplicates only allowed when name matches namespace suffix**
199199+ - ✅ `com.example.thread` can have two items named "thread"
200200+ - ❌ `com.example.post` cannot have two items named "thread"
201201+202202+2. **Must use @main to disambiguate**
203203+ - One item must have `@main`
204204+ - Only one item can have `@main`
205205+206206+3. **Works with records, queries, procedures, subscriptions + defs**
207207+ - ✅ `@main record thread` + `def type thread`
208208+ - ✅ `@main query getThread` + `def type thread`
209209+ - ❌ `inline type thread` + anything (inline types can't be main)
210210+211211+### Example: Thread Types
212212+213213+```mlf
214214+// File: com.example.thread.mlf
215215+216216+/// The main thread record
217217+@main
218218+record thread {
219219+ title!: string,
220220+ body!: string,
221221+ author!: Did,
222222+ createdAt!: Datetime,
223223+}
224224+225225+/// Thread metadata
226226+def type thread = {
227227+ id!: string,
228228+ viewCount!: integer,
229229+ replyCount!: integer,
230230+}
231231+232232+record reply {
233233+ threadMeta: thread, // References the def type, not the record
234234+ text!: string,
235235+}
236236+```
237237+238238+### When @main Isn't Needed
239239+240240+If you only have one record/query/procedure/subscription in a file, it automatically becomes main:
241241+242242+```mlf
243243+// No @main needed - this automatically becomes the main definition
244244+record post {
245245+ text!: string,
246246+ createdAt!: Datetime,
247247+}
248248+249249+def type author = {
250250+ did!: Did,
251251+ handle!: Handle,
252252+}
253253+```
254254+255255+### Error: Missing @main
256256+257257+```mlf
258258+// ERROR: Which one is main?
259259+record thread {
260260+ title!: string,
261261+}
262262+263263+def type thread = {
264264+ id!: string,
265265+}
266266+// This will fail - you must add @main to one of them
267267+```
268268+154269## NSID and File Path Mapping
155270156271The file path **is** the NSID. MLF derives the lexicon NSID from the file path:
···216331- **Single main-eligible item** → automatically becomes `"main"`
217332- **Name matches last NSID segment** → becomes `"main"`
218333- **Neither condition met** → all items become named defs
219219-- **Supporting definitions** (def type, token) → always named defs
334334+- **Supporting definitions** (token, inline type) → always named defs
220335- **File path** → determines the NSID
336336+- **Shebang support** → optional `#!/usr/bin/env mlf` at file start
337337+338338+## What's Next?
339339+340340+Finally, explore [Lexicon Mapping](/docs/language-guide/lexicon-mapping/) to see how MLF constructs map to ATProto Lexicon JSON format.
···11++++
22+title = "Annotations"
33+weight = 9
44++++
55+66+Annotations use the `@` symbol and provide metadata for external tooling. MLF itself assigns no semantic meaning to most annotations - they're purely for tools, linters, code generators, and other processors.
77+88+## Annotation Syntax
99+1010+Three forms of annotations are supported:
1111+1212+### Simple Annotation
1313+1414+```mlf
1515+@deprecated
1616+record oldRecord {
1717+ field: string,
1818+}
1919+```
2020+2121+### Positional Arguments
2222+2323+```mlf
2424+@since(1, 2, 0)
2525+@doc("https://example.com/docs")
2626+record example {
2727+ field: string,
2828+}
2929+```
3030+3131+Arguments can be:
3232+- **Strings**: `"value"`
3333+- **Numbers**: `42`, `3.14`
3434+- **Booleans**: `true`, `false`
3535+3636+### Named Arguments
3737+3838+```mlf
3939+@validate(min: 0, max: 100, strict: true)
4040+@codegen(language: "rust", derive: "Debug, Clone")
4141+record example {
4242+ field: integer,
4343+}
4444+```
4545+4646+## Annotation Placement
4747+4848+Annotations can be placed on:
4949+5050+- Records
5151+- Def Types
5252+- Inline Types
5353+- Tokens
5454+- Queries
5555+- Procedures
5656+- Subscriptions
5757+- Fields within records/types
5858+5959+**Example:**
6060+6161+```mlf
6262+/// A user profile
6363+@table(name: "profiles", indexes: "did,handle")
6464+record profile {
6565+ /// User's DID
6666+ @indexed
6767+ did!: Did,
6868+6969+ /// Display name (optional)
7070+ @sensitive(pii: true)
7171+ displayName: string,
7272+}
7373+```
7474+7575+## MLF Annotations vs Generator Annotations
7676+7777+MLF distinguishes between two categories:
7878+7979+### 1. MLF Annotations
8080+8181+Built into the MLF language and affect compilation/validation. These are **bare annotations** without any namespace prefix:
8282+8383+**`@main`** - Marks an item as the main definition when there's ambiguity:
8484+8585+```mlf
8686+// File: com/example/thread.mlf
8787+@main
8888+record thread {
8989+ title!: string,
9090+}
9191+9292+// This def shares the same name but is not main
9393+def type thread = {
9494+ id!: string,
9595+ viewCount!: integer,
9696+}
9797+```
9898+9999+See [Important Info](/docs/language-guide/important-info/#the-main-definition) for more details on the `@main` annotation.
100100+101101+### 2. Generator Annotations
102102+103103+Used by code generators and external tools. These have no effect on MLF compilation and **must** be namespaced with the generator name:
104104+105105+```mlf
106106+@rust:derive("Debug, Clone, Serialize")
107107+@typescript:export
108108+@go:tag(json: "custom_name")
109109+record example {
110110+ field: string,
111111+}
112112+```
113113+114114+**Generator namespacing rules:**
115115+- All generator annotations must have a namespace prefix (e.g., `@rust:foo`)
116116+- Use `@all:annotation` to apply an annotation to all generators
117117+- Bare annotations (without `:`) are reserved for MLF itself
118118+119119+**Common generator namespaces:**
120120+- `@rust:*` - Rust code generator annotations
121121+- `@typescript:*` - TypeScript code generator annotations
122122+- `@go:*` - Go code generator annotations
123123+- `@python:*` - Python code generator annotations
124124+- `@all:*` - Applies to all generators
125125+126126+## Custom Annotations
127127+128128+You can define your own annotations for custom tooling:
129129+130130+```mlf
131131+@myapp:cache(ttl: 3600)
132132+@myapp:permission("read:public")
133133+query getProfile(actor!: Did): profile;
134134+135135+@myapp:audit_log
136136+@myapp:rate_limit(requests: 100, window: 60)
137137+procedure updateProfile(data!: profile): unit;
138138+```
139139+140140+The interpretation is entirely up to your tooling.
141141+142142+## Annotation Processing
143143+144144+Annotations are preserved in the MLF AST and can be accessed by:
145145+146146+- Code generators
147147+- Linters
148148+- Documentation generators
149149+- Build tools
150150+- Custom processors
151151+152152+Each tool decides which annotations to support and how to interpret them.
153153+154154+## Best Practices
155155+156156+1. **Always namespace generator annotations** - Use `@generator:name` for all generator-specific annotations
157157+2. **Use `@all:` for cross-generator annotations** - When an annotation should apply to all generators
158158+3. **Document custom annotations** - Keep a registry of annotations your project uses
159159+4. **Be consistent** - Use the same annotation patterns across your codebase
160160+5. **Don't overuse** - Annotations should augment, not replace, good design
161161+162162+## What's Next?
163163+164164+Next, read the [Important Info](/docs/language-guide/important-info/) section to understand critical details about how MLF maps to ATProto Lexicons.