Update namespacing and imports · vmx.cx/mlf@52d3888

-2

.mlf/.gitignore

··· 1 - * 2 - !.gitignore

+9 -69

README.md

··· 2 2 3 3 A human-friendly DSL for ATProto Lexicons 4 4 5 - **This is a work in progress, things are subject to break and change** 5 + *This is a work in progress, things are subject to break and change* 6 6 7 7 ## What it looks like 8 8 ··· 22 22 }; 23 23 ``` 24 24 25 - ## Getting started 25 + ## Installation 26 26 27 - ### Install 27 + Right now you can only install mlf from source: 28 28 29 29 ```bash 30 30 # Install with all code generators (default: TypeScript, Go, Rust) ··· 37 37 cargo install --path mlf-cli --no-default-features 38 38 ``` 39 39 40 - ### Generate code from MLF 41 - 42 - ```bash 43 - # Generate TypeScript types 44 - mlf generate code -g typescript -i examples/**/*.mlf -o output/ 45 - 46 - # Generate Go structs 47 - mlf generate code -g go -i examples/**/*.mlf -o output/ 48 - 49 - # Generate Rust structs with serde 50 - mlf generate code -g rust -i examples/**/*.mlf -o output/ 51 - 52 - # Generate JSON lexicons (always available) 53 - mlf generate code -g json -i examples/**/*.mlf -o output/ 54 - # Or use the legacy command: 55 - mlf generate lexicon -i examples/**/*.mlf -o output/ 56 - ``` 57 - 58 - ### Validate MLF files 59 - 60 - ```bash 61 - mlf check examples/app.bsky.feed.post.mlf 62 - ``` 63 - 64 - ### Validate JSON records 65 - 66 - ```bash 67 - mlf validate examples/app.bsky.feed.post.mlf record.json 68 - ``` 69 - 70 - ### Convert JSON lexicons to MLF 71 - 72 - Convert existing ATProto JSON lexicons to MLF format: 40 + ## Documentation 73 41 74 - ```bash 75 - # Convert a single lexicon 76 - mlf generate mlf -i my-lexicon.json -o ./ 42 + Visit the [MLF website](https://mlf.lol/docs) for comprehensive documentation, guides, and examples. 77 43 78 - # Convert multiple lexicons 79 - mlf generate mlf -i "dist/lexicons/**/*.json" -o src/lexicons/ 80 - ``` 44 + ## Architecture 81 45 82 - This is useful for: 83 - - Migrating existing JSON lexicons to the MLF format 84 - - Learning MLF syntax by comparing JSON and MLF 85 - - Working with lexicons from external sources 46 + Please review [ARCHITECTURE.md](ARCHITECTURE.md) for an overview of how the project is structured. 86 47 87 - ## Project layout 48 + ## License 88 49 89 - ``` 90 - mlf/ 91 - ├── mlf-cli/ # Command-line app 92 - ├── mlf-lang/ # Parser and lexer (no_std compatible) 93 - ├── mlf-codegen/ # Core code generation with plugin system 94 - ├── codegen-plugins/ # Language-specific code generators 95 - │ ├── mlf-codegen-typescript/ # TypeScript generator 96 - │ ├── mlf-codegen-go/ # Go generator 97 - │ └── mlf-codegen-rust/ # Rust generator 98 - ├── mlf-validation/ # Lexicon validation 99 - ├── mlf-diagnostics/ # Fancy error reporting 100 - ├── mlf-wasm/ # WASM bindings for browser use 101 - ├── tree-sitter-mlf/ # Tree-sitter grammar for syntax highlighting 102 - └── website/ # Docs and playground 103 - └── mlf-playground-wasm/ # Playground WASM with all generators 104 - ``` 105 - 106 - ## Documentation 107 - 108 - Full documentation available at the [MLF website](https://mlf.lol) (or run `just serve` in `website/`). 109 - 110 - See [SPEC.md](SPEC.md) for the complete language specification. 50 + MIT

-975

SPEC.md

··· 1 - # MLF (Matt's Lexicon Format) Specification 2 - 3 - ## Overview 4 - 5 - MLF is a domain-specific language (DSL) for writing ATProto Lexicons with 100% fidelity to the [AT Protocol Lexicon specification](https://atproto.com/specs/lexicon). It provides a more ergonomic, type-safe syntax for defining records, queries, procedures, and types. 6 - 7 - ## Design Goals 8 - 9 - 1. **100% ATProto Fidelity**: Every valid ATProto Lexicon can be represented in MLF 10 - 2. **Human-Readable**: Clear, concise syntax that's easy to read and write 11 - 3. **no_std Compatible**: Core parser can run in constrained environments 12 - 4. **Tooling-Friendly**: Enable validation, code generation, and formatting 13 - 14 - ## File Structure 15 - 16 - ### File Extension 17 - - `.mlf` - MLF source files 18 - 19 - ### Shebang (Optional) 20 - ```mlf 21 - #!/usr/bin/env mlf 22 - ``` 23 - 24 - The `#` character is reserved for shebangs only and is not used elsewhere in the syntax. 25 - 26 - ### File Naming Convention 27 - The file path determines the lexicon NSID. Files should follow the lexicon NSID structure: 28 - - `app.bsky.feed.post.mlf` → Lexicon NSID: `app.bsky.feed.post` 29 - - `sh.tangled.repo.issue.mlf` → Lexicon NSID: `sh.tangled.repo.issue` 30 - 31 - The lexicon NSID is derived solely from the filename, not from any internal namespace declarations. 32 - 33 - ## Core Concepts 34 - 35 - ### NSIDs (Namespaced Identifiers) 36 - 37 - NSIDs use dotted notation: 38 - ``` 39 - app.bsky.feed.post 40 - com.example.thing 41 - sh.tangled.repo.issue 42 - ``` 43 - 44 - - Format: `authority.name(.name)*` 45 - - Authority: Typically a reversed domain name 46 - - Segments: Lowercase letters, numbers, hyphens (no underscores) 47 - 48 - ### Lexicon Resolution 49 - 50 - References to definitions can be: 51 - 52 - 1. **Local (same file)**: Just use the name 53 - ```mlf 54 - record myRecord { 55 - field: myType // References type in same file 56 - } 57 - 58 - def type myType = { /* ... */ } 59 - ``` 60 - 61 - 2. **Cross-file (different lexicon)**: Use full dotted path 62 - ```mlf 63 - record myRecord { 64 - profile: app.bsky.actor.profile // References app/bsky/actor/profile.mlf 65 - author: com.example.user.author // References com/example/user/author.mlf 66 - } 67 - ``` 68 - 69 - **Note**: The `#` character is NOT used for references. All references use dotted notation. 70 - 71 - ### Syntax Rules 72 - 73 - #### Semicolons 74 - 75 - All definitions require semicolons: 76 - - `record` definitions end with `};` 77 - - `use` statements end with `;` 78 - - `token` definitions end with `;` 79 - - `inline type` definitions end with `;` 80 - - `def type` definitions end with `;` 81 - - `query` definitions end with `;` 82 - - `procedure` definitions end with `;` 83 - - `subscription` definitions end with `;` 84 - 85 - #### Commas 86 - 87 - Commas are **required** between items, with **trailing commas allowed**: 88 - 89 - - **Record fields**: Commas required between fields, trailing comma allowed 90 - ```mlf 91 - record example { 92 - field1: string, 93 - field2: integer, // trailing comma allowed 94 - } 95 - ``` 96 - 97 - - **Constraints**: Commas required between constraint properties, trailing comma allowed 98 - ```mlf 99 - title: string constrained { 100 - maxLength: 200, 101 - minLength: 1, // trailing comma allowed 102 - } 103 - ``` 104 - 105 - - **Error definitions**: Commas required between errors, trailing comma allowed 106 - ```mlf 107 - query getThread(): thread | error { 108 - NotFound, 109 - BadRequest, // trailing comma allowed 110 - } 111 - ``` 112 - 113 - ## Type System 114 - 115 - ### Primitive Types 116 - 117 - ```mlf 118 - null // Null value 119 - boolean // True or false 120 - integer // 64-bit integer 121 - string // UTF-8 string 122 - bytes // Byte array 123 - ``` 124 - 125 - **Note:** ATProto Lexicons do not support floating-point numbers. Only `integer` is available for numeric values. 126 - 127 - ### Special String Formats 128 - 129 - Defined in `prelude.mlf` and available everywhere: 130 - 131 - ```mlf 132 - Did // Decentralized Identifier (did:*) 133 - AtUri // AT-URI (at://...) 134 - AtIdentifier // Either a DID or Handle 135 - Handle // Handle identifier (domain name) 136 - Datetime // ISO 8601 datetime 137 - Uri // Generic URI 138 - Cid // Content Identifier 139 - Nsid // Namespaced Identifier 140 - Tid // Timestamp Identifier 141 - RecordKey // Record key 142 - Language // BCP 47 language code 143 - ``` 144 - 145 - ### Blob Types 146 - 147 - ```mlf 148 - blob // Generic blob 149 - ``` 150 - 151 - With constraints: 152 - ```mlf 153 - avatar: blob constrained { 154 - accept: ["image/png", "image/jpeg"] 155 - maxSize: 1000000 // bytes 156 - } 157 - ``` 158 - 159 - ### Unknown Type 160 - 161 - ```mlf 162 - unknown // Represents any value, used for forward compatibility 163 - ``` 164 - 165 - ## Definitions 166 - 167 - ### Records 168 - 169 - Records are the primary data structure, stored in repositories: 170 - 171 - ```mlf 172 - record post { 173 - text!: string constrained { 174 - maxLength: 300 175 - maxGraphemes: 300 176 - } 177 - createdAt!: Datetime 178 - reply: replyRef // Optional field (default) 179 - } 180 - ``` 181 - 182 - ### Type Definitions 183 - 184 - MLF supports two kinds of type definitions: 185 - 186 - **Inline Types** - Expanded at the point of use, never appear in generated lexicon defs: 187 - 188 - ```mlf 189 - inline type AtIdentifier = string constrained { 190 - format "at-identifier" 191 - }; 192 - ``` 193 - 194 - **Def Types** - Become named definitions in the lexicon's defs block: 195 - 196 - ```mlf 197 - def type ReplyRef = { 198 - root!: AtUri 199 - parent!: AtUri 200 - }; 201 - ``` 202 - 203 - Use `inline type` for type aliases that should be expanded inline (like primitive type wrappers). Use `def type` for types that should be referenced by name in the generated lexicon. 204 - 205 - ### Tokens 206 - 207 - Tokens are named constants used in enums and unions: 208 - 209 - ```mlf 210 - /// Open state 211 - token open; 212 - 213 - /// Closed state 214 - token closed; 215 - 216 - record issue { 217 - state!: string constrained { 218 - knownValues: [ 219 - open // References token defined above 220 - closed 221 - ] 222 - default: "open" 223 - } 224 - } 225 - ``` 226 - 227 - Tokens must have doc comments describing their purpose. 228 - 229 - ### Queries 230 - 231 - Queries are read-only HTTP endpoints (GET): 232 - 233 - ```mlf 234 - /// Get a user profile 235 - query getProfile( 236 - /// The actor's DID or handle 237 - actor!: AtIdentifier 238 - /// Optional viewer context (default) 239 - viewer: Did 240 - ): profileView | error { 241 - /// Profile not found 242 - ProfileNotFound 243 - /// Invalid request parameters 244 - BadRequest 245 - }; 246 - ``` 247 - 248 - ### Procedures 249 - 250 - Procedures are write operations (POST): 251 - 252 - ```mlf 253 - /// Create a new post 254 - procedure createPost( 255 - text!: string 256 - createdAt!: Datetime 257 - ): { 258 - uri!: AtUri 259 - cid!: Cid 260 - } | error { 261 - /// Text exceeds maximum length 262 - TextTooLong 263 - }; 264 - ``` 265 - 266 - ### Subscriptions 267 - 268 - Subscriptions are WebSocket-based event streams that emit messages over time. They are used for real-time updates and event notifications. 269 - 270 - ```mlf 271 - /// Subscribe to repository events 272 - subscription subscribeRepos( 273 - /// Optional cursor for resuming from a specific point (default) 274 - cursor: integer 275 - ): commit | identity | handle | migrate | tombstone | info; 276 - ``` 277 - 278 - **Message definitions** for subscriptions are defined as def types or records: 279 - 280 - ```mlf 281 - /// Commit message emitted by subscribeRepos 282 - def type commit = { 283 - seq!: integer 284 - rebase!: boolean 285 - tooBig!: boolean 286 - repo!: Did 287 - commit!: Cid 288 - rev!: string 289 - since!: string 290 - blocks!: bytes 291 - ops!: repoOp[] 292 - blobs!: Cid[] 293 - time!: Datetime 294 - }; 295 - 296 - /// Info message 297 - def type info = { 298 - name!: string 299 - message: string // Optional (default) 300 - }; 301 - ``` 302 - 303 - **Subscription features:** 304 - 305 - - Parameters: Like queries, subscriptions can have parameters 306 - - Return type: A union of message types that can be emitted 307 - - Each message type must be defined as a def type or record 308 - - Message types can be local or imported from other lexicons 309 - - Subscriptions are long-lived WebSocket connections 310 - - No error block (errors are handled at the WebSocket protocol level) 311 - 312 - **Example: Chat message subscription** 313 - 314 - ```mlf 315 - /// Subscribe to chat messages for a stream 316 - subscription subscribeChat( 317 - /// The DID of the streamer 318 - streamer!: Did 319 - /// Optional cursor to resume from (default) 320 - cursor: string 321 - ): message | delete | join | leave; 322 - 323 - /// Chat message payload 324 - def type message = { 325 - id!: string 326 - text!: string 327 - author!: Did 328 - createdAt!: Datetime 329 - }; 330 - 331 - /// Delete event payload 332 - def type delete = { 333 - id!: string 334 - }; 335 - 336 - /// Join event payload 337 - def type join = { 338 - user!: Did 339 - }; 340 - 341 - /// Leave event payload 342 - def type leave = { 343 - user!: Did 344 - }; 345 - ``` 346 - 347 - ### Return Types 348 - 349 - Queries and procedures can return: 350 - 351 - 1. **Simple success**: `(): returnType` 352 - 2. **Success with errors**: `(): successType | error { ErrorName, ... }` 353 - - Each error must have a doc comment describing it 354 - 3. **Unknown/empty**: `(): unknown` 355 - 356 - ## Type Modifiers 357 - 358 - ### Optional and Required Fields 359 - 360 - Fields are **optional by default**. Use `!:` to mark a field as required: 361 - 362 - ```mlf 363 - record example { 364 - optional: string // Optional (default) 365 - required!: string // Required (marked with !) 366 - } 367 - ``` 368 - 369 - ### Arrays 370 - 371 - ```mlf 372 - record example { 373 - tags: string[] 374 - items: string[] constrained { 375 - minLength: 1 376 - maxLength: 10 377 - } 378 - } 379 - ``` 380 - 381 - ### Unions 382 - 383 - Use the pipe operator `|`. Unions are **open by default** (allowing unknown types): 384 - 385 - ```mlf 386 - record example { 387 - // Open union (default, can include unknown types) 388 - content: text | image | video 389 - 390 - // Union of tokens (also open by default) 391 - state: open | closed | pending 392 - } 393 - ``` 394 - 395 - Closed unions (only allowing listed types) use `| !`: 396 - 397 - ```mlf 398 - record example { 399 - // Closed union (marked with !, only these types allowed) 400 - content: text | image | video | ! 401 - } 402 - ``` 403 - 404 - ### References 405 - 406 - Reference local or external definitions: 407 - 408 - ```mlf 409 - // Local reference (same file) 410 - record post { 411 - author: author // References 'def type author' in same file 412 - } 413 - 414 - // Cross-file reference 415 - record post { 416 - profile: app.bsky.actor.profile // References app/bsky/actor/profile.mlf 417 - } 418 - ``` 419 - 420 - ## Constraints 421 - 422 - Constraints refine types by adding additional restrictions. A key principle is that constraints can only make types **more restrictive**, never less restrictive. This ensures type safety and proper substitutability. 423 - 424 - ### Constraint Refinement Rules 425 - 426 - When applying constraints, each constraint must be **at least as restrictive** as any parent constraint: 427 - 428 - ```mlf 429 - // Valid: More restrictive constraints 430 - def type shortString = string constrained { 431 - maxLength: 100 432 - }; 433 - 434 - record post { 435 - // Can further constrain to 50 (more restrictive than 100) 436 - title: shortString constrained { 437 - maxLength: 50 // ✓ Valid: 50 ≤ 100 438 - } 439 - } 440 - 441 - // Invalid: Less restrictive constraints 442 - record invalid { 443 - // ERROR: Cannot expand to 200 (less restrictive than 100) 444 - content: shortString constrained { 445 - maxLength: 200 // ✗ Invalid: 200 > 100 446 - } 447 - } 448 - ``` 449 - 450 - **Refinement rules by constraint type:** 451 - 452 - - **Numeric bounds**: `minimum` can only increase, `maximum` can only decrease 453 - - **Length bounds**: `minLength`/`minGraphemes` can only increase, `maxLength`/`maxGraphemes` can only decrease 454 - - **Enums**: Can only restrict to a subset of values 455 - - **Known values**: Can add new values (extensible) but cannot remove specified ones 456 - - **Format**: Cannot change once specified 457 - - **Defaults**: Can be specified if not already set 458 - 459 - ### String Constraints 460 - 461 - ```mlf 462 - field: string constrained { 463 - minLength: 1 // Minimum byte length 464 - maxLength: 1000 // Maximum byte length 465 - minGraphemes: 1 // Minimum grapheme clusters 466 - maxGraphemes: 100 // Maximum grapheme clusters 467 - format: "uri" // Format validation 468 - enum: ["a", "b", "c"] // Allowed values (closed set) - string literals 469 - knownValues: [ // Known values (extensible set) - can be string literals OR token references 470 - value1 // Token reference 471 - "value2" // String literal 472 - ] 473 - default: "defaultValue" // Default value 474 - } 475 - ``` 476 - 477 - **Note**: `enum`, `knownValues`, and `default` can accept either: 478 - - **Literals**: `"open"`, `42`, `true` (string, integer, or boolean) 479 - - **References**: `open`, `myType` (references to tokens, records, types, etc.) 480 - 481 - When using references, the identifier will be resolved to its string representation in the generated lexicon. 482 - 483 - ### Integer Constraints 484 - 485 - ```mlf 486 - field: integer constrained { 487 - minimum: 0 488 - maximum: 100 489 - enum: [1, 2, 3] 490 - default: 1 491 - } 492 - ``` 493 - 494 - ### Array Constraints 495 - 496 - ```mlf 497 - field: string[] constrained { 498 - minLength: 1 499 - maxLength: 10 500 - } 501 - ``` 502 - 503 - ### Blob Constraints 504 - 505 - ```mlf 506 - field: blob constrained { 507 - accept: ["image/png", "image/jpeg"] // MIME types 508 - maxSize: 1000000 // Bytes 509 - } 510 - ``` 511 - 512 - ### Boolean Constraints 513 - 514 - ```mlf 515 - field: boolean constrained { 516 - default: false 517 - } 518 - ``` 519 - 520 - ## Comments 521 - 522 - ### Documentation Comments 523 - 524 - Use `///` for documentation (appears in generated docs/code): 525 - 526 - ```mlf 527 - /// A user profile record 528 - record profile { 529 - /// The user's display name 530 - displayName?: string 531 - } 532 - ``` 533 - 534 - ### Regular Comments 535 - 536 - Regular comments (`//`) are ignored when processing and will have no impact on any output. 537 - 538 - ## Annotations 539 - 540 - Annotations use the `@` symbol and are metadata markers for external tooling. MLF itself assigns no semantic meaning to annotations - they are purely for tools, linters, code generators, and other processors to interpret. 541 - 542 - ### Annotation Syntax 543 - 544 - Three forms of annotations are supported: 545 - 546 - **1. Simple annotation:** 547 - ```mlf 548 - @deprecated 549 - record oldRecord { 550 - field: string 551 - } 552 - ``` 553 - 554 - **2. Positional arguments:** 555 - ```mlf 556 - @since(1, 2, 0) 557 - @doc("https://example.com/docs") 558 - record example { 559 - field: string 560 - } 561 - ``` 562 - 563 - Arguments can be: 564 - - Strings: `"value"` 565 - - Numbers: `42`, `3.14` 566 - - Booleans: `true`, `false` 567 - 568 - **3. Named arguments:** 569 - ```mlf 570 - @validate(min: 0, max: 100, strict: true) 571 - @codegen(language: "rust", derive: "Debug, Clone") 572 - record example { 573 - field: integer 574 - } 575 - ``` 576 - 577 - ### Annotation Placement 578 - 579 - Annotations can be placed on: 580 - - Records 581 - - Inline Types 582 - - Def Types 583 - - Tokens 584 - - Queries 585 - - Procedures 586 - - Subscriptions 587 - - Fields within records/types 588 - 589 - ```mlf 590 - /// A user profile 591 - @table(name: "profiles", indexes: "did,handle") 592 - record profile { 593 - /// User's DID 594 - @indexed 595 - did!: Did 596 - 597 - /// Display name (optional) 598 - @sensitive(pii: true) 599 - displayName: string 600 - } 601 - ``` 602 - 603 - ### Common Annotation Examples 604 - 605 - ```mlf 606 - // Deprecation 607 - @deprecated 608 - @deprecated(since: "2.0.0", replacement: "newRecord") 609 - record oldRecord { /* ... */ } 610 - 611 - // Code generation hints 612 - @derive("Debug, Clone, Serialize") 613 - @table(name: "users") 614 - record user { /* ... */ } 615 - 616 - // Validation 617 - @validate(custom: "validateEmail") 618 - @range(min: 0, max: 100) 619 - field: integer 620 - 621 - // Documentation 622 - @example("did:plc:abc123") 623 - @see("https://atproto.com/specs/did") 624 - field: Did 625 - 626 - // Versioning 627 - @since(1, 0, 0) 628 - @unstable 629 - record experimentalFeature { /* ... */ } 630 - ``` 631 - 632 - **Note:** The interpretation of annotations is entirely up to the tooling consuming the MLF. Different tools may support different annotation sets. 633 - 634 - ## Use Statements 635 - 636 - Import definitions from other lexicons: 637 - 638 - ```mlf 639 - // Named imports 640 - use app.bsky.actor.{profile, profileView}; 641 - use sh.tangled.repo.issue.{issue, open, closed}; 642 - 643 - // Alias entire namespace 644 - use app.bsky.actor as Actor; 645 - 646 - // Wildcard import 647 - use app.bsky.feed.*; 648 - 649 - // Mixed 650 - use sh.tangled.repo.issue.{issue as IssueRecord, open, closed}; 651 - ``` 652 - 653 - After importing, use the short name: 654 - 655 - ```mlf 656 - use app.bsky.actor.profile; 657 - 658 - record myThing { 659 - author: profile // Instead of app.bsky.actor.profile 660 - } 661 - ``` 662 - 663 - ## Lexicon Discovery & Resolution 664 - 665 - ### File Discovery 666 - 667 - Tools discover lexicons explicit paths: Single file, list of files, or glob pattern 668 - 669 - ```bash 670 - mlf validate app.bsky.feed.post.mlf 671 - mlf validate *.mlf 672 - mlf validate "**/*.mlf" 673 - ``` 674 - 675 - ### Resolution Order 676 - 677 - When resolving cross-file references: 678 - 679 - 1. Current file (local definitions) 680 - 2. Explicitly imported lexicons (via `use`) 681 - 3. Configured lexicon paths 682 - 4. (Future) Remote fetch via ATProto 683 - 684 - ### File Path Convention 685 - 686 - The lexicon NSID is determined by the file path. Lexicons can follow a directory structure matching their NSID: 687 - 688 - ``` 689 - lexicons/ 690 - app/ 691 - bsky/ 692 - actor/ 693 - profile.mlf → app.bsky.actor.profile 694 - feed/ 695 - post.mlf → app.bsky.feed.post 696 - com/ 697 - example/ 698 - thing.mlf → com.example.thing 699 - ``` 700 - 701 - Or use a flat structure with dots in the filename: 702 - ``` 703 - lexicons/ 704 - app.bsky.actor.profile.mlf 705 - app.bsky.feed.post.mlf 706 - com.example.thing.mlf 707 - ``` 708 - 709 - In both cases, the NSID is derived from the file path, not from internal declarations. 710 - 711 - ## CLI Commands 712 - 713 - ```bash 714 - # Generation 715 - mlf generate code --input "**/*.mlf" --plugin rust src/* 716 - mlf generate lexicon --input "**/*.mlf" lexicons/* 717 - mlf generate example --input "**/*.mlf" --count 5 examples/* 718 - 719 - # Convert JSON lexicons to MLF 720 - mlf generate mlf --input "lexicons/**/*.json" --output ./mlf/ 721 - 722 - # Validate lexicons 723 - mlf validate <files|globs> 724 - mlf validate "**/*.mlf" 725 - 726 - # Format lexicons 727 - mlf fmt <files|globs> 728 - 729 - # Validate a record against a lexicon 730 - mlf check --input app.bsky.feed.post.mlf ./record.json 731 - ``` 732 - 733 - ### JSON to MLF Conversion 734 - 735 - The `mlf generate mlf` command converts ATProto JSON lexicons back to MLF format. This is useful for: 736 - 737 - - **Migration**: Converting existing JSON lexicons to MLF 738 - - **Interoperability**: Working with lexicons from external sources 739 - - **Learning**: Seeing how JSON lexicons map to MLF syntax 740 - - **Comparison**: Generating MLF from JSON to compare with hand-written MLF 741 - 742 - The converter automatically: 743 - - Converts format strings (did, datetime, handle) to prelude types (Did, Datetime, Handle) 744 - - Properly formats required (`!`) and optional (default) fields 745 - - Converts `namespace#name` references to `namespace.name` notation 746 - - Generates clean, properly indented MLF with correct syntax 747 - 748 - ## Examples 749 - 750 - ### Complete Lexicon Example 751 - 752 - ```mlf 753 - #!/usr/bin/env mlf 754 - 755 - use app.bsky.actor.profile; 756 - 757 - /// Open issue state 758 - token open; 759 - 760 - /// Closed issue state 761 - token closed; 762 - 763 - /// An issue in a repository 764 - record issue { 765 - /// The repository this issue belongs to 766 - repo!: AtUri 767 - /// Issue title 768 - title!: string constrained { 769 - minGraphemes: 1 770 - maxGraphemes: 200 771 - } 772 - /// Issue body (markdown) 773 - body: string constrained { 774 - maxGraphemes: 10000 775 - } 776 - /// Issue state 777 - state!: string constrained { 778 - knownValues: [ 779 - open 780 - closed 781 - ] 782 - default: "open" 783 - } 784 - /// Creation timestamp 785 - createdAt!: Datetime 786 - } 787 - 788 - /// A comment on an issue 789 - record comment { 790 - /// The issue this comment belongs to 791 - issue!: AtUri 792 - /// Comment body (markdown) 793 - body!: string constrained { 794 - minGraphemes: 1 795 - maxGraphemes: 10000 796 - } 797 - /// Creation timestamp 798 - createdAt!: Datetime 799 - /// Optional reply target 800 - replyTo: AtUri 801 - } 802 - 803 - /// Get an issue by URI 804 - query getIssue( 805 - /// Issue AT-URI 806 - uri!: AtUri 807 - ): issue | error { 808 - /// Issue not found 809 - NotFound 810 - }; 811 - 812 - /// Create a new issue 813 - procedure createIssue( 814 - repo!: AtUri 815 - title!: string 816 - body: string // Optional (default) 817 - ): { 818 - uri!: AtUri 819 - cid!: Cid 820 - } | error { 821 - /// Repository not found 822 - RepoNotFound 823 - /// Title too long 824 - TitleTooLong 825 - }; 826 - ``` 827 - 828 - ## ATProto Mapping 829 - 830 - ### MLF → JSON Lexicon 831 - 832 - MLF compiles to standard ATProto JSON Lexicons: 833 - 834 - **MLF:** 835 - ```mlf 836 - record post { 837 - text!: string constrained { 838 - maxLength: 300 839 - } 840 - createdAt!: Datetime 841 - } 842 - ``` 843 - 844 - **JSON:** 845 - ```json 846 - { 847 - "lexicon": 1, 848 - "id": "app.bsky.feed.post", 849 - "defs": { 850 - "main": { 851 - "type": "record", 852 - "key": "tid", 853 - "record": { 854 - "type": "object", 855 - "required": ["text", "createdAt"], 856 - "properties": { 857 - "text": { 858 - "type": "string", 859 - "maxLength": 300 860 - }, 861 - "createdAt": { 862 - "type": "string", 863 - "format": "datetime" 864 - } 865 - } 866 - } 867 - } 868 - } 869 - } 870 - ``` 871 - 872 - ### Subscription Mapping 873 - 874 - **MLF:** 875 - ```mlf 876 - subscription subscribeRepos( 877 - cursor: integer // Optional (default) 878 - ): commit | identity; 879 - ``` 880 - 881 - **JSON:** 882 - ```json 883 - { 884 - "lexicon": 1, 885 - "id": "com.atproto.sync.subscribeRepos", 886 - "defs": { 887 - "main": { 888 - "type": "subscription", 889 - "parameters": { 890 - "type": "params", 891 - "properties": { 892 - "cursor": { 893 - "type": "integer" 894 - } 895 - } 896 - }, 897 - "message": { 898 - "schema": { 899 - "type": "union", 900 - "refs": ["#commit", "#identity"] 901 - } 902 - } 903 - }, 904 - "commit": { 905 - "type": "object", 906 - "required": ["seq", "repo", "commit"], 907 - "properties": { 908 - "seq": { "type": "integer" }, 909 - "repo": { "type": "string", "format": "did" }, 910 - "commit": { "type": "string", "format": "cid" } 911 - } 912 - } 913 - } 914 - } 915 - ``` 916 - 917 - ## Future Considerations 918 - 919 - ### Potential Extensions 920 - 921 - - **Version constraints**: Specify compatible lexicon versions in lexicon headers 922 - - **Custom validation**: Pluggable validators beyond built-in constraints 923 - - **Documentation generation**: Automatic API docs from MLF with annotation support 924 - - **Standard annotation registry**: Common annotations like `@deprecated`, `@since`, `@internal` 925 - - **Import resolution**: Remote lexicon fetching and caching 926 - - **Type inference**: Automatic type inference for constrained types 927 - 928 - ### Versioning 929 - 930 - Lexicons are versioned at the NSID level. MLF files should include version metadata in comments or future version declarations. 931 - 932 - ## Appendix 933 - 934 - ### Reserved Keywords 935 - 936 - ``` 937 - as, blob, boolean, bytes, constrained, def, error, inline, integer, 938 - null, procedure, query, record, string, subscription, token, 939 - type, unknown, use 940 - ``` 941 - 942 - ### Reserved Names 943 - 944 - The following names cannot be used as item names: 945 - 946 - ``` 947 - main, defs 948 - ``` 949 - 950 - ### Raw Identifiers 951 - 952 - To use a reserved keyword as an identifier, wrap it in backticks: 953 - 954 - ```mlf 955 - def type `record` = { 956 - `record`: com.atproto.repo.strongRef 957 - `error`: string 958 - }; 959 - ``` 960 - 961 - This allows field names or type names to match reserved keywords when necessary for compatibility with existing schemas. 962 - 963 - ### Constraint Keywords 964 - 965 - ``` 966 - accept, default, enum, format, knownValues, maxGraphemes, 967 - maxLength, maxSize, maximum, minGraphemes, minLength, minimum 968 - ``` 969 - 970 - ### Format Values 971 - 972 - ``` 973 - at-identifier, at-uri, cid, datetime, did, handle, language, 974 - nsid, record-key, tid, uri 975 - ```

+209 -49

mlf-cli/src/check.rs

··· 37 37 help: Option<String>, 38 38 }, 39 39 40 - #[error("Failed to expand glob pattern")] 41 - #[diagnostic(code(mlf::check::glob_error))] 42 - GlobError { 43 - #[source] 44 - source: glob::GlobError, 45 - }, 46 - 47 - #[error("Invalid glob pattern: {pattern}")] 48 - #[diagnostic(code(mlf::check::invalid_glob))] 49 - InvalidGlob { 50 - pattern: String, 51 - #[source] 52 - source: glob::PatternError, 53 - }, 54 40 55 41 #[error("Record validation failed")] 56 42 #[diagnostic(code(mlf::check::record_validation))] ··· 63 49 ConfigError(#[from] ConfigError), 64 50 } 65 51 66 - pub fn run_check(input_patterns: Vec<String>) -> Result<(), CheckError> { 67 - // If no input patterns provided, use source directory from mlf.toml 68 - let patterns = if input_patterns.is_empty() { 69 - let current_dir = std::env::current_dir() 70 - .map_err(|e| CheckError::ReadFile { 71 - path: ".".to_string(), 72 - source: e, 73 - })?; 52 + pub fn run_check(input_paths: Vec<PathBuf>, explicit_root: Option<PathBuf>) -> Result<(), CheckError> { 53 + let current_dir = std::env::current_dir() 54 + .map_err(|e| CheckError::ReadFile { 55 + path: ".".to_string(), 56 + source: e, 57 + })?; 74 58 59 + // Determine root directory and input paths 60 + let (root_dir, file_paths) = if input_paths.is_empty() { 61 + // No input provided: must use mlf.toml 75 62 match find_project_root(&current_dir) { 76 63 Ok(project_root) => { 77 64 let config_path = project_root.join("mlf.toml"); 78 65 let config = MlfConfig::load(&config_path)?; 79 - let source_pattern = format!("{}/**/*.mlf", config.source.directory); 66 + let source_dir = project_root.join(&config.source.directory); 67 + let root = explicit_root.unwrap_or_else(|| source_dir.clone()); 80 68 println!("Using source directory from mlf.toml: {}", config.source.directory); 81 - vec![source_pattern] 69 + 70 + // Collect all .mlf files from source directory 71 + let files = collect_mlf_files(&source_dir)?; 72 + (root, files) 82 73 } 83 74 Err(ConfigError::NotFound) => { 84 75 return Err(CheckError::ValidationErrors { ··· 88 79 Err(e) => return Err(CheckError::ConfigError(e)), 89 80 } 90 81 } else { 91 - input_patterns 92 - }; 82 + // Input provided: determine root 83 + let root = if let Some(explicit) = explicit_root { 84 + // --root flag takes precedence 85 + explicit 86 + } else if let Ok(project_root) = find_project_root(&current_dir) { 87 + // Try to use mlf.toml source directory 88 + let config_path = project_root.join("mlf.toml"); 89 + if let Ok(config) = MlfConfig::load(&config_path) { 90 + project_root.join(&config.source.directory) 91 + } else { 92 + current_dir.clone() 93 + } 94 + } else { 95 + // Fall back to current directory 96 + current_dir.clone() 97 + }; 93 98 94 - let mut file_paths = Vec::new(); 99 + // Collect files from input paths 100 + let mut files = Vec::new(); 101 + for input_path in input_paths { 102 + let path = if input_path.is_absolute() { 103 + input_path 104 + } else { 105 + current_dir.join(input_path) 106 + }; 95 107 96 - for pattern in patterns { 97 - if pattern.contains('*') || pattern.contains('?') { 98 - for entry in glob::glob(&pattern).map_err(|source| CheckError::InvalidGlob { 99 - pattern: pattern.clone(), 100 - source, 101 - })? { 102 - let path = entry.map_err(|source| CheckError::GlobError { source })?; 103 - file_paths.push(path); 108 + if path.is_dir() { 109 + files.extend(collect_mlf_files(&path)?); 110 + } else if path.is_file() { 111 + files.push(path); 112 + } else { 113 + return Err(CheckError::ReadFile { 114 + path: path.display().to_string(), 115 + source: std::io::Error::new(std::io::ErrorKind::NotFound, "Path not found"), 116 + }); 104 117 } 105 - } else { 106 - file_paths.push(PathBuf::from(pattern)); 107 118 } 108 - } 119 + (root, files) 120 + }; 109 121 110 122 // Try to load cached lexicons from .mlf directory 111 123 let current_dir = std::env::current_dir() ··· 148 160 } 149 161 }; 150 162 151 - let namespace = file_path 152 - .file_stem() 153 - .and_then(|s| s.to_str()) 154 - .unwrap_or("unknown") 155 - .to_string(); 163 + let namespace = extract_namespace(&file_path, &root_dir)?; 156 164 157 165 if let Err(e) = workspace.add_module(namespace.clone(), lexicon.clone()) { 158 - let diagnostic = ValidationDiagnostic::new(filename.clone(), source.clone(), e); 166 + let diagnostic = ValidationDiagnostic::new(filename.clone(), source.clone(), namespace.clone(), e); 159 167 eprintln!("{:?}", miette::Report::new(diagnostic)); 160 168 had_parse_errors = true; 161 169 continue; 162 170 } 163 171 164 - source_files.push((filename.clone(), source)); 172 + source_files.push((filename.clone(), namespace.clone(), source)); 165 173 println!("✓ {}: Parsed successfully", file_path.display()); 166 174 } 167 175 ··· 172 180 } 173 181 174 182 if let Err(e) = workspace.resolve() { 175 - // Show all errors from the first source file 176 - if let Some((filename, source)) = source_files.first() { 177 - let diagnostic = ValidationDiagnostic::new(filename.clone(), source.clone(), e); 178 - eprintln!("{:?}", miette::Report::new(diagnostic)); 183 + // Collect all modules that have errors 184 + let mut modules_with_errors: std::collections::BTreeMap<String, (Option<String>, String)> = std::collections::BTreeMap::new(); 185 + 186 + // First, add all explicitly checked files 187 + for (filename, namespace, source) in &source_files { 188 + modules_with_errors.insert(namespace.clone(), (Some(filename.clone()), source.clone())); 179 189 } 190 + 191 + // Then, find any cached modules with errors and try to load their source 192 + for error in &e.errors { 193 + let error_namespace = mlf_diagnostics::get_error_module_namespace_str(error); 194 + if !modules_with_errors.contains_key(error_namespace) { 195 + let namespace_path = error_namespace.replace('.', "/"); 196 + let mut source_loaded = false; 197 + 198 + // Try multiple locations for the source file 199 + let mut possible_paths = vec![ 200 + // Check in lexicons/ directory (common structure) 201 + current_dir.join("lexicons").join(format!("{}.mlf", namespace_path)), 202 + // Check in source directory from config 203 + current_dir.join("src").join(format!("{}.mlf", namespace_path)), 204 + // Check relative to current directory 205 + current_dir.join(format!("{}.mlf", namespace_path)), 206 + ]; 207 + 208 + // Add cache directory if available (lexicons are in lexicons/mlf/ subdirectory) 209 + if let Some(cache_dir) = &mlf_cache_dir { 210 + possible_paths.push(cache_dir.join("lexicons").join("mlf").join(format!("{}.mlf", namespace_path))); 211 + } 212 + 213 + for path in possible_paths { 214 + if let Ok(source) = std::fs::read_to_string(&path) { 215 + modules_with_errors.insert( 216 + error_namespace.to_string(), 217 + (Some(path.display().to_string()), source) 218 + ); 219 + source_loaded = true; 220 + break; 221 + } 222 + } 223 + 224 + if !source_loaded { 225 + // Couldn't load source, add placeholder 226 + modules_with_errors.insert( 227 + error_namespace.to_string(), 228 + (None, String::new()) 229 + ); 230 + } 231 + } 232 + } 233 + 234 + // Show diagnostics for all modules with errors 235 + for (namespace, (filename_opt, source)) in &modules_with_errors { 236 + // Only show diagnostic if this module has errors 237 + let has_errors = e.errors.iter().any(|error| { 238 + mlf_diagnostics::get_error_module_namespace_str(error) == namespace 239 + }); 240 + 241 + if has_errors { 242 + if let Some(filename) = filename_opt { 243 + // Have source file, show full diagnostic 244 + let diagnostic = ValidationDiagnostic::new(filename.clone(), source.clone(), namespace.clone(), e.clone()); 245 + eprintln!("{:?}", miette::Report::new(diagnostic)); 246 + } else { 247 + // No source available, just list the errors 248 + let error_count = e.errors.iter() 249 + .filter(|err| mlf_diagnostics::get_error_module_namespace_str(err) == namespace) 250 + .count(); 251 + eprintln!("\n{}: {} error(s) (source not available)", namespace, error_count); 252 + } 253 + } 254 + } 255 + 180 256 return Err(CheckError::ValidationErrors { 181 257 help: Some("Workspace validation failed".to_string()), 182 258 }); ··· 235 311 } 236 312 } 237 313 } 314 + 315 + /// Recursively collect all .mlf files from a directory 316 + fn collect_mlf_files(dir: &std::path::Path) -> Result<Vec<PathBuf>, CheckError> { 317 + let mut files = Vec::new(); 318 + 319 + if !dir.exists() { 320 + return Err(CheckError::ReadFile { 321 + path: dir.display().to_string(), 322 + source: std::io::Error::new(std::io::ErrorKind::NotFound, "Directory not found"), 323 + }); 324 + } 325 + 326 + fn visit_dirs(dir: &std::path::Path, files: &mut Vec<PathBuf>) -> std::io::Result<()> { 327 + if dir.is_dir() { 328 + for entry in std::fs::read_dir(dir)? { 329 + let entry = entry?; 330 + let path = entry.path(); 331 + if path.is_dir() { 332 + visit_dirs(&path, files)?; 333 + } else if path.extension().and_then(|s| s.to_str()) == Some("mlf") { 334 + files.push(path); 335 + } 336 + } 337 + } 338 + Ok(()) 339 + } 340 + 341 + visit_dirs(dir, &mut files).map_err(|source| CheckError::ReadFile { 342 + path: dir.display().to_string(), 343 + source, 344 + })?; 345 + 346 + Ok(files) 347 + } 348 + 349 + /// Extract namespace from file path relative to root directory 350 + /// e.g., root=/project/lexicons, file=/project/lexicons/com/example/foo.mlf -> com.example.foo 351 + fn extract_namespace(file_path: &std::path::Path, root_dir: &std::path::Path) -> Result<String, CheckError> { 352 + // Get the canonical paths to handle . and .. correctly 353 + let file_canonical = file_path.canonicalize().map_err(|source| CheckError::ReadFile { 354 + path: file_path.display().to_string(), 355 + source, 356 + })?; 357 + 358 + let root_canonical = root_dir.canonicalize().map_err(|source| CheckError::ReadFile { 359 + path: root_dir.display().to_string(), 360 + source, 361 + })?; 362 + 363 + // Get relative path from root to file 364 + let relative_path = file_canonical.strip_prefix(&root_canonical) 365 + .map_err(|_| CheckError::ValidationErrors { 366 + help: Some(format!( 367 + "File {} is not within root directory {}", 368 + file_path.display(), 369 + root_dir.display() 370 + )), 371 + })?; 372 + 373 + // Convert path to namespace 374 + let mut components = Vec::new(); 375 + for component in relative_path.components() { 376 + if let std::path::Component::Normal(os_str) = component { 377 + if let Some(s) = os_str.to_str() { 378 + components.push(s); 379 + } 380 + } 381 + } 382 + 383 + // Remove .mlf extension from last component 384 + if let Some(last) = components.last_mut() { 385 + if let Some(stem) = last.strip_suffix(".mlf") { 386 + *last = stem; 387 + } 388 + } 389 + 390 + if components.is_empty() { 391 + return Err(CheckError::ValidationErrors { 392 + help: Some(format!("Could not extract namespace from path: {}", file_path.display())), 393 + }); 394 + } 395 + 396 + Ok(components.join(".")) 397 + }

+161 -42

mlf-cli/src/generate/code.rs

··· 21 21 source: std::io::Error, 22 22 }, 23 23 24 - #[error("Failed to expand glob pattern")] 25 - #[diagnostic(code(mlf::generate::glob_error))] 26 - GlobError { 27 - #[source] 28 - source: glob::GlobError, 29 - }, 30 - 31 - #[error("Invalid glob pattern: {pattern}")] 32 - #[diagnostic(code(mlf::generate::invalid_glob))] 33 - InvalidGlob { 34 - pattern: String, 35 - #[source] 36 - source: glob::PatternError, 37 - }, 38 - 39 24 #[error("Generator '{name}' not found")] 40 25 #[diagnostic(code(mlf::generate::generator_not_found))] 41 26 #[help("Available generators: {}", available.join(", "))] ··· 51 36 } 52 37 53 38 pub fn run( 54 - generator_name: String, 55 - input_patterns: Vec<String>, 56 - output_dir: PathBuf, 39 + generator_name: Option<String>, 40 + input_paths: Vec<PathBuf>, 41 + output_dir: Option<PathBuf>, 42 + root: Option<PathBuf>, 57 43 flat: bool, 58 44 ) -> Result<(), GenerateError> { 45 + let current_dir = std::env::current_dir().map_err(|source| GenerateError::WriteOutput { 46 + path: "current directory".to_string(), 47 + source, 48 + })?; 49 + 50 + // Load mlf.toml if available 51 + let project_root = crate::config::find_project_root(&current_dir).ok(); 52 + let config = project_root 53 + .as_ref() 54 + .and_then(|root| { 55 + let config_path = root.join("mlf.toml"); 56 + crate::config::MlfConfig::load(&config_path).ok() 57 + }); 58 + 59 + // Determine generator name 60 + let generator_name = if let Some(explicit) = generator_name { 61 + explicit 62 + } else if let Some(cfg) = &config { 63 + // Find first non-lexicon, non-mlf output in mlf.toml 64 + cfg.output 65 + .iter() 66 + .find(|o| o.r#type != "lexicon" && o.r#type != "mlf") 67 + .map(|o| o.r#type.clone()) 68 + .ok_or_else(|| GenerateError::GeneratorNotFound { 69 + name: "any".to_string(), 70 + available: vec!["No code generator outputs configured in mlf.toml. Either add an output configuration or provide --generator flag.".to_string()], 71 + })? 72 + } else { 73 + return Err(GenerateError::GeneratorNotFound { 74 + name: "any".to_string(), 75 + available: vec!["No mlf.toml found and no --generator flag provided. Either create a mlf.toml or provide --generator flag.".to_string()], 76 + }); 77 + }; 78 + 59 79 // Find the generator 60 80 let generators = mlf_codegen::plugin::generators(); 61 81 let generator = generators ··· 72 92 println!("Using generator: {} ({})", generator.name(), generator.description()); 73 93 println!("Output extension: {}\n", generator.file_extension()); 74 94 95 + // Determine output directory 96 + let output_dir = if let Some(explicit) = output_dir { 97 + explicit 98 + } else if let Some(cfg) = &config { 99 + // Find output matching the generator type 100 + cfg.output 101 + .iter() 102 + .find(|o| o.r#type == generator_name) 103 + .map(|o| PathBuf::from(&o.directory)) 104 + .ok_or_else(|| GenerateError::WriteOutput { 105 + path: "mlf.toml".to_string(), 106 + source: std::io::Error::new( 107 + std::io::ErrorKind::NotFound, 108 + format!("No output configured for generator '{}' in mlf.toml", generator_name) 109 + ), 110 + })? 111 + } else { 112 + return Err(GenerateError::WriteOutput { 113 + path: "mlf.toml".to_string(), 114 + source: std::io::Error::new( 115 + std::io::ErrorKind::NotFound, 116 + "No mlf.toml found and no --output flag provided" 117 + ), 118 + }); 119 + }; 120 + 121 + // Determine root directory 122 + let root_dir = if let Some(explicit) = root { 123 + explicit 124 + } else if let Some(cfg) = &config { 125 + project_root.as_ref().unwrap().join(&cfg.source.directory) 126 + } else { 127 + current_dir.clone() 128 + }; 129 + 130 + // Determine input paths 131 + let input_paths = if input_paths.is_empty() { 132 + if let Some(cfg) = &config { 133 + vec![project_root.as_ref().unwrap().join(&cfg.source.directory)] 134 + } else { 135 + return Err(GenerateError::WriteOutput { 136 + path: "input".to_string(), 137 + source: std::io::Error::new( 138 + std::io::ErrorKind::NotFound, 139 + "No input files specified and no mlf.toml found" 140 + ), 141 + }); 142 + } 143 + } else { 144 + input_paths 145 + }; 146 + 75 147 // Collect input files 76 148 let mut file_paths = Vec::new(); 77 - for pattern in input_patterns { 78 - if pattern.contains('*') || pattern.contains('?') { 79 - for entry in glob::glob(&pattern).map_err(|source| GenerateError::InvalidGlob { 80 - pattern: pattern.clone(), 81 - source, 82 - })? { 83 - let path = entry.map_err(|source| GenerateError::GlobError { source })?; 84 - file_paths.push(path); 85 - } 86 - } else { 87 - file_paths.push(PathBuf::from(pattern)); 149 + for path in input_paths { 150 + if path.is_dir() { 151 + file_paths.extend(collect_mlf_files(&path)?); 152 + } else if path.is_file() && path.extension().and_then(|s| s.to_str()) == Some("mlf") { 153 + file_paths.push(path); 88 154 } 89 155 } 90 156 ··· 116 182 } 117 183 }; 118 184 119 - let namespace = extract_namespace(&file_path); 185 + let namespace = match extract_namespace(&file_path, &root_dir) { 186 + Ok(ns) => ns, 187 + Err(e) => { 188 + errors.push(( 189 + file_path.display().to_string(), 190 + format!("Failed to extract namespace: {}", e), 191 + )); 192 + continue; 193 + } 194 + }; 120 195 121 196 // Create workspace with standard library and .mlf cache 122 197 let mlf_cache_dir = crate::config::find_project_root(&std::env::current_dir().unwrap()) ··· 220 295 Ok(()) 221 296 } 222 297 223 - fn extract_namespace(file_path: &Path) -> String { 224 - // Extract namespace from path components 225 - // e.g., com/atproto/admin/defs.mlf -> com.atproto.admin.defs 298 + /// Collect all .mlf files recursively from a directory 299 + fn collect_mlf_files(dir: &Path) -> Result<Vec<PathBuf>, GenerateError> { 300 + let mut files = Vec::new(); 301 + 302 + for entry in std::fs::read_dir(dir).map_err(|source| GenerateError::WriteOutput { 303 + path: dir.display().to_string(), 304 + source, 305 + })? { 306 + let entry = entry.map_err(|source| GenerateError::WriteOutput { 307 + path: dir.display().to_string(), 308 + source, 309 + })?; 310 + 311 + let path = entry.path(); 312 + 313 + if path.is_dir() { 314 + files.extend(collect_mlf_files(&path)?); 315 + } else if path.extension().and_then(|s| s.to_str()) == Some("mlf") { 316 + files.push(path); 317 + } 318 + } 319 + 320 + Ok(files) 321 + } 226 322 227 - let mut components = Vec::new(); 323 + fn extract_namespace(file_path: &Path, root_dir: &Path) -> Result<String, std::io::Error> { 324 + // Canonicalize both paths for comparison 325 + let file_canonical = file_path.canonicalize()?; 326 + let root_canonical = root_dir.canonicalize()?; 228 327 229 - for component in file_path.components() { 328 + // Get the relative path from root to file 329 + let relative_path = file_canonical 330 + .strip_prefix(&root_canonical) 331 + .map_err(|_| { 332 + std::io::Error::new( 333 + std::io::ErrorKind::Other, 334 + format!( 335 + "File path {} is not under root directory {}", 336 + file_path.display(), 337 + root_dir.display() 338 + ), 339 + ) 340 + })?; 341 + 342 + // Convert path components to namespace parts 343 + let mut namespace_parts = Vec::new(); 344 + 345 + for component in relative_path.components() { 230 346 match component { 231 347 std::path::Component::Normal(os_str) => { 232 348 if let Some(s) = os_str.to_str() { 233 - components.push(s); 349 + namespace_parts.push(s); 234 350 } 235 351 } 236 - _ => continue, // Skip ., .., /, etc. 352 + _ => continue, 237 353 } 238 354 } 239 355 240 - // Remove the .mlf extension from the last component if present 241 - if let Some(last) = components.last_mut() { 356 + // Remove .mlf extension from the last component if present 357 + if let Some(last) = namespace_parts.last_mut() { 242 358 if let Some(stem) = last.strip_suffix(".mlf") { 243 359 *last = stem; 244 360 } 245 361 } 246 362 247 - if components.is_empty() { 248 - return "unknown".to_string(); 363 + if namespace_parts.is_empty() { 364 + return Err(std::io::Error::new( 365 + std::io::ErrorKind::Other, 366 + format!("Could not extract namespace from path: {}", file_path.display()), 367 + )); 249 368 } 250 369 251 - components.join(".") 370 + Ok(namespace_parts.join(".")) 252 371 }

+143 -39

mlf-cli/src/generate/lexicon.rs

··· 29 29 source: std::io::Error, 30 30 }, 31 31 32 - #[error("Failed to expand glob pattern")] 33 - #[diagnostic(code(mlf::generate::glob_error))] 34 - GlobError { 35 - #[source] 36 - source: glob::GlobError, 37 - }, 32 + } 33 + 34 + pub fn run(input_paths: Vec<PathBuf>, output_dir: Option<PathBuf>, explicit_root: Option<PathBuf>, flat: bool) -> Result<(), GenerateError> { 35 + let current_dir = std::env::current_dir() 36 + .map_err(|e| GenerateError::WriteOutput { 37 + path: ".".to_string(), 38 + source: e, 39 + })?; 40 + 41 + // Load mlf.toml if available 42 + let project_root = crate::config::find_project_root(&current_dir).ok(); 43 + let config = project_root 44 + .as_ref() 45 + .and_then(|root| { 46 + let config_path = root.join("mlf.toml"); 47 + crate::config::MlfConfig::load(&config_path).ok() 48 + }); 49 + 50 + // Determine output directory 51 + let output_dir = if let Some(explicit) = output_dir { 52 + explicit 53 + } else if let Some(cfg) = &config { 54 + // Find first lexicon output in mlf.toml 55 + cfg.output 56 + .iter() 57 + .find(|o| o.r#type == "lexicon") 58 + .map(|o| PathBuf::from(&o.directory)) 59 + .ok_or_else(|| GenerateError::ParseLexicon { 60 + path: "mlf.toml".to_string(), 61 + help: Some("No lexicon output configured in mlf.toml. Either add an output configuration or provide --output flag.".to_string()), 62 + })? 63 + } else { 64 + return Err(GenerateError::ParseLexicon { 65 + path: "mlf.toml".to_string(), 66 + help: Some("No mlf.toml found and no --output flag provided. Either create a mlf.toml or provide --output flag.".to_string()), 67 + }); 68 + }; 69 + 70 + // Determine root directory 71 + let root_dir = if let Some(explicit) = explicit_root { 72 + explicit 73 + } else if let Some(cfg) = &config { 74 + project_root.as_ref().unwrap().join(&cfg.source.directory) 75 + } else { 76 + current_dir.clone() 77 + }; 38 78 39 - #[error("Invalid glob pattern: {pattern}")] 40 - #[diagnostic(code(mlf::generate::invalid_glob))] 41 - InvalidGlob { 42 - pattern: String, 43 - #[source] 44 - source: glob::PatternError, 45 - }, 46 - } 79 + // Determine input paths 80 + let input_paths = if input_paths.is_empty() { 81 + if let Some(cfg) = &config { 82 + vec![project_root.as_ref().unwrap().join(&cfg.source.directory)] 83 + } else { 84 + return Err(GenerateError::ParseLexicon { 85 + path: "input".to_string(), 86 + help: Some("No input files specified and no mlf.toml found. Either provide input files or create a mlf.toml.".to_string()), 87 + }); 88 + } 89 + } else { 90 + input_paths 91 + }; 47 92 48 - pub fn run(input_patterns: Vec<String>, output_dir: PathBuf, flat: bool) -> Result<(), GenerateError> { 93 + // Collect files from input paths 49 94 let mut file_paths = Vec::new(); 95 + for input_path in input_paths { 96 + let path = if input_path.is_absolute() { 97 + input_path 98 + } else { 99 + current_dir.join(input_path) 100 + }; 50 101 51 - for pattern in input_patterns { 52 - if pattern.contains('*') || pattern.contains('?') { 53 - for entry in glob::glob(&pattern).map_err(|source| GenerateError::InvalidGlob { 54 - pattern: pattern.clone(), 55 - source, 56 - })? { 57 - let path = entry.map_err(|source| GenerateError::GlobError { source })?; 58 - file_paths.push(path); 59 - } 102 + if path.is_dir() { 103 + file_paths.extend(collect_mlf_files(&path)?); 104 + } else if path.is_file() { 105 + file_paths.push(path); 60 106 } else { 61 - file_paths.push(PathBuf::from(pattern)); 107 + return Err(GenerateError::ReadFile { 108 + path: path.display().to_string(), 109 + source: std::io::Error::new(std::io::ErrorKind::NotFound, "Path not found"), 110 + }); 62 111 } 63 112 } 64 113 ··· 87 136 } 88 137 }; 89 138 90 - let namespace = extract_namespace(&file_path); 139 + let namespace = extract_namespace(&file_path, &root_dir)?; 91 140 92 141 // Create workspace with standard library and .mlf cache for inline type resolution 93 142 let mlf_cache_dir = crate::config::find_project_root(&std::env::current_dir().unwrap()) ··· 157 206 Ok(()) 158 207 } 159 208 160 - fn extract_namespace(file_path: &Path) -> String { 161 - // Extract namespace from path components 162 - // e.g., com/atproto/admin/defs.mlf -> com.atproto.admin.defs 209 + /// Recursively collect all .mlf files from a directory 210 + fn collect_mlf_files(dir: &Path) -> Result<Vec<PathBuf>, GenerateError> { 211 + let mut files = Vec::new(); 163 212 164 - let mut components = Vec::new(); 213 + if !dir.exists() { 214 + return Err(GenerateError::ReadFile { 215 + path: dir.display().to_string(), 216 + source: std::io::Error::new(std::io::ErrorKind::NotFound, "Directory not found"), 217 + }); 218 + } 165 219 166 - for component in file_path.components() { 167 - match component { 168 - std::path::Component::Normal(os_str) => { 169 - if let Some(s) = os_str.to_str() { 170 - components.push(s); 220 + fn visit_dirs(dir: &Path, files: &mut Vec<PathBuf>) -> std::io::Result<()> { 221 + if dir.is_dir() { 222 + for entry in std::fs::read_dir(dir)? { 223 + let entry = entry?; 224 + let path = entry.path(); 225 + if path.is_dir() { 226 + visit_dirs(&path, files)?; 227 + } else if path.extension().and_then(|s| s.to_str()) == Some("mlf") { 228 + files.push(path); 171 229 } 172 230 } 173 - _ => continue, // Skip ., .., /, etc. 231 + } 232 + Ok(()) 233 + } 234 + 235 + visit_dirs(dir, &mut files).map_err(|source| GenerateError::ReadFile { 236 + path: dir.display().to_string(), 237 + source, 238 + })?; 239 + 240 + Ok(files) 241 + } 242 + 243 + /// Extract namespace from file path relative to root directory 244 + /// e.g., root=/project/lexicons, file=/project/lexicons/com/example/foo.mlf -> com.example.foo 245 + fn extract_namespace(file_path: &Path, root_dir: &Path) -> Result<String, GenerateError> { 246 + // Get the canonical paths to handle . and .. correctly 247 + let file_canonical = file_path.canonicalize().map_err(|source| GenerateError::ReadFile { 248 + path: file_path.display().to_string(), 249 + source, 250 + })?; 251 + 252 + let root_canonical = root_dir.canonicalize().map_err(|source| GenerateError::ReadFile { 253 + path: root_dir.display().to_string(), 254 + source, 255 + })?; 256 + 257 + // Get relative path from root to file 258 + let relative_path = file_canonical.strip_prefix(&root_canonical) 259 + .map_err(|_| GenerateError::ParseLexicon { 260 + path: file_path.display().to_string(), 261 + help: Some(format!( 262 + "File {} is not within root directory {}", 263 + file_path.display(), 264 + root_dir.display() 265 + )), 266 + })?; 267 + 268 + // Convert path to namespace 269 + let mut components = Vec::new(); 270 + for component in relative_path.components() { 271 + if let std::path::Component::Normal(os_str) = component { 272 + if let Some(s) = os_str.to_str() { 273 + components.push(s); 274 + } 174 275 } 175 276 } 176 277 177 - // Remove the .mlf extension from the last component if present 278 + // Remove .mlf extension from last component 178 279 if let Some(last) = components.last_mut() { 179 280 if let Some(stem) = last.strip_suffix(".mlf") { 180 281 *last = stem; ··· 182 283 } 183 284 184 285 if components.is_empty() { 185 - return "unknown".to_string(); 286 + return Err(GenerateError::ParseLexicon { 287 + path: file_path.display().to_string(), 288 + help: Some("Could not extract namespace from path".to_string()), 289 + }); 186 290 } 187 291 188 - components.join(".") 292 + Ok(components.join(".")) 189 293 }

+123 -35

mlf-cli/src/generate/mlf.rs

··· 51 51 }, 52 52 } 53 53 54 - pub fn run(input_patterns: Vec<String>, output_dir: PathBuf) -> Result<(), MlfGenerateError> { 54 + pub fn run(input_patterns: Vec<String>, output_dir: Option<PathBuf>) -> Result<(), MlfGenerateError> { 55 + let current_dir = std::env::current_dir().map_err(|source| MlfGenerateError::WriteOutput { 56 + path: "current directory".to_string(), 57 + source, 58 + })?; 59 + 60 + // Load mlf.toml if available 61 + let project_root = crate::config::find_project_root(&current_dir).ok(); 62 + let config = project_root 63 + .as_ref() 64 + .and_then(|root| { 65 + let config_path = root.join("mlf.toml"); 66 + crate::config::MlfConfig::load(&config_path).ok() 67 + }); 68 + 69 + // Determine output directory 70 + let output_dir = if let Some(explicit) = output_dir { 71 + explicit 72 + } else if let Some(cfg) = &config { 73 + // Find first mlf output in mlf.toml 74 + cfg.output 75 + .iter() 76 + .find(|o| o.r#type == "mlf") 77 + .map(|o| PathBuf::from(&o.directory)) 78 + .ok_or_else(|| MlfGenerateError::InvalidLexicon { 79 + message: "No mlf output configured in mlf.toml. Either add an output configuration or provide --output flag.".to_string(), 80 + })? 81 + } else { 82 + return Err(MlfGenerateError::InvalidLexicon { 83 + message: "No mlf.toml found and no --output flag provided. Either create a mlf.toml or provide --output flag.".to_string(), 84 + }); 85 + }; 86 + 55 87 let mut file_paths = Vec::new(); 56 88 57 89 for pattern in input_patterns { ··· 179 211 } 180 212 })?; 181 213 214 + // Create a context to pass the current namespace to type generation 215 + let ctx = ConversionContext { 216 + current_namespace: nsid.to_string(), 217 + }; 218 + 182 219 // Process all definitions 183 220 for (name, def) in defs { 184 221 let def_type = def.get("type").and_then(|v| v.as_str()).ok_or_else(|| { ··· 189 226 190 227 match def_type { 191 228 "record" => { 192 - let mlf = generate_record(name, def, last_segment)?; 229 + let mlf = generate_record(name, def, last_segment, &ctx)?; 193 230 output.push_str(&mlf); 194 231 output.push('\n'); 195 232 } 196 233 "query" => { 197 - let mlf = generate_query(name, def, last_segment)?; 234 + let mlf = generate_query(name, def, last_segment, &ctx)?; 198 235 output.push_str(&mlf); 199 236 output.push('\n'); 200 237 } 201 238 "procedure" => { 202 - let mlf = generate_procedure(name, def, last_segment)?; 239 + let mlf = generate_procedure(name, def, last_segment, &ctx)?; 203 240 output.push_str(&mlf); 204 241 output.push('\n'); 205 242 } 206 243 "subscription" => { 207 - let mlf = generate_subscription(name, def, last_segment)?; 244 + let mlf = generate_subscription(name, def, last_segment, &ctx)?; 208 245 output.push_str(&mlf); 209 246 output.push('\n'); 210 247 } ··· 213 250 output.push_str(&mlf); 214 251 output.push('\n'); 215 252 } 216 - "object" => { 217 - let mlf = generate_def_type(name, def, last_segment)?; 253 + _ => { 254 + // All other types (object, string, array, union, etc.) are treated as def type 255 + let mlf = generate_def_type(name, def, last_segment, &ctx)?; 218 256 output.push_str(&mlf); 219 257 output.push('\n'); 220 - } 221 - _ => { 222 - // Unknown type, skip 223 258 } 224 259 } 225 260 } 226 261 227 262 Ok(output) 263 + } 264 + 265 + struct ConversionContext { 266 + current_namespace: String, 228 267 } 229 268 230 269 /// Reserved words in MLF that need to be escaped ··· 243 282 } 244 283 } 245 284 246 - fn generate_record(name: &str, def: &Value, last_segment: &str) -> Result<String, MlfGenerateError> { 285 + fn generate_record(name: &str, def: &Value, last_segment: &str, ctx: &ConversionContext) -> Result<String, MlfGenerateError> { 247 286 let mut output = String::new(); 248 287 249 288 // Add doc comment if present ··· 253 292 output.push_str(&format!("/// {}\n", line)); 254 293 } 255 294 } 295 + } 296 + 297 + // Add @main annotation for "main" definitions 298 + if name == "main" { 299 + output.push_str("@main\n"); 256 300 } 257 301 258 302 // Use last segment of NSID for "main" definitions ··· 301 345 let is_required = required.contains(&field_name.as_str()); 302 346 let required_marker = if is_required { "!" } else { "" }; 303 347 304 - let field_type = generate_type(field_def)?; 348 + let field_type = generate_type(field_def, ctx)?; 305 349 let escaped_field_name = escape_name(field_name); 306 350 output.push_str(&format!( 307 351 " {}{}: {},\n", ··· 313 357 Ok(output) 314 358 } 315 359 316 - fn generate_query(name: &str, def: &Value, last_segment: &str) -> Result<String, MlfGenerateError> { 360 + fn generate_query(name: &str, def: &Value, last_segment: &str, ctx: &ConversionContext) -> Result<String, MlfGenerateError> { 317 361 let mut output = String::new(); 318 362 319 363 // Add doc comment ··· 325 369 } 326 370 } 327 371 372 + // Add @main annotation for "main" definitions 373 + if name == "main" { 374 + output.push_str("@main\n"); 375 + } 376 + 328 377 let query_name = if name == "main" { 329 378 escape_name(last_segment) 330 379 } else { ··· 352 401 .map(|(param_name, param_def)| { 353 402 let is_required = required.contains(&param_name.as_str()); 354 403 let required_marker = if is_required { "!" } else { "" }; 355 - let param_type = generate_type(param_def).unwrap_or_else(|_| "unknown".to_string()); 404 + let param_type = generate_type(param_def, ctx).unwrap_or_else(|_| "unknown".to_string()); 356 405 let escaped_param_name = escape_name(param_name); 357 406 358 407 // Add doc comment inline if present ··· 377 426 // Output type 378 427 if let Some(output_obj) = def.get("output").and_then(|v| v.as_object()) { 379 428 if let Some(schema) = output_obj.get("schema") { 380 - let return_type = generate_type(schema)?; 429 + let return_type = generate_type(schema, ctx)?; 381 430 output.push_str(&format!(": {}", return_type)); 382 431 383 432 // Check for errors ··· 400 449 Ok(output) 401 450 } 402 451 403 - fn generate_procedure(name: &str, def: &Value, last_segment: &str) -> Result<String, MlfGenerateError> { 452 + fn generate_procedure(name: &str, def: &Value, last_segment: &str, ctx: &ConversionContext) -> Result<String, MlfGenerateError> { 404 453 let mut output = String::new(); 405 454 406 455 // Add doc comment ··· 412 461 } 413 462 } 414 463 464 + // Add @main annotation for "main" definitions 465 + if name == "main" { 466 + output.push_str("@main\n"); 467 + } 468 + 415 469 let procedure_name = if name == "main" { 416 470 escape_name(last_segment) 417 471 } else { ··· 441 495 let is_required = required.contains(&param_name.as_str()); 442 496 let required_marker = if is_required { "!" } else { "" }; 443 497 let param_type = 444 - generate_type(param_def).unwrap_or_else(|_| "unknown".to_string()); 498 + generate_type(param_def, ctx).unwrap_or_else(|_| "unknown".to_string()); 445 499 let escaped_param_name = escape_name(param_name); 446 500 447 501 // Add doc comment inline if present ··· 470 524 // Output type 471 525 if let Some(output_obj) = def.get("output").and_then(|v| v.as_object()) { 472 526 if let Some(schema) = output_obj.get("schema") { 473 - let return_type = generate_type(schema)?; 527 + let return_type = generate_type(schema, ctx)?; 474 528 output.push_str(&format!(": {}", return_type)); 475 529 476 530 // Check for errors ··· 493 547 Ok(output) 494 548 } 495 549 496 - fn generate_subscription(name: &str, def: &Value, last_segment: &str) -> Result<String, MlfGenerateError> { 550 + fn generate_subscription(name: &str, def: &Value, last_segment: &str, ctx: &ConversionContext) -> Result<String, MlfGenerateError> { 497 551 let mut output = String::new(); 498 552 499 553 // Add doc comment ··· 505 559 } 506 560 } 507 561 562 + // Add @main annotation for "main" definitions 563 + if name == "main" { 564 + output.push_str("@main\n"); 565 + } 566 + 508 567 let subscription_name = if name == "main" { 509 568 escape_name(last_segment) 510 569 } else { ··· 532 591 .map(|(param_name, param_def)| { 533 592 let is_required = required.contains(&param_name.as_str()); 534 593 let required_marker = if is_required { "!" } else { "" }; 535 - let param_type = generate_type(param_def).unwrap_or_else(|_| "unknown".to_string()); 594 + let param_type = generate_type(param_def, ctx).unwrap_or_else(|_| "unknown".to_string()); 536 595 let escaped_param_name = escape_name(param_name); 537 596 538 597 format!("{}{}: {}", escaped_param_name, required_marker, param_type) ··· 549 608 // Message types 550 609 if let Some(message) = def.get("message").and_then(|v| v.as_object()) { 551 610 if let Some(schema) = message.get("schema") { 552 - let message_type = generate_type(schema)?; 611 + let message_type = generate_type(schema, ctx)?; 553 612 output.push_str(&format!(": {}", message_type)); 554 613 } 555 614 } ··· 575 634 Ok(output) 576 635 } 577 636 578 - fn generate_def_type(name: &str, def: &Value, last_segment: &str) -> Result<String, MlfGenerateError> { 637 + fn generate_def_type(name: &str, def: &Value, last_segment: &str, ctx: &ConversionContext) -> Result<String, MlfGenerateError> { 579 638 let mut output = String::new(); 580 639 640 + // Add doc comment if present 641 + if let Some(desc) = def.get("description").and_then(|v| v.as_str()) { 642 + if !desc.is_empty() { 643 + for line in desc.lines() { 644 + output.push_str(&format!("/// {}\n", line)); 645 + } 646 + } 647 + } 648 + 649 + // Add @main annotation for "main" definitions 650 + if name == "main" { 651 + output.push_str("@main\n"); 652 + } 653 + 581 654 // Use last segment of NSID for "main" definitions 582 655 let def_name = if name == "main" { 583 656 escape_name(last_segment) ··· 586 659 }; 587 660 588 661 output.push_str(&format!("def type {} = ", def_name)); 589 - let type_str = generate_type_with_indent(def, 0)?; 662 + let type_str = generate_type_with_indent(def, 0, ctx)?; 590 663 output.push_str(&type_str); 591 664 output.push_str(";\n"); 592 665 593 666 Ok(output) 594 667 } 595 668 596 - fn generate_type_with_indent(type_def: &Value, indent_level: usize) -> Result<String, MlfGenerateError> { 669 + fn generate_type_with_indent(type_def: &Value, indent_level: usize, ctx: &ConversionContext) -> Result<String, MlfGenerateError> { 597 670 let type_name = type_def.get("type").and_then(|v| v.as_str()); 598 671 599 672 match type_name { ··· 631 704 632 705 let is_required = required.contains(&field_name.as_str()); 633 706 let required_marker = if is_required { "!" } else { "" }; 634 - let field_type = generate_type_with_indent(field_def, indent_level + 1)?; 707 + let field_type = generate_type_with_indent(field_def, indent_level + 1, ctx)?; 635 708 let escaped_field_name = escape_name(field_name); 636 709 output.push_str(&format!( 637 710 "{}{}{}: {},\n", ··· 642 715 output.push_str(&format!("{}}}", indent)); 643 716 Ok(output) 644 717 } 645 - _ => generate_type(type_def), 718 + _ => generate_type(type_def, ctx), 646 719 } 647 720 } 648 721 649 - fn generate_type(type_def: &Value) -> Result<String, MlfGenerateError> { 722 + fn generate_type(type_def: &Value, ctx: &ConversionContext) -> Result<String, MlfGenerateError> { 650 723 let type_name = type_def.get("type").and_then(|v| v.as_str()); 651 724 652 725 match type_name { ··· 735 808 .unwrap_or("unknown") 736 809 .to_string() 737 810 } else { 738 - generate_type(items)? 811 + generate_type(items, ctx)? 739 812 }; 740 813 741 814 let mut result = format!("{}[]", item_type); ··· 773 846 774 847 let is_required = required.contains(&field_name.as_str()); 775 848 let required_marker = if is_required { "!" } else { "" }; 776 - let field_type = generate_type(field_def)?; 849 + let field_type = generate_type(field_def, ctx)?; 777 850 let escaped_field_name = escape_name(field_name); 778 851 output.push_str(&format!( 779 852 " {}{}: {},\n", ··· 793 866 794 867 let type_strs: Vec<String> = refs 795 868 .iter() 796 - .map(|r| generate_type(r).unwrap_or_else(|_| "unknown".to_string())) 869 + .map(|r| generate_type(r, ctx).unwrap_or_else(|_| "unknown".to_string())) 797 870 .collect(); 798 871 799 872 let mut result = type_strs.join(" | "); ··· 807 880 } 808 881 Some("ref") => { 809 882 if let Some(ref_str) = type_def.get("ref").and_then(|v| v.as_str()) { 810 - // Convert refs: strip leading # and convert remaining # to . 811 - // "#audio" -> "audio" (local ref, just the name) 812 - // "com.example#foo" -> "com.example.foo" (external ref) 813 - let clean_ref = ref_str.trim_start_matches('#').replace('#', "."); 814 - Ok(clean_ref) 883 + // Handle references: 884 + // "#defName" -> "defName" (local reference, same file) 885 + // "namespace.id#defName" -> Check if same namespace, if so use "defName", else use full path 886 + 887 + if let Some(stripped) = ref_str.strip_prefix('#') { 888 + // Local reference: #defName -> defName 889 + Ok(stripped.to_string()) 890 + } else if let Some((namespace, def_name)) = ref_str.split_once('#') { 891 + // Check if this references the current namespace 892 + if namespace == ctx.current_namespace { 893 + // Same namespace - use just the def name 894 + Ok(def_name.to_string()) 895 + } else { 896 + // Different namespace - use full NSID format 897 + Ok(format!("{}.{}", namespace, def_name)) 898 + } 899 + } else { 900 + // No # at all - shouldn't happen in valid lexicons, but handle gracefully 901 + Ok(ref_str.to_string()) 902 + } 815 903 } else { 816 904 Err(MlfGenerateError::InvalidLexicon { 817 905 message: "Missing 'ref' in ref type".to_string(),

+941

mlf-cli/src/generate/mlf.rs.backup

··· 1 + use miette::Diagnostic; 2 + use serde_json::Value; 3 + use std::path::PathBuf; 4 + use thiserror::Error; 5 + 6 + #[derive(Error, Debug, Diagnostic)] 7 + pub enum MlfGenerateError { 8 + #[error("Failed to read file: {path}")] 9 + #[diagnostic(code(mlf::generate::read_file))] 10 + #[allow(dead_code)] 11 + ReadFile { 12 + path: String, 13 + #[source] 14 + source: std::io::Error, 15 + }, 16 + 17 + #[error("Failed to parse JSON: {path}")] 18 + #[diagnostic(code(mlf::generate::parse_json))] 19 + #[allow(dead_code)] 20 + ParseJson { 21 + path: String, 22 + #[source] 23 + source: serde_json::Error, 24 + }, 25 + 26 + #[error("Failed to write output: {path}")] 27 + #[diagnostic(code(mlf::generate::write_output))] 28 + WriteOutput { 29 + path: String, 30 + #[source] 31 + source: std::io::Error, 32 + }, 33 + 34 + #[error("Invalid lexicon format: {message}")] 35 + #[diagnostic(code(mlf::generate::invalid_lexicon))] 36 + InvalidLexicon { message: String }, 37 + 38 + #[error("Failed to expand glob pattern")] 39 + #[diagnostic(code(mlf::generate::glob_error))] 40 + GlobError { 41 + #[source] 42 + source: glob::GlobError, 43 + }, 44 + 45 + #[error("Invalid glob pattern: {pattern}")] 46 + #[diagnostic(code(mlf::generate::invalid_glob))] 47 + InvalidGlob { 48 + pattern: String, 49 + #[source] 50 + source: glob::PatternError, 51 + }, 52 + } 53 + 54 + pub fn run(input_patterns: Vec<String>, output_dir: PathBuf) -> Result<(), MlfGenerateError> { 55 + let mut file_paths = Vec::new(); 56 + 57 + for pattern in input_patterns { 58 + if pattern.contains('*') || pattern.contains('?') { 59 + for entry in glob::glob(&pattern).map_err(|source| MlfGenerateError::InvalidGlob { 60 + pattern: pattern.clone(), 61 + source, 62 + })? { 63 + let path = entry.map_err(|source| MlfGenerateError::GlobError { source })?; 64 + file_paths.push(path); 65 + } 66 + } else { 67 + file_paths.push(PathBuf::from(pattern)); 68 + } 69 + } 70 + 71 + std::fs::create_dir_all(&output_dir).map_err(|source| MlfGenerateError::WriteOutput { 72 + path: output_dir.display().to_string(), 73 + source, 74 + })?; 75 + 76 + let mut errors = Vec::new(); 77 + let mut success_count = 0; 78 + 79 + for file_path in file_paths { 80 + let source = match std::fs::read_to_string(&file_path) { 81 + Ok(s) => s, 82 + Err(source) => { 83 + errors.push(( 84 + file_path.display().to_string(), 85 + format!("Failed to read file: {}", source), 86 + )); 87 + continue; 88 + } 89 + }; 90 + 91 + let json: Value = match serde_json::from_str(&source) { 92 + Ok(j) => j, 93 + Err(source) => { 94 + errors.push(( 95 + file_path.display().to_string(), 96 + format!("Failed to parse JSON: {}", source), 97 + )); 98 + continue; 99 + } 100 + }; 101 + 102 + let mlf_content = match generate_mlf_from_json(&json) { 103 + Ok(content) => content, 104 + Err(e) => { 105 + errors.push((file_path.display().to_string(), format!("{:?}", e))); 106 + continue; 107 + } 108 + }; 109 + 110 + // Extract namespace from JSON "id" field 111 + let namespace = json 112 + .get("id") 113 + .and_then(|v| v.as_str()) 114 + .ok_or_else(|| MlfGenerateError::InvalidLexicon { 115 + message: "Missing 'id' field in lexicon".to_string(), 116 + })?; 117 + 118 + // Create output path from namespace 119 + let mut output_path = output_dir.clone(); 120 + for segment in namespace.split('.') { 121 + output_path.push(segment); 122 + } 123 + if let Err(source) = std::fs::create_dir_all(&output_path.parent().unwrap()) { 124 + errors.push(( 125 + file_path.display().to_string(), 126 + format!("Failed to create directory: {}", source), 127 + )); 128 + continue; 129 + } 130 + output_path.set_extension("mlf"); 131 + 132 + if let Err(source) = std::fs::write(&output_path, mlf_content) { 133 + errors.push(( 134 + output_path.display().to_string(), 135 + format!("Failed to write file: {}", source), 136 + )); 137 + continue; 138 + } 139 + 140 + println!("Generated: {}", output_path.display()); 141 + success_count += 1; 142 + } 143 + 144 + if !errors.is_empty() { 145 + eprintln!( 146 + "\n{} file(s) generated successfully, {} error(s) encountered:\n", 147 + success_count, 148 + errors.len() 149 + ); 150 + for (path, error) in &errors { 151 + eprintln!(" {} - {}", path, error); 152 + } 153 + eprintln!(); 154 + return Err(MlfGenerateError::InvalidLexicon { 155 + message: format!("{} errors total", errors.len()), 156 + }); 157 + } 158 + 159 + println!("\nSuccessfully generated {} file(s)", success_count); 160 + Ok(()) 161 + } 162 + 163 + pub fn generate_mlf_from_json(json: &Value) -> Result<String, MlfGenerateError> { 164 + let mut output = String::new(); 165 + 166 + // Extract NSID to get the last segment for "main" definitions 167 + let nsid = json 168 + .get("id") 169 + .and_then(|v| v.as_str()) 170 + .ok_or_else(|| MlfGenerateError::InvalidLexicon { 171 + message: "Missing 'id' field in lexicon".to_string(), 172 + })?; 173 + 174 + let last_segment = nsid.split('.').last().unwrap_or("main"); 175 + 176 + let defs = json.get("defs").and_then(|v| v.as_object()).ok_or_else(|| { 177 + MlfGenerateError::InvalidLexicon { 178 + message: "Missing or invalid 'defs' field".to_string(), 179 + } 180 + })?; 181 + 182 + // Create a context to pass the current namespace to type generation 183 + let ctx = ConversionContext { 184 + current_namespace: nsid.to_string(), 185 + }; 186 + 187 + // Process all definitions 188 + for (name, def) in defs { 189 + let def_type = def.get("type").and_then(|v| v.as_str()).ok_or_else(|| { 190 + MlfGenerateError::InvalidLexicon { 191 + message: format!("Missing 'type' field for definition '{}'", name), 192 + } 193 + })?; 194 + 195 + match def_type { 196 + "record" => { 197 + let mlf = generate_record(name, def, last_segment, &ctx)?; 198 + output.push_str(&mlf); 199 + output.push('\n'); 200 + } 201 + "query" => { 202 + let mlf = generate_query(name, def, last_segment, &ctx)?; 203 + output.push_str(&mlf); 204 + output.push('\n'); 205 + } 206 + "procedure" => { 207 + let mlf = generate_procedure(name, def, last_segment, &ctx)?; 208 + output.push_str(&mlf); 209 + output.push('\n'); 210 + } 211 + "subscription" => { 212 + let mlf = generate_subscription(name, def, last_segment, &ctx)?; 213 + output.push_str(&mlf); 214 + output.push('\n'); 215 + } 216 + "token" => { 217 + let mlf = generate_token(name, def)?; 218 + output.push_str(&mlf); 219 + output.push('\n'); 220 + } 221 + "object" => { 222 + let mlf = generate_def_type(name, def, last_segment, &ctx)?; 223 + output.push_str(&mlf); 224 + output.push('\n'); 225 + } 226 + _ => { 227 + // Unknown type, skip 228 + } 229 + } 230 + } 231 + 232 + Ok(output) 233 + } 234 + 235 + struct ConversionContext { 236 + current_namespace: String, 237 + } 238 + 239 + /// Reserved words in MLF that need to be escaped 240 + const RESERVED_WORDS: &[&str] = &[ 241 + "main", "record", "query", "procedure", "subscription", "token", "def", "type", "use", 242 + "pub", "alias", "namespace", "constrained", "error", "unit", "null", "boolean", 243 + "integer", "string", "bytes", "blob", "unknown", "array", "object", "union", "ref", 244 + ]; 245 + 246 + /// Escape a name if it's a reserved word 247 + fn escape_name(name: &str) -> String { 248 + if RESERVED_WORDS.contains(&name) { 249 + format!("`{}`", name) 250 + } else { 251 + name.to_string() 252 + } 253 + } 254 + 255 + fn generate_record(name: &str, def: &Value, last_segment: &str, ctx: &ConversionContext) -> Result<String, MlfGenerateError> { 256 + let mut output = String::new(); 257 + 258 + // Add doc comment if present 259 + if let Some(desc) = def.get("description").and_then(|v| v.as_str()) { 260 + if !desc.is_empty() { 261 + for line in desc.lines() { 262 + output.push_str(&format!("/// {}\n", line)); 263 + } 264 + } 265 + } 266 + 267 + // Add @main annotation for "main" definitions 268 + if name == "main" { 269 + output.push_str("@main\n"); 270 + } 271 + 272 + // Use last segment of NSID for "main" definitions 273 + let record_name = if name == "main" { 274 + escape_name(last_segment) 275 + } else { 276 + escape_name(name) 277 + }; 278 + 279 + output.push_str(&format!("record {} {{\n", record_name)); 280 + 281 + // Get the record object 282 + let record_obj = def.get("record").and_then(|v| v.as_object()).ok_or_else(|| { 283 + MlfGenerateError::InvalidLexicon { 284 + message: format!("Missing 'record' field in record definition '{}'", name), 285 + } 286 + })?; 287 + 288 + let properties = record_obj 289 + .get("properties") 290 + .and_then(|v| v.as_object()) 291 + .ok_or_else(|| MlfGenerateError::InvalidLexicon { 292 + message: format!("Missing 'properties' in record '{}'", name), 293 + })?; 294 + 295 + let required = record_obj 296 + .get("required") 297 + .and_then(|v| v.as_array()) 298 + .map(|arr| { 299 + arr.iter() 300 + .filter_map(|v| v.as_str()) 301 + .collect::<Vec<_>>() 302 + }) 303 + .unwrap_or_default(); 304 + 305 + for (field_name, field_def) in properties { 306 + // Add field doc comment 307 + if let Some(desc) = field_def.get("description").and_then(|v| v.as_str()) { 308 + if !desc.is_empty() { 309 + for line in desc.lines() { 310 + output.push_str(&format!(" /// {}\n", line)); 311 + } 312 + } 313 + } 314 + 315 + let is_required = required.contains(&field_name.as_str()); 316 + let required_marker = if is_required { "!" } else { "" }; 317 + 318 + let field_type = generate_type(field_def)?; 319 + let escaped_field_name = escape_name(field_name); 320 + output.push_str(&format!( 321 + " {}{}: {},\n", 322 + escaped_field_name, required_marker, field_type 323 + )); 324 + } 325 + 326 + output.push_str("}\n"); 327 + Ok(output) 328 + } 329 + 330 + fn generate_query(name: &str, def: &Value, last_segment: &str, ctx: &ConversionContext) -> Result<String, MlfGenerateError> { 331 + let mut output = String::new(); 332 + 333 + // Add doc comment 334 + if let Some(desc) = def.get("description").and_then(|v| v.as_str()) { 335 + if !desc.is_empty() { 336 + for line in desc.lines() { 337 + output.push_str(&format!("/// {}\n", line)); 338 + } 339 + } 340 + } 341 + 342 + // Add @main annotation for "main" definitions 343 + if name == "main" { 344 + output.push_str("@main\n"); 345 + } 346 + 347 + let query_name = if name == "main" { 348 + escape_name(last_segment) 349 + } else { 350 + escape_name(name) 351 + }; 352 + output.push_str(&format!("query {}", query_name)); 353 + 354 + // Parameters 355 + output.push('('); 356 + if let Some(params) = def.get("parameters").and_then(|v| v.as_object()) { 357 + let properties = params.get("properties").and_then(|v| v.as_object()); 358 + let required = params 359 + .get("required") 360 + .and_then(|v| v.as_array()) 361 + .map(|arr| { 362 + arr.iter() 363 + .filter_map(|v| v.as_str()) 364 + .collect::<Vec<_>>() 365 + }) 366 + .unwrap_or_default(); 367 + 368 + if let Some(props) = properties { 369 + let param_strs: Vec<String> = props 370 + .iter() 371 + .map(|(param_name, param_def)| { 372 + let is_required = required.contains(&param_name.as_str()); 373 + let required_marker = if is_required { "!" } else { "" }; 374 + let param_type = generate_type(param_def).unwrap_or_else(|_| "unknown".to_string()); 375 + let escaped_param_name = escape_name(param_name); 376 + 377 + // Add doc comment inline if present 378 + let mut result = String::new(); 379 + if let Some(desc) = param_def.get("description").and_then(|v| v.as_str()) { 380 + if !desc.is_empty() { 381 + result.push_str(&format!("\n /// {}\n ", desc)); 382 + } 383 + } 384 + result.push_str(&format!("{}{}: {}", escaped_param_name, required_marker, param_type)); 385 + result 386 + }) 387 + .collect(); 388 + 389 + if !param_strs.is_empty() { 390 + output.push_str(&param_strs.join(",")); 391 + } 392 + } 393 + } 394 + output.push(')'); 395 + 396 + // Output type 397 + if let Some(output_obj) = def.get("output").and_then(|v| v.as_object()) { 398 + if let Some(schema) = output_obj.get("schema") { 399 + let return_type = generate_type(schema)?; 400 + output.push_str(&format!(": {}", return_type)); 401 + 402 + // Check for errors 403 + if let Some(errors) = output_obj.get("errors").and_then(|v| v.as_object()) { 404 + output.push_str(" | error {\n"); 405 + for (error_name, error_def) in errors { 406 + if let Some(desc) = error_def.get("description").and_then(|v| v.as_str()) { 407 + if !desc.is_empty() { 408 + output.push_str(&format!(" /// {}\n", desc)); 409 + } 410 + } 411 + output.push_str(&format!(" {},\n", error_name)); 412 + } 413 + output.push('}'); 414 + } 415 + } 416 + } 417 + 418 + output.push_str(";\n"); 419 + Ok(output) 420 + } 421 + 422 + fn generate_procedure(name: &str, def: &Value, last_segment: &str, ctx: &ConversionContext) -> Result<String, MlfGenerateError> { 423 + let mut output = String::new(); 424 + 425 + // Add doc comment 426 + if let Some(desc) = def.get("description").and_then(|v| v.as_str()) { 427 + if !desc.is_empty() { 428 + for line in desc.lines() { 429 + output.push_str(&format!("/// {}\n", line)); 430 + } 431 + } 432 + } 433 + 434 + // Add @main annotation for "main" definitions 435 + if name == "main" { 436 + output.push_str("@main\n"); 437 + } 438 + 439 + let procedure_name = if name == "main" { 440 + escape_name(last_segment) 441 + } else { 442 + escape_name(name) 443 + }; 444 + output.push_str(&format!("procedure {}", procedure_name)); 445 + 446 + // Input parameters 447 + output.push('('); 448 + if let Some(input) = def.get("input").and_then(|v| v.as_object()) { 449 + if let Some(schema) = input.get("schema").and_then(|v| v.as_object()) { 450 + let properties = schema.get("properties").and_then(|v| v.as_object()); 451 + let required = schema 452 + .get("required") 453 + .and_then(|v| v.as_array()) 454 + .map(|arr| { 455 + arr.iter() 456 + .filter_map(|v| v.as_str()) 457 + .collect::<Vec<_>>() 458 + }) 459 + .unwrap_or_default(); 460 + 461 + if let Some(props) = properties { 462 + let param_strs: Vec<String> = props 463 + .iter() 464 + .map(|(param_name, param_def)| { 465 + let is_required = required.contains(&param_name.as_str()); 466 + let required_marker = if is_required { "!" } else { "" }; 467 + let param_type = 468 + generate_type(param_def).unwrap_or_else(|_| "unknown".to_string()); 469 + let escaped_param_name = escape_name(param_name); 470 + 471 + // Add doc comment inline if present 472 + let mut result = String::new(); 473 + if let Some(desc) = param_def.get("description").and_then(|v| v.as_str()) { 474 + if !desc.is_empty() { 475 + result.push_str(&format!("\n /// {}\n ", desc)); 476 + } 477 + } 478 + result.push_str(&format!( 479 + "{}{}: {}", 480 + escaped_param_name, required_marker, param_type 481 + )); 482 + result 483 + }) 484 + .collect(); 485 + 486 + if !param_strs.is_empty() { 487 + output.push_str(&param_strs.join(",")); 488 + } 489 + } 490 + } 491 + } 492 + output.push(')'); 493 + 494 + // Output type 495 + if let Some(output_obj) = def.get("output").and_then(|v| v.as_object()) { 496 + if let Some(schema) = output_obj.get("schema") { 497 + let return_type = generate_type(schema)?; 498 + output.push_str(&format!(": {}", return_type)); 499 + 500 + // Check for errors 501 + if let Some(errors) = output_obj.get("errors").and_then(|v| v.as_object()) { 502 + output.push_str(" | error {\n"); 503 + for (error_name, error_def) in errors { 504 + if let Some(desc) = error_def.get("description").and_then(|v| v.as_str()) { 505 + if !desc.is_empty() { 506 + output.push_str(&format!(" /// {}\n", desc)); 507 + } 508 + } 509 + output.push_str(&format!(" {},\n", error_name)); 510 + } 511 + output.push('}'); 512 + } 513 + } 514 + } 515 + 516 + output.push_str(";\n"); 517 + Ok(output) 518 + } 519 + 520 + fn generate_subscription(name: &str, def: &Value, last_segment: &str, ctx: &ConversionContext) -> Result<String, MlfGenerateError> { 521 + let mut output = String::new(); 522 + 523 + // Add doc comment 524 + if let Some(desc) = def.get("description").and_then(|v| v.as_str()) { 525 + if !desc.is_empty() { 526 + for line in desc.lines() { 527 + output.push_str(&format!("/// {}\n", line)); 528 + } 529 + } 530 + } 531 + 532 + // Add @main annotation for "main" definitions 533 + if name == "main" { 534 + output.push_str("@main\n"); 535 + } 536 + 537 + let subscription_name = if name == "main" { 538 + escape_name(last_segment) 539 + } else { 540 + escape_name(name) 541 + }; 542 + output.push_str(&format!("subscription {}", subscription_name)); 543 + 544 + // Parameters 545 + output.push('('); 546 + if let Some(params) = def.get("parameters").and_then(|v| v.as_object()) { 547 + let properties = params.get("properties").and_then(|v| v.as_object()); 548 + let required = params 549 + .get("required") 550 + .and_then(|v| v.as_array()) 551 + .map(|arr| { 552 + arr.iter() 553 + .filter_map(|v| v.as_str()) 554 + .collect::<Vec<_>>() 555 + }) 556 + .unwrap_or_default(); 557 + 558 + if let Some(props) = properties { 559 + let param_strs: Vec<String> = props 560 + .iter() 561 + .map(|(param_name, param_def)| { 562 + let is_required = required.contains(&param_name.as_str()); 563 + let required_marker = if is_required { "!" } else { "" }; 564 + let param_type = generate_type(param_def).unwrap_or_else(|_| "unknown".to_string()); 565 + let escaped_param_name = escape_name(param_name); 566 + 567 + format!("{}{}: {}", escaped_param_name, required_marker, param_type) 568 + }) 569 + .collect(); 570 + 571 + if !param_strs.is_empty() { 572 + output.push_str(&param_strs.join(", ")); 573 + } 574 + } 575 + } 576 + output.push(')'); 577 + 578 + // Message types 579 + if let Some(message) = def.get("message").and_then(|v| v.as_object()) { 580 + if let Some(schema) = message.get("schema") { 581 + let message_type = generate_type(schema)?; 582 + output.push_str(&format!(": {}", message_type)); 583 + } 584 + } 585 + 586 + output.push_str(";\n"); 587 + Ok(output) 588 + } 589 + 590 + fn generate_token(name: &str, def: &Value) -> Result<String, MlfGenerateError> { 591 + let mut output = String::new(); 592 + 593 + // Add doc comment 594 + if let Some(desc) = def.get("description").and_then(|v| v.as_str()) { 595 + if !desc.is_empty() { 596 + for line in desc.lines() { 597 + output.push_str(&format!("/// {}\n", line)); 598 + } 599 + } 600 + } 601 + 602 + let escaped_name = escape_name(name); 603 + output.push_str(&format!("token {};\n", escaped_name)); 604 + Ok(output) 605 + } 606 + 607 + fn generate_def_type(name: &str, def: &Value, last_segment: &str, ctx: &ConversionContext) -> Result<String, MlfGenerateError> { 608 + let mut output = String::new(); 609 + 610 + // Add @main annotation for "main" definitions 611 + if name == "main" { 612 + output.push_str("@main\n"); 613 + } 614 + 615 + // Use last segment of NSID for "main" definitions 616 + let def_name = if name == "main" { 617 + escape_name(last_segment) 618 + } else { 619 + escape_name(name) 620 + }; 621 + 622 + output.push_str(&format!("def type {} = ", def_name)); 623 + let type_str = generate_type_with_indent(def, 0)?; 624 + output.push_str(&type_str); 625 + output.push_str(";\n"); 626 + 627 + Ok(output) 628 + } 629 + 630 + fn generate_type_with_indent(type_def: &Value, indent_level: usize, ctx: &ConversionContext) -> Result<String, MlfGenerateError> { 631 + let type_name = type_def.get("type").and_then(|v| v.as_str()); 632 + 633 + match type_name { 634 + Some("object") => { 635 + let indent = " ".repeat(indent_level); 636 + let field_indent = " ".repeat(indent_level + 1); 637 + 638 + let mut output = String::from("{\n"); 639 + let properties = type_def 640 + .get("properties") 641 + .and_then(|v| v.as_object()) 642 + .ok_or_else(|| MlfGenerateError::InvalidLexicon { 643 + message: "Missing 'properties' in object type".to_string(), 644 + })?; 645 + 646 + let required = type_def 647 + .get("required") 648 + .and_then(|v| v.as_array()) 649 + .map(|arr| { 650 + arr.iter() 651 + .filter_map(|v| v.as_str()) 652 + .collect::<Vec<_>>() 653 + }) 654 + .unwrap_or_default(); 655 + 656 + for (field_name, field_def) in properties { 657 + // Add field doc comment 658 + if let Some(desc) = field_def.get("description").and_then(|v| v.as_str()) { 659 + if !desc.is_empty() { 660 + for line in desc.lines() { 661 + output.push_str(&format!("{}/// {}\n", field_indent, line)); 662 + } 663 + } 664 + } 665 + 666 + let is_required = required.contains(&field_name.as_str()); 667 + let required_marker = if is_required { "!" } else { "" }; 668 + let field_type = generate_type_with_indent(field_def, indent_level + 1)?; 669 + let escaped_field_name = escape_name(field_name); 670 + output.push_str(&format!( 671 + "{}{}{}: {},\n", 672 + field_indent, escaped_field_name, required_marker, field_type 673 + )); 674 + } 675 + 676 + output.push_str(&format!("{}}}", indent)); 677 + Ok(output) 678 + } 679 + _ => generate_type(type_def), 680 + } 681 + } 682 + 683 + fn generate_type(type_def: &Value, ctx: &ConversionContext) -> Result<String, MlfGenerateError> { 684 + let type_name = type_def.get("type").and_then(|v| v.as_str()); 685 + 686 + match type_name { 687 + Some("null") => Ok("null".to_string()), 688 + Some("boolean") => Ok("boolean".to_string()), 689 + Some("integer") => { 690 + let mut result = "integer".to_string(); 691 + result = apply_constraints(result, type_def); 692 + Ok(result) 693 + } 694 + Some("string") => { 695 + // Check if this is a format string that maps to a prelude type 696 + if let Some(format) = type_def.get("format").and_then(|v| v.as_str()) { 697 + let prelude_type = match format { 698 + "did" => "Did", 699 + "at-uri" => "AtUri", 700 + "at-identifier" => "AtIdentifier", 701 + "handle" => "Handle", 702 + "datetime" => "Datetime", 703 + "uri" => "Uri", 704 + "cid" => "Cid", 705 + "nsid" => "Nsid", 706 + "tid" => "Tid", 707 + "record-key" => "RecordKey", 708 + "language" => "Language", 709 + _ => { 710 + // Unknown format, fall through to normal string with constraints 711 + let mut result = "string".to_string(); 712 + result = apply_constraints(result, type_def); 713 + return Ok(result); 714 + } 715 + }; 716 + // If it's a known prelude type with only the format constraint, use the prelude type directly 717 + // Check if there are other constraints besides format 718 + let has_other_constraints = type_def.get("minLength").is_some() 719 + || type_def.get("maxLength").is_some() 720 + || type_def.get("minGraphemes").is_some() 721 + || type_def.get("maxGraphemes").is_some() 722 + || type_def.get("enum").is_some() 723 + || type_def.get("knownValues").is_some() 724 + || type_def.get("default").is_some(); 725 + 726 + if !has_other_constraints { 727 + return Ok(prelude_type.to_string()); 728 + } 729 + } 730 + 731 + let mut result = "string".to_string(); 732 + result = apply_constraints(result, type_def); 733 + Ok(result) 734 + } 735 + Some("bytes") => Ok("bytes".to_string()), 736 + Some("blob") => { 737 + let mut result = "blob".to_string(); 738 + result = apply_constraints(result, type_def); 739 + Ok(result) 740 + } 741 + Some("unknown") => Ok("unknown".to_string()), 742 + Some("array") => { 743 + let items = type_def.get("items").ok_or_else(|| { 744 + MlfGenerateError::InvalidLexicon { 745 + message: "Missing 'items' in array type".to_string(), 746 + } 747 + })?; 748 + 749 + // Check if items have constraints 750 + let items_obj = items.as_object(); 751 + let has_item_constraints = items_obj.map_or(false, |obj| { 752 + obj.contains_key("minLength") || 753 + obj.contains_key("maxLength") || 754 + obj.contains_key("minGraphemes") || 755 + obj.contains_key("maxGraphemes") || 756 + obj.contains_key("minimum") || 757 + obj.contains_key("maximum") || 758 + obj.contains_key("enum") || 759 + obj.contains_key("knownValues") || 760 + obj.contains_key("default") 761 + }); 762 + 763 + let item_type = if has_item_constraints { 764 + // If item has constraints, we need to wrap in parentheses to apply constraints before [] 765 + // For now, just generate the base type without item constraints 766 + // TODO: Consider generating a type alias for complex constrained items 767 + items.get("type") 768 + .and_then(|t| t.as_str()) 769 + .unwrap_or("unknown") 770 + .to_string() 771 + } else { 772 + generate_type(items)? 773 + }; 774 + 775 + let mut result = format!("{}[]", item_type); 776 + result = apply_constraints(result, type_def); 777 + Ok(result) 778 + } 779 + Some("object") => { 780 + let mut output = String::from("{\n"); 781 + let properties = type_def 782 + .get("properties") 783 + .and_then(|v| v.as_object()) 784 + .ok_or_else(|| MlfGenerateError::InvalidLexicon { 785 + message: "Missing 'properties' in object type".to_string(), 786 + })?; 787 + 788 + let required = type_def 789 + .get("required") 790 + .and_then(|v| v.as_array()) 791 + .map(|arr| { 792 + arr.iter() 793 + .filter_map(|v| v.as_str()) 794 + .collect::<Vec<_>>() 795 + }) 796 + .unwrap_or_default(); 797 + 798 + for (field_name, field_def) in properties { 799 + // Add field doc comment 800 + if let Some(desc) = field_def.get("description").and_then(|v| v.as_str()) { 801 + if !desc.is_empty() { 802 + for line in desc.lines() { 803 + output.push_str(&format!(" /// {}\n", line)); 804 + } 805 + } 806 + } 807 + 808 + let is_required = required.contains(&field_name.as_str()); 809 + let required_marker = if is_required { "!" } else { "" }; 810 + let field_type = generate_type(field_def)?; 811 + let escaped_field_name = escape_name(field_name); 812 + output.push_str(&format!( 813 + " {}{}: {},\n", 814 + escaped_field_name, required_marker, field_type 815 + )); 816 + } 817 + 818 + output.push_str(" }"); 819 + Ok(output) 820 + } 821 + Some("union") => { 822 + let refs = type_def.get("refs").and_then(|v| v.as_array()).ok_or_else(|| { 823 + MlfGenerateError::InvalidLexicon { 824 + message: "Missing 'refs' in union type".to_string(), 825 + } 826 + })?; 827 + 828 + let type_strs: Vec<String> = refs 829 + .iter() 830 + .map(|r| generate_type(r).unwrap_or_else(|_| "unknown".to_string())) 831 + .collect(); 832 + 833 + let mut result = type_strs.join(" | "); 834 + 835 + // Check if closed 836 + if type_def.get("closed").and_then(|v| v.as_bool()).unwrap_or(false) { 837 + result.push_str(" | !"); 838 + } 839 + 840 + Ok(result) 841 + } 842 + Some("ref") => { 843 + if let Some(ref_str) = type_def.get("ref").and_then(|v| v.as_str()) { 844 + // Handle references: 845 + // "#defName" -> "defName" (local reference, same file) 846 + // "namespace.id#defName" -> Check if same namespace, if so use "defName", else use full path 847 + 848 + if let Some(stripped) = ref_str.strip_prefix('#') { 849 + // Local reference: #defName -> defName 850 + Ok(stripped.to_string()) 851 + } else if let Some((namespace, def_name)) = ref_str.split_once('#') { 852 + // Check if this is the current namespace 853 + // For now, we'll just use the def name if it's the same namespace 854 + // Note: This requires passing context through, which we'll add 855 + // For external refs, we keep the full NSID format 856 + Ok(format!("{}.{}", namespace, def_name)) 857 + } else { 858 + // No # at all - shouldn't happen in valid lexicons, but handle gracefully 859 + Ok(ref_str.to_string()) 860 + } 861 + } else { 862 + Err(MlfGenerateError::InvalidLexicon { 863 + message: "Missing 'ref' in ref type".to_string(), 864 + }) 865 + } 866 + } 867 + _ => Ok("unknown".to_string()), 868 + } 869 + } 870 + 871 + fn apply_constraints(mut type_str: String, type_def: &Value) -> String { 872 + let mut constraints = Vec::new(); 873 + 874 + if let Some(min_length) = type_def.get("minLength").and_then(|v| v.as_i64()) { 875 + constraints.push(format!("minLength: {}", min_length)); 876 + } 877 + if let Some(max_length) = type_def.get("maxLength").and_then(|v| v.as_i64()) { 878 + constraints.push(format!("maxLength: {}", max_length)); 879 + } 880 + if let Some(min_graphemes) = type_def.get("minGraphemes").and_then(|v| v.as_i64()) { 881 + constraints.push(format!("minGraphemes: {}", min_graphemes)); 882 + } 883 + if let Some(max_graphemes) = type_def.get("maxGraphemes").and_then(|v| v.as_i64()) { 884 + constraints.push(format!("maxGraphemes: {}", max_graphemes)); 885 + } 886 + if let Some(minimum) = type_def.get("minimum").and_then(|v| v.as_i64()) { 887 + constraints.push(format!("minimum: {}", minimum)); 888 + } 889 + if let Some(maximum) = type_def.get("maximum").and_then(|v| v.as_i64()) { 890 + constraints.push(format!("maximum: {}", maximum)); 891 + } 892 + if let Some(format) = type_def.get("format").and_then(|v| v.as_str()) { 893 + constraints.push(format!("format: \"{}\"", format)); 894 + } 895 + if let Some(enum_vals) = type_def.get("enum").and_then(|v| v.as_array()) { 896 + let vals: Vec<String> = enum_vals 897 + .iter() 898 + .filter_map(|v| v.as_str()) 899 + .map(|s| format!("\"{}\"", s)) 900 + .collect(); 901 + constraints.push(format!("enum: [{}]", vals.join(", "))); 902 + } 903 + if let Some(known_vals) = type_def.get("knownValues").and_then(|v| v.as_array()) { 904 + let vals: Vec<String> = known_vals 905 + .iter() 906 + .filter_map(|v| v.as_str()) 907 + .map(|s| format!("\"{}\"", s)) 908 + .collect(); 909 + constraints.push(format!("knownValues: [{}]", vals.join(", "))); 910 + } 911 + if let Some(accept) = type_def.get("accept").and_then(|v| v.as_array()) { 912 + let mimes: Vec<String> = accept 913 + .iter() 914 + .filter_map(|v| v.as_str()) 915 + .map(|s| format!("\"{}\"", s)) 916 + .collect(); 917 + constraints.push(format!("accept: [{}]", mimes.join(", "))); 918 + } 919 + if let Some(max_size) = type_def.get("maxSize").and_then(|v| v.as_i64()) { 920 + constraints.push(format!("maxSize: {}", max_size)); 921 + } 922 + if let Some(default) = type_def.get("default") { 923 + let default_str = match default { 924 + Value::String(s) => format!("\"{}\"", s), 925 + Value::Number(n) => n.to_string(), 926 + Value::Bool(b) => b.to_string(), 927 + _ => "null".to_string(), 928 + }; 929 + constraints.push(format!("default: {}", default_str)); 930 + } 931 + 932 + if !constraints.is_empty() { 933 + type_str.push_str(" constrained {\n"); 934 + for constraint in &constraints { 935 + type_str.push_str(&format!(" {},\n", constraint)); 936 + } 937 + type_str.push_str(" }"); 938 + } 939 + 940 + type_str 941 + }

+5 -5

mlf-cli/src/generate/mod.rs

··· 31 31 32 32 println!("Running {} output configuration(s)...", config.output.len()); 33 33 34 - // Build input pattern from source directory 35 - let source_pattern = format!("{}/**/*.mlf", config.source.directory); 36 - let input_patterns = vec![source_pattern]; 34 + // Build input path from source directory 35 + let source_dir = project_root.join(&config.source.directory); 36 + let input_paths = vec![source_dir.clone()]; 37 37 38 38 let mut errors = Vec::new(); 39 39 let mut success_count = 0; ··· 46 46 47 47 let result = match output_type.as_str() { 48 48 "lexicon" => { 49 - lexicon::run(input_patterns.clone(), output_dir, false) 49 + lexicon::run(input_paths.clone(), Some(output_dir), Some(source_dir.clone()), false) 50 50 .map_err(|e| format!("{}", e)) 51 51 } 52 52 "mlf" => { ··· 57 57 } 58 58 generator_type => { 59 59 // Assume it's a code generator (typescript, go, rust, etc.) 60 - code::run(generator_type.to_string(), input_patterns.clone(), output_dir, false) 60 + code::run(Some(generator_type.to_string()), input_paths.clone(), Some(output_dir), Some(source_dir.clone()), false) 61 61 .map_err(|e| format!("{}", e)) 62 62 } 63 63 };

+29 -20

mlf-cli/src/main.rs

··· 37 37 }, 38 38 39 39 Check { 40 - #[arg(help = "MLF lexicon file(s) to validate (glob patterns supported). If omitted, checks source directory from mlf.toml")] 41 - input: Vec<String>, 40 + #[arg(help = "MLF lexicon file(s) or directory to validate. If omitted, checks source directory from mlf.toml")] 41 + input: Vec<PathBuf>, 42 + 43 + #[arg(long, help = "Root directory for namespace calculation (defaults to mlf.toml source directory or current directory)")] 44 + root: Option<PathBuf>, 42 45 }, 43 46 44 47 Validate { ··· 66 69 #[derive(Subcommand)] 67 70 enum GenerateCommands { 68 71 Lexicon { 69 - #[arg(short, long, help = "Input MLF files (glob patterns supported)")] 70 - input: Vec<String>, 72 + #[arg(short, long, help = "Input MLF file(s) or directory. If omitted, uses source directory from mlf.toml")] 73 + input: Vec<PathBuf>, 71 74 72 - #[arg(short, long, help = "Output directory")] 73 - output: PathBuf, 75 + #[arg(short, long, help = "Output directory. If omitted, uses first lexicon output from mlf.toml")] 76 + output: Option<PathBuf>, 77 + 78 + #[arg(long, help = "Root directory for namespace calculation (defaults to mlf.toml source directory or current directory)")] 79 + root: Option<PathBuf>, 74 80 75 81 #[arg(long, help = "Use flat file structure (e.g., app.bsky.post.json)")] 76 82 flat: bool, 77 83 }, 78 84 Code { 79 - #[arg(short, long, help = "Generator to use (json, typescript, go, rust)")] 80 - generator: String, 85 + #[arg(short, long, help = "Generator to use (typescript, go, rust, etc.). If omitted, uses first code output from mlf.toml")] 86 + generator: Option<String>, 87 + 88 + #[arg(short, long, help = "Input MLF file(s) or directory. If omitted, uses source directory from mlf.toml")] 89 + input: Vec<PathBuf>, 81 90 82 - #[arg(short, long, help = "Input MLF files (glob patterns supported)")] 83 - input: Vec<String>, 91 + #[arg(short, long, help = "Output directory. If omitted, uses matching output from mlf.toml")] 92 + output: Option<PathBuf>, 84 93 85 - #[arg(short, long, help = "Output directory")] 86 - output: PathBuf, 94 + #[arg(long, help = "Root directory for namespace calculation (defaults to mlf.toml source directory or current directory)")] 95 + root: Option<PathBuf>, 87 96 88 97 #[arg(long, help = "Use flat file structure (e.g., app.bsky.post.ts)")] 89 98 flat: bool, ··· 92 101 #[arg(short, long, help = "Input JSON lexicon files (glob patterns supported)")] 93 102 input: Vec<String>, 94 103 95 - #[arg(short, long, help = "Output directory")] 96 - output: PathBuf, 104 + #[arg(short, long, help = "Output directory. If omitted, uses first mlf output from mlf.toml")] 105 + output: Option<PathBuf>, 97 106 }, 98 107 } 99 108 ··· 104 113 Commands::Init { yes } => { 105 114 init::run_init(yes).into_diagnostic() 106 115 } 107 - Commands::Check { input } => { 108 - check::run_check(input).into_diagnostic() 116 + Commands::Check { input, root } => { 117 + check::run_check(input, root).into_diagnostic() 109 118 } 110 119 Commands::Validate { lexicon, record } => { 111 120 check::validate(lexicon, record).into_diagnostic() 112 121 } 113 122 Commands::Generate { command } => match command { 114 - Some(GenerateCommands::Lexicon { input, output, flat }) => { 115 - generate::lexicon::run(input, output, flat).into_diagnostic() 123 + Some(GenerateCommands::Lexicon { input, output, root, flat }) => { 124 + generate::lexicon::run(input, output, root, flat).into_diagnostic() 116 125 } 117 - Some(GenerateCommands::Code { generator, input, output, flat }) => { 118 - generate::code::run(generator, input, output, flat).into_diagnostic() 126 + Some(GenerateCommands::Code { generator, input, output, root, flat }) => { 127 + generate::code::run(generator, input, output, root, flat).into_diagnostic() 119 128 } 120 129 Some(GenerateCommands::Mlf { input, output }) => { 121 130 generate::mlf::run(input, output).into_diagnostic()

+95 -12

mlf-codegen/src/lib.rs

··· 68 68 69 69 pub use plugin::{CodeGenerator, GeneratorContext}; 70 70 71 + fn has_main_annotation(annotations: &[Annotation]) -> bool { 72 + annotations.iter().any(|ann| ann.name.name == "main") 73 + } 74 + 71 75 pub fn generate_lexicon(namespace: &str, lexicon: &Lexicon, workspace: &Workspace) -> Value { 72 76 let usage_counts = analyze_type_usage(lexicon); 73 77 ··· 76 80 let expected_main_name = namespace_parts.last().copied().unwrap_or(""); 77 81 let is_defs_namespace = expected_main_name == "defs"; 78 82 79 - // Count main-eligible items (records, queries, procedures, subscriptions) 80 - let main_eligible_count = lexicon.items.iter().filter(|item| { 81 - matches!(item, Item::Record(_) | Item::Query(_) | Item::Procedure(_) | Item::Subscription(_)) 82 - }).count(); 83 + // Count main-eligible items (records, queries, procedures, subscriptions, def types) without @main 84 + let main_eligible_items: Vec<&Item> = lexicon.items.iter() 85 + .filter(|item| { 86 + matches!(item, Item::Record(_) | Item::Query(_) | Item::Procedure(_) | Item::Subscription(_) | Item::DefType(_)) 87 + }) 88 + .collect(); 89 + 90 + let main_eligible_count = main_eligible_items.len(); 91 + 92 + // Check if any item has @main annotation 93 + let has_explicit_main = main_eligible_items.iter().any(|item| { 94 + match item { 95 + Item::Record(r) => has_main_annotation(&r.annotations), 96 + Item::Query(q) => has_main_annotation(&q.annotations), 97 + Item::Procedure(p) => has_main_annotation(&p.annotations), 98 + Item::Subscription(s) => has_main_annotation(&s.annotations), 99 + Item::DefType(d) => has_main_annotation(&d.annotations), 100 + _ => false, 101 + } 102 + }); 83 103 84 104 let mut defs = Map::new(); 85 105 ··· 87 107 match item { 88 108 Item::Record(record) => { 89 109 let record_json = generate_record_json(record, &usage_counts, workspace, namespace); 90 - // If there's only one main-eligible item, it becomes "main" 91 - if main_eligible_count == 1 || (!is_defs_namespace && record.name.name == expected_main_name) { 110 + 111 + // Check if this should be main 112 + let is_main = if has_explicit_main { 113 + // If @main is used explicitly, only that item is main 114 + has_main_annotation(&record.annotations) 115 + } else { 116 + // Otherwise use heuristics: single item or name matches namespace 117 + main_eligible_count == 1 || (!is_defs_namespace && record.name.name == expected_main_name) 118 + }; 119 + 120 + if is_main { 92 121 defs.insert("main".to_string(), record_json); 93 122 } else { 94 123 defs.insert(record.name.name.clone(), record_json); ··· 96 125 } 97 126 Item::Query(query) => { 98 127 let query_json = generate_query_json(query, &usage_counts, workspace, namespace); 99 - if main_eligible_count == 1 || (!is_defs_namespace && query.name.name == expected_main_name) { 128 + 129 + let is_main = if has_explicit_main { 130 + has_main_annotation(&query.annotations) 131 + } else { 132 + main_eligible_count == 1 || (!is_defs_namespace && query.name.name == expected_main_name) 133 + }; 134 + 135 + if is_main { 100 136 defs.insert("main".to_string(), query_json); 101 137 } else { 102 138 defs.insert(query.name.name.clone(), query_json); ··· 104 140 } 105 141 Item::Procedure(procedure) => { 106 142 let procedure_json = generate_procedure_json(procedure, &usage_counts, workspace, namespace); 107 - if main_eligible_count == 1 || (!is_defs_namespace && procedure.name.name == expected_main_name) { 143 + 144 + let is_main = if has_explicit_main { 145 + has_main_annotation(&procedure.annotations) 146 + } else { 147 + main_eligible_count == 1 || (!is_defs_namespace && procedure.name.name == expected_main_name) 148 + }; 149 + 150 + if is_main { 108 151 defs.insert("main".to_string(), procedure_json); 109 152 } else { 110 153 defs.insert(procedure.name.name.clone(), procedure_json); ··· 112 155 } 113 156 Item::Subscription(subscription) => { 114 157 let subscription_json = generate_subscription_json(subscription, &usage_counts, workspace, namespace); 115 - if main_eligible_count == 1 || (!is_defs_namespace && subscription.name.name == expected_main_name) { 158 + 159 + let is_main = if has_explicit_main { 160 + has_main_annotation(&subscription.annotations) 161 + } else { 162 + main_eligible_count == 1 || (!is_defs_namespace && subscription.name.name == expected_main_name) 163 + }; 164 + 165 + if is_main { 116 166 defs.insert("main".to_string(), subscription_json); 117 167 } else { 118 168 defs.insert(subscription.name.name.clone(), subscription_json); ··· 120 170 } 121 171 Item::DefType(def_type) => { 122 172 let def_type_json = generate_def_type_json(def_type, &usage_counts, workspace, namespace); 123 - defs.insert(def_type.name.name.clone(), def_type_json); 173 + 174 + // Check if this should be main 175 + let is_main = if has_explicit_main { 176 + has_main_annotation(&def_type.annotations) 177 + } else { 178 + main_eligible_count == 1 || (!is_defs_namespace && def_type.name.name == expected_main_name) 179 + }; 180 + 181 + if is_main { 182 + defs.insert("main".to_string(), def_type_json); 183 + } else { 184 + defs.insert(def_type.name.name.clone(), def_type_json); 185 + } 124 186 } 125 187 Item::InlineType(_) => { 126 188 // Inline types are never added to defs - they expand at point of use ··· 138 200 } 139 201 140 202 let mut root = Map::new(); 203 + root.insert("$type".to_string(), json!("com.atproto.lexicon.schema")); 141 204 root.insert("lexicon".to_string(), json!(1)); 142 205 root.insert("id".to_string(), json!(namespace)); 143 206 root.insert("defs".to_string(), json!(defs)); ··· 473 536 } 474 537 475 538 // Not an inline type (or couldn't resolve) - generate a ref 476 - if path.segments.len() == 1 { 539 + // First, try to get the fully resolved namespace for this type 540 + if let Some(full_namespace) = workspace.resolve_reference_namespace(path, current_namespace) { 541 + // We have the full namespace where this type is defined 542 + if full_namespace == current_namespace { 543 + // It's in the current namespace - use local reference 544 + let type_name = path.segments.last().unwrap().name.as_str(); 545 + json!({ 546 + "type": "ref", 547 + "ref": format!("#{}", type_name) 548 + }) 549 + } else { 550 + // It's in a different namespace - use full reference 551 + let type_name = path.segments.last().unwrap().name.as_str(); 552 + json!({ 553 + "type": "ref", 554 + "ref": format!("{}#{}", full_namespace, type_name) 555 + }) 556 + } 557 + } else if path.segments.len() == 1 { 558 + // Couldn't resolve namespace - fall back to heuristic 559 + // Single segment likely means local reference 477 560 let name = &path.segments[0].name; 478 561 json!({ 479 562 "type": "ref", 480 563 "ref": format!("#{}", name) 481 564 }) 482 565 } else { 483 - // Multi-segment path ref 566 + // Multi-segment path ref - use as-is 484 567 let namespace = path.segments[..path.segments.len()-1] 485 568 .iter() 486 569 .map(|s| s.name.as_str())

+132 -13

mlf-diagnostics/src/lib.rs

··· 70 70 #[derive(Debug)] 71 71 pub struct ValidationDiagnostic { 72 72 source_code: NamedSource<String>, 73 + module_namespace: String, 73 74 errors: ValidationErrors, 74 75 } 75 76 76 77 impl ValidationDiagnostic { 77 - pub fn new(filename: String, source: String, errors: ValidationErrors) -> Self { 78 + pub fn new(filename: String, source: String, module_namespace: String, errors: ValidationErrors) -> Self { 78 79 Self { 79 80 source_code: NamedSource::new(filename, source), 81 + module_namespace, 80 82 errors, 81 83 } 82 84 } ··· 84 86 85 87 impl fmt::Display for ValidationDiagnostic { 86 88 fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { 87 - if self.errors.len() == 1 { 88 - format_validation_error(f, &self.errors.errors[0]) 89 + // Filter to only errors that belong to this module 90 + let errors_in_this_module: Vec<&ValidationError> = self 91 + .errors 92 + .errors 93 + .iter() 94 + .filter(|error| get_error_module_namespace(error) == self.module_namespace) 95 + .collect(); 96 + 97 + if errors_in_this_module.len() == 1 { 98 + format_validation_error(f, errors_in_this_module[0]) 99 + } else if errors_in_this_module.is_empty() { 100 + write!( 101 + f, 102 + "Found {} validation error(s) in other modules in the workspace", 103 + self.errors.len() 104 + ) 105 + } else if errors_in_this_module.len() == self.errors.len() { 106 + write!(f, "Found {} validation error(s)", self.errors.len()) 89 107 } else { 90 - write!(f, "Found {} validation errors", self.errors.len()) 108 + write!( 109 + f, 110 + "Found {} validation error(s) in this module ({} total in workspace)", 111 + errors_in_this_module.len(), 112 + self.errors.len() 113 + ) 91 114 } 92 115 } 93 116 } ··· 96 119 97 120 impl Diagnostic for ValidationDiagnostic { 98 121 fn code<'a>(&'a self) -> Option<Box<dyn fmt::Display + 'a>> { 99 - if self.errors.len() == 1 { 100 - Some(Box::new(get_error_code(&self.errors.errors[0]))) 122 + // Filter to only errors in this module 123 + let errors_in_this_module: Vec<&ValidationError> = self 124 + .errors 125 + .errors 126 + .iter() 127 + .filter(|error| get_error_module_namespace(error) == self.module_namespace) 128 + .collect(); 129 + 130 + if errors_in_this_module.len() == 1 { 131 + Some(Box::new(get_error_code(errors_in_this_module[0]))) 101 132 } else { 102 133 Some(Box::new("mlf::validation")) 103 134 } ··· 108 139 } 109 140 110 141 fn labels(&self) -> Option<Box<dyn Iterator<Item = LabeledSpan> + '_>> { 111 - let labels: Vec<LabeledSpan> = self 142 + // Filter to only errors that belong to this module 143 + let errors_in_this_module: Vec<&ValidationError> = self 112 144 .errors 113 145 .errors 114 146 .iter() 147 + .filter(|error| get_error_module_namespace(error) == self.module_namespace) 148 + .collect(); 149 + 150 + // Get all labels from those errors 151 + let labels: Vec<LabeledSpan> = errors_in_this_module 152 + .iter() 115 153 .flat_map(|error| get_error_labels(error)) 116 154 .collect(); 117 155 ··· 123 161 } 124 162 125 163 fn help<'a>(&'a self) -> Option<Box<dyn fmt::Display + 'a>> { 126 - if self.errors.len() == 1 { 127 - get_error_help(&self.errors.errors[0]) 164 + // Filter to only errors in this module 165 + let errors_in_this_module: Vec<&ValidationError> = self 166 + .errors 167 + .errors 168 + .iter() 169 + .filter(|error| get_error_module_namespace(error) == self.module_namespace) 170 + .collect(); 171 + 172 + if errors_in_this_module.len() == 1 { 173 + get_error_help(errors_in_this_module[0]) 128 174 } else { 129 175 None 130 176 } ··· 153 199 ValidationError::ReservedName { name, .. } => { 154 200 write!(f, "Reserved name '{}' cannot be used as an item name", name) 155 201 } 202 + ValidationError::AmbiguousMain { name, namespace_suffix, .. } => { 203 + write!( 204 + f, 205 + "Ambiguous main definition for '{}' in namespace ending with '{}'. Use @main to disambiguate", 206 + name, namespace_suffix 207 + ) 208 + } 209 + ValidationError::MultipleMain { name, .. } => { 210 + write!(f, "Multiple items named '{}' marked with @main. Only one can be @main", name) 211 + } 212 + ValidationError::ConflictNotAllowed { name, namespace_suffix, .. } => { 213 + write!( 214 + f, 215 + "Name conflict for '{}' is not allowed. Conflicts are only allowed when the name matches the namespace suffix ('{}')", 216 + name, namespace_suffix 217 + ) 218 + } 156 219 } 157 220 } 158 221 ··· 164 227 ValidationError::TypeMismatch { .. } => "mlf::type_mismatch", 165 228 ValidationError::ConstraintTooPermissive { .. } => "mlf::constraint_too_permissive", 166 229 ValidationError::ReservedName { .. } => "mlf::reserved_name", 230 + ValidationError::AmbiguousMain { .. } => "mlf::ambiguous_main", 231 + ValidationError::MultipleMain { .. } => "mlf::multiple_main", 232 + ValidationError::ConflictNotAllowed { .. } => "mlf::conflict_not_allowed", 167 233 } 168 234 } 169 235 ··· 173 239 first_span, 174 240 second_span, 175 241 name, 242 + .. 176 243 } => vec![ 177 244 LabeledSpan::at( 178 245 first_span.start..first_span.end, ··· 183 250 format!("'{}' redefined here", name), 184 251 ), 185 252 ], 186 - ValidationError::UndefinedReference { span, name } => vec![LabeledSpan::at( 253 + ValidationError::UndefinedReference { span, name, .. } => vec![LabeledSpan::at( 187 254 span.start..span.end, 188 255 format!("'{}' is not defined", name), 189 256 )], 190 - ValidationError::InvalidConstraint { span, message } => { 257 + ValidationError::InvalidConstraint { span, message, .. } => { 191 258 vec![LabeledSpan::at(span.start..span.end, message.clone())] 192 259 } 193 260 ValidationError::TypeMismatch { span, .. } => { 194 261 vec![LabeledSpan::at(span.start..span.end, "type mismatch here")] 195 262 } 196 - ValidationError::ConstraintTooPermissive { span, message } => { 263 + ValidationError::ConstraintTooPermissive { span, message, .. } => { 197 264 vec![LabeledSpan::at(span.start..span.end, message.clone())] 198 265 } 199 - ValidationError::ReservedName { span, name } => vec![LabeledSpan::at( 266 + ValidationError::ReservedName { span, name, .. } => vec![LabeledSpan::at( 200 267 span.start..span.end, 201 268 format!("'{}' is a reserved name and cannot be used", name), 202 269 )], 270 + ValidationError::AmbiguousMain { 271 + first_span, 272 + second_span, 273 + name, 274 + .. 275 + } => vec![ 276 + LabeledSpan::at( 277 + first_span.start..first_span.end, 278 + format!("'{}' defined here", name), 279 + ), 280 + LabeledSpan::at( 281 + second_span.start..second_span.end, 282 + format!("'{}' also defined here", name), 283 + ), 284 + ], 285 + ValidationError::MultipleMain { 286 + first_span, 287 + second_span, 288 + name, 289 + .. 290 + } => vec![ 291 + LabeledSpan::at( 292 + first_span.start..first_span.end, 293 + format!("'{}' marked as @main here", name), 294 + ), 295 + LabeledSpan::at( 296 + second_span.start..second_span.end, 297 + format!("'{}' also marked as @main here", name), 298 + ), 299 + ], 300 + ValidationError::ConflictNotAllowed { span, name, .. } => vec![LabeledSpan::at( 301 + span.start..span.end, 302 + format!("'{}' conflicts with another definition", name), 303 + )], 203 304 } 204 305 } 205 306 ··· 243 344 _ => None, 244 345 } 245 346 } 347 + 348 + pub fn get_error_module_namespace_str(error: &ValidationError) -> &str { 349 + match error { 350 + ValidationError::DuplicateDefinition { module_namespace, .. } => module_namespace, 351 + ValidationError::UndefinedReference { module_namespace, .. } => module_namespace, 352 + ValidationError::InvalidConstraint { module_namespace, .. } => module_namespace, 353 + ValidationError::TypeMismatch { module_namespace, .. } => module_namespace, 354 + ValidationError::ConstraintTooPermissive { module_namespace, .. } => module_namespace, 355 + ValidationError::ReservedName { module_namespace, .. } => module_namespace, 356 + ValidationError::AmbiguousMain { module_namespace, .. } => module_namespace, 357 + ValidationError::MultipleMain { module_namespace, .. } => module_namespace, 358 + ValidationError::ConflictNotAllowed { module_namespace, .. } => module_namespace, 359 + } 360 + } 361 + 362 + fn get_error_module_namespace(error: &ValidationError) -> &str { 363 + get_error_module_namespace_str(error) 364 + }

+9 -6

mlf-lang/src/error.rs

··· 12 12 13 13 #[derive(Debug, Clone, PartialEq)] 14 14 pub enum ValidationError { 15 - DuplicateDefinition { name: String, first_span: Span, second_span: Span }, 16 - UndefinedReference { name: String, span: Span }, 17 - InvalidConstraint { message: String, span: Span }, 18 - TypeMismatch { expected: String, found: String, span: Span }, 19 - ConstraintTooPermissive { message: String, span: Span }, 20 - ReservedName { name: String, span: Span }, 15 + DuplicateDefinition { name: String, first_span: Span, second_span: Span, module_namespace: String }, 16 + UndefinedReference { name: String, span: Span, module_namespace: String }, 17 + InvalidConstraint { message: String, span: Span, module_namespace: String }, 18 + TypeMismatch { expected: String, found: String, span: Span, module_namespace: String }, 19 + ConstraintTooPermissive { message: String, span: Span, module_namespace: String }, 20 + ReservedName { name: String, span: Span, module_namespace: String }, 21 + AmbiguousMain { name: String, namespace_suffix: String, first_span: Span, second_span: Span, module_namespace: String }, 22 + MultipleMain { name: String, first_span: Span, second_span: Span, module_namespace: String }, 23 + ConflictNotAllowed { name: String, namespace_suffix: String, span: Span, module_namespace: String }, 21 24 } 22 25 23 26 #[derive(Debug, Clone, Default)]

+26

mlf-lang/src/lexer.rs

··· 421 421 assert_eq!(tokens[1].token, Token::Record); 422 422 assert_eq!(tokens[2].token, Token::Ident("foo".into())); 423 423 } 424 + 425 + #[test] 426 + fn test_raw_identifiers() { 427 + let input = "`record` `type` `string`"; 428 + let tokens = tokenize(input).unwrap(); 429 + assert_eq!(tokens[0].token, Token::Ident("record".into())); 430 + assert_eq!(tokens[1].token, Token::Ident("type".into())); 431 + assert_eq!(tokens[2].token, Token::Ident("string".into())); 432 + } 433 + 434 + #[test] 435 + fn test_raw_identifier_in_field() { 436 + let input = "def type foo = { `record`: string, };"; 437 + let tokens = tokenize(input).unwrap(); 438 + assert_eq!(tokens[0].token, Token::Def); 439 + assert_eq!(tokens[1].token, Token::Type); 440 + assert_eq!(tokens[2].token, Token::Ident("foo".into())); 441 + assert_eq!(tokens[3].token, Token::Equals); 442 + assert_eq!(tokens[4].token, Token::LeftBrace); 443 + assert_eq!(tokens[5].token, Token::Ident("record".into())); // Should be Ident, not Record keyword 444 + assert_eq!(tokens[6].token, Token::Colon); 445 + assert_eq!(tokens[7].token, Token::String); 446 + assert_eq!(tokens[8].token, Token::Comma); 447 + assert_eq!(tokens[9].token, Token::RightBrace); 448 + assert_eq!(tokens[10].token, Token::Semicolon); 449 + } 424 450 }

+212 -3

mlf-lang/src/parser.rs

··· 100 100 Ok(ident) 101 101 } 102 102 103 + fn parse_ident_or_keyword(&mut self) -> Result<Ident, ParseError> { 104 + let current = self.current(); 105 + // Path segments can be identifiers or keywords 106 + let name = match &current.token { 107 + LexToken::Ident(n) => n.clone(), 108 + LexToken::Record => "record".into(), 109 + LexToken::Token => "token".into(), 110 + LexToken::Inline => "inline".into(), 111 + LexToken::Def => "def".into(), 112 + LexToken::Type => "type".into(), 113 + LexToken::Query => "query".into(), 114 + LexToken::Procedure => "procedure".into(), 115 + LexToken::Subscription => "subscription".into(), 116 + LexToken::Error => "error".into(), 117 + LexToken::Use => "use".into(), 118 + LexToken::As => "as".into(), 119 + LexToken::String => "string".into(), 120 + LexToken::Integer => "integer".into(), 121 + LexToken::Boolean => "boolean".into(), 122 + LexToken::Null => "null".into(), 123 + LexToken::Unknown => "unknown".into(), 124 + LexToken::Constrained => "constrained".into(), 125 + LexToken::Blob => "blob".into(), 126 + LexToken::Bytes => "bytes".into(), 127 + LexToken::Namespace => "namespace".into(), 128 + _ => { 129 + return Err(ParseError::Syntax { 130 + message: alloc::format!("Expected identifier, found {}", current.token), 131 + span: current.span, 132 + }); 133 + } 134 + }; 135 + let ident = Ident { 136 + name, 137 + span: current.span, 138 + }; 139 + self.advance(); 140 + Ok(ident) 141 + } 142 + 103 143 fn parse_path(&mut self) -> Result<Path, ParseError> { 104 144 let mut segments = Vec::new(); 105 145 let start = self.current().span.start; 106 146 107 - segments.push(self.parse_ident()?); 147 + segments.push(self.parse_ident_or_keyword()?); 108 148 109 149 while matches!(self.current().token, LexToken::Dot) { 110 150 self.advance(); 111 - segments.push(self.parse_ident()?); 151 + segments.push(self.parse_ident_or_keyword()?); 112 152 } 113 153 114 154 let end = segments.last().unwrap().span.end; ··· 530 570 let start = self.expect(LexToken::Use)?; 531 571 let path = self.parse_path()?; 532 572 533 - let imports = if matches!(self.current().token, LexToken::As) { 573 + let imports = if matches!(self.current().token, LexToken::LeftBrace) { 574 + // use namespace { items } 575 + self.advance(); // consume { 576 + let mut items = Vec::new(); 577 + 578 + while !matches!(self.current().token, LexToken::RightBrace) { 579 + // Check for 'main' keyword 580 + let current = self.current(); 581 + let name_ident = if let LexToken::Ident(name) = &current.token { 582 + if name == "main" { 583 + // Create a special identifier for main 584 + let ident = Ident { 585 + name: "main".into(), 586 + span: current.span, 587 + }; 588 + self.advance(); 589 + ident 590 + } else { 591 + self.parse_ident()? 592 + } 593 + } else { 594 + self.parse_ident()? 595 + }; 596 + 597 + // Check for alias (as keyword) 598 + let alias = if matches!(self.current().token, LexToken::As) { 599 + self.advance(); 600 + Some(self.parse_ident()?) 601 + } else { 602 + None 603 + }; 604 + 605 + items.push(UseItem { 606 + name: name_ident, 607 + alias, 608 + }); 609 + 610 + // Comma is optional before closing brace 611 + if matches!(self.current().token, LexToken::Comma) { 612 + self.advance(); 613 + } else if !matches!(self.current().token, LexToken::RightBrace) { 614 + return Err(ParseError::Syntax { 615 + message: alloc::format!("Expected comma or closing brace, found {}", self.current().token), 616 + span: self.current().span, 617 + }); 618 + } 619 + } 620 + 621 + self.expect(LexToken::RightBrace)?; 622 + UseImports::Items(items) 623 + } else if matches!(self.current().token, LexToken::As) { 624 + // use namespace as alias 534 625 self.advance(); 535 626 let alias = self.parse_ident()?; 536 627 UseImports::Items(alloc::vec![UseItem { ··· 538 629 alias: Some(alias), 539 630 }]) 540 631 } else { 632 + // use namespace (imports all) 541 633 UseImports::All 542 634 }; 543 635 ··· 1500 1592 assert!(s.messages.is_none()); 1501 1593 } 1502 1594 _ => panic!("Expected subscription"), 1595 + } 1596 + } 1597 + 1598 + #[test] 1599 + fn test_parse_use_with_main() { 1600 + let input = "use com.example.thread { main };"; 1601 + let result = parse_lexicon(input); 1602 + assert!(result.is_ok()); 1603 + let lexicon = result.unwrap(); 1604 + assert_eq!(lexicon.items.len(), 1); 1605 + match &lexicon.items[0] { 1606 + Item::Use(u) => { 1607 + assert_eq!(u.path.to_string(), "com.example.thread"); 1608 + match &u.imports { 1609 + UseImports::Items(items) => { 1610 + assert_eq!(items.len(), 1); 1611 + assert_eq!(items[0].name.name, "main"); 1612 + assert!(items[0].alias.is_none()); 1613 + } 1614 + _ => panic!("Expected UseImports::Items"), 1615 + } 1616 + } 1617 + _ => panic!("Expected use statement"), 1618 + } 1619 + } 1620 + 1621 + #[test] 1622 + fn test_parse_use_with_main_and_alias() { 1623 + let input = "use com.example.thread { main as ThreadRecord };"; 1624 + let result = parse_lexicon(input); 1625 + assert!(result.is_ok()); 1626 + let lexicon = result.unwrap(); 1627 + assert_eq!(lexicon.items.len(), 1); 1628 + match &lexicon.items[0] { 1629 + Item::Use(u) => { 1630 + assert_eq!(u.path.to_string(), "com.example.thread"); 1631 + match &u.imports { 1632 + UseImports::Items(items) => { 1633 + assert_eq!(items.len(), 1); 1634 + assert_eq!(items[0].name.name, "main"); 1635 + assert_eq!(items[0].alias.as_ref().unwrap().name, "ThreadRecord"); 1636 + } 1637 + _ => panic!("Expected UseImports::Items"), 1638 + } 1639 + } 1640 + _ => panic!("Expected use statement"), 1641 + } 1642 + } 1643 + 1644 + #[test] 1645 + fn test_parse_use_with_multiple_items() { 1646 + let input = "use com.example { main, foo, bar as Baz };"; 1647 + let result = parse_lexicon(input); 1648 + assert!(result.is_ok()); 1649 + let lexicon = result.unwrap(); 1650 + assert_eq!(lexicon.items.len(), 1); 1651 + match &lexicon.items[0] { 1652 + Item::Use(u) => { 1653 + assert_eq!(u.path.to_string(), "com.example"); 1654 + match &u.imports { 1655 + UseImports::Items(items) => { 1656 + assert_eq!(items.len(), 3); 1657 + assert_eq!(items[0].name.name, "main"); 1658 + assert!(items[0].alias.is_none()); 1659 + assert_eq!(items[1].name.name, "foo"); 1660 + assert!(items[1].alias.is_none()); 1661 + assert_eq!(items[2].name.name, "bar"); 1662 + assert_eq!(items[2].alias.as_ref().unwrap().name, "Baz"); 1663 + } 1664 + _ => panic!("Expected UseImports::Items"), 1665 + } 1666 + } 1667 + _ => panic!("Expected use statement"), 1668 + } 1669 + } 1670 + 1671 + #[test] 1672 + fn test_parse_keywords_in_paths() { 1673 + let input = r#"def type test = { 1674 + field1: app.bsky.embed.record, 1675 + field2: com.example.query, 1676 + field3: com.example.string.type, 1677 + };"#; 1678 + let result = parse_lexicon(input); 1679 + assert!(result.is_ok()); 1680 + let lexicon = result.unwrap(); 1681 + assert_eq!(lexicon.items.len(), 1); 1682 + match &lexicon.items[0] { 1683 + Item::DefType(d) => { 1684 + assert_eq!(d.name.name, "test"); 1685 + match &d.ty { 1686 + Type::Object { fields, .. } => { 1687 + assert_eq!(fields.len(), 3); 1688 + // Verify the paths contain keywords 1689 + match &fields[0].ty { 1690 + Type::Reference { path, .. } => { 1691 + assert_eq!(path.to_string(), "app.bsky.embed.record"); 1692 + } 1693 + _ => panic!("Expected reference type"), 1694 + } 1695 + match &fields[1].ty { 1696 + Type::Reference { path, .. } => { 1697 + assert_eq!(path.to_string(), "com.example.query"); 1698 + } 1699 + _ => panic!("Expected reference type"), 1700 + } 1701 + match &fields[2].ty { 1702 + Type::Reference { path, .. } => { 1703 + assert_eq!(path.to_string(), "com.example.string.type"); 1704 + } 1705 + _ => panic!("Expected reference type"), 1706 + } 1707 + } 1708 + _ => panic!("Expected object type"), 1709 + } 1710 + } 1711 + _ => panic!("Expected def type"), 1503 1712 } 1504 1713 } 1505 1714 }

+563 -152

mlf-lang/src/workspace.rs

··· 54 54 errors.push(ValidationError::InvalidConstraint { 55 55 message: alloc::format!("Failed to parse prelude: {:?}", e), 56 56 span: crate::span::Span::new(0, 0), 57 + module_namespace: "prelude".to_string(), 57 58 }); 58 59 errors 59 60 })?; ··· 71 72 errors.push(ValidationError::InvalidConstraint { 72 73 message: alloc::format!("Failed to parse prelude: {:?}", e), 73 74 span: crate::span::Span::new(0, 0), 75 + module_namespace: "prelude".to_string(), 74 76 }); 75 77 errors 76 78 })?; ··· 84 86 for file in dir.files() { 85 87 if let Some(path_str) = file.path().to_str() { 86 88 if path_str.ends_with(".mlf") && !path_str.ends_with("prelude.mlf") { 87 - // Convert file path to namespace 88 - // e.g., "com/atproto/repo/defs.mlf" -> "com.atproto.repo" 89 - // e.g., "com/atproto/lexicon/schema.mlf" -> "com.atproto.lexicon" 90 - let mut namespace = path_str 89 + // Convert file path to namespace (including filename) 90 + // e.g., "com/atproto/repo/defs.mlf" -> "com.atproto.repo.defs" 91 + // e.g., "com/atproto/lexicon/schema.mlf" -> "com.atproto.lexicon.schema" 92 + let namespace = path_str 91 93 .strip_suffix(".mlf") 92 94 .unwrap_or(path_str) 93 95 .replace('/', "."); 94 96 95 - // Strip the filename part - files are just containers, not part of the NSID 96 - // Get the directory path as the namespace 97 - if let Some(last_dot) = namespace.rfind('.') { 98 - namespace = namespace[..last_dot].to_string(); 99 - } 100 - 101 97 if let Some(contents) = file.contents_utf8() { 102 98 let lexicon = crate::parser::parse_lexicon(contents) 103 99 .map_err(|e| { ··· 105 101 errors.push(ValidationError::InvalidConstraint { 106 102 message: alloc::format!("Failed to parse {} (file: {}): {:?}", namespace, path_str, e), 107 103 span: crate::span::Span::new(0, 0), 104 + module_namespace: namespace.clone(), 108 105 }); 109 106 errors 110 107 })?; ··· 264 261 errors.push(ValidationError::InvalidConstraint { 265 262 message: "Union type must have at least one member".into(), 266 263 span: *span, 264 + module_namespace: namespace.to_string(), 267 265 }); 268 266 } 269 267 ··· 296 294 errors.append(&mut base_errors); 297 295 } 298 296 299 - if let Err(mut constraint_errors) = self.typecheck_constraints(base, constraints, *span) { 297 + if let Err(mut constraint_errors) = self.typecheck_constraints(namespace, base, constraints, *span) { 300 298 errors.append(&mut constraint_errors); 301 299 } 302 300 303 - if let Err(mut refinement_errors) = self.check_constraint_refinement(base, constraints) { 301 + if let Err(mut refinement_errors) = self.check_constraint_refinement(namespace, base, constraints) { 304 302 errors.append(&mut refinement_errors); 305 303 } 306 304 ··· 313 311 } 314 312 } 315 313 316 - fn typecheck_constraints(&self, base: &Type, constraints: &[Constraint], _span: Span) -> Result<(), ValidationErrors> { 314 + fn typecheck_constraints(&self, namespace: &str, base: &Type, constraints: &[Constraint], _span: Span) -> Result<(), ValidationErrors> { 317 315 let mut errors = ValidationErrors::new(); 318 316 319 317 let base_kind = self.get_base_primitive(base); ··· 329 327 errors.push(ValidationError::InvalidConstraint { 330 328 message: alloc::format!("Length constraint can only be applied to string or array types"), 331 329 span: *span, 330 + module_namespace: namespace.to_string(), 332 331 }); 333 332 } 334 333 } ··· 340 339 errors.push(ValidationError::InvalidConstraint { 341 340 message: alloc::format!("String constraint on non-string type"), 342 341 span: *span, 342 + module_namespace: namespace.to_string(), 343 343 }); 344 344 } 345 345 } ··· 349 349 errors.push(ValidationError::InvalidConstraint { 350 350 message: alloc::format!("Numeric constraint on non-numeric type"), 351 351 span: *span, 352 + module_namespace: namespace.to_string(), 352 353 }); 353 354 } 354 355 } ··· 358 359 errors.push(ValidationError::InvalidConstraint { 359 360 message: alloc::format!("Blob constraint on non-blob type"), 360 361 span: *span, 362 + module_namespace: namespace.to_string(), 361 363 }); 362 364 } 363 365 } ··· 374 376 } 375 377 } 376 378 377 - fn check_constraint_refinement(&self, base: &Type, new_constraints: &[Constraint]) -> Result<(), ValidationErrors> { 379 + fn check_constraint_refinement(&self, namespace: &str, base: &Type, new_constraints: &[Constraint]) -> Result<(), ValidationErrors> { 378 380 let base_constraints = self.get_base_constraints(base); 379 381 380 382 if base_constraints.is_empty() { ··· 395 397 new_max, base_max 396 398 ), 397 399 span: *span, 400 + module_namespace: namespace.to_string(), 398 401 }); 399 402 } 400 403 } ··· 410 413 new_min, base_min 411 414 ), 412 415 span: *span, 416 + module_namespace: namespace.to_string(), 413 417 }); 414 418 } 415 419 } ··· 425 429 new_max, base_max 426 430 ), 427 431 span: *span, 432 + module_namespace: namespace.to_string(), 428 433 }); 429 434 } 430 435 } ··· 440 445 new_min, base_min 441 446 ), 442 447 span: *span, 448 + module_namespace: namespace.to_string(), 443 449 }); 444 450 } 445 451 } ··· 455 461 new_max, base_max 456 462 ), 457 463 span: *span, 464 + module_namespace: namespace.to_string(), 458 465 }); 459 466 } 460 467 } ··· 470 477 new_min, base_min 471 478 ), 472 479 span: *span, 480 + module_namespace: namespace.to_string(), 473 481 }); 474 482 } 475 483 } ··· 485 493 new_max, base_max 486 494 ), 487 495 span: *span, 496 + module_namespace: namespace.to_string(), 488 497 }); 489 498 } 490 499 } ··· 501 510 new_val 502 511 ), 503 512 span: *span, 513 + module_namespace: namespace.to_string(), 504 514 }); 505 515 } 506 516 } ··· 580 590 .join("."); 581 591 let type_name = &path.segments[path.segments.len() - 1].name; 582 592 593 + // First try: normal resolution (namespace + type) 583 594 if let Some(module) = self.modules.get(&target_namespace) { 584 595 for item in &module.lexicon.items { 585 596 match item { ··· 593 604 } 594 605 } 595 606 } 607 + 608 + // Second try: implicit main resolution 609 + let full_namespace = path.to_string(); 610 + let namespace_suffix = full_namespace.split('.').last().unwrap_or(&full_namespace); 611 + if namespace_suffix == type_name { 612 + if let Some(module) = self.modules.get(&full_namespace) { 613 + for item in &module.lexicon.items { 614 + match item { 615 + Item::InlineType(i) if i.name.name == *type_name => { 616 + return Some(i.ty.clone()); 617 + } 618 + Item::DefType(d) if d.name.name == *type_name => { 619 + return Some(d.ty.clone()); 620 + } 621 + _ => {} 622 + } 623 + } 624 + } 625 + } 596 626 } 597 627 598 628 None ··· 635 665 false 636 666 } 637 667 668 + /// Resolve a type reference to its actual namespace 669 + /// Returns the namespace where the type is defined, or None if not found 670 + pub fn resolve_reference_namespace(&self, path: &Path, current_namespace: &str) -> Option<String> { 671 + if path.segments.len() == 1 { 672 + let name = &path.segments[0].name; 673 + 674 + // Check current module first 675 + if let Some(module) = self.modules.get(current_namespace) { 676 + // Check if it's a local type 677 + if module.symbols.types.contains_key(name) { 678 + return Some(current_namespace.to_string()); 679 + } 680 + 681 + // Check if it's imported 682 + if let Some(imported) = module.imports.mappings.get(name) { 683 + // Build the full namespace from the original path 684 + // original_path is Vec<String> like ["place", "stream", "key", "key"] 685 + // We want "place.stream.key" (drop the last segment which is the type name) 686 + if imported.original_path.len() > 1 { 687 + let namespace = imported.original_path[..imported.original_path.len() - 1].join("."); 688 + return Some(namespace); 689 + } else { 690 + // Edge case: just a single segment, assume it's in the current namespace 691 + return Some(current_namespace.to_string()); 692 + } 693 + } 694 + } 695 + 696 + // Check prelude 697 + if let Some(module) = self.modules.get("prelude") { 698 + if module.symbols.types.contains_key(name) { 699 + return Some("prelude".to_string()); 700 + } 701 + } 702 + 703 + return None; 704 + } else { 705 + // Multi-segment path: resolve normally 706 + let target_namespace = path.segments[..path.segments.len() - 1] 707 + .iter() 708 + .map(|s| s.name.as_str()) 709 + .collect::<Vec<_>>() 710 + .join("."); 711 + let type_name = &path.segments[path.segments.len() - 1].name; 712 + 713 + // First try: normal resolution 714 + if let Some(module) = self.modules.get(&target_namespace) { 715 + if module.symbols.types.contains_key(type_name) { 716 + return Some(target_namespace); 717 + } 718 + } 719 + 720 + // Second try: implicit main resolution 721 + let full_namespace = path.to_string(); 722 + if let Some(module) = self.modules.get(&full_namespace) { 723 + let namespace_suffix = full_namespace.split('.').last().unwrap_or(&full_namespace); 724 + if namespace_suffix == type_name && module.symbols.types.contains_key(type_name) { 725 + return Some(full_namespace); 726 + } 727 + } 728 + 729 + None 730 + } 731 + } 732 + 638 733 fn resolve_imports(&mut self) -> Result<(), ValidationErrors> { 639 734 let mut errors = ValidationErrors::new(); 640 735 ··· 662 757 fn resolve_use_statement(&mut self, current_namespace: &str, use_stmt: &Use) -> Result<(), ValidationErrors> { 663 758 let mut errors = ValidationErrors::new(); 664 759 665 - let (target_namespace, type_name_opt) = match &use_stmt.imports { 760 + // Determine if this is: 761 + // 1. use namespace; (UseImports::All) 762 + // 2. use namespace { items }; (UseImports::Items with path=namespace) 763 + // 3. use namespace.typename as alias; (UseImports::Items with path=namespace.typename, single item) 764 + 765 + let (target_namespace, is_single_type_import) = match &use_stmt.imports { 666 766 UseImports::All => { 667 - (use_stmt.path.to_string(), None) 767 + // use namespace; 768 + (use_stmt.path.to_string(), false) 668 769 } 669 - UseImports::Items(_) => { 670 - if use_stmt.path.segments.len() < 2 { 671 - errors.push(ValidationError::UndefinedReference { 672 - name: use_stmt.path.to_string(), 673 - span: use_stmt.path.span, 674 - }); 675 - return Err(errors); 770 + UseImports::Items(items) => { 771 + // Check if this is the old syntax: use namespace.typename as alias 772 + // In this case, path has >=2 segments and items has 1 item whose name matches the last segment 773 + if items.len() == 1 774 + && use_stmt.path.segments.len() >= 2 775 + && items[0].name.name == use_stmt.path.segments.last().unwrap().name { 776 + // Old syntax: use namespace.typename as alias 777 + let namespace = use_stmt.path.segments[..use_stmt.path.segments.len() - 1] 778 + .iter() 779 + .map(|s| s.name.as_str()) 780 + .collect::<Vec<_>>() 781 + .join("."); 782 + (namespace, true) 783 + } else { 784 + // New syntax: use namespace { items }; 785 + (use_stmt.path.to_string(), false) 676 786 } 677 - let namespace = use_stmt.path.segments[..use_stmt.path.segments.len() - 1] 678 - .iter() 679 - .map(|s| s.name.as_str()) 680 - .collect::<Vec<_>>() 681 - .join("."); 682 - let type_name = use_stmt.path.segments.last().unwrap().name.clone(); 683 - (namespace, Some(type_name)) 684 787 } 685 788 }; 686 789 ··· 688 791 errors.push(ValidationError::UndefinedReference { 689 792 name: target_namespace.clone(), 690 793 span: use_stmt.path.span, 794 + module_namespace: current_namespace.to_string(), 691 795 }); 692 796 return Err(errors); 693 797 } 694 798 695 799 let imports_to_add: Vec<(String, ImportedSymbol)> = match &use_stmt.imports { 696 800 UseImports::All => { 801 + // Import all types from the namespace 697 802 let target_module = self.modules.get(&target_namespace).unwrap(); 698 803 target_module.symbols.types.keys() 699 804 .map(|type_name| { ··· 709 814 .collect() 710 815 } 711 816 UseImports::Items(items) => { 817 + // Import specific items from the namespace 712 818 let target_module = self.modules.get(&target_namespace).unwrap(); 713 819 let mut imports = Vec::new(); 714 820 715 - let type_name = type_name_opt.as_ref().unwrap(); 716 - if !target_module.symbols.types.contains_key(type_name) { 717 - errors.push(ValidationError::UndefinedReference { 718 - name: alloc::format!("{}.{}", target_namespace, type_name), 719 - span: use_stmt.path.span, 720 - }); 721 - } else { 722 - for item in items { 723 - let local_name = if let Some(alias) = &item.alias { 724 - alias.name.clone() 725 - } else { 726 - type_name.clone() 727 - }; 821 + // Get the namespace suffix for main resolution 822 + let namespace_suffix = target_namespace.split('.').last().unwrap_or(&target_namespace); 728 823 729 - imports.push(( 730 - local_name.clone(), 731 - ImportedSymbol { 732 - original_path: use_stmt.path.segments.iter() 733 - .map(|s| s.name.clone()) 734 - .collect(), 735 - local_name, 736 - }, 737 - )); 824 + for item in items { 825 + // Determine the actual type name to look up 826 + let type_name = if item.name.name == "main" { 827 + // Special case: "main" keyword resolves to the namespace suffix 828 + namespace_suffix.to_string() 829 + } else { 830 + item.name.name.clone() 831 + }; 832 + 833 + // Check if the type exists in the target module 834 + if !target_module.symbols.types.contains_key(&type_name) { 835 + errors.push(ValidationError::UndefinedReference { 836 + name: alloc::format!("{}.{}", target_namespace, item.name.name), 837 + span: item.name.span, 838 + module_namespace: current_namespace.to_string(), 839 + }); 840 + continue; 738 841 } 842 + 843 + // Determine the local name (alias or original name) 844 + let local_name = if let Some(alias) = &item.alias { 845 + alias.name.clone() 846 + } else if item.name.name == "main" { 847 + // If importing main without alias, bind to namespace suffix 848 + namespace_suffix.to_string() 849 + } else { 850 + type_name.clone() 851 + }; 852 + 853 + // Build the original_path correctly based on import type 854 + let original_path = if is_single_type_import { 855 + // Old syntax: use namespace.typename as alias 856 + // Path segments already include the type name 857 + use_stmt.path.segments.iter() 858 + .map(|s| s.name.clone()) 859 + .collect() 860 + } else { 861 + // New syntax: use namespace { items }; 862 + // Need to append the type name to the namespace 863 + use_stmt.path.segments.iter() 864 + .map(|s| s.name.clone()) 865 + .chain(core::iter::once(type_name.clone())) 866 + .collect() 867 + }; 868 + 869 + imports.push(( 870 + local_name.clone(), 871 + ImportedSymbol { 872 + original_path, 873 + local_name, 874 + }, 875 + )); 739 876 } 877 + 740 878 imports 741 879 } 742 880 }; ··· 753 891 } 754 892 } 755 893 756 - fn build_symbol_table(_namespace: &str, lexicon: &Lexicon) -> Result<SymbolTable, ValidationErrors> { 894 + fn has_main_annotation(annotations: &[Annotation]) -> bool { 895 + annotations.iter().any(|ann| ann.name.name == "main") 896 + } 897 + 898 + fn is_main_eligible_item(item: &Item) -> bool { 899 + matches!(item, Item::Record(_) | Item::Query(_) | Item::Procedure(_) | Item::Subscription(_) | Item::DefType(_)) 900 + } 901 + 902 + fn build_symbol_table(namespace: &str, lexicon: &Lexicon) -> Result<SymbolTable, ValidationErrors> { 757 903 let mut symbols = SymbolTable { 758 904 types: BTreeMap::new(), 759 905 }; 760 906 let mut errors = ValidationErrors::new(); 907 + 908 + // Extract namespace suffix for conflict checking 909 + let namespace_suffix = namespace.split('.').last().unwrap_or(namespace); 910 + 911 + // First pass: group items by name to detect duplicates 912 + let mut items_by_name: BTreeMap<String, Vec<&Item>> = BTreeMap::new(); 761 913 762 914 for item in &lexicon.items { 763 - match item { 764 - Item::Record(r) => { 765 - // Check for reserved names 766 - if r.name.name == "main" || r.name.name == "defs" { 767 - errors.push(crate::error::ValidationError::ReservedName { 768 - name: r.name.name.clone(), 769 - span: r.name.span, 770 - }); 771 - } 915 + let name = match item { 916 + Item::Record(r) => Some(r.name.name.as_str()), 917 + Item::InlineType(i) => Some(i.name.name.as_str()), 918 + Item::DefType(d) => Some(d.name.name.as_str()), 919 + Item::Token(t) => Some(t.name.name.as_str()), 920 + Item::Query(q) => Some(q.name.name.as_str()), 921 + Item::Procedure(p) => Some(p.name.name.as_str()), 922 + Item::Subscription(s) => Some(s.name.name.as_str()), 923 + Item::Use(_) => None, 924 + }; 925 + 926 + if let Some(name) = name { 927 + items_by_name.entry(name.to_string()).or_insert_with(Vec::new).push(item); 928 + } 929 + } 930 + 931 + // Second pass: validate each name group 932 + for (name, items) in items_by_name { 933 + // Check for reserved names 934 + if name == "main" || name == "defs" { 935 + for item in &items { 936 + let span = match item { 937 + Item::Record(r) => r.name.span, 938 + Item::InlineType(i) => i.name.span, 939 + Item::DefType(d) => d.name.span, 940 + Item::Token(t) => t.name.span, 941 + Item::Query(q) => q.name.span, 942 + Item::Procedure(p) => p.name.span, 943 + Item::Subscription(s) => s.name.span, 944 + Item::Use(_) => continue, 945 + }; 946 + errors.push(crate::error::ValidationError::ReservedName { 947 + name: name.clone(), 948 + span, 949 + module_namespace: namespace.to_string(), 950 + }); 951 + } 952 + continue; 953 + } 772 954 773 - if let Some(existing) = symbols.types.get(&r.name.name) { 774 - errors.push(crate::error::ValidationError::DuplicateDefinition { 775 - name: r.name.name.clone(), 776 - first_span: existing.span(), 777 - second_span: r.name.span, 778 - }); 779 - } else { 955 + if items.len() == 1 { 956 + // No duplicates, just add to symbol table 957 + let item = items[0]; 958 + match item { 959 + Item::Record(r) => { 780 960 symbols.types.insert( 781 - r.name.name.clone(), 961 + name.clone(), 782 962 Symbol::Record { 783 - name: r.name.name.clone(), 963 + name: name.clone(), 784 964 span: r.name.span, 785 965 }, 786 966 ); 787 967 } 788 - } 789 - Item::InlineType(i) => { 790 - // Check for reserved names 791 - if i.name.name == "main" || i.name.name == "defs" { 792 - errors.push(crate::error::ValidationError::ReservedName { 793 - name: i.name.name.clone(), 794 - span: i.name.span, 795 - }); 796 - } 797 - 798 - if let Some(existing) = symbols.types.get(&i.name.name) { 799 - errors.push(crate::error::ValidationError::DuplicateDefinition { 800 - name: i.name.name.clone(), 801 - first_span: existing.span(), 802 - second_span: i.name.span, 803 - }); 804 - } else { 968 + Item::InlineType(i) => { 805 969 symbols.types.insert( 806 - i.name.name.clone(), 970 + name.clone(), 807 971 Symbol::Alias { 808 - name: i.name.name.clone(), 972 + name: name.clone(), 809 973 span: i.name.span, 810 974 }, 811 975 ); 812 976 } 813 - } 814 - Item::DefType(d) => { 815 - // Check for reserved names 816 - if d.name.name == "main" || d.name.name == "defs" { 817 - errors.push(crate::error::ValidationError::ReservedName { 818 - name: d.name.name.clone(), 819 - span: d.name.span, 820 - }); 821 - } 822 - 823 - if let Some(existing) = symbols.types.get(&d.name.name) { 824 - errors.push(crate::error::ValidationError::DuplicateDefinition { 825 - name: d.name.name.clone(), 826 - first_span: existing.span(), 827 - second_span: d.name.span, 828 - }); 829 - } else { 977 + Item::DefType(d) => { 830 978 symbols.types.insert( 831 - d.name.name.clone(), 979 + name.clone(), 832 980 Symbol::Alias { 833 - name: d.name.name.clone(), 981 + name: name.clone(), 834 982 span: d.name.span, 835 983 }, 836 984 ); 837 985 } 838 - } 839 - Item::Token(t) => { 840 - // Check for reserved names 841 - if t.name.name == "main" || t.name.name == "defs" { 842 - errors.push(crate::error::ValidationError::ReservedName { 843 - name: t.name.name.clone(), 844 - span: t.name.span, 845 - }); 846 - } 847 - 848 - if let Some(existing) = symbols.types.get(&t.name.name) { 849 - errors.push(crate::error::ValidationError::DuplicateDefinition { 850 - name: t.name.name.clone(), 851 - first_span: existing.span(), 852 - second_span: t.name.span, 853 - }); 854 - } else { 986 + Item::Token(t) => { 855 987 symbols.types.insert( 856 - t.name.name.clone(), 988 + name.clone(), 857 989 Symbol::Token { 858 - name: t.name.name.clone(), 990 + name: name.clone(), 859 991 span: t.name.span, 860 992 }, 861 993 ); 862 994 } 995 + _ => {} // Queries, procedures, subscriptions not added to symbol table 863 996 } 864 - Item::Query(q) => { 865 - // Check for reserved names 866 - if q.name.name == "main" || q.name.name == "defs" { 867 - errors.push(crate::error::ValidationError::ReservedName { 868 - name: q.name.name.clone(), 869 - span: q.name.span, 870 - }); 997 + } else { 998 + // Duplicates found - apply @main rules 999 + // Check if name matches namespace suffix 1000 + if name != namespace_suffix { 1001 + // Conflict not allowed - name doesn't match suffix 1002 + for (idx, item) in items.iter().enumerate() { 1003 + if idx > 0 { 1004 + let span = match item { 1005 + Item::Record(r) => r.name.span, 1006 + Item::InlineType(i) => i.name.span, 1007 + Item::DefType(d) => d.name.span, 1008 + Item::Token(t) => t.name.span, 1009 + Item::Query(q) => q.name.span, 1010 + Item::Procedure(p) => p.name.span, 1011 + Item::Subscription(s) => s.name.span, 1012 + Item::Use(_) => continue, 1013 + }; 1014 + errors.push(crate::error::ValidationError::ConflictNotAllowed { 1015 + name: name.clone(), 1016 + namespace_suffix: namespace_suffix.to_string(), 1017 + span, 1018 + module_namespace: namespace.to_string(), 1019 + }); 1020 + } 871 1021 } 1022 + continue; 872 1023 } 873 - Item::Procedure(p) => { 874 - // Check for reserved names 875 - if p.name.name == "main" || p.name.name == "defs" { 876 - errors.push(crate::error::ValidationError::ReservedName { 877 - name: p.name.name.clone(), 878 - span: p.name.span, 879 - }); 1024 + 1025 + // Check if any inline types are involved (not allowed to conflict) 1026 + let has_inline = items.iter().any(|item| matches!(item, Item::InlineType(_))); 1027 + if has_inline { 1028 + for (idx, item) in items.iter().enumerate() { 1029 + if idx > 0 { 1030 + let span = match item { 1031 + Item::Record(r) => r.name.span, 1032 + Item::InlineType(i) => i.name.span, 1033 + Item::DefType(d) => d.name.span, 1034 + Item::Token(t) => t.name.span, 1035 + Item::Query(q) => q.name.span, 1036 + Item::Procedure(p) => p.name.span, 1037 + Item::Subscription(s) => s.name.span, 1038 + Item::Use(_) => continue, 1039 + }; 1040 + errors.push(crate::error::ValidationError::DuplicateDefinition { 1041 + name: name.clone(), 1042 + first_span: match items[0] { 1043 + Item::Record(r) => r.name.span, 1044 + Item::InlineType(i) => i.name.span, 1045 + Item::DefType(d) => d.name.span, 1046 + Item::Token(t) => t.name.span, 1047 + Item::Query(q) => q.name.span, 1048 + Item::Procedure(p) => p.name.span, 1049 + Item::Subscription(s) => s.name.span, 1050 + Item::Use(_) => continue, 1051 + }, 1052 + second_span: span, 1053 + module_namespace: namespace.to_string(), 1054 + }); 1055 + } 880 1056 } 1057 + continue; 881 1058 } 882 - Item::Subscription(s) => { 883 - // Check for reserved names 884 - if s.name.name == "main" || s.name.name == "defs" { 885 - errors.push(crate::error::ValidationError::ReservedName { 886 - name: s.name.name.clone(), 887 - span: s.name.span, 888 - }); 1059 + 1060 + // Check @main annotations 1061 + let items_with_main: Vec<&Item> = items.iter() 1062 + .filter(|item| { 1063 + let annotations = match item { 1064 + Item::Record(r) => &r.annotations, 1065 + Item::DefType(d) => &d.annotations, 1066 + Item::Query(q) => &q.annotations, 1067 + Item::Procedure(p) => &p.annotations, 1068 + Item::Subscription(s) => &s.annotations, 1069 + _ => return false, 1070 + }; 1071 + Self::has_main_annotation(annotations) 1072 + }) 1073 + .copied() 1074 + .collect(); 1075 + 1076 + if items_with_main.is_empty() { 1077 + // No @main annotation - ambiguous 1078 + errors.push(crate::error::ValidationError::AmbiguousMain { 1079 + name: name.clone(), 1080 + namespace_suffix: namespace_suffix.to_string(), 1081 + first_span: match items[0] { 1082 + Item::Record(r) => r.name.span, 1083 + Item::DefType(d) => d.name.span, 1084 + Item::Query(q) => q.name.span, 1085 + Item::Procedure(p) => p.name.span, 1086 + Item::Subscription(s) => s.name.span, 1087 + _ => continue, 1088 + }, 1089 + second_span: match items[1] { 1090 + Item::Record(r) => r.name.span, 1091 + Item::DefType(d) => d.name.span, 1092 + Item::Query(q) => q.name.span, 1093 + Item::Procedure(p) => p.name.span, 1094 + Item::Subscription(s) => s.name.span, 1095 + _ => continue, 1096 + }, 1097 + module_namespace: namespace.to_string(), 1098 + }); 1099 + } else if items_with_main.len() > 1 { 1100 + // Multiple @main annotations 1101 + errors.push(crate::error::ValidationError::MultipleMain { 1102 + name: name.clone(), 1103 + first_span: match items_with_main[0] { 1104 + Item::Record(r) => r.name.span, 1105 + Item::DefType(d) => d.name.span, 1106 + Item::Query(q) => q.name.span, 1107 + Item::Procedure(p) => p.name.span, 1108 + Item::Subscription(s) => s.name.span, 1109 + _ => continue, 1110 + }, 1111 + second_span: match items_with_main[1] { 1112 + Item::Record(r) => r.name.span, 1113 + Item::DefType(d) => d.name.span, 1114 + Item::Query(q) => q.name.span, 1115 + Item::Procedure(p) => p.name.span, 1116 + Item::Subscription(s) => s.name.span, 1117 + _ => continue, 1118 + }, 1119 + module_namespace: namespace.to_string(), 1120 + }); 1121 + } else { 1122 + // Exactly one @main - valid conflict 1123 + // Add the non-main item to symbol table (only defs/aliases) 1124 + for item in &items { 1125 + let has_main = match item { 1126 + Item::Record(r) => Self::has_main_annotation(&r.annotations), 1127 + Item::DefType(d) => Self::has_main_annotation(&d.annotations), 1128 + Item::Query(q) => Self::has_main_annotation(&q.annotations), 1129 + Item::Procedure(p) => Self::has_main_annotation(&p.annotations), 1130 + Item::Subscription(s) => Self::has_main_annotation(&s.annotations), 1131 + _ => false, 1132 + }; 1133 + 1134 + if !has_main { 1135 + // Add non-main item to symbol table 1136 + match item { 1137 + Item::DefType(d) => { 1138 + symbols.types.insert( 1139 + name.clone(), 1140 + Symbol::Alias { 1141 + name: name.clone(), 1142 + span: d.name.span, 1143 + }, 1144 + ); 1145 + } 1146 + Item::Record(r) => { 1147 + symbols.types.insert( 1148 + name.clone(), 1149 + Symbol::Record { 1150 + name: name.clone(), 1151 + span: r.name.span, 1152 + }, 1153 + ); 1154 + } 1155 + Item::Token(t) => { 1156 + symbols.types.insert( 1157 + name.clone(), 1158 + Symbol::Token { 1159 + name: name.clone(), 1160 + span: t.name.span, 1161 + }, 1162 + ); 1163 + } 1164 + _ => {} 1165 + } 1166 + } 889 1167 } 890 - } 891 - Item::Use(_) => { 892 - // Handled separately 893 1168 } 894 1169 } 895 1170 } ··· 1109 1384 errors.push(crate::error::ValidationError::UndefinedReference { 1110 1385 name: name.clone(), 1111 1386 span, 1387 + module_namespace: current_namespace.to_string(), 1112 1388 }); 1113 1389 return Err(errors); 1114 1390 } ··· 1120 1396 .join("."); 1121 1397 let type_name = &path.segments[path.segments.len() - 1].name; 1122 1398 1399 + // First try: normal resolution (namespace + type) 1123 1400 if let Some(module) = self.modules.get(&target_namespace) { 1124 1401 if module.symbols.types.contains_key(type_name) { 1125 1402 return Ok(()); 1126 1403 } 1127 1404 } 1128 1405 1406 + // Second try: implicit main resolution 1407 + // If com.atproto.repo.strongRef fails, try treating the full path as a namespace 1408 + // and look for a type named "strongRef" (matching the namespace suffix) 1409 + let full_namespace = &full_path; 1410 + if let Some(module) = self.modules.get(full_namespace) { 1411 + let namespace_suffix = full_namespace.split('.').last().unwrap_or(full_namespace); 1412 + if namespace_suffix == type_name && module.symbols.types.contains_key(type_name) { 1413 + return Ok(()); 1414 + } 1415 + } 1416 + 1129 1417 let mut errors = ValidationErrors::new(); 1130 1418 errors.push(crate::error::ValidationError::UndefinedReference { 1131 1419 name: full_path, 1132 1420 span, 1421 + module_namespace: current_namespace.to_string(), 1133 1422 }); 1134 1423 Err(errors) 1135 1424 } ··· 1603 1892 1604 1893 // Check that std modules are loaded 1605 1894 assert!(ws.modules.len() > 1, "Should have more than just prelude"); 1895 + } 1896 + 1897 + #[test] 1898 + fn test_main_annotation_valid_conflict() { 1899 + let mut ws = Workspace::new(); 1900 + 1901 + let input = r#" 1902 + @main 1903 + record thread { 1904 + title!: string, 1905 + } 1906 + 1907 + def type thread = { 1908 + id!: string, 1909 + }; 1910 + "#; 1911 + let lexicon = parse_lexicon(input).unwrap(); 1912 + let result = ws.add_module("com.example.thread".into(), lexicon); 1913 + assert!(result.is_ok()); 1914 + } 1915 + 1916 + #[test] 1917 + fn test_main_annotation_ambiguous() { 1918 + let mut ws = Workspace::new(); 1919 + 1920 + let input = r#" 1921 + record thread { 1922 + title!: string, 1923 + } 1924 + 1925 + def type thread = { 1926 + id!: string, 1927 + }; 1928 + "#; 1929 + let lexicon = parse_lexicon(input).unwrap(); 1930 + let result = ws.add_module("com.example.thread".into(), lexicon); 1931 + assert!(result.is_err()); 1932 + let errors = result.unwrap_err(); 1933 + assert!(errors.errors.iter().any(|e| matches!(e, ValidationError::AmbiguousMain { .. }))); 1934 + } 1935 + 1936 + #[test] 1937 + fn test_main_annotation_multiple() { 1938 + let mut ws = Workspace::new(); 1939 + 1940 + let input = r#" 1941 + @main 1942 + record thread { 1943 + title!: string, 1944 + } 1945 + 1946 + @main 1947 + def type thread = { 1948 + id!: string, 1949 + }; 1950 + "#; 1951 + let lexicon = parse_lexicon(input).unwrap(); 1952 + let result = ws.add_module("com.example.thread".into(), lexicon); 1953 + assert!(result.is_err()); 1954 + let errors = result.unwrap_err(); 1955 + assert!(errors.errors.iter().any(|e| matches!(e, ValidationError::MultipleMain { .. }))); 1956 + } 1957 + 1958 + #[test] 1959 + fn test_main_annotation_conflict_not_allowed() { 1960 + let mut ws = Workspace::new(); 1961 + 1962 + let input = r#" 1963 + @main 1964 + record post { 1965 + text!: string, 1966 + } 1967 + 1968 + def type post = { 1969 + id!: string, 1970 + }; 1971 + "#; 1972 + let lexicon = parse_lexicon(input).unwrap(); 1973 + let result = ws.add_module("com.example.thread".into(), lexicon); 1974 + assert!(result.is_err()); 1975 + let errors = result.unwrap_err(); 1976 + assert!(errors.errors.iter().any(|e| matches!(e, ValidationError::ConflictNotAllowed { .. }))); 1977 + } 1978 + 1979 + #[test] 1980 + fn test_main_annotation_inline_type_conflict() { 1981 + let mut ws = Workspace::new(); 1982 + 1983 + let input = r#" 1984 + inline type thread = string; 1985 + 1986 + def type thread = { 1987 + id!: string, 1988 + }; 1989 + "#; 1990 + let lexicon = parse_lexicon(input).unwrap(); 1991 + let result = ws.add_module("com.example.thread".into(), lexicon); 1992 + assert!(result.is_err()); 1993 + let errors = result.unwrap_err(); 1994 + assert!(errors.errors.iter().any(|e| matches!(e, ValidationError::DuplicateDefinition { .. }))); 1995 + } 1996 + 1997 + #[test] 1998 + fn test_main_annotation_query_valid() { 1999 + let mut ws = Workspace::new(); 2000 + 2001 + let input = r#" 2002 + @main 2003 + query getThread(id!: string): threadData; 2004 + 2005 + def type thread = { 2006 + id!: string, 2007 + title!: string, 2008 + }; 2009 + 2010 + def type threadData = { 2011 + thread!: thread, 2012 + }; 2013 + "#; 2014 + let lexicon = parse_lexicon(input).unwrap(); 2015 + let result = ws.add_module("com.example.thread".into(), lexicon); 2016 + assert!(result.is_ok()); 1606 2017 } 1607 2018 }

+2 -2

std/com/atproto/admin/defs.mlf

··· 11 11 email: string, 12 12 relatedRecords: [unknown], 13 13 indexedAt!: Datetime, 14 - invitedBy: com.atproto.server.inviteCode, 15 - invites: [com.atproto.server.inviteCode], 14 + invitedBy: com.atproto.server.defs.inviteCode, 15 + invites: [com.atproto.server.defs.inviteCode], 16 16 invitesDisabled: boolean, 17 17 emailConfirmedAt: Datetime, 18 18 inviteNote: string,

+22 -1

tree-sitter-mlf/grammar.js

··· 17 17 ], 18 18 19 19 conflicts: $ => [ 20 - [$.type, $.union_type] 20 + [$.type, $.union_type], 21 + [$.type_path] 21 22 ], 22 23 23 24 rules: { ··· 42 43 use_statement: $ => seq( 43 44 'use', 44 45 field('path', $.type_path), 46 + optional(choice( 47 + seq('as', field('alias', $.identifier)), 48 + seq('.', '*'), 49 + seq('.', field('imports', $.import_block)) 50 + )), 45 51 ';' 52 + ), 53 + 54 + import_block: $ => seq( 55 + '{', 56 + optional(seq( 57 + $.import_item, 58 + repeat(seq(',', $.import_item)), 59 + optional(',') 60 + )), 61 + '}' 62 + ), 63 + 64 + import_item: $ => seq( 65 + field('name', choice('main', $.identifier)), 66 + optional(seq('as', field('alias', $.identifier))) 46 67 ), 47 68 48 69 // Record definition

+14 -1

tree-sitter-mlf/queries/highlights.scm

··· 1 1 ; Keywords 2 2 [ 3 3 "use" 4 + "as" 5 + "main" 4 6 "record" 5 7 "def" 6 8 "inline" ··· 27 29 ; Type references 28 30 (reference_type 29 31 (type_path) @type) 32 + 33 + ; Import paths 34 + (use_statement 35 + path: (type_path) @namespace) 36 + 37 + ; Import items 38 + (import_item 39 + name: (identifier) @type) 40 + 41 + (import_item 42 + alias: (identifier) @type) 30 43 31 44 ; Function/method names 32 45 (query_definition ··· 79 92 ":" 80 93 "=" 81 94 "|" 82 - "?" 95 + "*" 83 96 ] @operator 84 97 85 98 ; Delimiters

+168

tree-sitter-mlf/src/grammar.json

··· 93 93 } 94 94 }, 95 95 { 96 + "type": "CHOICE", 97 + "members": [ 98 + { 99 + "type": "CHOICE", 100 + "members": [ 101 + { 102 + "type": "SEQ", 103 + "members": [ 104 + { 105 + "type": "STRING", 106 + "value": "as" 107 + }, 108 + { 109 + "type": "FIELD", 110 + "name": "alias", 111 + "content": { 112 + "type": "SYMBOL", 113 + "name": "identifier" 114 + } 115 + } 116 + ] 117 + }, 118 + { 119 + "type": "SEQ", 120 + "members": [ 121 + { 122 + "type": "STRING", 123 + "value": "." 124 + }, 125 + { 126 + "type": "STRING", 127 + "value": "*" 128 + } 129 + ] 130 + }, 131 + { 132 + "type": "SEQ", 133 + "members": [ 134 + { 135 + "type": "STRING", 136 + "value": "." 137 + }, 138 + { 139 + "type": "FIELD", 140 + "name": "imports", 141 + "content": { 142 + "type": "SYMBOL", 143 + "name": "import_block" 144 + } 145 + } 146 + ] 147 + } 148 + ] 149 + }, 150 + { 151 + "type": "BLANK" 152 + } 153 + ] 154 + }, 155 + { 96 156 "type": "STRING", 97 157 "value": ";" 158 + } 159 + ] 160 + }, 161 + "import_block": { 162 + "type": "SEQ", 163 + "members": [ 164 + { 165 + "type": "STRING", 166 + "value": "{" 167 + }, 168 + { 169 + "type": "CHOICE", 170 + "members": [ 171 + { 172 + "type": "SEQ", 173 + "members": [ 174 + { 175 + "type": "SYMBOL", 176 + "name": "import_item" 177 + }, 178 + { 179 + "type": "REPEAT", 180 + "content": { 181 + "type": "SEQ", 182 + "members": [ 183 + { 184 + "type": "STRING", 185 + "value": "," 186 + }, 187 + { 188 + "type": "SYMBOL", 189 + "name": "import_item" 190 + } 191 + ] 192 + } 193 + }, 194 + { 195 + "type": "CHOICE", 196 + "members": [ 197 + { 198 + "type": "STRING", 199 + "value": "," 200 + }, 201 + { 202 + "type": "BLANK" 203 + } 204 + ] 205 + } 206 + ] 207 + }, 208 + { 209 + "type": "BLANK" 210 + } 211 + ] 212 + }, 213 + { 214 + "type": "STRING", 215 + "value": "}" 216 + } 217 + ] 218 + }, 219 + "import_item": { 220 + "type": "SEQ", 221 + "members": [ 222 + { 223 + "type": "FIELD", 224 + "name": "name", 225 + "content": { 226 + "type": "CHOICE", 227 + "members": [ 228 + { 229 + "type": "STRING", 230 + "value": "main" 231 + }, 232 + { 233 + "type": "SYMBOL", 234 + "name": "identifier" 235 + } 236 + ] 237 + } 238 + }, 239 + { 240 + "type": "CHOICE", 241 + "members": [ 242 + { 243 + "type": "SEQ", 244 + "members": [ 245 + { 246 + "type": "STRING", 247 + "value": "as" 248 + }, 249 + { 250 + "type": "FIELD", 251 + "name": "alias", 252 + "content": { 253 + "type": "SYMBOL", 254 + "name": "identifier" 255 + } 256 + } 257 + ] 258 + }, 259 + { 260 + "type": "BLANK" 261 + } 262 + ] 98 263 } 99 264 ] 100 265 }, ··· 1009 1174 [ 1010 1175 "type", 1011 1176 "union_type" 1177 + ], 1178 + [ 1179 + "type_path" 1012 1180 ] 1013 1181 ], 1014 1182 "precedences": [],

+79 -2

tree-sitter-mlf/src/node-types.json

··· 223 223 "fields": {} 224 224 }, 225 225 { 226 + "type": "import_block", 227 + "named": true, 228 + "fields": {}, 229 + "children": { 230 + "multiple": true, 231 + "required": false, 232 + "types": [ 233 + { 234 + "type": "import_item", 235 + "named": true 236 + } 237 + ] 238 + } 239 + }, 240 + { 241 + "type": "import_item", 242 + "named": true, 243 + "fields": { 244 + "alias": { 245 + "multiple": false, 246 + "required": false, 247 + "types": [ 248 + { 249 + "type": "identifier", 250 + "named": true 251 + } 252 + ] 253 + }, 254 + "name": { 255 + "multiple": false, 256 + "required": true, 257 + "types": [ 258 + { 259 + "type": "identifier", 260 + "named": true 261 + }, 262 + { 263 + "type": "main", 264 + "named": false 265 + } 266 + ] 267 + } 268 + } 269 + }, 270 + { 226 271 "type": "inline_type_definition", 227 272 "named": true, 228 273 "fields": { ··· 650 695 "type": "use_statement", 651 696 "named": true, 652 697 "fields": { 698 + "alias": { 699 + "multiple": false, 700 + "required": false, 701 + "types": [ 702 + { 703 + "type": "identifier", 704 + "named": true 705 + } 706 + ] 707 + }, 708 + "imports": { 709 + "multiple": false, 710 + "required": false, 711 + "types": [ 712 + { 713 + "type": "import_block", 714 + "named": true 715 + } 716 + ] 717 + }, 653 718 "path": { 654 719 "multiple": false, 655 720 "required": true, ··· 672 737 }, 673 738 { 674 739 "type": ")", 740 + "named": false 741 + }, 742 + { 743 + "type": "*", 675 744 "named": false 676 745 }, 677 746 { ··· 707 776 "named": false 708 777 }, 709 778 { 779 + "type": "as", 780 + "named": false 781 + }, 782 + { 710 783 "type": "blob", 711 784 "named": false 712 785 }, ··· 751 824 "named": false 752 825 }, 753 826 { 827 + "type": "main", 828 + "named": false 829 + }, 830 + { 754 831 "type": "null", 755 832 "named": false 756 833 }, ··· 772 849 }, 773 850 { 774 851 "type": "string", 775 - "named": false 852 + "named": true 776 853 }, 777 854 { 778 855 "type": "string", 779 - "named": true 856 + "named": false 780 857 }, 781 858 { 782 859 "type": "subscription",

+13

tree-sitter-mlf/test.mlf

··· 1 + // Test imports - new syntax with dot 2 + use com.example.forum.profile; 3 + use com.example.thread.{ main }; 4 + use com.example.types.{ author, postRef }; 5 + use com.example.post.{ main as Post }; 6 + use com.example.user.{ main as User, userMeta }; 7 + 8 + // Test wildcard and alias 9 + use com.example.forum.*; 10 + use com.example.types as Types; 11 + 1 12 /// A simple post record 2 13 record post { 3 14 /// Post text ··· 7 18 }, 8 19 /// Creation timestamp 9 20 createdAt!: Datetime, 21 + author: profile, 22 + thread: Post, 10 23 }

+107 -6

website/content/docs/cli/02-configuration.md

··· 41 41 42 42 This directory is used by: 43 43 - `mlf check` (when run without arguments) 44 - - `mlf generate` (when run without arguments) 44 + - `mlf generate` commands (when run without `--input` or `--root`) 45 + 46 + The source directory also serves as the default **root** for namespace calculation. For example, a file at `./lexicons/com/example/thread.mlf` will have the namespace `com.example.thread`. 45 47 46 48 ### Output Configurations 47 49 ··· 63 65 [[output]] 64 66 type = "rust" 65 67 directory = "./src/lexicons" 68 + 69 + [[output]] 70 + type = "mlf" 71 + directory = "./converted" 66 72 ``` 67 73 68 74 **Supported types:** ··· 70 76 - `"typescript"` - Generate TypeScript types 71 77 - `"go"` - Generate Go structs 72 78 - `"rust"` - Generate Rust structs 79 + - `"mlf"` - Convert JSON lexicons to MLF 73 80 74 81 When you run `mlf generate` without arguments, it will generate all configured outputs. 75 82 ··· 96 103 97 104 ```bash 98 105 mlf check 99 - # Equivalent to: mlf check "./lexicons/**/*.mlf" 106 + # Uses: input=./lexicons, root=./lexicons 107 + ``` 108 + 109 + You can override with explicit arguments: 110 + 111 + ```bash 112 + mlf check ./custom-lexicons --root ./custom-lexicons 100 113 ``` 101 114 102 115 ### `mlf generate` ··· 108 121 # Runs all [[output]] configurations 109 122 ``` 110 123 124 + ### `mlf generate lexicon` 125 + 126 + When run without arguments, uses configuration defaults: 127 + 128 + ```bash 129 + mlf generate lexicon 130 + # Uses: input=./lexicons, output=first lexicon output, root=./lexicons 131 + ``` 132 + 133 + Override with explicit arguments: 134 + 135 + ```bash 136 + mlf generate lexicon -i ./src -o ./dist --root ./src 137 + ``` 138 + 139 + ### `mlf generate code` 140 + 141 + When run without arguments, uses configuration defaults: 142 + 143 + ```bash 144 + mlf generate code 145 + # Uses: generator=first non-lexicon output, input=./lexicons, output=matching directory 146 + ``` 147 + 148 + Override with explicit arguments: 149 + 150 + ```bash 151 + mlf generate code -g typescript -i ./src -o ./types --root ./src 152 + ``` 153 + 154 + ### `mlf generate mlf` 155 + 156 + When run without `--output`, uses configuration defaults: 157 + 158 + ```bash 159 + mlf generate mlf -i external.json 160 + # Uses: output=first mlf output 161 + ``` 162 + 111 163 ### `mlf fetch` 112 164 113 165 When run without arguments, fetches all dependencies: ··· 124 176 # Downloads lexicons AND adds to dependencies array 125 177 ``` 126 178 179 + ## Understanding Namespace Calculation 180 + 181 + The `source.directory` in `mlf.toml` acts as the **root** for namespace calculation. File paths relative to this root become the namespace. 182 + 183 + **Example:** 184 + 185 + ```toml 186 + [source] 187 + directory = "./lexicons" 188 + ``` 189 + 190 + | File Path | Namespace | 191 + |-----------|-----------| 192 + | `./lexicons/com/example/thread.mlf` | `com.example.thread` | 193 + | `./lexicons/com/example/types/post.mlf` | `com.example.types.post` | 194 + | `./lexicons/app/bsky/feed/post.mlf` | `app.bsky.feed.post` | 195 + 196 + If your files are in a different location, use the `--root` flag: 197 + 198 + ```bash 199 + mlf generate lexicon -i ./src/schemas -o ./dist --root ./src/schemas 200 + ``` 201 + 202 + Now `./src/schemas/com/example/thread.mlf` → namespace `com.example.thread` 203 + 127 204 ## Complete Example 128 205 129 206 Here's a complete `mlf.toml` for a TypeScript project using ATProto lexicons: ··· 197 274 1. **Commit `mlf.toml`** - Version control your configuration 198 275 2. **Don't commit `.mlf/`** - Let each developer fetch dependencies 199 276 3. **Use semantic namespaces** - Organize lexicons by domain 200 - 4. **Multiple outputs** - Generate both lexicons and code simultaneously 201 - 5. **CI/CD integration** - Run `mlf check` in your CI pipeline 277 + 4. **Set consistent root** - Keep your source directory as the root for namespace calculation 278 + 5. **Multiple outputs** - Generate both lexicons and code simultaneously 279 + 6. **CI/CD integration** - Run `mlf check` in your CI pipeline 202 280 203 281 ## Override Configuration 204 282 ··· 206 284 207 285 ```bash 208 286 # Override source directory 209 - mlf check "./other-lexicons/**/*.mlf" 287 + mlf check ./other-lexicons --root ./other-lexicons 210 288 211 289 # Override output 212 - mlf generate lexicon -i custom.mlf -o ./custom-output/ 290 + mlf generate lexicon -i custom.mlf -o ./custom-output/ --root ./ 291 + 292 + # Override generator 293 + mlf generate code -g go -i ./lexicons -o ./go-types --root ./lexicons 213 294 214 295 # Fetch specific namespace (ignoring dependencies list) 215 296 mlf fetch stream.place 216 297 ``` 298 + 299 + ## Multiple Projects 300 + 301 + If you have multiple MLF projects, each can have its own `mlf.toml`: 302 + 303 + ``` 304 + my-app/ 305 + ├── mlf.toml 306 + ├── lexicons/ 307 + │ └── com/example/app/... 308 + └── dist/ 309 + 310 + shared-lexicons/ 311 + ├── mlf.toml 312 + ├── lexicons/ 313 + │ └── com/example/shared/... 314 + └── dist/ 315 + ``` 316 + 317 + Commands always use the `mlf.toml` in the current directory or nearest parent directory.

-14

website/content/docs/cli/04-check.md

··· 168 168 169 169 **Solution:** Correct the field type to match the schema 170 170 171 - ## Integration with CI/CD 172 - 173 - Use `mlf check` in your continuous integration pipeline: 174 - 175 - ```yaml 176 - # GitHub Actions example 177 - - name: Validate MLF Lexicons 178 - run: | 179 - mlf fetch 180 - mlf check 181 - ``` 182 - 183 - This ensures all lexicons remain valid as your project evolves. 184 - 185 171 ## Tips 186 172 187 173 1. **Use configuration** - Set up `mlf.toml` to avoid typing paths repeatedly

+159 -32

website/content/docs/cli/06-generate.md

··· 13 13 mlf generate 14 14 15 15 # Generate JSON lexicons 16 - mlf generate lexicon -i <INPUT> -o <OUTPUT> 16 + mlf generate lexicon [OPTIONS] 17 17 18 18 # Generate code in a specific language 19 - mlf generate code -g <GENERATOR> -i <INPUT> -o <OUTPUT> 19 + mlf generate code [OPTIONS] 20 20 21 21 # Convert JSON lexicons to MLF 22 - mlf generate mlf -i <INPUT> -o <OUTPUT> 22 + mlf generate mlf -i <INPUT> [OPTIONS] 23 23 ``` 24 24 25 + All generate commands can use defaults from `mlf.toml`, making them easier to run without arguments. 26 + 25 27 ## Generate All Outputs 26 28 27 29 When run without a subcommand, `mlf generate` uses your `mlf.toml` configuration to generate all specified outputs: 28 30 29 31 ```toml 32 + [source] 33 + directory = "./lexicons" 34 + 30 35 [[output]] 31 36 type = "lexicon" 32 37 directory = "./dist/lexicons" ··· 67 72 Generate ATProto JSON lexicons from MLF files. 68 73 69 74 ```bash 70 - mlf generate lexicon -i <INPUT> -o <OUTPUT> [OPTIONS] 75 + mlf generate lexicon [OPTIONS] 71 76 ``` 72 77 73 78 **Options:** 74 - - `-i, --input <INPUT>` - Input MLF files (glob patterns supported, can be specified multiple times) 75 - - `-o, --output <OUTPUT>` - Output directory (required) 79 + - `-i, --input <INPUT>` - Input MLF file(s) or directory (defaults to `source.directory` from mlf.toml) 80 + - `-o, --output <OUTPUT>` - Output directory (defaults to first `type = "lexicon"` output from mlf.toml) 81 + - `--root <ROOT>` - Root directory for namespace calculation (defaults to `source.directory` from mlf.toml) 76 82 - `--flat` - Use flat file structure (e.g., `com.example.thread.json`) 77 83 78 - **Examples:** 84 + ### Using mlf.toml Defaults 85 + 86 + With this configuration: 87 + 88 + ```toml 89 + [source] 90 + directory = "./lexicons" 91 + 92 + [[output]] 93 + type = "lexicon" 94 + directory = "./dist/lexicons" 95 + ``` 96 + 97 + You can run: 98 + 99 + ```bash 100 + mlf generate lexicon 101 + # Uses: input=./lexicons, output=./dist/lexicons, root=./lexicons 102 + ``` 103 + 104 + ### Explicit Arguments 105 + 106 + You can override defaults with explicit arguments: 79 107 80 108 ```bash 81 109 # Generate with folder structure 82 - mlf generate lexicon -i thread.mlf -o lexicons/ 110 + mlf generate lexicon -i thread.mlf -o lexicons/ --root ./ 83 111 # Creates: lexicons/com/example/thread.json 84 112 85 113 # Generate with flat structure 86 - mlf generate lexicon -i thread.mlf -o lexicons/ --flat 114 + mlf generate lexicon -i thread.mlf -o lexicons/ --root ./ --flat 87 115 # Creates: lexicons/com.example.thread.json 88 116 89 - # Generate from multiple files 90 - mlf generate lexicon -i thread.mlf -i reply.mlf -o lexicons/ 117 + # Generate from directory 118 + mlf generate lexicon -i ./src/lexicons -o dist/lexicons/ --root ./src/lexicons 119 + ``` 91 120 92 - # Generate from glob pattern 93 - mlf generate lexicon -i "src/**/*.mlf" -o dist/lexicons/ 121 + ### Understanding --root 122 + 123 + The `--root` flag tells MLF where to calculate namespaces from. For example: 124 + 125 + ```bash 126 + # File: ./lexicons/com/example/thread.mlf 127 + mlf generate lexicon -i ./lexicons -o ./dist --root ./lexicons 128 + # Namespace: com.example.thread (relative to ./lexicons) 129 + 130 + # File: ./src/lexicons/com/example/thread.mlf 131 + mlf generate lexicon -i ./src/lexicons -o ./dist --root ./src/lexicons 132 + # Namespace: com.example.thread (relative to ./src/lexicons) 94 133 ``` 95 134 135 + Without `--root`, it defaults to the `source.directory` from mlf.toml or the current directory. 136 + 96 137 --- 97 138 98 139 ## Generate Code ··· 100 141 Generate code in various programming languages from MLF files. 101 142 102 143 ```bash 103 - mlf generate code -g <GENERATOR> -i <INPUT> -o <OUTPUT> [OPTIONS] 144 + mlf generate code [OPTIONS] 104 145 ``` 105 146 106 147 **Options:** 107 - - `-g, --generator <GENERATOR>` - Generator to use (required): `json`, `typescript`, `go`, or `rust` 108 - - `-i, --input <INPUT>` - Input MLF files (glob patterns supported, can be specified multiple times) 109 - - `-o, --output <OUTPUT>` - Output directory (required) 148 + - `-g, --generator <GENERATOR>` - Generator to use (defaults to first non-lexicon output from mlf.toml) 149 + - `-i, --input <INPUT>` - Input MLF file(s) or directory (defaults to `source.directory` from mlf.toml) 150 + - `-o, --output <OUTPUT>` - Output directory (defaults to matching output from mlf.toml) 151 + - `--root <ROOT>` - Root directory for namespace calculation (defaults to `source.directory` from mlf.toml) 110 152 - `--flat` - Use flat file structure 111 153 112 154 **Available Generators:** 113 155 114 156 | Generator | Output | Features | 115 157 |-----------|--------|----------| 116 - | `json` | `.json` | AT Protocol JSON lexicons (always available) | 117 158 | `typescript` | `.ts` | TypeScript interfaces with JSDoc, optional fields with `?` | 118 159 | `go` | `.go` | Go structs with JSON tags, proper capitalization | 119 160 | `rust` | `.rs` | Rust structs with serde, `Option<T>` for optional fields | 120 161 162 + ### Using mlf.toml Defaults 163 + 164 + With this configuration: 165 + 166 + ```toml 167 + [source] 168 + directory = "./lexicons" 169 + 170 + [[output]] 171 + type = "typescript" 172 + directory = "./src/types" 173 + ``` 174 + 175 + You can run: 176 + 177 + ```bash 178 + mlf generate code 179 + # Uses: generator=typescript, input=./lexicons, output=./src/types 180 + ``` 181 + 182 + ### Explicit Arguments 183 + 184 + ```bash 185 + # Generate TypeScript 186 + mlf generate code -g typescript -i thread.mlf -o src/types/ --root ./ 187 + 188 + # Generate Go 189 + mlf generate code -g go -i ./lexicons -o pkg/models/ --root ./lexicons 190 + 191 + # Generate Rust with flat structure 192 + mlf generate code -g rust -i ./src -o ./generated --root ./src --flat 193 + ``` 194 + 121 195 ### TypeScript Example 122 196 123 197 ```bash 124 - mlf generate code -g typescript -i thread.mlf -o src/types/ 198 + mlf generate code -g typescript -i thread.mlf -o src/types/ --root ./ 125 199 ``` 126 200 127 201 **Input MLF:** ··· 160 234 ### Go Example 161 235 162 236 ```bash 163 - mlf generate code -g go -i thread.mlf -o pkg/models/ 237 + mlf generate code -g go -i thread.mlf -o pkg/models/ --root ./ 164 238 ``` 165 239 166 240 **Generated Go:** ··· 184 258 ### Rust Example 185 259 186 260 ```bash 187 - mlf generate code -g rust -i thread.mlf -o src/models/ 261 + mlf generate code -g rust -i thread.mlf -o src/models/ --root ./ 188 262 ``` 189 263 190 264 **Generated Rust:** ··· 215 289 Convert ATProto JSON lexicons back to MLF format. 216 290 217 291 ```bash 218 - mlf generate mlf -i <INPUT> -o <OUTPUT> 292 + mlf generate mlf -i <INPUT> [OPTIONS] 219 293 ``` 220 294 221 295 **Options:** 222 - - `-i, --input <INPUT>` - Input JSON lexicon files (glob patterns supported, can be specified multiple times) 223 - - `-o, --output <OUTPUT>` - Output directory (required) 296 + - `-i, --input <INPUT>` - Input JSON lexicon files (required, can be specified multiple times) 297 + - `-o, --output <OUTPUT>` - Output directory (defaults to first `type = "mlf"` output from mlf.toml) 224 298 225 - **Examples:** 299 + ### Using mlf.toml Defaults 300 + 301 + With this configuration: 302 + 303 + ```toml 304 + [[output]] 305 + type = "mlf" 306 + directory = "./converted" 307 + ``` 308 + 309 + You can run: 310 + 311 + ```bash 312 + mlf generate mlf -i external-lexicon.json 313 + # Uses: output=./converted 314 + ``` 315 + 316 + ### Examples 226 317 227 318 ```bash 228 319 # Convert a single JSON lexicon to MLF ··· 232 323 # Convert multiple JSON lexicons 233 324 mlf generate mlf -i lexicon1.json -i lexicon2.json -o ./mlf/ 234 325 235 - # Convert using glob pattern 236 - mlf generate mlf -i "dist/lexicons/**/*.json" -o ./src/ 326 + # Convert from directory 327 + mlf generate mlf -i dist/lexicons/com/example/thread.json -o ./src/ 237 328 ``` 238 329 239 330 **Features:** ··· 275 366 276 367 **Generated MLF:** 277 368 ```mlf 278 - record main { 369 + @main 370 + record thread { 279 371 title!: string constrained { 280 372 maxLength: 200, 281 373 }, 282 374 createdAt!: Datetime, 283 - }; 375 + } 284 376 ``` 285 377 286 378 --- ··· 293 385 294 386 ```mlf 295 387 /// This is a user profile 296 - def Profile = { 388 + def type Profile = { 297 389 /// The user's display name 298 390 displayName: string, 299 391 }; ··· 333 425 334 426 --- 335 427 428 + ## Multiple Output Targets 429 + 430 + You can configure multiple generators in `mlf.toml`: 431 + 432 + ```toml 433 + [source] 434 + directory = "./lexicons" 435 + 436 + [[output]] 437 + type = "lexicon" 438 + directory = "./dist/lexicons" 439 + 440 + [[output]] 441 + type = "typescript" 442 + directory = "./src/types" 443 + 444 + [[output]] 445 + type = "go" 446 + directory = "./pkg/lexicons" 447 + 448 + [[output]] 449 + type = "rust" 450 + directory = "./rust-client/src/lexicons" 451 + ``` 452 + 453 + Then run: 454 + 455 + ```bash 456 + mlf generate 457 + ``` 458 + 459 + This generates all four output types from the same MLF source files. 460 + 461 + --- 462 + 336 463 ## Tips 337 464 338 - 1. **Use configuration** - Set up `mlf.toml` for multi-output generation 339 - 2. **Commit generated code** - If it's part of your build artifacts 465 + 1. **Use configuration** - Set up `mlf.toml` to avoid repetitive arguments 466 + 2. **Set explicit root** - Use `--root` when your file structure doesn't match your namespaces 340 467 3. **Regenerate often** - Run `mlf generate` after any lexicon changes 341 468 4. **Use flat mode** - For simpler directory structures 342 469 5. **Multiple generators** - Generate multiple languages from the same MLF files

+215 -19

website/content/docs/language-guide/08-imports.md

··· 7 7 8 8 ## Basic Import 9 9 10 - Import a definition from another file: 10 + Import a specific definition from another file: 11 11 12 12 ```mlf 13 13 use com.example.forum.profile; ··· 17 17 } 18 18 ``` 19 19 20 - This imports the `profile` record from `com/example/forum/profile.mlf`. 20 + This imports the `profile` definition from `com/example/forum/profile.mlf`. 21 21 22 22 ## How Imports Work 23 23 24 - The namespace matches the file path: 24 + The import path consists of the file's namespace plus the definition name: 25 25 26 - | File Path | Namespace | Import Statement | 26 + | File Path | Definition | Import Statement | 27 27 |-----------|-----------|------------------| 28 - | `com/example/forum/user.mlf` | `com.example.forum.user` | `use com.example.forum.user;` | 29 - | `com/example/forum/post.mlf` | `com.example.forum.post` | `use com.example.forum.post;` | 28 + | `com/example/forum/user.mlf` | `record user { ... }` | `use com.example.forum.user;` | 29 + | `com/example/forum/post.mlf` | `def type postMeta = { ... }` | `use com.example.forum.post.postMeta;` | 30 + 31 + **Key point:** You import specific definitions by their full path: `namespace.definitionName` 32 + 33 + For example: 34 + - File `com/example/forum/post.mlf` has namespace `com.example.forum.post` 35 + - To import `postMeta` from that file: `use com.example.forum.post.postMeta;` 36 + - Or using the main definition: `use com.example.forum.post;` (imports the record named `post`) 30 37 31 38 ## What Can Be Imported 32 39 ··· 61 68 62 69 ## Multiple Imports 63 70 64 - Import multiple definitions with separate `use` statements: 71 + Import multiple definitions from the same namespace with a single statement using `.{ }`: 72 + 73 + ```mlf 74 + use com.example.forum.{ author, timestamp, location }; 75 + 76 + record post { 77 + author: author, 78 + createdAt: timestamp, 79 + location?: location, 80 + } 81 + ``` 82 + 83 + This works for any namespace, regardless of how many levels deep: 84 + 85 + ```mlf 86 + use com.example.forum.types.{ author, postRef }; 87 + 88 + record comment { 89 + author: author, 90 + replyTo: postRef, 91 + } 92 + ``` 93 + 94 + Or import them separately: 65 95 66 96 ```mlf 67 97 use com.example.forum.author; 68 98 use com.example.forum.timestamp; 69 99 use com.example.forum.location; 100 + ``` 101 + 102 + ## Renaming Imports 103 + 104 + Sometimes you need to rename an imported definition to avoid conflicts or improve clarity. Use the `as` keyword: 105 + 106 + ```mlf 107 + use com.example.types.author as ForumAuthor; 108 + use com.social.types.author as SocialAuthor; 109 + 110 + record crossPost { 111 + forumAuthor: ForumAuthor, 112 + socialAuthor: SocialAuthor, 113 + } 114 + ``` 115 + 116 + This is useful when: 117 + - Two imports have the same name 118 + - You want a shorter or clearer name 119 + - You're dealing with naming conflicts 120 + 121 + You can rename multiple imports at once: 122 + 123 + ```mlf 124 + use com.example.types.{ author as ForumAuthor, postRef as PostReference }; 125 + 126 + record crossPost { 127 + author: ForumAuthor, 128 + ref: PostReference, 129 + } 130 + ``` 131 + 132 + You can also rename when there's a local definition with the same name: 133 + 134 + ```mlf 135 + // Local definition 136 + def type thread = { 137 + localId!: string, 138 + } 139 + 140 + // Import with rename to avoid conflict 141 + use com.example.types.thread as ExternalThread; 70 142 71 143 record post { 72 - author: author, 73 - createdAt: timestamp, 74 - location?: location, 144 + localThread: thread, // Local def 145 + externalThread: ExternalThread, // Imported type 75 146 } 76 147 ``` 77 148 149 + ## Importing Main Definitions 150 + 151 + Every MLF file has a "main" definition - the primary export. You can import it using just the file's namespace: 152 + 153 + ```mlf 154 + use com.example.thread; // Imports the main definition, bound as "thread" 155 + ``` 156 + 157 + This is shorthand for `use com.example.thread.{ main }`. 158 + 159 + **Note:** The `@main` annotation is only needed when there's a naming conflict (see [Important Info](/docs/language-guide/important-info/#the-main-annotation)). Otherwise, the main definition is determined automatically. 160 + 161 + ### Example 162 + 163 + **File: `com/example/thread.mlf`** 164 + ```mlf 165 + /// The main thread record 166 + record thread { 167 + title!: string, 168 + body!: string, 169 + } 170 + 171 + /// Thread metadata 172 + def type threadMeta = { 173 + id!: string, 174 + viewCount!: integer, 175 + } 176 + ``` 177 + 178 + **File: `com/example/post.mlf`** 179 + ```mlf 180 + // Import the main thread record 181 + use com.example.thread; 182 + 183 + // Import the threadMeta definition 184 + use com.example.thread.threadMeta; 185 + 186 + record post { 187 + thread: thread, // The main record 188 + meta: threadMeta, // The def type 189 + text!: string, 190 + } 191 + ``` 192 + 193 + ### Explicit Main Import 194 + 195 + You can also explicitly import main definitions using `.{ }`: 196 + 197 + ```mlf 198 + use com.example.thread.{ main }; 199 + 200 + record post { 201 + thread: thread, // Bound as "thread" (the last segment of the namespace) 202 + } 203 + ``` 204 + 205 + ### Importing and Renaming Main 206 + 207 + You can rename the main definition when importing: 208 + 209 + ```mlf 210 + use com.example.thread.{ main as Thread }; 211 + 212 + record post { 213 + thread: Thread, 214 + } 215 + ``` 216 + 217 + Or import both main and other definitions together: 218 + 219 + ```mlf 220 + use com.example.thread.{ main as Thread, threadMeta as Meta }; 221 + 222 + record post { 223 + thread: Thread, 224 + meta: Meta, 225 + } 226 + ``` 227 + 228 + ## Wildcard Imports 229 + 230 + Import all definitions from a namespace with `.*`: 231 + 232 + ```mlf 233 + use com.example.forum.*; 234 + 235 + record post { 236 + author: author, // All definitions from com.example.forum 237 + postRef: postRef, // are now available 238 + } 239 + ``` 240 + 241 + **Note:** Wildcard imports bring all public definitions into scope, which can lead to naming conflicts. Use with caution. 242 + 243 + ## Namespace Aliasing 244 + 245 + Alias an entire namespace for shorter references: 246 + 247 + ```mlf 248 + use com.example.forum as Forum; 249 + 250 + record post { 251 + author: Forum.author, 252 + ref: Forum.postRef, 253 + } 254 + ``` 255 + 256 + This is useful for: 257 + - Avoiding naming conflicts 258 + - Shortening long namespace paths 259 + - Making code more readable 260 + 261 + ## Import Syntax Summary 262 + 263 + Here's what MLF supports for imports: 264 + 265 + | Import Type | Syntax Example | 266 + |-------------|----------------| 267 + | **Single import** | `use com.example.forum.profile;` | 268 + | **Multiple imports** | `use com.example.forum.{ author, postRef };` | 269 + | **Main import** | `use com.example.thread;` or `use com.example.thread.{ main };` | 270 + | **With renaming** | `use com.example.post.{ main as Post };` | 271 + | **Mixed imports** | `use com.example.thread.{ main as Thread, threadMeta };` | 272 + | **Wildcard imports** | `use com.example.forum.*;` | 273 + | **Namespace aliasing** | `use com.example.forum as Forum;` | 274 + 78 275 ## Organizing Files 79 276 80 277 Common organization patterns: ··· 94 291 com/ 95 292 example/ 96 293 forum/ 97 - author.mlf 98 - postRef.mlf 294 + types/ 295 + author.mlf 296 + postRef.mlf 99 297 post.mlf 100 298 comment.mlf 101 299 ``` ··· 145 343 146 344 Here's a well-organized multi-file lexicon: 147 345 148 - **File: `com/example/forum/author.mlf`** 346 + **File: `com/example/forum/types/author.mlf`** 149 347 ```mlf 150 348 /// Basic author information 151 349 def type author = { ··· 155 353 }; 156 354 ``` 157 355 158 - **File: `com/example/forum/postRef.mlf`** 356 + **File: `com/example/forum/types/postRef.mlf`** 159 357 ```mlf 160 358 /// Reference to a post 161 359 def type postRef = { ··· 166 364 167 365 **File: `com/example/forum/post.mlf`** 168 366 ```mlf 169 - use com.example.forum.author; 170 - use com.example.forum.postRef; 367 + use com.example.forum.types.{ author, postRef }; 171 368 172 369 /// A forum post 173 370 record post { ··· 190 387 /// Get a post by URI 191 388 query getPost( 192 389 uri: AtUri 193 - ):post | error { 390 + ): post | error { 194 391 NotFound, 195 392 }; 196 393 ``` 197 394 198 395 **File: `com/example/forum/comment.mlf`** 199 396 ```mlf 200 - use com.example.forum.author; 201 - use com.example.forum.postRef; 397 + use com.example.forum.types.{ author, postRef }; 202 398 203 399 /// A comment on a post 204 400 record comment {

+5 -7

website/content/docs/language-guide/09-prelude.md

··· 76 76 // Use fully qualified names 77 77 record myPost { 78 78 reference: com.atproto.repo.strongRef, 79 - labels: [com.atproto.label.label], 79 + labels: [com.atproto.label.defs.label], 80 80 } 81 81 82 82 // Or import them 83 83 use com.atproto.repo.strongRef; 84 - use com.atproto.label.label; 84 + use com.atproto.label.defs.label; 85 85 86 86 record myPost { 87 87 reference: strongRef, ··· 112 112 } 113 113 ``` 114 114 115 - **`com.atproto.label.label`** - Content labels for moderation: 115 + **`com.atproto.label.defs.label`** - Content labels for moderation: 116 116 ```mlf 117 117 record post { 118 - labels: [com.atproto.label.label], 118 + labels: [com.atproto.label.defs.label], 119 119 } 120 120 ``` 121 121 ··· 245 245 246 246 ## What's Next? 247 247 248 - Read the [Important Info](/docs/language-guide/10-important-info/) section to understand how MLF maps to ATProto Lexicons, especially the rules for the `"main"` definition. 249 - 250 - Then check out the [Playground](/playground/) to experiment with MLF, or read the [CLI documentation](/docs/cli/) to learn how to compile your lexicons. 248 + Next, learn about [Annotations](/docs/language-guide/annotations/) to add metadata for code generators and tooling.

+122 -2

website/content/docs/language-guide/10-important-info.md

··· 5 5 6 6 This section covers important details about how MLF maps to ATProto Lexicons. 7 7 8 + ## Shebang Support 9 + 10 + MLF files can optionally include a shebang for direct execution: 11 + 12 + ```mlf 13 + #!/usr/bin/env mlf 14 + 15 + record post { 16 + text: string, 17 + } 18 + ``` 19 + 20 + The `#` character is **only** used for shebangs at the start of files. It has no other meaning in MLF syntax. 21 + 8 22 ## The "main" Definition 9 23 10 24 In ATProto Lexicons, each lexicon has a `defs` object where definitions are stored. One special definition is called `"main"` - it's the primary definition for that lexicon. ··· 105 119 ### Supporting Definitions 106 120 107 121 These are **never** `"main"` - they're always named defs: 108 - - `def type` definitions 109 122 - `token` definitions 110 123 - `inline type` definitions (don't appear in output at all) 111 124 ··· 151 164 152 165 When the NSID ends with `defs`, all items become named defs (no `"main"`). 153 166 167 + ## The @main Annotation 168 + 169 + Sometimes you need both a main definition **and** a def with the same name. This happens when the name matches your namespace suffix. 170 + 171 + ### Why Would You Need This? 172 + 173 + Consider `app.bsky.embed.external`. You might want: 174 + 1. A **main record** called `external` (the primary export) 175 + 2. A **def type** called `external` (metadata about externals) 176 + 177 + Normally, duplicate names aren't allowed. But when the name matches the namespace suffix (the last part), you can use `@main`: 178 + 179 + ```mlf 180 + // File: app.bsky.embed.external.mlf 181 + 182 + /// The main external embed record 183 + @main 184 + record external { 185 + external!: externalDetail, 186 + } 187 + 188 + /// External link details 189 + def type externalDetail = { 190 + uri!: Uri, 191 + title!: string, 192 + description!: string, 193 + } 194 + ``` 195 + 196 + ### Rules 197 + 198 + 1. **Duplicates only allowed when name matches namespace suffix** 199 + - ✅ `com.example.thread` can have two items named "thread" 200 + - ❌ `com.example.post` cannot have two items named "thread" 201 + 202 + 2. **Must use @main to disambiguate** 203 + - One item must have `@main` 204 + - Only one item can have `@main` 205 + 206 + 3. **Works with records, queries, procedures, subscriptions + defs** 207 + - ✅ `@main record thread` + `def type thread` 208 + - ✅ `@main query getThread` + `def type thread` 209 + - ❌ `inline type thread` + anything (inline types can't be main) 210 + 211 + ### Example: Thread Types 212 + 213 + ```mlf 214 + // File: com.example.thread.mlf 215 + 216 + /// The main thread record 217 + @main 218 + record thread { 219 + title!: string, 220 + body!: string, 221 + author!: Did, 222 + createdAt!: Datetime, 223 + } 224 + 225 + /// Thread metadata 226 + def type thread = { 227 + id!: string, 228 + viewCount!: integer, 229 + replyCount!: integer, 230 + } 231 + 232 + record reply { 233 + threadMeta: thread, // References the def type, not the record 234 + text!: string, 235 + } 236 + ``` 237 + 238 + ### When @main Isn't Needed 239 + 240 + If you only have one record/query/procedure/subscription in a file, it automatically becomes main: 241 + 242 + ```mlf 243 + // No @main needed - this automatically becomes the main definition 244 + record post { 245 + text!: string, 246 + createdAt!: Datetime, 247 + } 248 + 249 + def type author = { 250 + did!: Did, 251 + handle!: Handle, 252 + } 253 + ``` 254 + 255 + ### Error: Missing @main 256 + 257 + ```mlf 258 + // ERROR: Which one is main? 259 + record thread { 260 + title!: string, 261 + } 262 + 263 + def type thread = { 264 + id!: string, 265 + } 266 + // This will fail - you must add @main to one of them 267 + ``` 268 + 154 269 ## NSID and File Path Mapping 155 270 156 271 The file path **is** the NSID. MLF derives the lexicon NSID from the file path: ··· 216 331 - **Single main-eligible item** → automatically becomes `"main"` 217 332 - **Name matches last NSID segment** → becomes `"main"` 218 333 - **Neither condition met** → all items become named defs 219 - - **Supporting definitions** (def type, token) → always named defs 334 + - **Supporting definitions** (token, inline type) → always named defs 220 335 - **File path** → determines the NSID 336 + - **Shebang support** → optional `#!/usr/bin/env mlf` at file start 337 + 338 + ## What's Next? 339 + 340 + Finally, explore [Lexicon Mapping](/docs/language-guide/lexicon-mapping/) to see how MLF constructs map to ATProto Lexicon JSON format.

+164

website/content/docs/language-guide/11-annotations.md

··· 1 + +++ 2 + title = "Annotations" 3 + weight = 9 4 + +++ 5 + 6 + Annotations use the `@` symbol and provide metadata for external tooling. MLF itself assigns no semantic meaning to most annotations - they're purely for tools, linters, code generators, and other processors. 7 + 8 + ## Annotation Syntax 9 + 10 + Three forms of annotations are supported: 11 + 12 + ### Simple Annotation 13 + 14 + ```mlf 15 + @deprecated 16 + record oldRecord { 17 + field: string, 18 + } 19 + ``` 20 + 21 + ### Positional Arguments 22 + 23 + ```mlf 24 + @since(1, 2, 0) 25 + @doc("https://example.com/docs") 26 + record example { 27 + field: string, 28 + } 29 + ``` 30 + 31 + Arguments can be: 32 + - **Strings**: `"value"` 33 + - **Numbers**: `42`, `3.14` 34 + - **Booleans**: `true`, `false` 35 + 36 + ### Named Arguments 37 + 38 + ```mlf 39 + @validate(min: 0, max: 100, strict: true) 40 + @codegen(language: "rust", derive: "Debug, Clone") 41 + record example { 42 + field: integer, 43 + } 44 + ``` 45 + 46 + ## Annotation Placement 47 + 48 + Annotations can be placed on: 49 + 50 + - Records 51 + - Def Types 52 + - Inline Types 53 + - Tokens 54 + - Queries 55 + - Procedures 56 + - Subscriptions 57 + - Fields within records/types 58 + 59 + **Example:** 60 + 61 + ```mlf 62 + /// A user profile 63 + @table(name: "profiles", indexes: "did,handle") 64 + record profile { 65 + /// User's DID 66 + @indexed 67 + did!: Did, 68 + 69 + /// Display name (optional) 70 + @sensitive(pii: true) 71 + displayName: string, 72 + } 73 + ``` 74 + 75 + ## MLF Annotations vs Generator Annotations 76 + 77 + MLF distinguishes between two categories: 78 + 79 + ### 1. MLF Annotations 80 + 81 + Built into the MLF language and affect compilation/validation. These are **bare annotations** without any namespace prefix: 82 + 83 + **`@main`** - Marks an item as the main definition when there's ambiguity: 84 + 85 + ```mlf 86 + // File: com/example/thread.mlf 87 + @main 88 + record thread { 89 + title!: string, 90 + } 91 + 92 + // This def shares the same name but is not main 93 + def type thread = { 94 + id!: string, 95 + viewCount!: integer, 96 + } 97 + ``` 98 + 99 + See [Important Info](/docs/language-guide/important-info/#the-main-definition) for more details on the `@main` annotation. 100 + 101 + ### 2. Generator Annotations 102 + 103 + Used by code generators and external tools. These have no effect on MLF compilation and **must** be namespaced with the generator name: 104 + 105 + ```mlf 106 + @rust:derive("Debug, Clone, Serialize") 107 + @typescript:export 108 + @go:tag(json: "custom_name") 109 + record example { 110 + field: string, 111 + } 112 + ``` 113 + 114 + **Generator namespacing rules:** 115 + - All generator annotations must have a namespace prefix (e.g., `@rust:foo`) 116 + - Use `@all:annotation` to apply an annotation to all generators 117 + - Bare annotations (without `:`) are reserved for MLF itself 118 + 119 + **Common generator namespaces:** 120 + - `@rust:*` - Rust code generator annotations 121 + - `@typescript:*` - TypeScript code generator annotations 122 + - `@go:*` - Go code generator annotations 123 + - `@python:*` - Python code generator annotations 124 + - `@all:*` - Applies to all generators 125 + 126 + ## Custom Annotations 127 + 128 + You can define your own annotations for custom tooling: 129 + 130 + ```mlf 131 + @myapp:cache(ttl: 3600) 132 + @myapp:permission("read:public") 133 + query getProfile(actor!: Did): profile; 134 + 135 + @myapp:audit_log 136 + @myapp:rate_limit(requests: 100, window: 60) 137 + procedure updateProfile(data!: profile): unit; 138 + ``` 139 + 140 + The interpretation is entirely up to your tooling. 141 + 142 + ## Annotation Processing 143 + 144 + Annotations are preserved in the MLF AST and can be accessed by: 145 + 146 + - Code generators 147 + - Linters 148 + - Documentation generators 149 + - Build tools 150 + - Custom processors 151 + 152 + Each tool decides which annotations to support and how to interpret them. 153 + 154 + ## Best Practices 155 + 156 + 1. **Always namespace generator annotations** - Use `@generator:name` for all generator-specific annotations 157 + 2. **Use `@all:` for cross-generator annotations** - When an annotation should apply to all generators 158 + 3. **Document custom annotations** - Keep a registry of annotations your project uses 159 + 4. **Be consistent** - Use the same annotation patterns across your codebase 160 + 5. **Don't overuse** - Annotations should augment, not replace, good design 161 + 162 + ## What's Next? 163 + 164 + Next, read the [Important Info](/docs/language-guide/important-info/) section to understand critical details about how MLF maps to ATProto Lexicons.

+529

website/content/docs/language-guide/11-lexicon-mapping.md

··· 1 + +++ 2 + title = "Lexicon Mapping" 3 + weight = 11 4 + +++ 5 + 6 + This page explains how MLF constructs map to ATProto Lexicon JSON format. Understanding this mapping helps you work with existing lexicons and understand what MLF generates. 7 + 8 + ## Basic Record 9 + 10 + MLF provides a cleaner syntax for ATProto records: 11 + 12 + **MLF:** 13 + ```mlf 14 + // File: com/example/forum/post.mlf 15 + record post { 16 + text!: string constrained { 17 + maxLength: 300, 18 + }, 19 + createdAt!: Datetime, 20 + } 21 + ``` 22 + 23 + **Generated JSON:** 24 + ```json 25 + { 26 + "lexicon": 1, 27 + "id": "com.example.forum.post", 28 + "defs": { 29 + "main": { 30 + "type": "record", 31 + "key": "tid", 32 + "record": { 33 + "type": "object", 34 + "required": ["text", "createdAt"], 35 + "properties": { 36 + "text": { 37 + "type": "string", 38 + "maxLength": 300 39 + }, 40 + "createdAt": { 41 + "type": "string", 42 + "format": "datetime" 43 + } 44 + } 45 + } 46 + } 47 + } 48 + } 49 + ``` 50 + 51 + **Key mappings:** 52 + - MLF file path → JSON `id` field 53 + - `record post` → `"main": { "type": "record" }` 54 + - `field!:` → included in `required` array 55 + - `field:` (no `!`) → optional field (not in `required`) 56 + - `Datetime` → `{ "type": "string", "format": "datetime" }` 57 + 58 + ## Query Definition 59 + 60 + **MLF:** 61 + ```mlf 62 + // File: com/example/forum/getPost.mlf 63 + query getPost( 64 + uri!: AtUri, 65 + ): post | error { 66 + NotFound, 67 + BadRequest, 68 + }; 69 + ``` 70 + 71 + **Generated JSON:** 72 + ```json 73 + { 74 + "lexicon": 1, 75 + "id": "com.example.forum.getPost", 76 + "defs": { 77 + "main": { 78 + "type": "query", 79 + "parameters": { 80 + "type": "params", 81 + "required": ["uri"], 82 + "properties": { 83 + "uri": { 84 + "type": "string", 85 + "format": "at-uri" 86 + } 87 + } 88 + }, 89 + "output": { 90 + "encoding": "application/json", 91 + "schema": { 92 + "type": "ref", 93 + "ref": "#post" 94 + } 95 + }, 96 + "errors": [ 97 + { "name": "NotFound" }, 98 + { "name": "BadRequest" } 99 + ] 100 + } 101 + } 102 + } 103 + ``` 104 + 105 + **Key mappings:** 106 + - `query` → `"type": "query"` 107 + - Parameters → `"parameters"` object with `"type": "params"` 108 + - Return type → `"output"` with `"schema"` 109 + - Error block → `"errors"` array 110 + 111 + ## Procedure Definition 112 + 113 + **MLF:** 114 + ```mlf 115 + // File: com/example/forum/createPost.mlf 116 + procedure createPost( 117 + text!: string, 118 + ): { 119 + uri!: AtUri, 120 + cid!: Cid, 121 + } | error { 122 + TextTooLong, 123 + }; 124 + ``` 125 + 126 + **Generated JSON:** 127 + ```json 128 + { 129 + "lexicon": 1, 130 + "id": "com.example.forum.createPost", 131 + "defs": { 132 + "main": { 133 + "type": "procedure", 134 + "input": { 135 + "encoding": "application/json", 136 + "schema": { 137 + "type": "object", 138 + "required": ["text"], 139 + "properties": { 140 + "text": { 141 + "type": "string" 142 + } 143 + } 144 + } 145 + }, 146 + "output": { 147 + "encoding": "application/json", 148 + "schema": { 149 + "type": "object", 150 + "required": ["uri", "cid"], 151 + "properties": { 152 + "uri": { 153 + "type": "string", 154 + "format": "at-uri" 155 + }, 156 + "cid": { 157 + "type": "string", 158 + "format": "cid" 159 + } 160 + } 161 + } 162 + }, 163 + "errors": [ 164 + { "name": "TextTooLong" } 165 + ] 166 + } 167 + } 168 + } 169 + ``` 170 + 171 + **Key mappings:** 172 + - `procedure` → `"type": "procedure"` 173 + - Parameters → `"input"` with inline schema 174 + - Return object → `"output"` with inline schema 175 + 176 + ## Subscription Definition 177 + 178 + **MLF:** 179 + ```mlf 180 + // File: com/example/subscribeEvents.mlf 181 + subscription subscribeEvents( 182 + cursor: integer, 183 + ): commit | identity; 184 + 185 + def type commit = { 186 + seq!: integer, 187 + repo!: Did, 188 + commit!: Cid, 189 + }; 190 + 191 + def type identity = { 192 + did!: Did, 193 + handle!: Handle, 194 + }; 195 + ``` 196 + 197 + **Generated JSON:** 198 + ```json 199 + { 200 + "lexicon": 1, 201 + "id": "com.example.subscribeEvents", 202 + "defs": { 203 + "main": { 204 + "type": "subscription", 205 + "parameters": { 206 + "type": "params", 207 + "properties": { 208 + "cursor": { 209 + "type": "integer" 210 + } 211 + } 212 + }, 213 + "message": { 214 + "schema": { 215 + "type": "union", 216 + "refs": ["#commit", "#identity"] 217 + } 218 + } 219 + }, 220 + "commit": { 221 + "type": "object", 222 + "required": ["seq", "repo", "commit"], 223 + "properties": { 224 + "seq": { "type": "integer" }, 225 + "repo": { "type": "string", "format": "did" }, 226 + "commit": { "type": "string", "format": "cid" } 227 + } 228 + }, 229 + "identity": { 230 + "type": "object", 231 + "required": ["did", "handle"], 232 + "properties": { 233 + "did": { "type": "string", "format": "did" }, 234 + "handle": { "type": "string", "format": "handle" } 235 + } 236 + } 237 + } 238 + } 239 + ``` 240 + 241 + **Key mappings:** 242 + - `subscription` → `"type": "subscription"` 243 + - Parameters → `"parameters"` 244 + - Message union → `"message": { "schema": { "type": "union" } }` 245 + - `def type` → Named definition in `"defs"` 246 + 247 + ## Type References 248 + 249 + MLF uses simplified reference syntax that maps to ATProto's `ref` format: 250 + 251 + **MLF:** 252 + ```mlf 253 + // File: com/example/post.mlf 254 + use com.example.types.author; 255 + 256 + record post { 257 + author: author, // Imported type 258 + metadata: postMetadata, // Local type 259 + } 260 + 261 + def type postMetadata = { 262 + views!: integer, 263 + }; 264 + ``` 265 + 266 + **Generated JSON:** 267 + ```json 268 + { 269 + "lexicon": 1, 270 + "id": "com.example.post", 271 + "defs": { 272 + "main": { 273 + "type": "record", 274 + "record": { 275 + "type": "object", 276 + "properties": { 277 + "author": { 278 + "type": "ref", 279 + "ref": "com.example.types#author" 280 + }, 281 + "metadata": { 282 + "type": "ref", 283 + "ref": "#postMetadata" 284 + } 285 + } 286 + } 287 + }, 288 + "postMetadata": { 289 + "type": "object", 290 + "required": ["views"], 291 + "properties": { 292 + "views": { "type": "integer" } 293 + } 294 + } 295 + } 296 + } 297 + ``` 298 + 299 + **Reference rules:** 300 + - Local references → `"#defName"` 301 + - External references → `"namespace#defName"` 302 + - Imported types → Resolved to full namespace 303 + 304 + ## Unions 305 + 306 + **MLF:** 307 + ```mlf 308 + // Open union (default - allows unknown types) 309 + content: text | image | video 310 + 311 + // Closed union (only listed types) 312 + content: text | image | video | ! 313 + ``` 314 + 315 + **Generated JSON:** 316 + ```json 317 + { 318 + "openUnion": { 319 + "type": "union", 320 + "refs": ["#text", "#image", "#video"] 321 + }, 322 + "closedUnion": { 323 + "type": "union", 324 + "refs": ["#text", "#image", "#video"], 325 + "closed": true 326 + } 327 + } 328 + ``` 329 + 330 + ## Tokens 331 + 332 + **MLF:** 333 + ```mlf 334 + token open; 335 + token closed; 336 + 337 + record issue { 338 + state!: string constrained { 339 + knownValues: [open, closed], 340 + }, 341 + } 342 + ``` 343 + 344 + **Generated JSON:** 345 + ```json 346 + { 347 + "defs": { 348 + "main": { 349 + "type": "record", 350 + "record": { 351 + "type": "object", 352 + "required": ["state"], 353 + "properties": { 354 + "state": { 355 + "type": "string", 356 + "knownValues": ["open", "closed"] 357 + } 358 + } 359 + } 360 + }, 361 + "open": { 362 + "type": "token", 363 + "description": "..." 364 + }, 365 + "closed": { 366 + "type": "token", 367 + "description": "..." 368 + } 369 + } 370 + } 371 + ``` 372 + 373 + **Note:** Tokens are expanded to their string values in `knownValues` arrays. 374 + 375 + ## Constraints 376 + 377 + MLF constraints map directly to ATProto validation rules: 378 + 379 + | MLF Constraint | JSON Field | 380 + |----------------|------------| 381 + | `maxLength: 100` | `"maxLength": 100` | 382 + | `minLength: 1` | `"minLength": 1` | 383 + | `maxGraphemes: 100` | `"maxGraphemes": 100` | 384 + | `minGraphemes: 1` | `"minGraphemes": 1` | 385 + | `minimum: 0` | `"minimum": 0` | 386 + | `maximum: 100` | `"maximum": 100` | 387 + | `enum: ["a", "b"]` | `"enum": ["a", "b"]` | 388 + | `knownValues: [a, b]` | `"knownValues": ["a", "b"]` | 389 + | `format: "uri"` | `"format": "uri"` | 390 + | `default: "value"` | `"default": "value"` | 391 + | `accept: ["image/png"]` | `"accept": ["image/png"]` | 392 + | `maxSize: 1000000` | `"maxSize": 1000000` | 393 + 394 + ## Prelude Types 395 + 396 + MLF prelude types are convenience wrappers around formatted strings: 397 + 398 + | MLF Type | JSON Representation | 399 + |----------|---------------------| 400 + | `Did` | `{ "type": "string", "format": "did" }` | 401 + | `AtUri` | `{ "type": "string", "format": "at-uri" }` | 402 + | `AtIdentifier` | `{ "type": "string", "format": "at-identifier" }` | 403 + | `Handle` | `{ "type": "string", "format": "handle" }` | 404 + | `Datetime` | `{ "type": "string", "format": "datetime" }` | 405 + | `Uri` | `{ "type": "string", "format": "uri" }` | 406 + | `Cid` | `{ "type": "string", "format": "cid" }` | 407 + | `Nsid` | `{ "type": "string", "format": "nsid" }` | 408 + | `Tid` | `{ "type": "string", "format": "tid" }` | 409 + | `RecordKey` | `{ "type": "string", "format": "record-key" }` | 410 + | `Language` | `{ "type": "string", "format": "language" }` | 411 + 412 + ## Inline Types 413 + 414 + Inline types are expanded at the point of use and don't appear in the `defs` block: 415 + 416 + **MLF:** 417 + ```mlf 418 + inline type ShortString = string constrained { 419 + maxLength: 100, 420 + }; 421 + 422 + record example { 423 + title!: ShortString, 424 + } 425 + ``` 426 + 427 + **Generated JSON:** 428 + ```json 429 + { 430 + "defs": { 431 + "main": { 432 + "type": "record", 433 + "record": { 434 + "type": "object", 435 + "required": ["title"], 436 + "properties": { 437 + "title": { 438 + "type": "string", 439 + "maxLength": 100 440 + } 441 + } 442 + } 443 + } 444 + } 445 + } 446 + ``` 447 + 448 + **Note:** `ShortString` is expanded inline - it doesn't appear in `"defs"`. 449 + 450 + ## Arrays 451 + 452 + **MLF:** 453 + ```mlf 454 + tags: string[] 455 + images: Uri[] constrained { 456 + maxLength: 10, 457 + } 458 + ``` 459 + 460 + **Generated JSON:** 461 + ```json 462 + { 463 + "tags": { 464 + "type": "array", 465 + "items": { 466 + "type": "string" 467 + } 468 + }, 469 + "images": { 470 + "type": "array", 471 + "items": { 472 + "type": "string", 473 + "format": "uri" 474 + }, 475 + "maxLength": 10 476 + } 477 + } 478 + ``` 479 + 480 + ## Complete Example Comparison 481 + 482 + Here's a full lexicon showing MLF and its JSON output: 483 + 484 + **MLF (`com/example/forum/thread.mlf`):** 485 + ```mlf 486 + use com.example.forum.types.author; 487 + 488 + token open; 489 + token closed; 490 + 491 + record thread { 492 + title!: string constrained { 493 + maxGraphemes: 200, 494 + }, 495 + body!: string, 496 + author!: author, 497 + state!: string constrained { 498 + knownValues: [open, closed], 499 + default: "open", 500 + }, 501 + createdAt!: Datetime, 502 + replies: integer constrained { 503 + minimum: 0, 504 + default: 0, 505 + }, 506 + } 507 + 508 + query getThread( 509 + uri!: AtUri, 510 + ): thread | error { 511 + NotFound, 512 + }; 513 + ``` 514 + 515 + This generates a complete ATProto JSON lexicon with: 516 + - Namespace derived from file path 517 + - Main definition for the record 518 + - Token definitions 519 + - Query definition with parameters and errors 520 + - All type references properly resolved 521 + - Constraints mapped to validation rules 522 + 523 + ## What's Next? 524 + 525 + You now understand how MLF maps to ATProto Lexicons! This knowledge helps when: 526 + - Converting existing JSON lexicons to MLF 527 + - Understanding generated lexicon output 528 + - Debugging lexicon validation issues 529 + - Working with the ATProto ecosystem

+4 -2

website/syntaxes/mlf.sublime-syntax

··· 29 29 pop: true 30 30 31 31 keywords: 32 - - match: '\b(namespace|use|record|inline|def|type|token|query|procedure|subscription|throws|constrained)\b' 32 + - match: '\b(namespace|use|as|record|inline|def|type|token|query|procedure|subscription|throws|constrained|error)\b' 33 33 scope: keyword.control.mlf 34 34 - match: '\b(main|defs)\b' 35 35 scope: keyword.other.mlf ··· 65 65 operators: 66 66 - match: '[{}()\[\]]' 67 67 scope: punctuation.section.mlf 68 - - match: '[,;:]' 68 + - match: '[,;:.]' 69 69 scope: punctuation.separator.mlf 70 70 - match: '!' 71 71 scope: keyword.operator.required.mlf 72 72 - match: '\|' 73 73 scope: keyword.operator.union.mlf 74 + - match: '\*' 75 + scope: keyword.operator.wildcard.mlf 74 76 - match: '=' 75 77 scope: keyword.operator.assignment.mlf

Configure Feed

Configure Feed