A human-friendly DSL for ATProto Lexicons
27
fork

Configure Feed

Select the types of activity you want to include in your feed.

Add mlf status and mlf diff read-only commands

Introduce mlf-cli/src/remote_state.rs, a shared pipeline that walks
the workspace, generates Lexicon JSON for every module, wraps each as
a com.atproto.lexicon.schema record, computes its DAG-CBOR CID, and
fetches whatever's currently published on the PDS. classify() sorts
every NSID into Unchanged/New/Changed/Removed; remote records outside
[package].name are left alone so shared-account-per-package setups
don't get stepped on. mlf status renders the summary plus per-authority
DNS readiness; mlf diff prints unified JSON diffs via the similar
crate for any NSID the status would touch.

Also fixes a spec-compliance gap in mlf-atproto::identity: per the
ATProto lexicon spec, an NSID's publishing authority is the NSID
minus the final name segment (so com.example.forum.post and
com.example.forum.thread share _lexicon.forum.example.com).
authority_of_nsid + authority_dns_name + RealDnsResolver::resolve_authority_did
implement the correct drop-final rule used by the new status/diff
path; the fetcher's existing helpers are kept unchanged for
backwards compatibility.

authored by stavola.xyz and committed by

Tangled df1ea344 68b2c1a7

+1014 -11
+2
Cargo.lock
··· 1674 1674 "clap", 1675 1675 "glob", 1676 1676 "miette", 1677 + "mlf-atproto", 1677 1678 "mlf-codegen", 1678 1679 "mlf-codegen-go", 1679 1680 "mlf-codegen-rust", ··· 1686 1687 "serde", 1687 1688 "serde_json", 1688 1689 "sha2 0.10.9", 1690 + "similar", 1689 1691 "thiserror 2.0.17", 1690 1692 "tokio", 1691 1693 "toml",
+102 -10
mlf-atproto/src/identity.rs
··· 10 10 use async_trait::async_trait; 11 11 use hickory_resolver::TokioAsyncResolver; 12 12 use hickory_resolver::config::{ResolverConfig, ResolverOpts}; 13 + use hickory_resolver::error::ResolveErrorKind; 13 14 use std::collections::HashMap; 14 15 use std::sync::{Arc, Mutex}; 15 16 ··· 85 86 } 86 87 87 88 // --------------------------------------------------------------------------- 89 + // Authority (spec-compliant) 90 + // --------------------------------------------------------------------------- 91 + 92 + /// Compute the authority domain of an NSID per the ATProto lexicon spec. 93 + /// 94 + /// "Take the NSID, drop the final name segment; the remaining segments, 95 + /// still in NSID order, identify the publishing authority." The returned 96 + /// string is in NSID order (e.g. `com.example.forum`), not reversed DNS 97 + /// order — call [`authority_dns_name`] to get the `_lexicon.<…>` label. 98 + /// 99 + /// - `com.example.forum.post` → `com.example.forum` 100 + /// - `app.bsky.feed.post` → `app.bsky.feed` 101 + /// - `edu.university.dept.lab.blogging.getBlogPost` → `edu.university.dept.lab.blogging` 102 + pub fn authority_of_nsid(nsid: &str) -> Result<String, IdentityError> { 103 + let parts: Vec<&str> = nsid.split('.').collect(); 104 + if parts.len() < 3 { 105 + return Err(IdentityError::InvalidNsid(format!( 106 + "NSID must have at least 3 segments (authority + name): {nsid}" 107 + ))); 108 + } 109 + Ok(parts[..parts.len() - 1].join(".")) 110 + } 111 + 112 + /// Convert an NSID-order authority (e.g. `com.example.forum`) to the 113 + /// `_lexicon.<reversed>` DNS label used for the TXT lookup. 114 + /// 115 + /// `com.example.forum` → `_lexicon.forum.example.com`. 116 + pub fn authority_dns_name(authority: &str) -> String { 117 + let reversed: Vec<&str> = authority.split('.').rev().collect(); 118 + format!("_lexicon.{}", reversed.join(".")) 119 + } 120 + 121 + // --------------------------------------------------------------------------- 88 122 // DnsResolver trait 89 123 // --------------------------------------------------------------------------- 90 124 ··· 129 163 name_segments: &str, 130 164 ) -> Result<String, IdentityError> { 131 165 let dns_name = construct_dns_name(authority, name_segments); 132 - let response = self.resolver.txt_lookup(&dns_name).await.map_err(|e| { 166 + resolve_did_at(&self.resolver, &dns_name).await 167 + } 168 + } 169 + 170 + impl RealDnsResolver { 171 + /// Resolve `_lexicon.<reversed-authority>` TXT for an NSID-order 172 + /// authority domain — the spec-compliant lookup for a publishing 173 + /// authority that covers every NSID descended from it. 174 + pub async fn resolve_authority_did(&self, authority: &str) -> Result<String, IdentityError> { 175 + let dns_name = authority_dns_name(authority); 176 + resolve_did_at(&self.resolver, &dns_name).await 177 + } 178 + } 179 + 180 + async fn resolve_did_at( 181 + resolver: &TokioAsyncResolver, 182 + dns_name: &str, 183 + ) -> Result<String, IdentityError> { 184 + let response = resolver.txt_lookup(dns_name).await.map_err(|e| { 185 + // NXDOMAIN / "no records" means the authority hasn't been set 186 + // up yet — map to NoDidInTxt so callers treat it as missing 187 + // (bootstrappable) rather than an unrecoverable DNS failure. 188 + if matches!(e.kind(), ResolveErrorKind::NoRecordsFound { .. }) { 189 + IdentityError::NoDidInTxt(dns_name.to_string()) 190 + } else { 133 191 IdentityError::DnsLookupFailed { 134 - domain: dns_name.clone(), 192 + domain: dns_name.to_string(), 135 193 error: e.to_string(), 136 194 } 137 - })?; 138 - for txt in response.iter() { 139 - for data in txt.txt_data() { 140 - let text = String::from_utf8_lossy(data); 141 - if let Some(did) = text.strip_prefix("did=") { 142 - return Ok(did.trim().to_string()); 143 - } 195 + } 196 + })?; 197 + for txt in response.iter() { 198 + for data in txt.txt_data() { 199 + let text = String::from_utf8_lossy(data); 200 + if let Some(did) = text.strip_prefix("did=") { 201 + return Ok(did.trim().to_string()); 144 202 } 145 203 } 146 - Err(IdentityError::NoDidInTxt(dns_name)) 147 204 } 205 + Err(IdentityError::NoDidInTxt(dns_name.to_string())) 148 206 } 149 207 150 208 /// In-memory DNS resolver for tests. ··· 342 400 fn test_extract_pds_endpoint_missing() { 343 401 let doc = serde_json::json!({"service": []}); 344 402 assert!(extract_pds_endpoint(&doc, "did:plc:x").is_err()); 403 + } 404 + 405 + #[test] 406 + fn test_authority_of_nsid() { 407 + assert_eq!( 408 + authority_of_nsid("com.example.forum.post").unwrap(), 409 + "com.example.forum" 410 + ); 411 + assert_eq!( 412 + authority_of_nsid("app.bsky.feed.post").unwrap(), 413 + "app.bsky.feed" 414 + ); 415 + assert_eq!( 416 + authority_of_nsid("edu.university.dept.lab.blogging.getBlogPost").unwrap(), 417 + "edu.university.dept.lab.blogging" 418 + ); 419 + assert!(authority_of_nsid("com.example").is_err()); 420 + assert!(authority_of_nsid("one").is_err()); 421 + } 422 + 423 + #[test] 424 + fn test_authority_dns_name() { 425 + assert_eq!( 426 + authority_dns_name("com.example.forum"), 427 + "_lexicon.forum.example.com" 428 + ); 429 + assert_eq!( 430 + authority_dns_name("app.bsky.feed"), 431 + "_lexicon.feed.bsky.app" 432 + ); 433 + assert_eq!( 434 + authority_dns_name("edu.university.dept.lab.blogging"), 435 + "_lexicon.blogging.lab.dept.university.edu" 436 + ); 345 437 } 346 438 }
+2
mlf-cli/Cargo.toml
··· 14 14 mlf-codegen = { path = "../mlf-codegen" } 15 15 mlf-diagnostics = { path = "../mlf-diagnostics" } 16 16 mlf-lexicon-fetcher = { path = "../mlf-lexicon-fetcher" } 17 + mlf-atproto = { path = "../mlf-atproto" } 17 18 clap = { version = "4.5.48", features = ["derive"] } 18 19 miette = { version = "7", features = ["fancy"] } 19 20 thiserror = "2" ··· 25 26 reqwest = { version = "0.12", features = ["json"] } 26 27 chrono = { version = "0.4", features = ["serde"] } 27 28 sha2 = "0.10" 29 + similar = "2" 28 30 29 31 # Optional code generator plugins 30 32 mlf-codegen-typescript = { path = "../codegen-plugins/mlf-codegen-typescript", optional = true }
+14
mlf-cli/src/check.rs
··· 376 376 } 377 377 } 378 378 379 + /// Publicly-callable alias used by other commands that need to walk 380 + /// the workspace source tree. 381 + pub fn collect_mlf_files_pub(dir: &std::path::Path) -> Result<Vec<PathBuf>, CheckError> { 382 + collect_mlf_files(dir) 383 + } 384 + 385 + /// Publicly-callable alias used by other commands. 386 + pub fn extract_namespace_pub( 387 + file_path: &std::path::Path, 388 + root_dir: &std::path::Path, 389 + ) -> Result<String, CheckError> { 390 + extract_namespace(file_path, root_dir) 391 + } 392 + 379 393 /// Recursively collect all .mlf files from a directory 380 394 fn collect_mlf_files(dir: &std::path::Path) -> Result<Vec<PathBuf>, CheckError> { 381 395 let mut files = Vec::new();
+86
mlf-cli/src/diff.rs
··· 1 + //! `mlf diff [NSID]` — print a unified diff between the local Lexicon 2 + //! JSON and the currently-published record for one NSID (or every 3 + //! changed / new / removed NSID in the package). 4 + 5 + use crate::remote_state::{Classification, RemoteState, RemoteStateError, classify}; 6 + use similar::{ChangeTag, TextDiff}; 7 + 8 + pub async fn run_diff(nsid: Option<String>) -> Result<(), RemoteStateError> { 9 + let state = RemoteState::load().await?; 10 + let classifications = classify(&state); 11 + 12 + let targets: Vec<String> = match nsid { 13 + Some(ref n) => vec![n.clone()], 14 + None => classifications 15 + .iter() 16 + .filter(|(_, c)| !matches!(c, Classification::Unchanged)) 17 + .map(|(n, _)| n.clone()) 18 + .collect(), 19 + }; 20 + 21 + if targets.is_empty() { 22 + println!("No differences."); 23 + return Ok(()); 24 + } 25 + 26 + for (i, nsid) in targets.iter().enumerate() { 27 + if i > 0 { 28 + println!(); 29 + } 30 + let cls = classifications.get(nsid); 31 + print_diff(&state, nsid, cls); 32 + } 33 + 34 + Ok(()) 35 + } 36 + 37 + fn print_diff(state: &RemoteState, nsid: &str, cls: Option<&Classification>) { 38 + let local_json = state 39 + .local 40 + .get(nsid) 41 + .map(|l| pretty_json(&l.record_json)) 42 + .unwrap_or_default(); 43 + let remote_json = state 44 + .remote 45 + .get(nsid) 46 + .map(|r| pretty_json(&r.record_json)) 47 + .unwrap_or_default(); 48 + 49 + let header = match cls { 50 + Some(Classification::Unchanged) => format!("= {nsid} (unchanged)"), 51 + Some(Classification::New) => format!("+ {nsid} (new, not yet on PDS)"), 52 + Some(Classification::Removed) => format!("- {nsid} (removed locally, still on PDS)"), 53 + Some(Classification::Changed { 54 + local_cid, 55 + remote_cid, 56 + }) => format!("~ {nsid} {remote_cid} → {local_cid}"), 57 + None => format!("? {nsid} (not found locally or remotely)"), 58 + }; 59 + println!("{header}"); 60 + println!("{}", "-".repeat(header.chars().count())); 61 + 62 + let diff = TextDiff::from_lines(&remote_json, &local_json); 63 + for change in diff.iter_all_changes() { 64 + let (sign, text) = match change.tag() { 65 + ChangeTag::Delete => ("-", change.value()), 66 + ChangeTag::Insert => ("+", change.value()), 67 + ChangeTag::Equal => (" ", change.value()), 68 + }; 69 + // Trim trailing newline — `value()` returns `"line\n"` from the 70 + // source; we supply our own newline via println. 71 + let text = text.strip_suffix('\n').unwrap_or(text); 72 + println!("{sign} {text}"); 73 + } 74 + } 75 + 76 + fn pretty_json(value: &serde_json::Value) -> String { 77 + // `to_string_pretty` preserves insertion order; serde_json uses 78 + // `preserve_order` via the `IndexMap` feature by default in this 79 + // workspace? If not, key order is lexicographic. Either is fine for 80 + // a human-facing diff; the CID comparison happens elsewhere. 81 + let mut s = serde_json::to_string_pretty(value).unwrap_or_default(); 82 + if !s.ends_with('\n') { 83 + s.push('\n'); 84 + } 85 + s 86 + }
+3
mlf-cli/src/lib.rs
··· 1 1 pub mod check; 2 2 pub mod config; 3 + pub mod diff; 3 4 pub mod fetch; 4 5 pub mod generate; 5 6 pub mod init; 7 + pub mod remote_state; 8 + pub mod status; 6 9 pub mod workspace_ext;
+12 -1
mlf-cli/src/main.rs
··· 1 1 use clap::{Parser, Subcommand}; 2 2 use miette::IntoDiagnostic; 3 - use mlf_cli::{check, fetch, generate, init}; 3 + use mlf_cli::{check, diff, fetch, generate, init, status}; 4 4 use std::path::PathBuf; 5 5 use std::process; 6 6 ··· 73 73 74 74 #[arg(long, help = "Require lockfile and fail if dependencies need updating")] 75 75 locked: bool, 76 + }, 77 + 78 + #[command(about = "Show the publish status of each lexicon in the package")] 79 + Status, 80 + 81 + #[command(about = "Diff local lexicons against what's currently published")] 82 + Diff { 83 + #[arg(help = "NSID to diff. If omitted, diffs every changed, new, or removed NSID.")] 84 + nsid: Option<String>, 76 85 }, 77 86 } 78 87 ··· 195 204 } => fetch::run_fetch(nsid, save, update, locked) 196 205 .await 197 206 .into_diagnostic(), 207 + Commands::Status => status::run_status().await.into_diagnostic(), 208 + Commands::Diff { nsid } => diff::run_diff(nsid).await.into_diagnostic(), 198 209 }; 199 210 200 211 if let Err(e) = result {
+550
mlf-cli/src/remote_state.rs
··· 1 + //! Shared machinery for reading the current remote state of a package. 2 + //! 3 + //! Both `mlf status` and `mlf diff` (and later `mlf publish`) need to: 4 + //! 5 + //! 1. Load the workspace and generate Lexicon JSON for every local module. 6 + //! 2. Compute the on-chain CID for each local record. 7 + //! 3. Resolve the package's publishing DID via `_lexicon.<authority>` TXT. 8 + //! 4. Fetch the current `com.atproto.lexicon.schema` records from the PDS. 9 + //! 5. Diff local vs remote by (NSID, CID). 10 + //! 11 + //! This module exposes the first three as a single read-only operation 12 + //! that produces a [`RemoteState`] the command-specific code can render. 13 + 14 + use crate::check::collect_mlf_files_pub as collect_mlf_files; 15 + use crate::check::extract_namespace_pub as extract_namespace; 16 + use crate::config::{MlfConfig, PackageConfig, find_project_root}; 17 + use crate::workspace_ext::workspace_with_std_and_cache; 18 + use miette::Diagnostic; 19 + use mlf_atproto::identity::{self, IdentityError, RealDnsResolver}; 20 + use mlf_atproto::records::{self, RecordError}; 21 + use mlf_lang::Workspace; 22 + use serde_json::Value; 23 + use std::collections::{BTreeMap, HashSet}; 24 + use std::path::{Path, PathBuf}; 25 + use thiserror::Error; 26 + 27 + /// Collection name the AT Protocol uses for lexicon schemas. 28 + pub const LEXICON_COLLECTION: &str = "com.atproto.lexicon.schema"; 29 + 30 + #[derive(Error, Debug, Diagnostic)] 31 + pub enum RemoteStateError { 32 + #[error("No mlf.toml found in current or parent directories")] 33 + #[diagnostic( 34 + code(mlf::remote_state::no_project), 35 + help("Run `mlf init` first to set up a project.") 36 + )] 37 + NoProject, 38 + 39 + #[error("Failed to load mlf.toml: {0}")] 40 + #[diagnostic(code(mlf::remote_state::config))] 41 + Config(String), 42 + 43 + #[error("Failed to read source files: {0}")] 44 + #[diagnostic(code(mlf::remote_state::io))] 45 + Io(String), 46 + 47 + #[error("Failed to parse {file}")] 48 + #[diagnostic(code(mlf::remote_state::parse))] 49 + Parse { file: String }, 50 + 51 + #[error("Workspace resolution failed")] 52 + #[diagnostic(code(mlf::remote_state::resolve))] 53 + Resolve, 54 + 55 + #[error("Lexicon `{namespace}` (in {file}) is outside package scope `{package_name}.*`")] 56 + #[diagnostic( 57 + code(mlf::remote_state::scope_violation), 58 + help("Move the file or update `[package].name` in mlf.toml.") 59 + )] 60 + ScopeViolation { 61 + file: String, 62 + namespace: String, 63 + package_name: String, 64 + }, 65 + 66 + #[error("Failed to generate Lexicon JSON for `{namespace}`: {reason}")] 67 + #[diagnostic(code(mlf::remote_state::codegen))] 68 + Codegen { namespace: String, reason: String }, 69 + 70 + #[error("CID computation failed for `{namespace}`: {reason}")] 71 + #[diagnostic(code(mlf::remote_state::cid))] 72 + Cid { namespace: String, reason: String }, 73 + 74 + #[error("Identity resolution failed")] 75 + #[diagnostic(code(mlf::remote_state::identity))] 76 + Identity(#[source] IdentityError), 77 + 78 + #[error("Record fetch failed")] 79 + #[diagnostic(code(mlf::remote_state::records))] 80 + Records(#[source] RecordError), 81 + 82 + #[error( 83 + "Package authorities resolve to multiple DIDs: {first_authority} → {first_did}, {second_authority} → {second_did}" 84 + )] 85 + #[diagnostic( 86 + code(mlf::remote_state::multiple_dids), 87 + help( 88 + "Every authority under `[package].name` must resolve to the same DID. This package spans two. Either fix the `_lexicon` TXT records, or split into separate workspaces." 89 + ) 90 + )] 91 + MultipleDids { 92 + first_authority: String, 93 + first_did: String, 94 + second_authority: String, 95 + second_did: String, 96 + }, 97 + } 98 + 99 + /// Fully-resolved view of the package's local state and the remote state 100 + /// it would compare against on publish. 101 + pub struct RemoteState { 102 + pub project_root: PathBuf, 103 + pub package: PackageConfig, 104 + 105 + /// The DID every authority under `[package].name` resolves to. 106 + /// `None` if we couldn't resolve any authority (e.g. no TXT records yet 107 + /// — a brand-new package). 108 + pub publishing_did: Option<String>, 109 + 110 + /// The PDS endpoint for [`publishing_did`]. `None` when the DID is. 111 + pub pds: Option<String>, 112 + 113 + /// Per-authority TXT lookup result. Keyed by authority (the reverse- 114 + /// DNS prefix of the NSID), so `[package.foo, package.bar]` under 115 + /// `com.example` both show up as `com.example` in the map — the 116 + /// authority covers every NSID that starts with it. 117 + pub authority_status: BTreeMap<String, AuthorityStatus>, 118 + 119 + /// Every lexicon defined locally, keyed by NSID. 120 + pub local: BTreeMap<String, LocalLexicon>, 121 + 122 + /// Every `com.atproto.lexicon.schema` record currently in the 123 + /// publishing DID's repo, keyed by NSID. Empty if we couldn't 124 + /// resolve a DID. 125 + pub remote: BTreeMap<String, RemoteLexicon>, 126 + } 127 + 128 + pub struct LocalLexicon { 129 + pub namespace: String, 130 + pub file: String, 131 + /// The JSON as it would be written to the PDS (with `$type`). 132 + pub record_json: Value, 133 + /// CID of [`record_json`] — comparable to whatever listRecords returns. 134 + pub cid: String, 135 + } 136 + 137 + pub struct RemoteLexicon { 138 + pub nsid: String, 139 + pub cid: String, 140 + pub record_json: Value, 141 + } 142 + 143 + #[derive(Debug, Clone)] 144 + pub enum AuthorityStatus { 145 + /// `_lexicon.<auth>` TXT resolved to a DID. 146 + Resolved { did: String }, 147 + /// `_lexicon.<auth>` TXT was missing or empty. 148 + Missing, 149 + /// Lookup failed for some other reason (network, parse, etc.). 150 + Error(String), 151 + } 152 + 153 + impl RemoteState { 154 + /// Build a [`RemoteState`] by walking the workspace, generating JSON, 155 + /// and performing the necessary network lookups. 156 + pub async fn load() -> Result<Self, RemoteStateError> { 157 + let current_dir = 158 + std::env::current_dir().map_err(|e| RemoteStateError::Io(e.to_string()))?; 159 + let project_root = 160 + find_project_root(&current_dir).map_err(|_| RemoteStateError::NoProject)?; 161 + let config_path = project_root.join("mlf.toml"); 162 + let config = 163 + MlfConfig::load(&config_path).map_err(|e| RemoteStateError::Config(e.to_string()))?; 164 + let package = config.package.clone(); 165 + let source_dir = project_root.join(&config.source.directory); 166 + 167 + // 1. Load every .mlf file, parse it, add to a resolver workspace. 168 + let local = collect_local(&source_dir, &project_root, &package)?; 169 + 170 + // 2. Group NSIDs into their reverse-DNS authorities. 171 + let authorities = authorities_for_nsids(local.keys().map(|s| s.as_str()))?; 172 + 173 + // 3. Resolve each authority via DNS; enforce single-DID gate. 174 + let resolver = RealDnsResolver::new().map_err(RemoteStateError::Identity)?; 175 + let (publishing_did, authority_status) = 176 + resolve_authorities(&resolver, &authorities).await?; 177 + 178 + // 4. If we have a DID, resolve its PDS and fetch the lexicon records. 179 + let (pds, remote) = if let Some(did) = publishing_did.as_deref() { 180 + let client = reqwest::Client::new(); 181 + let pds = identity::resolve_did_to_pds(&client, did) 182 + .await 183 + .map_err(RemoteStateError::Identity)?; 184 + let records = records::list_all_records(&client, &pds, did, LEXICON_COLLECTION) 185 + .await 186 + .map_err(RemoteStateError::Records)?; 187 + let mut remote_map: BTreeMap<String, RemoteLexicon> = BTreeMap::new(); 188 + for r in records { 189 + let nsid = record_nsid(&r).unwrap_or_else(|| rkey_from_uri(&r.uri)); 190 + let cid = r.cid.clone().unwrap_or_default(); 191 + remote_map.insert( 192 + nsid.clone(), 193 + RemoteLexicon { 194 + nsid, 195 + cid, 196 + record_json: r.value, 197 + }, 198 + ); 199 + } 200 + (Some(pds), remote_map) 201 + } else { 202 + (None, BTreeMap::new()) 203 + }; 204 + 205 + Ok(Self { 206 + project_root, 207 + package, 208 + publishing_did, 209 + pds, 210 + authority_status, 211 + local, 212 + remote, 213 + }) 214 + } 215 + } 216 + 217 + /// Compute what a publish run would do given a fully-loaded [`RemoteState`]. 218 + pub fn classify(state: &RemoteState) -> BTreeMap<String, Classification> { 219 + let mut out = BTreeMap::new(); 220 + let mut seen: HashSet<&String> = HashSet::new(); 221 + 222 + for (nsid, local) in &state.local { 223 + seen.insert(nsid); 224 + let cls = match state.remote.get(nsid) { 225 + None => Classification::New, 226 + Some(remote) if remote.cid == local.cid => Classification::Unchanged, 227 + Some(remote) => Classification::Changed { 228 + local_cid: local.cid.clone(), 229 + remote_cid: remote.cid.clone(), 230 + }, 231 + }; 232 + out.insert(nsid.clone(), cls); 233 + } 234 + 235 + for nsid in state.remote.keys() { 236 + if state.package.namespace_is_in_scope(nsid) && !seen.contains(nsid) { 237 + // We only flag remote-only records that fall under this package's 238 + // scope. Anything else in the repo belongs to a different package 239 + // hosted on the same account and isn't ours to touch. 240 + out.insert(nsid.clone(), Classification::Removed); 241 + } 242 + } 243 + 244 + out 245 + } 246 + 247 + #[derive(Debug, Clone, PartialEq, Eq)] 248 + pub enum Classification { 249 + Unchanged, 250 + New, 251 + Changed { 252 + local_cid: String, 253 + remote_cid: String, 254 + }, 255 + Removed, 256 + } 257 + 258 + // --------------------------------------------------------------------------- 259 + // helpers 260 + // --------------------------------------------------------------------------- 261 + 262 + fn collect_local( 263 + source_dir: &Path, 264 + project_root: &Path, 265 + package: &PackageConfig, 266 + ) -> Result<BTreeMap<String, LocalLexicon>, RemoteStateError> { 267 + let cache_dir = crate::config::get_mlf_cache_dir(project_root); 268 + let mut workspace = workspace_with_std_and_cache(Some(&cache_dir)) 269 + .map_err(|e| RemoteStateError::Io(format!("loading workspace: {e}")))?; 270 + 271 + let files = collect_mlf_files(source_dir) 272 + .map_err(|e| RemoteStateError::Io(format!("collecting .mlf files: {e}")))?; 273 + 274 + let mut source_order: Vec<(String, String)> = Vec::new(); 275 + 276 + for file in &files { 277 + let source = std::fs::read_to_string(file) 278 + .map_err(|e| RemoteStateError::Io(format!("reading {}: {e}", file.display())))?; 279 + let namespace = extract_namespace(file, source_dir) 280 + .map_err(|e| RemoteStateError::Io(format!("namespace for {}: {e}", file.display())))?; 281 + if !package.namespace_is_in_scope(&namespace) { 282 + return Err(RemoteStateError::ScopeViolation { 283 + file: file.display().to_string(), 284 + namespace, 285 + package_name: package.name.clone(), 286 + }); 287 + } 288 + let lexicon = mlf_lang::parse_lexicon(&source).map_err(|_| RemoteStateError::Parse { 289 + file: file.display().to_string(), 290 + })?; 291 + workspace 292 + .add_module(namespace.clone(), lexicon) 293 + .map_err(|_| RemoteStateError::Resolve)?; 294 + source_order.push((namespace, file.display().to_string())); 295 + } 296 + 297 + workspace.resolve().map_err(|_| RemoteStateError::Resolve)?; 298 + 299 + // Produce JSON + CID for each module in the order we loaded them. 300 + let mut out = BTreeMap::new(); 301 + for (namespace, file) in source_order { 302 + let lexicon = workspace_module(&workspace, &namespace)?; 303 + let codegen_out = mlf_codegen::generate_lexicon(&namespace, lexicon, &workspace); 304 + let record_json = wrap_as_schema_record(codegen_out.json); 305 + let cid = 306 + mlf_atproto::cid::cid_for_json(&record_json).map_err(|e| RemoteStateError::Cid { 307 + namespace: namespace.clone(), 308 + reason: e.to_string(), 309 + })?; 310 + out.insert( 311 + namespace.clone(), 312 + LocalLexicon { 313 + namespace: namespace.clone(), 314 + file, 315 + record_json, 316 + cid, 317 + }, 318 + ); 319 + } 320 + 321 + Ok(out) 322 + } 323 + 324 + fn workspace_module<'a>( 325 + workspace: &'a Workspace, 326 + namespace: &str, 327 + ) -> Result<&'a mlf_lang::Lexicon, RemoteStateError> { 328 + workspace 329 + .get_lexicon(namespace) 330 + .ok_or_else(|| RemoteStateError::Codegen { 331 + namespace: namespace.to_string(), 332 + reason: "module not present in workspace after resolve".to_string(), 333 + }) 334 + } 335 + 336 + /// Wrap a generated Lexicon JSON with the `$type` required on a PDS record. 337 + fn wrap_as_schema_record(mut lexicon_json: Value) -> Value { 338 + if let Value::Object(ref mut map) = lexicon_json { 339 + map.insert( 340 + "$type".to_string(), 341 + Value::String(LEXICON_COLLECTION.to_string()), 342 + ); 343 + } 344 + lexicon_json 345 + } 346 + 347 + /// Take every NSID and return the unique set of authorities covering 348 + /// them per the ATProto lexicon spec: drop the final name segment and 349 + /// use the remainder as the authority. 350 + /// 351 + /// NSIDs `com.example.forum.post` and `com.example.forum.thread` share 352 + /// authority `com.example.forum`. `com.example.directory.listing` adds 353 + /// a second authority `com.example.directory`. 354 + fn authorities_for_nsids<'a>( 355 + nsids: impl Iterator<Item = &'a str>, 356 + ) -> Result<HashSet<String>, RemoteStateError> { 357 + nsids 358 + .map(|nsid| identity::authority_of_nsid(nsid).map_err(RemoteStateError::Identity)) 359 + .collect() 360 + } 361 + 362 + async fn resolve_authorities( 363 + resolver: &RealDnsResolver, 364 + authorities: &HashSet<String>, 365 + ) -> Result<(Option<String>, BTreeMap<String, AuthorityStatus>), RemoteStateError> { 366 + let mut statuses: BTreeMap<String, AuthorityStatus> = BTreeMap::new(); 367 + let mut dids: Vec<(String, String)> = Vec::new(); 368 + 369 + for authority in authorities { 370 + let label = identity::authority_dns_name(authority); 371 + let res = resolver.resolve_authority_did(authority).await; 372 + match res { 373 + Ok(did) => { 374 + statuses.insert( 375 + label.clone(), 376 + AuthorityStatus::Resolved { did: did.clone() }, 377 + ); 378 + dids.push((label, did)); 379 + } 380 + Err(IdentityError::NoDidInTxt(_)) => { 381 + statuses.insert(label, AuthorityStatus::Missing); 382 + } 383 + Err(IdentityError::DnsLookupFailed { error, .. }) => { 384 + statuses.insert(label, AuthorityStatus::Error(error)); 385 + } 386 + Err(e) => return Err(RemoteStateError::Identity(e)), 387 + } 388 + } 389 + 390 + // Enforce single-DID gate: every resolved authority must point at the 391 + // same DID. Multiple DIDs is a v1 error per the plan. 392 + let mut publishing_did: Option<String> = None; 393 + if let Some((first_auth, first_did)) = dids.first().cloned() { 394 + for (auth, did) in dids.iter().skip(1) { 395 + if did != &first_did { 396 + return Err(RemoteStateError::MultipleDids { 397 + first_authority: first_auth, 398 + first_did, 399 + second_authority: auth.clone(), 400 + second_did: did.clone(), 401 + }); 402 + } 403 + } 404 + publishing_did = Some(first_did); 405 + } 406 + 407 + Ok((publishing_did, statuses)) 408 + } 409 + 410 + fn record_nsid(record: &records::Record) -> Option<String> { 411 + record 412 + .value 413 + .get("id") 414 + .and_then(|v| v.as_str()) 415 + .map(|s| s.to_string()) 416 + } 417 + 418 + fn rkey_from_uri(uri: &str) -> String { 419 + uri.rsplit('/').next().unwrap_or(uri).to_string() 420 + } 421 + 422 + #[cfg(test)] 423 + mod tests { 424 + use super::*; 425 + use serde_json::json; 426 + 427 + fn mk_local(nsid: &str, cid: &str) -> LocalLexicon { 428 + LocalLexicon { 429 + namespace: nsid.to_string(), 430 + file: format!("{nsid}.mlf"), 431 + record_json: json!({"id": nsid}), 432 + cid: cid.to_string(), 433 + } 434 + } 435 + 436 + fn mk_remote(nsid: &str, cid: &str) -> RemoteLexicon { 437 + RemoteLexicon { 438 + nsid: nsid.to_string(), 439 + cid: cid.to_string(), 440 + record_json: json!({"id": nsid}), 441 + } 442 + } 443 + 444 + fn state_with( 445 + package: &str, 446 + local: Vec<LocalLexicon>, 447 + remote: Vec<RemoteLexicon>, 448 + ) -> RemoteState { 449 + let mut local_map = BTreeMap::new(); 450 + for l in local { 451 + local_map.insert(l.namespace.clone(), l); 452 + } 453 + let mut remote_map = BTreeMap::new(); 454 + for r in remote { 455 + remote_map.insert(r.nsid.clone(), r); 456 + } 457 + RemoteState { 458 + project_root: PathBuf::from("/tmp"), 459 + package: PackageConfig { 460 + name: package.to_string(), 461 + }, 462 + publishing_did: Some("did:plc:x".to_string()), 463 + pds: Some("https://pds".to_string()), 464 + authority_status: BTreeMap::new(), 465 + local: local_map, 466 + remote: remote_map, 467 + } 468 + } 469 + 470 + #[test] 471 + fn classify_detects_new_changed_unchanged_removed() { 472 + let state = state_with( 473 + "com.example.forum", 474 + vec![ 475 + mk_local("com.example.forum.post", "bafyA"), // unchanged 476 + mk_local("com.example.forum.thread", "bafyNEW"), // changed 477 + mk_local("com.example.forum.reply", "bafyR"), // new 478 + ], 479 + vec![ 480 + mk_remote("com.example.forum.post", "bafyA"), 481 + mk_remote("com.example.forum.thread", "bafyOLD"), 482 + mk_remote("com.example.forum.orphan", "bafyO"), // removed 483 + ], 484 + ); 485 + 486 + let classes = classify(&state); 487 + 488 + assert_eq!( 489 + classes.get("com.example.forum.post"), 490 + Some(&Classification::Unchanged) 491 + ); 492 + assert_eq!( 493 + classes.get("com.example.forum.thread"), 494 + Some(&Classification::Changed { 495 + local_cid: "bafyNEW".into(), 496 + remote_cid: "bafyOLD".into(), 497 + }), 498 + ); 499 + assert_eq!( 500 + classes.get("com.example.forum.reply"), 501 + Some(&Classification::New) 502 + ); 503 + assert_eq!( 504 + classes.get("com.example.forum.orphan"), 505 + Some(&Classification::Removed) 506 + ); 507 + } 508 + 509 + #[test] 510 + fn classify_ignores_remote_records_outside_package_scope() { 511 + // The PDS may hold records for other packages under the same 512 + // account. We must never touch records whose NSID isn't under 513 + // `[package].name`. 514 + let state = state_with( 515 + "com.example.forum", 516 + vec![mk_local("com.example.forum.post", "bafyA")], 517 + vec![ 518 + mk_remote("com.example.forum.post", "bafyA"), 519 + // Under the same account but a different package — leave it alone. 520 + mk_remote("com.example.directory.listing", "bafyD"), 521 + ], 522 + ); 523 + 524 + let classes = classify(&state); 525 + assert_eq!(classes.len(), 1); 526 + assert!(classes.contains_key("com.example.forum.post")); 527 + assert!(!classes.contains_key("com.example.directory.listing")); 528 + } 529 + 530 + #[test] 531 + fn authorities_collapse_by_full_prefix() { 532 + let auths = authorities_for_nsids( 533 + ["com.example.forum.post", "com.example.forum.thread"].into_iter(), 534 + ) 535 + .unwrap(); 536 + assert_eq!(auths.len(), 1); 537 + assert!(auths.contains("com.example.forum")); 538 + } 539 + 540 + #[test] 541 + fn authorities_split_on_different_sub_authority() { 542 + let auths = authorities_for_nsids( 543 + ["com.example.forum.post", "com.example.directory.listing"].into_iter(), 544 + ) 545 + .unwrap(); 546 + assert_eq!(auths.len(), 2); 547 + assert!(auths.contains("com.example.forum")); 548 + assert!(auths.contains("com.example.directory")); 549 + } 550 + }
+127
mlf-cli/src/status.rs
··· 1 + //! `mlf status` — show each NSID's state (new / changed / unchanged / 2 + //! removed) relative to what's currently published on the PDS, plus the 3 + //! DNS readiness of each authority in the package. 4 + 5 + use crate::remote_state::{ 6 + AuthorityStatus, Classification, RemoteState, RemoteStateError, classify, 7 + }; 8 + 9 + pub async fn run_status() -> Result<(), RemoteStateError> { 10 + let state = RemoteState::load().await?; 11 + 12 + println!("Package: {}", state.package.name); 13 + match state.publishing_did.as_deref() { 14 + Some(did) => println!("Publishing DID: {did}"), 15 + None => println!("Publishing DID: (not yet set — _lexicon TXT records missing)"), 16 + } 17 + if let Some(pds) = state.pds.as_deref() { 18 + println!("PDS: {pds}"); 19 + } 20 + println!(); 21 + 22 + print_dns_section(&state); 23 + println!(); 24 + print_lexicon_section(&state); 25 + 26 + Ok(()) 27 + } 28 + 29 + fn print_dns_section(state: &RemoteState) { 30 + println!("DNS authority readiness:"); 31 + if state.authority_status.is_empty() { 32 + println!(" (no authorities found — is your workspace empty?)"); 33 + return; 34 + } 35 + for (label, status) in &state.authority_status { 36 + match status { 37 + AuthorityStatus::Resolved { did } => { 38 + let matches_package = state 39 + .publishing_did 40 + .as_deref() 41 + .map(|p| p == did) 42 + .unwrap_or(false); 43 + let marker = if matches_package { "✓" } else { "≠" }; 44 + println!(" {marker} {label} → did={did}"); 45 + } 46 + AuthorityStatus::Missing => { 47 + println!(" ✗ {label} → (TXT record missing)"); 48 + } 49 + AuthorityStatus::Error(e) => { 50 + println!(" ! {label} → (lookup failed: {e})"); 51 + } 52 + } 53 + } 54 + } 55 + 56 + fn print_lexicon_section(state: &RemoteState) { 57 + let classifications = classify(state); 58 + 59 + if classifications.is_empty() { 60 + println!("No local lexicons found."); 61 + return; 62 + } 63 + 64 + let mut new = Vec::new(); 65 + let mut changed = Vec::new(); 66 + let mut removed = Vec::new(); 67 + let mut unchanged = Vec::new(); 68 + for (nsid, cls) in &classifications { 69 + match cls { 70 + Classification::New => new.push(nsid), 71 + Classification::Changed { .. } => changed.push(nsid), 72 + Classification::Removed => removed.push(nsid), 73 + Classification::Unchanged => unchanged.push(nsid), 74 + } 75 + } 76 + 77 + println!( 78 + "Lexicons: {} total ({} unchanged, {} changed, {} new, {} removed)", 79 + classifications.len(), 80 + unchanged.len(), 81 + changed.len(), 82 + new.len(), 83 + removed.len(), 84 + ); 85 + println!(); 86 + 87 + if !new.is_empty() { 88 + println!(" New (will be published):"); 89 + for nsid in new { 90 + if let Some(local) = state.local.get(nsid) { 91 + println!(" + {nsid} cid={}", local.cid); 92 + } 93 + } 94 + println!(); 95 + } 96 + 97 + if !changed.is_empty() { 98 + println!(" Changed (will be republished):"); 99 + for nsid in changed { 100 + if let Some(Classification::Changed { 101 + local_cid, 102 + remote_cid, 103 + }) = classifications.get(nsid) 104 + { 105 + println!(" ~ {nsid} {remote_cid} → {local_cid}"); 106 + } 107 + } 108 + println!(); 109 + } 110 + 111 + if !removed.is_empty() { 112 + println!(" Removed (will be unpublished):"); 113 + for nsid in removed { 114 + if let Some(remote) = state.remote.get(nsid) { 115 + println!(" - {nsid} cid={}", remote.cid); 116 + } 117 + } 118 + println!(); 119 + } 120 + 121 + if !unchanged.is_empty() { 122 + println!(" Unchanged:"); 123 + for nsid in unchanged { 124 + println!(" = {nsid}"); 125 + } 126 + } 127 + }
+63
website/content/docs/cli/08-status.md
··· 1 + +++ 2 + title = "Status Command" 3 + description = "Show what's changed since the last publish" 4 + weight = 8 5 + +++ 6 + 7 + `mlf status` compares every lexicon in your workspace against whatever's currently published on your PDS and reports, per NSID, whether it's **unchanged**, **changed**, **new**, or **removed**. It also summarises the DNS readiness of every authority under `[package].name`. 8 + 9 + The command is read-only — no credentials needed, no writes performed. It exercises the same pipeline a `mlf publish` run would, minus the network writes. 10 + 11 + ## Usage 12 + 13 + ```bash 14 + mlf status 15 + ``` 16 + 17 + No flags yet. The command reads `mlf.toml` from the current directory or a parent, generates Lexicon JSON for every `.mlf` file under `[source].directory`, computes each record's CID, resolves each `_lexicon.<authority>` TXT to a DID, and fetches the current `com.atproto.lexicon.schema` records from the resulting PDS. 18 + 19 + ## Output 20 + 21 + ``` 22 + Package: com.example.forum 23 + Publishing DID: did:plc:abcd1234 24 + PDS: https://pds.example.com 25 + 26 + DNS authority readiness: 27 + ✓ _lexicon.forum.example.com → did=did:plc:abcd1234 28 + 29 + Lexicons: 3 total (1 unchanged, 1 changed, 1 new, 0 removed) 30 + 31 + New (will be published): 32 + + com.example.forum.reply cid=bafyLOCAL 33 + 34 + Changed (will be republished): 35 + ~ com.example.forum.thread bafyREMOTE → bafyLOCAL 36 + 37 + Unchanged: 38 + = com.example.forum.post 39 + ``` 40 + 41 + Symbols: 42 + - `✓` — authority's TXT resolves to the package's DID. 43 + - `≠` — TXT resolves to a *different* DID than the package's. Publish will refuse unless you fix the TXT or pass `--force`. 44 + - `✗` — TXT is missing. Publish will create it (if a DNS plugin is configured for this authority). 45 + - `!` — lookup failed for another reason (network, DNSSEC, etc.). 46 + 47 + Lexicon classifications: 48 + - `=` **Unchanged** — local CID matches the record currently on the PDS. 49 + - `~` **Changed** — local CID differs; publish will replace the record. 50 + - `+` **New** — no record under this NSID on the PDS yet; publish will create one. 51 + - `-` **Removed** — a record exists on the PDS under an NSID your workspace no longer defines; publish will delete it. 52 + 53 + Records whose NSID isn't under `[package].name` are left alone — even if they sit in the same PDS repo, they belong to a different package and aren't this workspace's concern. 54 + 55 + ## Exit codes 56 + 57 + - `0` — status fetched and printed (no error even if there are differences; status is informational). 58 + - Non-zero — could not load the workspace, resolve DNS, or reach the PDS. 59 + 60 + ## See also 61 + 62 + - [`mlf diff`](../09-diff/) — show the actual JSON diff for a changed lexicon. 63 + - [Configuration → Package Identity](../02-configuration/#package-identity) — what controls which NSIDs are in-scope.
+53
website/content/docs/cli/09-diff.md
··· 1 + +++ 2 + title = "Diff Command" 3 + description = "Show what would change on publish" 4 + weight = 9 5 + +++ 6 + 7 + `mlf diff` prints a line-based unified diff between the Lexicon JSON your workspace currently produces and the record already published on the PDS, for one NSID or every changed lexicon in the package. 8 + 9 + Read-only. No credentials needed. 10 + 11 + ## Usage 12 + 13 + ```bash 14 + # Diff every changed / new / removed NSID 15 + mlf diff 16 + 17 + # Diff just one 18 + mlf diff com.example.forum.thread 19 + ``` 20 + 21 + ## Output 22 + 23 + ``` 24 + ~ com.example.forum.thread bafyREMOTE → bafyLOCAL 25 + ---------------------------------------------------- 26 + { 27 + "$type": "com.atproto.lexicon.schema", 28 + "lexicon": 1, 29 + "id": "com.example.forum.thread", 30 + "defs": { 31 + "main": { 32 + "type": "record", 33 + - "description": "A thread" 34 + + "description": "A thread (v2)" 35 + } 36 + } 37 + } 38 + ``` 39 + 40 + Header symbols match [`mlf status`](../08-status/): 41 + - `~` — changed (CID differs; both sides present) 42 + - `+` — new (no remote side yet) 43 + - `-` — removed (no local side, record still on PDS) 44 + - `=` — unchanged (only rendered when you explicitly target the NSID) 45 + 46 + ## Notes 47 + 48 + - The diff shows the **record** as it would be stored on the PDS — that is, the Lexicon JSON wrapped with `"$type": "com.atproto.lexicon.schema"`. The CID header is the content-hash of that exact shape. 49 + - Key order inside JSON objects is serialiser-defined (alphabetical by default). CID comparison is independent of display order; the DAG-CBOR canonical form is what the hash is taken over. 50 + 51 + ## See also 52 + 53 + - [`mlf status`](../08-status/) — overview of every lexicon's state at a glance.