A loose federation of distributed, typed datasets
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

docs: add .reference directory and document git tracking policy

Added reference materials directory with atproto lexicon specification.
Updated CLAUDE.md to document that .reference directory should be
tracked in git to preserve external specifications and reference materials.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

+236
.chainlink/issues.db

This is a binary file and will not be displayed.

+230
.reference/atproto_lexicon_spec.md
··· 1 + # Lexicon Specification - AT Protocol 2 + 3 + ## Overview 4 + 5 + "Lexicon is a schema definition language used to describe atproto records, HTTP endpoints (XRPC), and event stream messages." 6 + 7 + The language builds on the atproto Data Model and incorporates concepts similar to JSON Schema and OpenAPI, while adding protocol-specific features. This specification covers version 1 of the Lexicon language. 8 + 9 + ## Type Categories 10 + 11 + Lexicon types fall into several categories: 12 + 13 + **Concrete Types:** boolean, integer, string, bytes, cid-link, blob 14 + 15 + **Container Types:** array, object 16 + 17 + **Sub-types:** params, permission 18 + 19 + **Meta Types:** token, ref, union, unknown 20 + 21 + **Primary Types:** record, query, procedure, subscription, permission-set 22 + 23 + ## Lexicon Files 24 + 25 + Lexicon schemas are JSON files associated with a single NSID containing one or more definitions. Required file fields: 26 + 27 + - `lexicon` (integer): Language version, currently fixed at 1 28 + - `id` (string): The NSID identifier 29 + - `defs` (object): Named definitions with distinct keys 30 + - `description` (string, optional): Overview text 31 + 32 + "References to specific definitions within a Lexicon use fragment syntax, like `com.example.defs#someView`." 33 + 34 + ## Primary Type Definitions 35 + 36 + ### Record Type 37 + 38 + Specifies data objects stored in repositories. Type-specific fields: 39 + 40 + - `key` (string): Record key type specification 41 + - `record` (object): Schema with type object describing the record structure 42 + 43 + ### Query and Procedure (HTTP API) 44 + 45 + Describes XRPC endpoints. Fields: 46 + 47 + - `parameters`: Optional params schema for query parameters 48 + - `output`: Response body with encoding (MIME type) and optional schema 49 + - `input`: Request body (procedures only) 50 + - `errors`: Array of possible error codes with descriptions 51 + 52 + ### Subscription (Event Stream) 53 + 54 + Defines WebSocket endpoint messages. Fields: 55 + 56 + - `parameters`: Optional HTTP parameters 57 + - `message`: Required specification with union schema 58 + - `errors`: Optional error definitions 59 + 60 + "Subscription schemas must be a `union` of refs, not an `object` type." 61 + 62 + ### Permission Set 63 + 64 + Bundles permissions for OAuth scopes. Fields: 65 + 66 + - `title` / `title:langs`: Display name with localization 67 + - `detail` / `detail:langs`: Human-readable scope description 68 + - `permissions`: Array of permission definitions 69 + 70 + ## Field Type Definitions 71 + 72 + ### Primitive Types 73 + 74 + **boolean:** Optional `default` and `const` fields 75 + 76 + **integer:** Supports `minimum`, `maximum`, `enum`, `default`, `const` 77 + 78 + **string:** Supports `format`, `maxLength`, `minLength`, `maxGraphemes`, `minGraphemes`, `knownValues`, `enum`, `default`, `const` 79 + 80 + "Strings are Unicode. For non-Unicode encodings, use `bytes` instead." 81 + 82 + **bytes:** Raw binary data with optional `minLength` and `maxLength` 83 + 84 + **cid-link:** Content identifier links with no type-specific fields 85 + 86 + ### Container Types 87 + 88 + **array:** Contains `items` (required schema) and optional `minLength`/`maxLength` 89 + 90 + **object:** 91 + - `properties`: Named field schemas 92 + - `required`: Array of required field names 93 + - `nullable`: Array of fields accepting null values 94 + 95 + "There is a semantic difference in data between omitting a field; including the field with value `null`; and including the field with a falsy value." 96 + 97 + **blob:** Binary large objects with: 98 + - `accept`: MIME type restrictions (glob patterns supported) 99 + - `maxSize`: Maximum bytes 100 + 101 + ### Specialized Types 102 + 103 + **params:** Limited to HTTP query parameters, supporting only boolean, integer, string, or arrays of these types. Cannot be top-level named definitions. 104 + 105 + **permission:** Defines access permissions with `resource` field. Current resources: 106 + 107 + - `repo`: Repository write permissions with collection and optional action fields 108 + - `rpc`: Remote API calls with lxm (endpoints), aud (audience), and inheritAud fields 109 + 110 + "Permission declarations with unsupported resource types must be ignored by services implementing access control." 111 + 112 + **token:** Empty values referenced by name, used for symbolic enumerations. Cannot be used in refs, unions, or as object fields. 113 + 114 + ### Reference and Union Types 115 + 116 + **ref:** References another schema definition globally (by NSID) or locally (by fragment). Reduces schema duplication for reusable definitions. 117 + 118 + **union:** Declares multiple possible types at a location. Fields: 119 + 120 + - `refs`: Array of schema references 121 + - `closed`: Boolean indicating if type list is fixed (default: false) 122 + 123 + "Unions represent that multiple possible types could be present at this location in the schema." 124 + 125 + **unknown:** Accepts any data object with no specific validation, but must be a CBOR map. Data may contain optional `$type` field. 126 + 127 + ## String Formats 128 + 129 + Lexicon supports format-constrained strings: 130 + 131 + - `at-identifier`: Handle or DID 132 + - `at-uri`: AT-URI reference 133 + - `at-uri-regex`: "Lenient" version accepting unresolved at-identifier 134 + - `cid`: Content identifier 135 + - `datetime`: RFC 3339 timestamp 136 + - `did`: Decentralized identifier 137 + - `handle`: Handle identifier 138 + - `nsid`: Namespaced identifier 139 + - `tid`: Timestamp identifier 140 + - `record-key`: Record key syntax 141 + - `uri`: Generic URI (RFC 3986) 142 + - `language`: IETF language tag (BCP 47) 143 + 144 + ### Datetime Format 145 + 146 + Required elements: 147 + - Intersection of RFC 3339, ISO 8601, and WHATWG HTML standards 148 + - Uppercase T separator between date and time 149 + - Timezone specification (preferably Z for UTC) 150 + - Whole seconds precision (millisecond precision recommended) 151 + 152 + Valid example: `1985-04-12T23:20:50.123Z` 153 + 154 + Invalid: Missing timezone, lowercase t, insufficient precision, or invalid day/month values 155 + 156 + ### AT-URI Format 157 + 158 + "at-uri": Represents an AT-URI following the AT-URI scheme specification. Examples: 159 + - `at://did:plc:abc123/com.example.record/rkey123` 160 + - `at://alice.bsky.social/app.bsky.feed.post/3k4i5j6k` 161 + 162 + "at-uri-regex": "Lenient" version that accepts AT-URIs with unresolved at-identifiers. 163 + 164 + ### URI Format 165 + 166 + "uri": "Flexible to any URI schema, following the generic RFC-3986 on URIs." Supports did, https, wss, ipfs, dns, and at schemes. Maximum length is 8 KBytes. 167 + 168 + ### Language Format 169 + 170 + "language": "An IETF Language Tag string, compliant with BCP 47, defined in RFC 5646." Examples include `ja` (Japanese) and `pt-BR` (Brazilian Portuguese). 171 + 172 + ## Validation Approach 173 + 174 + "For the various identifier formats, when doing Lexicon schema validation the most expansive identifier syntax format should be permitted." Application-level validation of specific identifier methods occurs separately from schema validation. 175 + 176 + ## When to Use `$type` 177 + 178 + Data objects sometimes require a `$type` field for disambiguation: 179 + 180 + - `record` objects: Always include `$type` 181 + - `union` variants: Always include `$type` (except top-level subscription messages) 182 + - `blob` objects: Always include `$type` 183 + 184 + "Main types must be referenced in `$type` fields as just the NSID, not including a `#main` suffix." 185 + 186 + ## Validation Options 187 + 188 + Three PDS validation approaches: 189 + 190 + 1. **Explicit validation:** Record must validate against known Lexicon; fails if unavailable 191 + 2. **No validation:** Record bypasses Lexicon validation (still validates data model rules) 192 + 3. **Optimistic validation (default):** Validates if Lexicon known; allows creation if unavailable 193 + 194 + ## Lexicon Evolution 195 + 196 + Compatibility rules for schema updates: 197 + 198 + - New fields must be optional 199 + - Non-optional fields cannot be removed 200 + - Field types cannot change 201 + - Fields cannot be renamed 202 + 203 + "If larger breaking changes are necessary, a new Lexicon name must be used." 204 + 205 + Lexicon publication occurs through atproto repositories using `com.atproto.lexicon.schema` record types, linked via DNS TXT records for authority resolution. 206 + 207 + ## Authority and Control 208 + 209 + NSID authority derives from DNS domain control. Domain authorities maintain Lexicon definitions with ultimate responsibility for maintenance and distribution. Protocol implementations should treat data failing Lexicon validation as entirely invalid. 210 + 211 + "Unexpected fields in data which otherwise conforms to the Lexicon should be ignored." 212 + 213 + ## Usage Guidelines 214 + 215 + Implementations should support translation to JSON Schema and OpenAPI formats for cross-ecosystem compatibility. Care must be taken when deserializing/reserializing to avoid losing unexpected fields that may represent newer schema versions. 216 + 217 + ## Record Key Types 218 + 219 + The `key` field in record definitions specifies the format of record keys (rkeys). Options: 220 + 221 + - `"any"`: Any string matching general record-key syntax 222 + - `"tid"`: Must be a valid timestamp identifier 223 + - `"literal:{value}"`: Fixed literal string (e.g., `"literal:self"` for profile records) 224 + 225 + ## Notes on Implementation 226 + 227 + - String grapheme counting should follow Unicode extended grapheme cluster boundaries 228 + - Unknown fields should be preserved during serialization/deserialization when possible 229 + - Services should be permissive with format validation but strict with structural requirements 230 + - Breaking schema changes require new NSIDs rather than version updates
+6
CLAUDE.md
··· 164 164 - **Track `.planning/` directory in git** - Do not ignore planning documents 165 165 - Planning documents in `.planning/` should be committed to preserve design history 166 166 - This includes architecture notes, implementation plans, and design decisions 167 + 168 + ### Reference Materials 169 + 170 + - **Track `.reference/` directory in git** - Include reference documentation in commits 171 + - The `.reference/` directory contains external specifications and reference materials 172 + - This includes API specs, lexicon definitions, and other reference documentation used for development