An easy-to-host PDS on the ATProtocol, iPhone and MacOS. Maintain control of your keys and data, always.
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

docs: add MM-146 DID ceremony design plan

Completed brainstorming session. Design includes:
- New public GET /v1/relay/keys relay endpoint
- build_did_plc_genesis_op_with_external_signer() with FnOnce callback for SE signing
- perform_did_ceremony Tauri command orchestrating the full 7-step ceremony
- 4 implementation phases with acceptance criteria

authored by

Malpercio and committed by
Tangled
25a0531a 57877ef8

+190
+190
docs/design-plans/2026-03-20-MM-146.md
··· 1 + # Mobile DID Ceremony Flow — Sign Creation Op + Call Relay 2 + 3 + ## Summary 4 + 5 + This document describes the "DID ceremony" — the sequence of steps that formally establishes a user's decentralized identity after they create an account in the ezpds identity wallet app. During account creation the user receives only a provisional session token; the ceremony upgrades that into a permanent `did:plc` identity by combining two keys: the device's Secure Enclave key (which never leaves the hardware) and the relay's signing key (fetched over the network). The mobile app orchestrates these steps automatically on first launch after account creation, then persists both the resulting DID and the upgraded session token to Keychain. 6 + 7 + The approach is split into four layers that build on each other in dependency order. First, the relay gains a new public endpoint that exposes its active signing key. Second, the `crypto` crate gains a callback-based genesis op builder that allows the Secure Enclave to perform the signing step without ever exposing private key bytes to Rust. Third, a new Tauri command stitches those two pieces together with the existing Keychain and IPC plumbing. Fourth, the frontend gains a dedicated loading screen, a success screen showing the new DID, and inline retry handling — all wired through the existing TypeScript IPC layer. Error codes flow from Rust as structured enum variants and are translated to user-facing copy only in the Svelte layer, so UI text can be updated without a Rust recompile. 8 + 9 + ## Definition of Done 10 + 11 + - The app performs the DID ceremony automatically after account creation: fetches the relay's signing key (`GET /v1/relay/keys`), constructs and signs a `did:plc` genesis operation using the SE device key, posts it to `POST /v1/dids`, and stores the resulting DID and new session token in Keychain. 12 + - The relay exposes a new public `GET /v1/relay/keys` endpoint returning the active signing key's `keyId` and `publicKey`. 13 + - The `crypto` crate gains a new `build_did_plc_genesis_op_with_external_signer()` function that uses a signing callback (enabling SE-backed signing) instead of accepting raw private key bytes. 14 + - The app shows a loading screen during the ceremony, a success screen with the truncated DID + "Continue" button, and inline retry on failure. 15 + - Error states (network failure, no relay signing key, invalid signature) are handled gracefully with a retry button. 16 + 17 + ## Acceptance Criteria 18 + 19 + ### MM-146.AC1: GET /v1/relay/keys returns active signing key 20 + - **MM-146.AC1.1 Success:** Returns 200 with `{ keyId, publicKey, algorithm }` when a signing key is provisioned 21 + - **MM-146.AC1.2 Success:** Returns the most recently created key when multiple keys exist 22 + - **MM-146.AC1.3 Failure:** Returns 503 when no signing key is provisioned 23 + - **MM-146.AC1.4 Success:** Endpoint requires no authentication (public, no Bearer token needed) 24 + 25 + ### MM-146.AC2: build_did_plc_genesis_op_with_external_signer produces valid genesis op 26 + - **MM-146.AC2.1 Success:** Callback receives CBOR-encoded unsigned op bytes; returned `PlcGenesisOp` passes `verify_genesis_op` 27 + - **MM-146.AC2.2 Failure:** Callback returning `Err` propagates as `CryptoError::PlcOperation` 28 + - **MM-146.AC2.3 Success:** Existing `build_did_plc_genesis_op` (now a wrapper) produces identical output to before (existing tests unchanged) 29 + 30 + ### MM-146.AC3: perform_did_ceremony completes the full ceremony 31 + - **MM-146.AC3.1 Success:** Given a valid pending session token and provisioned relay key, returns `DIDCeremonyResult { did }` with a valid `did:plc` identifier 32 + - **MM-146.AC3.2 Success:** Keychain `"session-token"` is overwritten with the full session token from `POST /v1/dids` response 33 + - **MM-146.AC3.3 Success:** Keychain `"did"` is populated with the resulting DID 34 + - **MM-146.AC3.4 Failure:** Returns `DIDCeremonyError::NoRelaySigningKey` (serializes as `{ code: "NO_RELAY_SIGNING_KEY" }`) when relay has no key 35 + - **MM-146.AC3.5 Failure:** Returns `DIDCeremonyError::RelayKeyFetchFailed` when `GET /v1/relay/keys` is unreachable 36 + - **MM-146.AC3.6 Failure:** Returns `DIDCeremonyError::SigningFailed` when SE signing fails 37 + - **MM-146.AC3.7 Failure:** Returns `DIDCeremonyError::DidCreationFailed` when `POST /v1/dids` returns non-2xx 38 + 39 + ### MM-146.AC4: DID ceremony UI 40 + - **MM-146.AC4.1 Success:** App shows loading screen with status text while ceremony is in flight 41 + - **MM-146.AC4.2 Success:** On success, transitions to success screen showing truncated DID and a "Continue" button 42 + - **MM-146.AC4.3 Failure:** On failure, shows inline error message and a Retry button (does not rewind to previous screen) 43 + - **MM-146.AC4.4 Success:** Retry button re-invokes the ceremony from the beginning 44 + - **MM-146.AC4.5 Success:** "Continue" button transitions to `shamir_backup` placeholder step 45 + 46 + ## Glossary 47 + 48 + - **did:plc**: A W3C Decentralized Identifier (DID) method developed by Bluesky/ATProto. A `did:plc` identity is anchored by a signed genesis operation stored in the PLC directory, which is what this ceremony creates. 49 + - **DID ceremony**: The multi-step sequence that converts a provisional account (pending session token) into a permanent `did:plc` identity. Specific to this codebase's onboarding flow. 50 + - **Genesis operation**: The first signed operation in a `did:plc` log. It establishes the DID's initial rotation key, signing key, and service endpoint. Once submitted to the PLC directory, the DID identifier is derived from its content hash. 51 + - **Rotation key**: A key listed in a `did:plc` genesis op that is authorized to rotate (replace) the DID's keys in the future. Here, the device's SE key serves as the rotation key. 52 + - **Signing key**: The key listed in a `did:plc` genesis op that is authorized to sign ATProto records on behalf of this DID. Here, the relay's signing key plays this role. 53 + - **Secure Enclave (SE)**: A hardware security module present in Apple devices. Private keys stored in the SE are non-extractable — they can only be used to sign data inside the enclave, never read out. 54 + - **DAG-CBOR**: A deterministic binary encoding (Directed Acyclic Graph CBOR) used by the ATProto/IPLD ecosystem. The genesis op must be encoded in DAG-CBOR before signing so that the content hash — and therefore the DID — is stable and reproducible. 55 + - **Pending session token**: A short-lived credential issued by the relay after account creation, before a DID has been established. The ceremony exchanges it for a full session token tied to the new DID. 56 + - **Tauri command**: A Rust function registered with the Tauri framework that the frontend can call via IPC (`invoke()`). Equivalent to a native API endpoint for the mobile app's Svelte frontend. 57 + - **IPC (Inter-Process Communication)**: In Tauri, the mechanism by which the JavaScript/Svelte frontend calls Rust functions in the native backend. The `ipc.ts` module provides typed wrappers around Tauri's `invoke()` call. 58 + - **Keychain**: Apple's system credential store. Used here to persist the session token and DID between app launches without storing them in plain files. 59 + - **LazyLock**: A Rust standard-library type (`std::sync::LazyLock`) for a value that is initialized exactly once on first access, used here to hold the singleton `RelayClient`. 60 + - **thiserror**: A Rust derive-macro crate that generates `std::error::Error` implementations from annotated enums. Used throughout this codebase for structured error types. 61 + - **SCREAMING_SNAKE_CASE**: The all-caps-with-underscores naming convention (e.g., `NO_RELAY_SIGNING_KEY`) used for serialized error codes so the TypeScript side can match on them with a discriminated union. 62 + - **Functional Core / Imperative Shell**: An architecture pattern where pure logic (no I/O, no side effects) lives in library crates and all I/O is confined to the outermost layer. The `crypto` crate is the functional core; the Tauri command is the imperative shell. 63 + - **503 (Service Unavailable)**: The HTTP status code returned by `GET /v1/relay/keys` when no signing key has been provisioned in the relay database yet. 64 + - **Bruno**: A desktop HTTP client used in this project for API documentation and manual testing. Each endpoint has a corresponding `.bru` file in the `bruno/` directory. 65 + 66 + ## Architecture 67 + 68 + The ceremony is split across four layers: relay (new public key endpoint), crypto crate (callback-based genesis op builder), Tauri backend (orchestration command), and frontend (loading/success/retry UI). 69 + 70 + **Data flow:** 71 + 72 + 1. `DIDCeremonyScreen` mounts → calls `performDIDCeremony(handle)` (ipc.ts) 73 + 2. `perform_did_ceremony` Tauri command executes: 74 + - `device_key::get_or_create()` → `DevicePublicKey { multibase, key_id }` 75 + - `GET /v1/relay/keys` → `{ keyId, publicKey, algorithm }` 76 + - `build_did_plc_genesis_op_with_external_signer(rotation_key=device.key_id, signing_key=relay.keyId, handle, service_endpoint=RelayClient::base_url(), sign=|data| device_key::sign(data)...)` 77 + - `keychain::get_item("session-token")` → pending session token 78 + - `POST /v1/dids { rotationKeyPublic, signedCreationOp }` (Bearer pending token) → `{ did, session_token }` 79 + - `keychain::store_item("session-token", new_session_token)` 80 + - `keychain::store_item("did", did)` 81 + - Return `DIDCeremonyResult { did }` 82 + 3. On success: transition to `'did_success'` step → `DIDSuccessScreen` 83 + 4. On failure: show inline error + Retry button (no step rewind) 84 + 85 + **Component boundaries:** 86 + 87 + | Component | Responsibility | 88 + |---|---| 89 + | `GET /v1/relay/keys` (relay) | Returns most-recently-created signing key; no auth | 90 + | `build_did_plc_genesis_op_with_external_signer` (crypto) | DAG-CBOR encoding, DID derivation, signature assembly — all in Rust | 91 + | `perform_did_ceremony` (Tauri) | Orchestrates steps 1-7; owns error mapping | 92 + | `DIDCeremonyScreen` / `DIDSuccessScreen` (frontend) | UI state machine; no crypto logic | 93 + 94 + The relay's `service_endpoint` validation (`services.atproto_pds.endpoint == config.public_url`) is satisfied by passing `RelayClient::base_url()` — a compile-time constant that matches the relay's own `public_url` config. 95 + 96 + ## Existing Patterns 97 + 98 + Investigation confirmed the following existing patterns that this design follows: 99 + 100 + - **Tauri error types**: `thiserror` derives + SCREAMING_SNAKE_CASE via `serde(tag = "code", rename_all = "SCREAMING_SNAKE_CASE")` — matches `CreateAccountError` and `DeviceKeyError` in `src-tauri/src/lib.rs` 101 + - **Keychain API**: `keychain::store_item(key, value)` / `get_item(key)` under service `"ezpds-identity-wallet"` — matches existing `create_account` and `get_or_create_device_key` usage 102 + - **RelayClient**: `LazyLock<RelayClient>` with compile-time base URL (`DEBUG_RELAY_BASE_URL` / `RELAY_BASE_URL`), `post()` method via reqwest — `get()` added following the same pattern 103 + - **ipc.ts**: `invoke('command_name', { args })` wrappers returning typed `Result` — matches `createAccount`, `signWithDeviceKey`, etc. 104 + - **Screen components**: `LoadingScreen.svelte` already exists and accepts `statusText` prop — reused directly in `DIDCeremonyScreen` 105 + - **Bruno collection**: one `.bru` file per endpoint with sequential `seq` number — new `get_relay_keys.bru` added 106 + - **Crypto crate**: `build_did_plc_genesis_op` in `crates/crypto/src/plc.rs` — new function added in same file; existing becomes a thin wrapper (preserves all existing tests) 107 + 108 + One divergence: `build_did_plc_genesis_op_with_external_signer` introduces a generic `<F: FnOnce(&[u8]) -> Result<Vec<u8>, CryptoError>>` parameter. This is new in the crypto crate (existing functions are not generic), but required because the SE private key is non-extractable. 109 + 110 + ## Implementation Phases 111 + 112 + <!-- START_PHASE_1 --> 113 + ### Phase 1: Relay — GET /v1/relay/keys 114 + 115 + **Goal:** Expose the relay's active signing key as a public endpoint. 116 + 117 + **Components:** 118 + - `crates/relay/src/routes/get_relay_signing_key.rs` — handler that queries `relay_signing_keys ORDER BY created_at DESC LIMIT 1`; returns `{ keyId, publicKey, algorithm }`; returns 503 if no key provisioned 119 + - `crates/relay/src/app.rs` — register `get(get_relay_signing_key)` alongside existing `post(create_signing_key)` on `/v1/relay/keys` 120 + - `crates/relay/src/routes/mod.rs` — expose new module 121 + - `bruno/get_relay_keys.bru` — Bruno file for the new endpoint 122 + 123 + **Dependencies:** None (first phase) 124 + 125 + **Done when:** Handler integration tests pass: 200 with active key, 503 when no key provisioned, 200 with most recently created key when multiple exist 126 + <!-- END_PHASE_1 --> 127 + 128 + <!-- START_PHASE_2 --> 129 + ### Phase 2: Crypto — build_did_plc_genesis_op_with_external_signer 130 + 131 + **Goal:** Add a callback-based genesis op builder so callers with non-extractable keys (SE) can sign without exposing private key bytes. 132 + 133 + **Components:** 134 + - `crates/crypto/src/plc.rs` — new `pub fn build_did_plc_genesis_op_with_external_signer<F>(rotation_key, signing_key, handle, service_endpoint, sign: F) -> Result<PlcGenesisOp, CryptoError> where F: FnOnce(&[u8]) -> Result<Vec<u8>, CryptoError>`; existing `build_did_plc_genesis_op` becomes a thin wrapper calling the new function 135 + - `crates/crypto/src/lib.rs` — re-export new function 136 + 137 + **Dependencies:** None (pure functional core, no I/O) 138 + 139 + **Done when:** New function tests pass (callback receives CBOR bytes, result verified via `verify_genesis_op`; callback error propagated as `CryptoError::PlcOperation`); existing `build_did_plc_genesis_op` tests still pass unchanged 140 + <!-- END_PHASE_2 --> 141 + 142 + <!-- START_PHASE_3 --> 143 + ### Phase 3: Tauri Backend — perform_did_ceremony 144 + 145 + **Goal:** Implement the Tauri command that orchestrates the full ceremony: fetch relay key, build signed op, post to relay, persist DID + session token. 146 + 147 + **Components:** 148 + - `apps/identity-wallet/src-tauri/src/lib.rs`: 149 + - `DIDCeremonyResult { did: String }` (camelCase serde) 150 + - `DIDCeremonyError` enum: `KeyNotFound`, `RelayKeyFetchFailed`, `NoRelaySigningKey`, `SigningFailed`, `DidCreationFailed`, `KeychainError`, `NetworkError { message }` — SCREAMING_SNAKE_CASE via `serde(tag = "code")` 151 + - `async fn perform_did_ceremony(handle: String) -> Result<DIDCeremonyResult, DIDCeremonyError>` — 7-step orchestration 152 + - `RelayClient` gains `pub const fn base_url() -> &'static str` and `pub async fn get(path) -> Result<Response, reqwest::Error>` 153 + - `perform_did_ceremony` registered in `tauri::generate_handler![]` 154 + 155 + **Dependencies:** Phase 2 (`build_did_plc_genesis_op_with_external_signer` must exist) 156 + 157 + **Done when:** Tauri unit tests pass: `DIDCeremonyResult` serializes `did` in camelCase; `DIDCeremonyError` variants serialize as SCREAMING_SNAKE_CASE codes; app builds (`cargo build`) 158 + <!-- END_PHASE_3 --> 159 + 160 + <!-- START_PHASE_4 --> 161 + ### Phase 4: Frontend — ipc.ts + UI Screens 162 + 163 + **Goal:** Wire the ceremony command into the TypeScript IPC layer and implement the loading, success, and retry UI. 164 + 165 + **Components:** 166 + - `apps/identity-wallet/src/lib/ipc.ts`: 167 + - `DIDCeremonyResult` type: `{ did: string }` 168 + - `DIDCeremonyError` type: `{ code: 'KEY_NOT_FOUND' | 'RELAY_KEY_FETCH_FAILED' | 'NO_RELAY_SIGNING_KEY' | 'SIGNING_FAILED' | 'DID_CREATION_FAILED' | 'KEYCHAIN_ERROR' | 'NETWORK_ERROR'; message?: string }` 169 + - `performDIDCeremony(handle: string): Promise<DIDCeremonyResult>` 170 + - `apps/identity-wallet/src/lib/components/onboarding/DIDCeremonyScreen.svelte` — props `handle: string`, `onSuccess: (did: string) => void`; calls `performDIDCeremony` on mount and retry; shows `LoadingScreen` while in flight; shows inline error + Retry on failure 171 + - `apps/identity-wallet/src/lib/components/onboarding/DIDSuccessScreen.svelte` — props `did: string`, `oncontinue: () => void`; shows truncated DID; "Continue" button 172 + - `apps/identity-wallet/src/routes/+page.svelte`: 173 + - `OnboardingStep` type gains `'did_success'` and `'shamir_backup'` 174 + - `form` state gains `did: string` field 175 + - `'did_ceremony'` placeholder replaced with `DIDCeremonyScreen` 176 + - `'did_success'` shows `DIDSuccessScreen` 177 + - `'shamir_backup'` shows placeholder `<div>` 178 + 179 + **Dependencies:** Phase 3 (Tauri command must exist for IPC to invoke) 180 + 181 + **Done when:** App builds and runs on iOS simulator; `performDIDCeremony` invokes the Tauri command; screens render correctly; error path shows retry button 182 + <!-- END_PHASE_4 --> 183 + 184 + ## Additional Considerations 185 + 186 + **Error code → user-facing message mapping** lives in `DIDCeremonyScreen.svelte`, not in the Tauri command. The Rust side emits structured error codes; the UI decides what text to show. This keeps user-facing copy out of Rust and centralizes it where it can be updated without a Rust recompile. 187 + 188 + **Retry semantics:** The Retry button re-invokes `performDIDCeremony` from scratch (step 1: device key, step 2: relay key fetch, ...). No partial-state resumption. This is safe because the relay's POST /v1/dids is idempotent for a given pending session token (it will return the existing DID if already promoted) — but for v1 we accept that a mid-ceremony failure before the POST may waste the pending session slot. 189 + 190 + **Implementation scoping:** This design has 4 phases. Well within the writing-plans skill's 8-phase limit.