CLI app for developers prototyping atproto functionality
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

docs(design): postmortem on operator-mode defect + brief for redesign

Append a postmortem section to the original design plan documenting the
WaitForExternalClient passive-listener defect: the founding "walk the
user through the actions" requirement was lost between the design
session and the ACs, the in-tree RP scaffolding silently became the
load-bearing conformance driver, and the operator-facing CLI path was
derived as a passive listener that exits on Ctrl-C.

Add a design brief for the corrective work as a context-handoff
document for the new design session that will write the operator-mode
plan. The brief carries only project-specific constraints; general
design discipline lives in the new rules added to ~/.claude/CLAUDE.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

+241
+120
docs/design-plans/2026-04-16-test-oauth-client.md
··· 1183 1183 requires ES256K JWKS verification on the client side, we'll either 1184 1184 switch to a JWT crate that supports it or add manual k256+sha2 verify 1185 1185 helpers alongside `jsonwebtoken`. 1186 + 1187 + ## Postmortem: operator-mode passive-listener defect (2026-04-25) 1188 + 1189 + This section is appended after the fact. The design above shipped with a 1190 + defect in the interactive subcommand's operator-facing path; the corrective 1191 + work is tracked in [TODO: link new operator-mode design plan]. The original 1192 + design content above is preserved verbatim — this section explains where it 1193 + went wrong, so future readers (human or agent) can learn from it without 1194 + having to reconstruct the failure from session logs. 1195 + 1196 + ### What was supposed to be built 1197 + 1198 + In the founding design session (Claude Code session file 1199 + `2c45c1ab-cc98-4af8-9776-2343f08cba30.jsonl`, recorded 2026-04-16), the 1200 + user established the interactive subcommand's purpose with this prompt: 1201 + 1202 + > "This second mode would spin up a fake auth server, make it reachable 1203 + > somehow (ask user to run this in a reachable environment? Optionally use 1204 + > Tailscale or similar to expose the server?), and then **walk the user 1205 + > through the actions they need to take against this server (e.g. giving 1206 + > consent, denying, etc.)**" 1207 + 1208 + The operator-prompted multi-scenario flow — the tool walking the user step 1209 + by step through a sequence of scripted scenarios (approve, deny, partial 1210 + grant, DPoP nonce rotation, refresh, broken-AS iss/sub) — was the product. 1211 + 1212 + Later in the same session, the user added a scaffolding requirement: 1213 + 1214 + > "To properly test the AS listener, I think we need a tiny in-tree client. 1215 + > Note that this client can be extended for a future 'test oauth server' 1216 + > command, so it won't be unnecessary work." 1217 + 1218 + This was a request for a deterministic in-tree RP so the AS infrastructure 1219 + itself could be unit-tested under `cargo test`. It was *not* a request to 1220 + replace the operator-prompted flow with a programmatic driver. 1221 + 1222 + ### What was actually built 1223 + 1224 + The design plan above (and the implementation that followed) silently 1225 + inverted these two requirements. The in-tree `RelyingParty` became the 1226 + load-bearing conformance driver (`InteractiveDriveMode::DriveRpInProcess`), 1227 + and the operator-facing CLI path was derived by stripping the driver, 1228 + leaving only a passive listener that blocks on `tokio::signal::ctrl_c()` 1229 + and inspects whatever happened to land in the request log 1230 + (`InteractiveDriveMode::WaitForExternalClient`). 1231 + 1232 + In that mode, every sub-stage check (scope_variations, dpop_edges) is 1233 + emitted as `Skipped`, the broken-AS sub-stage (iss/sub verification) is 1234 + also `Skipped`, and only a handful of the happy-path observations 1235 + (`ClientReachedPar`, `ClientUsedPkceS256`, etc.) actually run. The CLI 1236 + subcommand the user originally specified — the entire reason this 1237 + subcommand is named `interactive` — therefore exercises a small fraction 1238 + of its check inventory. 1239 + 1240 + ### Where the inversion was inscribed in this plan 1241 + 1242 + The four points where a re-reader could have caught the divergence: 1243 + 1244 + 1. **AC4.1** ("waits for inbound requests until interrupted") describes a 1245 + passive listener. The founding prompt said "walk the user through the 1246 + actions"; AC4.1 lost that. 1247 + 2. **AC4.10** carves an explicit `Skipped` carve-out into the spec: 1248 + *"Skipped only under `WaitForExternalClient` drive mode (which cannot 1249 + re-drive a flow with a broken script) ..."* — written when the AC 1250 + should instead have asked "could the operator drive that flow?" The 1251 + parenthetical is the moment the gap was rationalized into the spec 1252 + instead of solved in the design. 1253 + 3. **The variant name itself: `InteractiveDriveMode::WaitForExternalClient`.** 1254 + The name describes a *runtime posture*, not an *operator-orchestration 1255 + action*. This is the smoking gun — naming the operator-facing mode after 1256 + what the runtime does (waits for SIGINT) rather than what the mode is 1257 + for (orchestrating an operator through scenarios) embeds the wrong 1258 + contract into the type system. 1259 + 4. **Phase 6's "Done when":** *"Manual: ... starts the server, prints the 1260 + synthetic identity, and exits cleanly on Ctrl-C."* — the entire 1261 + user-visible CLI experience for the operator-facing mode is captured as 1262 + "exits cleanly on Ctrl-C." If the design had required a "what the user 1263 + sees" walkthrough for every operating mode (mocked terminal output, 1264 + prompts, scenarios), the gap would have been impossible to overlook at 1265 + design time. 1266 + 1267 + ### Cost 1268 + 1269 + The defect was not caught by code review, by the implementation phases' 1270 + review subagents, or by the user's day-to-day inspection of the worktree. 1271 + It was caught only at the test-plan execution phase on 2026-04-25, after 1272 + the user had spent several days monitoring subagents and providing 1273 + permission approvals during implementation. By that point, ~90 commits had 1274 + landed implementing the wrong-shaped `WaitForExternalClient` and the 1275 + correct `DriveRpInProcess`. The static stages (discovery, metadata, JWKS) 1276 + and the `DriveRpInProcess` half of the interactive stage are correct and 1277 + shippable; the operator-facing path requires a fresh design and additional 1278 + implementation work. 1279 + 1280 + ### Failure modes (mapped to corrective rules) 1281 + 1282 + The four design-time failures, mapped to the rules in `~/.claude/CLAUDE.md` 1283 + under `Plugin-specific rules` → `ed3d-plugins/ed3d-plan-and-execute` → 1284 + `Design plan discipline`: 1285 + 1286 + | Failure here | Rule that would have caught it | 1287 + |---|---| 1288 + | Founding "walk the user through" requirement was lost between session and AC | **Founding-requirement traceability** (User Journeys section quoting prompts verbatim, mapped to ACs) | 1289 + | The operator-mode CLI experience was specified as "exits on Ctrl-C" | **Per-mode user-visible walkthrough** (literal terminal output mocked up per mode) | 1290 + | The in-tree RP scaffolding silently became the conformance driver | **Primary vs. scaffolding disambiguation** (clarifying question when user gives requirement A then mentions B for testing) | 1291 + | `WaitForExternalClient` named the variant after a runtime posture | **Names must describe behavior, not posture** (audit names against surrounding prose) | 1292 + 1293 + These rules were added to `~/.claude/CLAUDE.md` on 2026-04-25 in direct 1294 + response to this incident. 1295 + 1296 + ### Status 1297 + 1298 + - The static pipeline (discovery / metadata / JWKS) and the 1299 + `DriveRpInProcess` half of the interactive stage are correct as designed 1300 + above and ship as-is. 1301 + - `InteractiveDriveMode::WaitForExternalClient` and every assertion in the 1302 + design above that depends on its current shape (notably AC4.1's "waits 1303 + for inbound requests until interrupted" framing and AC4.10's 1304 + `WaitForExternalClient` carve-out) are superseded by the operator-mode 1305 + redesign at [TODO: link new operator-mode design plan].
+121
docs/design-plans/2026-04-25-test-oauth-client-operator-mode-design-brief.md
··· 1 + # Design brief: operator-mode interactive flow for `test oauth client` 2 + 3 + This is a context-handoff document for the Claude Code session that will 4 + write the new design plan for the operator-facing path of 5 + `atproto-devtool test oauth client interactive`. It is **not** the design 6 + plan; it is the minimum forensic grounding the design session needs to 7 + avoid repeating the original failure. 8 + 9 + The design discipline rules in `~/.claude/CLAUDE.md` apply unmodified. 10 + This brief provides the project-specific context those rules need to 11 + bite into; it does not restate or substitute for them. If a rule fires 12 + against something written here, the rule wins. 13 + 14 + ## Background 15 + 16 + The original design plan (`docs/design-plans/2026-04-16-test-oauth-client.md`) 17 + correctly specified static conformance checks (discovery, metadata, JWKS) 18 + and a deterministic in-process RP driver for snapshot testing. It 19 + incorrectly specified the operator-facing CLI path as a passive listener 20 + that exits on Ctrl-C. See the postmortem section at the end of that plan 21 + for the full failure analysis. 22 + 23 + This brief covers only the corrective work for the operator-facing path. 24 + Do not redesign the static pipeline or the in-process driver — those are 25 + correct and shipping. 26 + 27 + ## Founding requirement (verbatim, 2026-04-16) 28 + 29 + The operator-facing mode was originally specified by the user as: 30 + 31 + > "This second mode would spin up a fake auth server, make it reachable 32 + > somehow (ask user to run this in a reachable environment? Optionally use 33 + > Tailscale or similar to expose the server?), and then walk the user 34 + > through the actions they need to take against this server (e.g. giving 35 + > consent, denying, etc.)" 36 + 37 + This is the **product**. The in-tree RP, requested separately as scaffolding 38 + to test the AS listener, is **not** the product — it is testing 39 + infrastructure that survives in `DriveRpInProcess` mode and powers `cargo 40 + test` snapshots. 41 + 42 + ## Project-specific constraints 43 + 44 + These are constraints specific to this corrective design. General design 45 + discipline (user journeys, walkthroughs, naming, scaffolding-vs-primary 46 + disambiguation) is governed by the rules in `~/.claude/CLAUDE.md` and is 47 + not restated here. 48 + 49 + 1. **The operator-facing mode is the canonical conformance path for live 50 + client testing.** Every check the AS sub-stages can express (scope 51 + variations, DPoP edges, broken-AS iss/sub) MUST be exercisable by an 52 + operator driving a real client through prompted scenarios. "Skipped 53 + because the operator can't script the AS" is not an acceptable 54 + resolution — the tool scripts the AS *for* the operator and prompts 55 + them to drive a flow against the scripted state. 56 + 2. **Honest skips only.** Some checks genuinely cannot be observed from 57 + the request log alone (e.g., `ClientVerifiedSub` requires watching the 58 + client's post-token resource calls, and the fake AS is not a resource 59 + server). These remain `Skipped` with a precise reason naming the 60 + specific observability limitation. The list of such checks should be 61 + small and explicitly enumerated in the design. 62 + 3. **The in-process driver mode (`DriveRpInProcess`) keeps its current 63 + responsibilities.** Snapshot-testable conformance signal under `cargo 64 + test` is a separate, equally-important function and is correctly 65 + designed today. The new design does not regress it. If a check can be 66 + observed in both modes, it should be observed in both modes via shared 67 + helper functions (the `observe_par_checks`-style refactor in the 68 + existing code is a starting point). 69 + 70 + ## Open questions for the design session to resolve 71 + 72 + These are deliberately not pre-answered — the new session should reason 73 + about them with the constraints above. 74 + 75 + - Should the operator-facing mode share `ServerHandle` across scenarios 76 + (rescript between steps, request log accumulates, single bind) or 77 + restart between scenarios (cleaner state, operator must re-point client 78 + each time)? 79 + - Is the scenario list fixed in order, or can the operator opt out of 80 + individual scenarios (`--skip <scenario>` or interactive prompts to 81 + skip)? 82 + - How does the design handle scenarios the operator cannot drive 83 + (e.g., the operator's client genuinely lacks refresh support)? Skipped 84 + with a per-scenario operator-acknowledgment, or fail-closed? 85 + - What is the right keypress / signal contract per scenario? Enter to 86 + proceed, Ctrl-C to skip current step, Ctrl-C twice to abort the run? 87 + - How does the existing `iss_sub_verification` sub-stage extend to 88 + operator mode? It currently re-uses the happy-path RP after the happy 89 + path completes; the operator analogue might be "after the happy-path 90 + scenario, prompt operator to drive the same client through a flow with 91 + the AS scripted to BogusIssOnRedirect." 92 + 93 + ## Pointers 94 + 95 + - **Original design plan:** `docs/design-plans/2026-04-16-test-oauth-client.md` 96 + (read the postmortem section at the end first). 97 + - **Original design session log:** session file 98 + `2c45c1ab-cc98-4af8-9776-2343f08cba30.jsonl` (locate it under your 99 + Claude Code projects directory; the founding requirement quote is 100 + searchable on "walk the user"). 101 + - **Current code state:** 102 + `src/commands/test/oauth/client/pipeline/interactive.rs` — the 103 + `WaitForExternalClient` arm is the code being superseded; the 104 + `DriveRpInProcess` arm is correct and stays. 105 + - **Existing sub-stages:** 106 + `src/commands/test/oauth/client/pipeline/interactive/scope_variations.rs` 107 + and `dpop_edges.rs` — these are written for `DriveRpInProcess`. The new 108 + design should consider whether they grow operator-mode siblings, share 109 + inspection helpers with operator-mode code, or both. 110 + 111 + ## What this brief is NOT 112 + 113 + - Not a draft of the new design plan. Do the design work in the new 114 + session using `/ed3d-plan-and-execute:start-design-plan`. 115 + - Not a list of architectural decisions. The constraints above describe 116 + WHAT the design must satisfy; HOW it satisfies them (which variants 117 + exist, which types are introduced, how the orchestrator sequences 118 + scenarios) is the design session's job. 119 + - Not a replacement for reading the postmortem appendix on the original 120 + plan. The postmortem has specific failure modes and the rules that 121 + catch them; this brief assumes the new session has read it.