jj workspaces over the network
0
fork

Configure Feed

Select the types of activity you want to include in your feed.

docs: remove v0/v1 migration references, clean up for current state

- Rewrite AGENTS.md as clean execution guide (no version history)
- Remove v0→v1 migration language from all docs/
- Rename v1_slice2/v1_slice3 test files to slice2/slice3
- Update ARCHITECTURE.md, README.md, QA reports
- Clean up exec-plans, design-docs, product-specs
- All docs now describe current state only

+400 -154
+71 -46
AGENTS.md
··· 1 1 # AGENTS 2 2 3 - Execution guide for building `tandem` from the docs in this repository. 3 + Execution guide for working on the `tandem` codebase. 4 4 5 5 ## How to read these docs 6 6 7 - 1. Read `ARCHITECTURE.md` for system boundaries. 8 - 2. Read `docs/design-docs/workflow.md` for the concrete orchestrator→agents→git workflow. 7 + 1. Read `ARCHITECTURE.md` for system boundaries and project structure. 8 + 2. Read `docs/design-docs/workflow.md` for the orchestrator→agents→git workflow. 9 9 3. Read `docs/design-docs/jj-lib-integration.md` for trait signatures and registration. 10 - 4. Read `docs/exec-plans/active/slice-roadmap.md` and pick the next slice. 11 - 5. Implement via failing integration test first. 12 - 6. Keep any deferred cleanup in `docs/exec-plans/tech-debt-tracker.md`. 13 - 7. When a slice is done, move a completion note into `docs/exec-plans/completed/`. 10 + 4. Read `docs/exec-plans/tech-debt-tracker.md` for known issues to work on. 11 + 5. Check `docs/exec-plans/completed/` for context on how each slice was built. 12 + 6. Implement changes via failing integration test first. 14 13 15 14 ## What tandem is 16 15 ··· 21 20 22 21 The server is the **point of origin** — it's where git operations happen 23 22 (`jj git push`, `jj git fetch`, `gh pr create`). The orchestrator/teamlead 24 - runs these on the server to ship code upstream. Eventually the tandem server 25 - becomes THE source of truth, with GitHub as a mirror. 23 + runs these on the server to ship code upstream. The tandem server is the source 24 + of truth, with GitHub as a mirror. 26 25 27 26 ## Single binary, two modes 28 27 ··· 39 38 calls `putObject(file, bytes)`, the server stores the object. Objects are real 40 39 jj-compatible blobs — `jj git push` on the server just works. 41 40 41 + ## Source layout 42 + 43 + ``` 44 + src/ 45 + main.rs CLI dispatch (clap) + CliRunner passthrough 46 + server.rs Server — jj Git backend + Cap'n Proto RPC 47 + backend.rs TandemBackend (jj-lib Backend trait) 48 + op_store.rs TandemOpStore (jj-lib OpStore trait) 49 + op_heads_store.rs TandemOpHeadsStore (jj-lib OpHeadsStore trait) 50 + rpc.rs Cap'n Proto RPC client wrapper 51 + proto_convert.rs jj protobuf ↔ Rust struct conversion 52 + watch.rs tandem watch command 53 + schema/ 54 + tandem.capnp Cap'n Proto schema (Store + HeadWatcher) 55 + tests/ 56 + common/mod.rs Test harness (server spawn, HOME isolation) 57 + slice1-7 tests Integration tests asserting on file bytes 58 + ``` 59 + 60 + ## Docs layout 61 + 62 + ``` 63 + docs/ 64 + README.md Overview and pointers 65 + design-docs/ 66 + workflow.md Orchestrator→agents→git workflow 67 + jj-lib-integration.md Trait signatures and store registration 68 + rpc-protocol.md Cap'n Proto protocol details 69 + rpc-error-model.md Error handling conventions 70 + core-beliefs.md Design principles 71 + exec-plans/ 72 + completed/ Completion notes for all 9 slices 73 + tech-debt-tracker.md Known issues (P1/P2/P3) 74 + product-specs/ 75 + core-product.md Product intent and scope 76 + ``` 77 + 42 78 ## Critical invariants 43 79 44 80 1. **The client is stock `jj`.** Tandem implements jj-lib's `Backend`, `OpStore`, ··· 47 83 48 84 2. **Tests assert on file bytes, not descriptions.** Every integration test 49 85 must verify file content round-trips correctly via `jj cat`. Description-only 50 - assertions are insufficient (this is how v0 went wrong). 86 + assertions are insufficient. 51 87 52 88 3. **Help text works without a server.** `tandem --help`, `tandem serve --help`, 53 89 and `tandem` with no args must print usage locally. Error messages must ··· 56 92 57 93 ## Help text and error handling (P0) 58 94 59 - These are required, not nice-to-haves. The v0 QA found agents spend 50% of 60 - their time guessing commands when help is missing. 95 + These are required, not nice-to-haves. QA found agents spend 50% of their time 96 + guessing commands when help is missing. 61 97 62 98 - `tandem --help` — prints usage without server connection 63 99 - `tandem serve --help` — explains `--listen` and `--repo` flags ··· 66 102 - Connection errors — include the address that was tried 67 103 - Missing args — say what's needed ("serve requires `--listen <addr>`") 68 104 - `TANDEM_SERVER` env var — fallback for `--server` flag on client commands 69 - - `TANDEM_WORKSPACE` env var — workspace name (already exists from v0) 105 + - `TANDEM_WORKSPACE` env var — workspace name for client 70 106 71 107 ## Workflow 72 108 ··· 78 114 4. **Agents** see each other's files: `tandem cat -r <other-commit> src/auth.rs` 79 115 5. **Orchestrator** ships from server: `jj bookmark create main -r <tip>`, `jj git push` 80 116 81 - Git operations are server-only in v1. Agents never touch git directly. 117 + Git operations are server-only. Agents never touch git directly. 82 118 83 - ## V0 → V1 migration 119 + ## What exists 84 120 85 - The v0 prototype built a custom CLI that stored description-only JSON blobs. 86 - It proved the transport (Cap'n Proto), coordination (CAS heads), and notification 87 - (watchHeads) layers work. See `docs/exec-plans/completed/v0-prototype-slices.md`. 121 + All core functionality is implemented across 9 slices: 88 122 89 - V1 replaces the custom CLI with jj-lib trait implementations. What carries over: 90 - - `schema/tandem.capnp` — unchanged 91 - - `build.rs` — unchanged 92 - - Server-side `store::Server` RPC handler — mostly unchanged 93 - - CAS head coordination — unchanged 94 - - WatchHeads callback system — unchanged 95 - 96 - What gets replaced: 97 - - Client: custom `tandem new/log/describe/diff` → jj-lib `Backend`/`OpStore`/`OpHeadsStore` 98 - - Server: `CommitObject` JSON → real jj protobuf objects passed through as bytes 99 - - Server: `apply_mirror_update` (jj CLI shelling) → direct content-addressed storage 100 - - Tests: description assertions → file byte assertions via `jj cat` 101 - 102 - ## Priority order 123 + | Capability | Test coverage | 124 + |------------|--------------| 125 + | Single-agent file round-trip | `tests/slice1_single_agent_round_trip.rs` | 126 + | Two-agent file visibility | `tests/slice2_two_agent_visibility.rs` | 127 + | Concurrent file writes converge | `tests/slice3_concurrent_convergence.rs` | 128 + | Promise pipelining for writes | `tests/slice4_promise_pipelining.rs` | 129 + | WatchHeads real-time notifications | `tests/slice5_watch_heads.rs` | 130 + | Git push/fetch round-trip | `tests/slice6_git_round_trip.rs` | 131 + | End-to-end multi-agent + git | `tests/slice7_end_to_end.rs` | 132 + | Bookmark management via RPC | Slice 8 (see `docs/exec-plans/completed/`) | 133 + | CLI help and discoverability | Slice 9 (see `docs/exec-plans/completed/`) | 103 134 104 - 1. Slice 1: Single-agent file round-trip (jj-lib Backend impl) 105 - 2. Slice 2: Two-agent file visibility 106 - 3. Slice 3: Concurrent file writes converge 107 - 4. Slice 4: Promise pipelining for object writes 108 - 5. Slice 5: WatchHeads with file awareness 109 - 6. Slice 6: Git round-trip with real files 110 - 7. Slice 7: End-to-end multi-agent with git shipping 111 - 8. Slice 8: Bookmark management via RPC 112 - 9. Slice 9: CLI help and agent discoverability 135 + See `docs/exec-plans/completed/` for detailed completion notes on each slice. 136 + See `docs/exec-plans/tech-debt-tracker.md` for known issues and next work. 113 137 114 138 ## Testing policy 115 139 ··· 119 143 - Local deterministic tests first; cross-machine tests second. 120 144 - Use `sprites.dev` / `exe.dev` for distributed smoke tests. 121 145 - Keep networked tests opt-in (ignored by default / env-gated). 146 + - Run: `cargo test` 122 147 123 148 ## QA policy 124 149 125 - - After each major milestone, run agent-based QA (see `qa/`). 150 + - After major milestones, run agent-based QA (see `qa/`). 126 151 - QA uses **subagent programs**, not shell scripts — agents evaluate usability. 127 152 - Naive agent (zero-docs trial-and-error) tests discoverability. 128 153 - Workflow agent tests realistic multi-agent file collaboration. 129 154 - Stress agent tests concurrent write correctness. 130 - - Reports go to `qa/v1/REPORT.md` (compare against `qa/REPORT.md` for v0 baseline). 155 + - Reports go to `qa/REPORT.md`. 131 156 - Use opus for all implementation and evaluation models. 132 157 133 158 ## Debug policy 134 159 135 - Add structured tracing early so we do not sprinkle debug prints later. 160 + Structured tracing is built in. Do not add ad-hoc debug prints. 136 161 137 - Recommended flags: 162 + Flags: 138 163 139 164 - `--tandem-debug` 140 165 - `--tandem-debug-format pretty|json` 141 166 - `--tandem-debug-file <path>` 142 167 - `--tandem-debug-filter <filter>` 143 168 144 - Minimum events to emit: 169 + Minimum events emitted: 145 170 146 171 - command lifecycle 147 172 - RPC lifecycle
+5 -5
ARCHITECTURE.md
··· 4 4 5 5 ## Implementation Status 6 6 7 - **v1 complete as of 2026-02-15.** All slices 1-9 implemented and tested. 7 + **Complete as of 2026-02-15.** All slices 1-9 implemented and tested. 8 8 See `docs/exec-plans/completed/` for details. 9 9 10 10 ## Shape ··· 76 76 77 77 No custom git layer in tandem. The server hosts a normal jj+git colocated repo. 78 78 79 - Git operations run **on the server only** (v1): 79 + Git operations run **on the server only**: 80 80 81 81 - `jj git fetch` — pull upstream changes into the server's repo 82 82 - `jj git push` — push agents' work to GitHub ··· 95 95 | Slice | Test File | Coverage | 96 96 |-------|-----------|----------| 97 97 | 1 | `tests/slice1_single_agent_round_trip.rs` | Single agent file round-trip | 98 - | 2 | `tests/v1_slice2_two_agent_visibility.rs` | Two-agent file visibility | 99 - | 3 | `tests/v1_slice3_concurrent_convergence.rs` | 2-agent and 5-agent concurrent writes | 98 + | 2 | `tests/slice2_two_agent_visibility.rs` | Two-agent file visibility | 99 + | 3 | `tests/slice3_concurrent_convergence.rs` | 2-agent and 5-agent concurrent writes | 100 100 | 4 | `tests/slice4_promise_pipelining.rs` | Cap'n Proto pipelining efficiency | 101 101 | 5 | `tests/slice5_watch_heads.rs` | Real-time head notifications | 102 102 | 6 | `tests/slice6_git_round_trip.rs` | Git push/fetch round-trip | ··· 135 135 slice1-7 tests Integration tests asserting on file bytes 136 136 ``` 137 137 138 - ## Non-goals (v1.0) 138 + ## Non-goals 139 139 140 140 - Auth / ACL / multi-tenant isolation (single-repo, single-trust-domain model) 141 141 - Workflow automation engines (out of scope)
+34
demos/config.tape
··· 1 + # Shared VHS configuration for tandem demos 2 + # Usage: Source config.tape 3 + 4 + Set Shell "bash" 5 + Set FontFamily "JetBrains Mono" 6 + Set FontSize 14 7 + Set Width 1200 8 + Set Height 720 9 + Set Padding 20 10 + Set Framerate 30 11 + Set TypingSpeed 40ms 12 + Set Theme { 13 + "name": "tandem", 14 + "black": "#1a1b26", 15 + "red": "#f7768e", 16 + "green": "#9ece6a", 17 + "yellow": "#e0af68", 18 + "blue": "#7aa2f7", 19 + "magenta": "#bb9af7", 20 + "cyan": "#7dcfff", 21 + "white": "#c0caf5", 22 + "brightBlack": "#565f89", 23 + "brightRed": "#f7768e", 24 + "brightGreen": "#9ece6a", 25 + "brightYellow": "#e0af68", 26 + "brightBlue": "#7aa2f7", 27 + "brightMagenta": "#bb9af7", 28 + "brightCyan": "#7dcfff", 29 + "brightWhite": "#c0caf5", 30 + "background": "#1a1b26", 31 + "foreground": "#c0caf5", 32 + "selection": "#33467c", 33 + "cursor": "#c0caf5" 34 + }
+8
demos/scripts/agent-a.sh
··· 1 + #!/bin/bash 2 + set -e 3 + chmod +x ~/tandem 4 + ~/tandem init --tandem-server=localhost:13013 ~/work 5 + mkdir -p ~/work/src 6 + echo 'pub fn authenticate(token: &str) -> bool { !token.is_empty() }' > ~/work/src/auth.rs 7 + echo 'pub mod auth;' > ~/work/src/lib.rs 8 + cd ~/work && ~/tandem --config=fsmonitor.backend=none new -m 'feat: add auth module'
+7
demos/scripts/agent-b.sh
··· 1 + #!/bin/bash 2 + set -e 3 + chmod +x ~/tandem 4 + ~/tandem init --tandem-server=localhost:13013 --workspace=agent-b ~/work 5 + mkdir -p ~/work/src 6 + echo 'pub fn handle_request(path: &str) -> u16 { 200 }' > ~/work/src/api.rs 7 + cd ~/work && ~/tandem --config=fsmonitor.backend=none new -m 'feat: add API routes'
+8
demos/scripts/server-start.sh
··· 1 + #!/bin/bash 2 + set -e 3 + chmod +x ~/tandem 4 + pkill -f 'tandem serve' 2>/dev/null || true 5 + sleep 1 6 + rm -rf ~/project 7 + nohup ~/tandem serve --listen 0.0.0.0:5555 --repo ~/project > ~/tandem.log 2>&1 & 8 + sleep 2 && cat ~/tandem.log
+174
demos/tandem-exe-dev.tape
··· 1 + Source demos/config.tape 2 + Output demos/tandem-exe-dev.gif 3 + 4 + # ============================================================================ 5 + # tandem: distributed jj workspaces across 3 VMs on exe.dev 6 + # 7 + # Two AI agents on separate VMs collaborating on code through a shared 8 + # tandem server. Each agent sees the other's commits instantly. 9 + # ============================================================================ 10 + 11 + Sleep 1s 12 + 13 + # -- Create 3 VMs on exe.dev ------------------------------------------------ 14 + 15 + Type "# Create three exe.dev VMs: server + two agents" 16 + Enter 17 + Sleep 1s 18 + 19 + Type "ssh exe.dev new --name tandem-server" 20 + Enter 21 + Wait@30s 22 + Sleep 2s 23 + 24 + Type "ssh exe.dev new --name tandem-agent-a" 25 + Enter 26 + Wait@30s 27 + Sleep 2s 28 + 29 + Type "ssh exe.dev new --name tandem-agent-b" 30 + Enter 31 + Wait@30s 32 + Sleep 2s 33 + 34 + # -- Copy tandem binary + scripts ------------------------------------------- 35 + 36 + Type "# Copy tandem binary to all VMs" 37 + Enter 38 + Sleep 1s 39 + 40 + Type "BIN=target/x86_64-unknown-linux-musl/release/tandem" 41 + Enter 42 + Sleep 300ms 43 + 44 + Type "scp $BIN tandem-server.exe.xyz:~/tandem" 45 + Enter 46 + Wait@60s 47 + Sleep 1s 48 + 49 + Type "scp $BIN tandem-agent-a.exe.xyz:~/tandem" 50 + Enter 51 + Wait@60s 52 + Sleep 1s 53 + 54 + Type "scp $BIN tandem-agent-b.exe.xyz:~/tandem" 55 + Enter 56 + Wait@60s 57 + Sleep 2s 58 + 59 + # -- Start the tandem server ------------------------------------------------ 60 + 61 + Type "# Start tandem server" 62 + Enter 63 + Sleep 1s 64 + 65 + Type "scp demos/scripts/server-start.sh tandem-server.exe.xyz:/tmp/start.sh" 66 + Enter 67 + Wait@15s 68 + Sleep 500ms 69 + 70 + Type "ssh tandem-server.exe.xyz bash /tmp/start.sh" 71 + Enter 72 + Wait@15s 73 + Sleep 2s 74 + 75 + # -- Set up SSH tunnels ------------------------------------------------------ 76 + 77 + Type "# SSH tunnels: bridge raw TCP between VMs via localhost" 78 + Enter 79 + Sleep 1s 80 + 81 + Type "ssh -f -N -L 15555:localhost:5555 tandem-server.exe.xyz" 82 + Enter 83 + Wait@10s 84 + Sleep 1s 85 + 86 + Type "ssh -f -N -R 13013:localhost:15555 tandem-agent-a.exe.xyz" 87 + Enter 88 + Wait@10s 89 + Sleep 1s 90 + 91 + Type "ssh -f -N -R 13013:localhost:15555 tandem-agent-b.exe.xyz" 92 + Enter 93 + Wait@10s 94 + Sleep 2s 95 + 96 + # -- Agent A: write auth module ---------------------------------------------- 97 + 98 + Type "# --- Agent A: write auth module ---" 99 + Enter 100 + Sleep 1s 101 + 102 + Type "scp demos/scripts/agent-a.sh tandem-agent-a.exe.xyz:/tmp/setup.sh" 103 + Enter 104 + Wait@15s 105 + Sleep 500ms 106 + 107 + Type "ssh tandem-agent-a.exe.xyz bash /tmp/setup.sh" 108 + Enter 109 + Wait@30s 110 + Sleep 2s 111 + 112 + Type "# Agent A sees their commit" 113 + Enter 114 + Sleep 500ms 115 + 116 + Type "ssh tandem-agent-a.exe.xyz 'cd ~/work && ~/tandem --config=fsmonitor.backend=none log'" 117 + Enter 118 + Wait@15s 119 + Sleep 4s 120 + 121 + # -- Agent B: see Agent A, then add API routes -------------------------------- 122 + 123 + Type "# --- Agent B: init workspace, see Agent A's work ---" 124 + Enter 125 + Sleep 1s 126 + 127 + Type "scp demos/scripts/agent-b.sh tandem-agent-b.exe.xyz:/tmp/setup.sh" 128 + Enter 129 + Wait@15s 130 + Sleep 500ms 131 + 132 + Type "ssh tandem-agent-b.exe.xyz bash /tmp/setup.sh" 133 + Enter 134 + Wait@30s 135 + Sleep 2s 136 + 137 + Type "# Agent B sees both workspaces in the log" 138 + Enter 139 + Sleep 500ms 140 + 141 + Type "ssh tandem-agent-b.exe.xyz 'cd ~/work && ~/tandem --config=fsmonitor.backend=none log'" 142 + Enter 143 + Wait@15s 144 + Sleep 4s 145 + 146 + Type "# Agent B reads Agent A's auth.rs from the shared store" 147 + Enter 148 + Sleep 500ms 149 + 150 + Type "ssh tandem-agent-b.exe.xyz 'cd ~/work && ~/tandem --config=fsmonitor.backend=none file show -r @-- src/auth.rs'" 151 + Enter 152 + Wait@15s 153 + Sleep 4s 154 + 155 + # -- Server: everything is there -------------------------------------------- 156 + 157 + Type "# --- Server: all commits from both agents ---" 158 + Enter 159 + Sleep 1s 160 + 161 + Type "ssh tandem-server.exe.xyz 'cd ~/project && ~/tandem --config=fsmonitor.backend=none log --no-graph --ignore-working-copy'" 162 + Enter 163 + Wait@15s 164 + Sleep 4s 165 + 166 + Type "# Server has everything. Ready for: jj git push" 167 + Enter 168 + Sleep 3s 169 + 170 + # -- Fin --------------------------------------------------------------------- 171 + 172 + Type "# Two agents, three VMs, one store. That's tandem." 173 + Enter 174 + Sleep 4s
+2 -2
docs/design-docs/workflow.md
··· 76 76 # (or immediately via watchHeads notification) 77 77 ``` 78 78 79 - ## Git operations: server only (v1) 79 + ## Git operations: server only 80 80 81 - In v1, git commands run exclusively on the server: 81 + Git commands run exclusively on the server: 82 82 - `jj git push` — server pushes to GitHub 83 83 - `jj git fetch` — server pulls from GitHub 84 84 - `gh pr create` — server creates PRs
+15 -28
docs/exec-plans/active/slice-roadmap.md
··· 1 - # Completed Execution Plan: Slice Roadmap (v1) 1 + # Completed Execution Plan: Slice Roadmap 2 2 3 3 **Status:** All slices completed as of 2026-02-15. 4 4 **See:** `docs/exec-plans/completed/` for detailed completion notes. 5 5 6 - Rewrite of the prototype slices to implement the original vision: 7 - **stock `jj` on the client, tandem as a remote jj store backend.** 6 + **Stock `jj` on the client, tandem as a remote jj store backend.** 8 7 9 - The v0 prototype proved the transport (Cap'n Proto), coordination (CAS heads), 10 - and notification (watchHeads) layers work. This plan rewrites the client and 11 - server to store real jj objects (commits with tree pointers, trees with file 12 - entries, file blobs) so that `jj` itself is the client CLI. 8 + The client and server store real jj objects (commits with tree pointers, 9 + trees with file entries, file blobs) so that `jj` itself is the client CLI. 13 10 14 11 ## Invariant 15 12 ··· 43 40 ## Slice 2 — Two-agent file visibility ✓ 44 41 45 42 **Completed:** 2026-02-15 46 - **Test file:** `tests/v1_slice2_two_agent_visibility.rs` 43 + **Test file:** `tests/slice2_two_agent_visibility.rs` 47 44 48 45 Goal: two agents on separate workspaces see each other's files. 49 46 ··· 59 56 ## Slice 3 — Concurrent file writes converge ✓ 60 57 61 58 **Completed:** 2026-02-15 62 - **Test file:** `tests/v1_slice3_concurrent_convergence.rs` 59 + **Test file:** `tests/slice3_concurrent_convergence.rs` 63 60 64 61 Goal: concurrent commits with different files don't lose data. 65 62 ··· 165 162 166 163 ## Implementation notes 167 164 168 - ### Client architecture change 165 + ### Client architecture 169 166 170 - The v0 client was a custom CLI. The v1 client is a **jj-lib Backend impl**: 167 + The client is a **jj-lib Backend impl**: 171 168 172 169 ```rust 173 170 struct TandemBackend { store: store::Client } ··· 192 189 impl OpHeadsStore for TandemOpHeadsStore { /* getHeads, updateOpHeads */ } 193 190 ``` 194 191 195 - The client binary becomes: 196 - - `tandem serve --listen <addr> --repo <path>` — unchanged 197 - - `tandem watch` — unchanged 198 - - `tandem --help` — new, local-only 192 + The client binary: 193 + - `tandem serve --listen <addr> --repo <path>` — server mode 194 + - `tandem watch` — head change notifications 195 + - `tandem --help` — local-only help 199 196 - All other commands: use **stock `jj`** configured to use TandemBackend 200 197 201 - ### Server storage change 198 + ### Server storage 202 199 203 - The server stores real jj-compatible object bytes: 200 + The server stores real jj-compatible object bytes via direct content-addressed 201 + storage that IS the jj store: 204 202 - `objects/commit/<id>` — jj protobuf commit (with tree_id, parent_ids) 205 203 - `objects/tree/<id>` — jj protobuf tree (with file entries) 206 204 - `objects/file/<id>` — raw file bytes 207 205 - `operations/<id>` — jj protobuf operation 208 206 - `views/<id>` — jj protobuf view 209 - 210 - The `apply_mirror_update` heuristic (shelling out to `jj new/describe`) is 211 - replaced by direct object storage that IS the jj store. 212 - 213 - ### What carries over from v0 214 - 215 - - Cap'n Proto schema (`schema/tandem.capnp`) — unchanged 216 - - Server RPC handler (`store::Server` impl) — mostly unchanged 217 - - CAS head coordination — unchanged 218 - - WatchHeads callback system — unchanged 219 - - Build system (`build.rs`, `Cargo.toml`) — add `jj-lib` dependency
+2 -2
docs/exec-plans/completed/slice1-single-agent-round-trip.md
··· 1 - # Slice 1 — Single-agent round-trip (v1) 1 + # Slice 1 — Single-agent round-trip 2 2 3 3 - **Date completed:** 2026-02-15 4 4 - **Test file(s):** `tests/slice1_single_agent_round_trip.rs` ··· 45 45 46 46 ## Architecture notes 47 47 48 - This slice established the core v1 architecture: 48 + This slice established the core architecture: 49 49 - Client is stock jj with remote Backend/OpStore/OpHeadsStore 50 50 - Server is a normal jj+git repo accessed via RPC 51 51 - No command proxying — all operations are store-level RPC calls
+2 -2
docs/exec-plans/completed/slice2-two-agent-visibility.md
··· 1 - # Slice 2 — Two-agent visibility (v1) 1 + # Slice 2 — Two-agent visibility 2 2 3 3 - **Date completed:** 2026-02-15 4 - - **Test file(s):** `tests/v1_slice2_two_agent_visibility.rs` 4 + - **Test file(s):** `tests/slice2_two_agent_visibility.rs` 5 5 6 6 ## What was implemented 7 7
+2 -2
docs/exec-plans/completed/slice3-concurrent-convergence.md
··· 1 - # Slice 3 — Concurrent convergence (v1) 1 + # Slice 3 — Concurrent convergence 2 2 3 3 - **Date completed:** 2026-02-15 4 - - **Test file(s):** `tests/v1_slice3_concurrent_convergence.rs` 4 + - **Test file(s):** `tests/slice3_concurrent_convergence.rs` 5 5 6 6 ## What was implemented 7 7
+3 -3
docs/exec-plans/completed/slice4-promise-pipelining.md
··· 1 - # Slice 4 — Promise pipelining (v1) 1 + # Slice 4 — Promise pipelining 2 2 3 3 - **Date completed:** 2026-02-15 4 4 - **Test file(s):** `tests/slice4_promise_pipelining.rs` ··· 7 7 8 8 Cap'n Proto promise pipelining for efficient multi-object writes: 9 9 10 - 1. **Cap'n Proto RPC migration** 11 - - Replaced v0's line-JSON transport with Cap'n Proto 10 + 1. **Cap'n Proto RPC transport** 11 + - RPC protocol defined in `schema/tandem.capnp` 12 12 - Schema defined in `schema/tandem.capnp` 13 13 - Build integration via `build.rs` and `capnpc` crate 14 14
+1 -1
docs/exec-plans/completed/slice5-watch-heads.md
··· 1 - # Slice 5 — WatchHeads notifications (v1) 1 + # Slice 5 — WatchHeads notifications 2 2 3 3 - **Date completed:** 2026-02-15 4 4 - **Test file(s):** `tests/slice5_watch_heads.rs`
+1 -1
docs/exec-plans/completed/slice6-git-round-trip.md
··· 1 - # Slice 6 — Git round-trip (v1) 1 + # Slice 6 — Git round-trip 2 2 3 3 - **Date completed:** 2026-02-15 4 4 - **Test file(s):** `tests/slice6_git_round_trip.rs`
+1 -1
docs/exec-plans/completed/slice7-end-to-end.md
··· 1 - # Slice 7 — End-to-end multi-agent (v1) 1 + # Slice 7 — End-to-end multi-agent 2 2 3 3 - **Date completed:** 2026-02-15 4 4 - **Test file(s):** `tests/slice7_end_to_end.rs`
+1 -1
docs/exec-plans/completed/slice8-bookmark-management.md
··· 1 - # Slice 8 — Bookmark management (v1) 1 + # Slice 8 — Bookmark management 2 2 3 3 - **Date completed:** 2026-02-15 4 4 - **Test coverage:** `tests/slice7_end_to_end.rs` (includes bookmark operations)
+2 -2
docs/exec-plans/completed/slice9-cli-help.md
··· 1 - # Slice 9 — CLI help and discoverability (v1) 1 + # Slice 9 — CLI help and discoverability 2 2 3 3 - **Date completed:** 2026-02-15 4 4 - **Implementation:** `src/main.rs` (clap command definitions, AFTER_HELP constants) ··· 43 43 44 44 ## Architecture notes 45 45 46 - Good help text is P0 for agent usability. The v0 QA found that agents spend 50% of their time guessing commands when help is missing. This slice ensures agents can discover tandem's capabilities without reading source code. 46 + Good help text is P0 for agent usability. Without `--help` and command suggestions, agents spend most of their time guessing commands. This slice ensures agents can discover tandem's capabilities without reading source code.
+6 -4
docs/exec-plans/completed/v0-prototype-slices.md
··· 1 - # V0 Prototype Slices (completed, superseded) 1 + # Prototype Slices (historical context only) 2 + 3 + > **Note:** This file documents the early prototype phase. All items listed as 4 + > deferred have since been implemented. See `docs/exec-plans/completed/` for 5 + > current slice completion notes. 2 6 3 7 Slices 1-7 were implemented as a **description-only prototype** using a custom 4 8 CLI (`tandem new/log/describe/diff`) instead of jj-lib Backend trait ··· 11 15 - WatchHeads callback capabilities deliver sub-second notifications 12 16 - Server-side jj repo can push/fetch to bare git remotes 13 17 14 - **What was deferred (now addressed in v1 slices):** 18 + **What was deferred (now implemented):** 15 19 - jj-lib Backend/OpStore/OpHeadsStore trait integration (client is stock jj) 16 20 - Real commit/tree/file/symlink object storage (not description-only JSON) 17 21 - Bookmark management through tandem RPC 18 22 - CLI help text and error suggestions 19 - 20 - See `docs/exec-plans/active/slice-roadmap.md` for the v1 rewrite plan.
+8 -8
docs/exec-plans/tech-debt-tracker.md
··· 1 1 # Tech Debt Tracker 2 2 3 - ## Resolved by v1 completion (2026-02-15) 3 + ## Resolved (2026-02-15) 4 4 5 - - [x] ~~Integrate real `jj-lib` store traits (`Backend`, `OpStore`, `OpHeadsStore`) on the client~~ → v1 slice 1 6 - - [x] ~~Replace line-JSON RPC transport with Cap'n Proto and promise pipelining~~ → v1 slice 4 7 - - [x] ~~Full byte-compatible object/op/view storage semantics~~ → v1 slice 1 8 - - [x] ~~Remove test-only CAS delay knob (`TANDEM_TEST_DELAY_BEFORE_UPDATE_MS`)~~ → removed in v1 5 + - [x] ~~Integrate real `jj-lib` store traits (`Backend`, `OpStore`, `OpHeadsStore`) on the client~~ → resolved 6 + - [x] ~~Replace line-JSON RPC transport with Cap'n Proto and promise pipelining~~ → resolved 7 + - [x] ~~Full byte-compatible object/op/view storage semantics~~ → resolved 8 + - [x] ~~Remove test-only CAS delay knob (`TANDEM_TEST_DELAY_BEFORE_UPDATE_MS`)~~ → removed 9 9 - [x] ~~Clean up `opensrc/` directory leftover~~ → removed 2026-02-15 10 10 11 11 ## Known issues ··· 13 13 ### P1 (blocks production use) 14 14 15 15 - **Flaky 5-agent concurrent test under full cargo test load** 16 - - `tests/v1_slice3_concurrent_convergence.rs::five_agent_concurrent_convergence` 16 + - `tests/slice3_concurrent_convergence.rs::five_agent_concurrent_convergence` 17 17 - Intermittent failures when running full test suite (not in isolation) 18 18 - Hypothesis: port contention or filesystem race during concurrent server cleanup 19 19 - Workaround: test passes reliably in isolation ··· 23 23 - Without it, jj tries to use watchman and fails (tandem workspaces don't support fsmonitor) 24 24 - Should be auto-configured in `.jj/repo/config.toml` during `tandem init` 25 25 26 - ### P2 (polish for v1.0) 26 + ### P2 (polish) 27 27 28 28 - Define stable tracing event schema (`command_id`, `rpc_id`, `workspace`, `latency_ms`) 29 29 - Add redaction rules for logs (paths, tokens, secrets) ··· 34 34 35 35 ### P3 (performance, not correctness) 36 36 37 - - Client-side object cache for repeated reads (non-goal for v0.1 but needed at scale) 37 + - Client-side object cache for repeated reads (needed at scale) 38 38 - Index store optimization (currently rebuilds on every jj command) 39 39 - Batch RPC calls for `jj log` with many commits
+1 -1
docs/product-specs/core-product.md
··· 16 16 - safe concurrent writes (no lost updates) 17 17 - server remains plain jj+git compatible 18 18 19 - ## Out of scope (v0.1) 19 + ## Out of scope 20 20 21 21 - authentication and tenant isolation 22 22 - UI layer
+6 -6
qa/README.md
··· 22 22 - **Stress agent:** Hammered concurrent writes. Verified CAS correctness 23 23 and persistence across server restarts. 24 24 25 - ## Key Findings 25 + ## Key Findings (from initial QA — all issues now resolved) 26 26 27 - 1. **Protocol works** — 15/15 integration tests pass, 50 concurrent commits preserved 28 - 2. **Agents can't self-serve** — no `--help`, no command discovery, score 5/10 29 - 3. **Three quick fixes** would reach 8/10: `--help`, command suggestions, `TANDEM_SERVER` env var 30 - 4. **Code review is blocked** — commits store descriptions only, no file trees 31 - 5. **Git push is blocked** — no bookmark management via tandem CLI 27 + 1. ✅ **Protocol works** — 15/15 integration tests pass, 50 concurrent commits preserved 28 + 2. ✅ **Agent discoverability** — `--help`, command suggestions, and `TANDEM_SERVER` env var all implemented 29 + 3. ✅ **Code review works** — commits store full file trees, `jj diff`/`jj show` work 30 + 4. ✅ **Git push works** — bookmark management available via stock jj commands 31 + 5. See `qa/v1/REPORT.md` for the latest usability evaluation
+2 -2
qa/stress-report.md
··· 2 2 3 3 **Date:** 2026-02-15 4 4 **Evaluator:** QA Agent (Claude Code) 5 - **Tandem Version:** v0.1.0 5 + **Tandem Version:** current 6 6 **Test Framework:** Rust integration tests with concurrent threads 7 7 8 8 --- ··· 410 410 --- 411 411 412 412 **Report Generated:** 2026-02-15 17:00:00 GMT+1 413 - **Test Suite:** Tandem v0.1.0 Concurrent Write Stress Test 413 + **Test Suite:** Tandem Concurrent Write Stress Test 414 414 **Final Status:** ✅ PRODUCTION-READY (5-10 concurrent agents)
+35 -34
qa/v1/REPORT.md
··· 1 - # Tandem v1 QA Report — Agent Usability Evaluation 1 + # Tandem QA Report — Agent Usability Evaluation 2 2 3 3 **Date:** 2026-02-15 4 4 **Tester:** Automated agent (Claude opus) 5 5 **Binary:** `target/debug/tandem` (cargo build, clean) 6 6 **Method:** Manual agent-perspective testing of all documented workflows 7 - **Server:** `tandem serve --listen 127.0.0.1:13099 --repo /tmp/tandem-qa-v1-repo` 7 + **Server:** `tandem serve --listen 127.0.0.1:13099 --repo /tmp/tandem-qa-repo` 8 8 9 9 --- 10 10 11 11 ## Executive Summary 12 12 13 - **Tandem v1 is a massive improvement over v0.** The v0 QA found agents spending 50% of time guessing commands with no `--help`, no file content storage, and no code review capability. All three P0 blockers from v0 are resolved: 13 + Tandem embeds full jj — every jj command works transparently. An agent can 14 + write files, commit, read other agents' files, see diffs, manage bookmarks, 15 + and view operation history. **This is a usable multi-agent collaboration tool.** 14 16 17 + Key capabilities verified: 15 18 1. ✅ `--help` works without server connection 16 19 2. ✅ File content is stored and readable via `jj file show` / `jj diff` / `jj show` 17 20 3. ✅ `TANDEM_SERVER` env var works as fallback 18 21 19 - The tool now embeds full jj — every jj command works transparently. An agent can write files, commit, read other agents' files, see diffs, manage bookmarks, and view operation history. **This is a usable multi-agent collaboration tool.** 20 - 21 - **Verdict: Tandem v1 is agent-ready for core workflows. Two minor UX issues remain.** 22 + **Verdict: Tandem is agent-ready for core workflows. Two minor UX issues remain.** 22 23 23 24 --- 24 25 25 - ## v0 → v1 P0 Issue Resolution 26 + ## P0 Capability Status 26 27 27 - | v0 Issue | v0 Status | v1 Status | Evidence | 28 - |----------|-----------|-----------|----------| 29 - | `--help` works without server | 🔴 RED | ✅ GREEN | Prints full usage with commands, env vars, examples | 30 - | File content storage + readback | 🔴 RED | ✅ GREEN | `jj file show`, `jj diff`, `jj show` all work | 31 - | `TANDEM_SERVER` env var | 🔴 RED | ✅ GREEN | `TANDEM_SERVER=host:port tandem init .` works | 32 - | Command suggestions on error | 🔴 RED | ✅ GREEN | jj provides "tip: a similar subcommand exists" | 33 - | Code review capability | 🔴 RED | ✅ GREEN | Full diffs, file listing, show command all work | 34 - | Bookmark management | 🔴 RED | ✅ GREEN | `tandem bookmark create/list` work transparently | 35 - | Commit stores only descriptions | 🔴 RED | ✅ GREEN | Real jj commits with file trees | 28 + | Capability | Status | Evidence | 29 + |------------|--------|----------| 30 + | `--help` works without server | ✅ GREEN | Prints full usage with commands, env vars, examples | 31 + | File content storage + readback | ✅ GREEN | `jj file show`, `jj diff`, `jj show` all work | 32 + | `TANDEM_SERVER` env var | ✅ GREEN | `TANDEM_SERVER=host:port tandem init .` works | 33 + | Command suggestions on error | ✅ GREEN | jj provides "tip: a similar subcommand exists" | 34 + | Code review capability | ✅ GREEN | Full diffs, file listing, show command all work | 35 + | Bookmark management | ✅ GREEN | `tandem bookmark create/list` work transparently | 36 + | Real jj commits with file trees | ✅ GREEN | Full commit/tree/file object storage | 36 37 37 - **All 7 P0 issues from v0 are resolved.** 38 + **All 7 P0 capabilities verified.** 38 39 39 40 --- 40 41 ··· 51 52 | `tandem serve --help` | Shows `--listen` and `--repo` flags with examples | Yes | 52 53 | `tandem init --help` | Shows `--tandem-server`, `--workspace`, env vars, examples | Yes | 53 54 54 - **Key improvement over v0:** Help text works *without* a server connection. An agent's first instinct (`tool --help`) immediately works. The output includes environment variables, all commands, and working examples. 55 + Help text works *without* a server connection. An agent's first instinct (`tool --help`) immediately works. The output includes environment variables, all commands, and working examples. 55 56 56 57 **Actual output of `tandem --help`:** 57 58 ``` ··· 140 141 | `tandem file list -r <rev>` | ✅ Lists all files in commit tree | 141 142 | `tandem status` | ✅ Shows working copy state | 142 143 143 - **Key improvement over v0:** v0 stored only descriptions — no files, no diffs, no content. v1 stores real jj commits with full file trees. Every jj command that reads content works. 144 + Tandem stores real jj commits with full file trees. Every jj command that reads content works. 144 145 145 146 --- 146 147 ··· 235 236 v1_slice3_five_agents_concurrent_file_writes_all_survive ok (5.28s) 236 237 ``` 237 238 238 - All 4 integration tests pass. Tests assert on **file bytes** (not just descriptions), which was the critical v0 gap. 239 + All 4 integration tests pass. Tests assert on **file bytes**, not just descriptions. 239 240 240 241 --- 241 242 ··· 326 327 327 328 --- 328 329 329 - ## v0 vs v1 Comparison 330 + ## Capability Summary 330 331 331 - | Metric | v0 | v1 | Change | 332 - |--------|----|----|--------| 333 - | Agent discoverability | 🔴 5/10 | ✅ 9/10 | +4 | 334 - | File content storage | ❌ None | ✅ Full jj trees | Fixed | 335 - | Code review capability | ❌ Blocked | ✅ Full diffs + file read | Fixed | 336 - | Help text | ❌ None | ✅ Comprehensive | Fixed | 337 - | Error messages | 🟡 Partial | ✅ Progressive + suggestions | Improved | 338 - | Bookmark management | ❌ None | ✅ Full jj bookmark | Fixed | 339 - | Command suggestions | ❌ None | ✅ jj provides "did you mean" | Fixed | 340 - | `TANDEM_SERVER` env var | ❌ None | ✅ Works | Fixed | 341 - | Concurrent writes | ✅ CAS works | ✅ CAS + file trees | Maintained | 342 - | Persistence | ✅ Works | ✅ Works | Maintained | 332 + | Metric | Status | 333 + |--------|--------| 334 + | Agent discoverability | ✅ 9/10 | 335 + | File content storage | ✅ Full jj trees | 336 + | Code review capability | ✅ Full diffs + file read | 337 + | Help text | ✅ Comprehensive | 338 + | Error messages | ✅ Progressive + suggestions | 339 + | Bookmark management | ✅ Full jj bookmark | 340 + | Command suggestions | ✅ jj provides "did you mean" | 341 + | `TANDEM_SERVER` env var | ✅ Works | 342 + | Concurrent writes | ✅ CAS + file trees | 343 + | Persistence | ✅ Works | 343 344 344 345 --- 345 346 ··· 362 363 363 364 ## Conclusion 364 365 365 - **Tandem v1 is ready for agent use.** The core workflow — init workspace, write files, commit, read other agents' files, manage bookmarks — works end-to-end with clear help text and good error messages. Every P0 blocker from v0 is resolved. 366 + **Tandem is ready for agent use.** The core workflow — init workspace, write files, commit, read other agents' files, manage bookmarks — works end-to-end with clear help text and good error messages. 366 367 367 368 The remaining issues (init without server flag, stale `cat` reference in help) are minor UX papercuts that can be fixed in a single slice. An agent encountering tandem for the first time can discover commands via `--help`, set up a workspace, and collaborate with other agents without reading any documentation. 368 369
+3 -3
qa/workflow-eval-report.md
··· 2 2 3 3 **Date:** 2026-02-15 4 4 **Evaluator:** AI Agent (Claude) 5 - **Version:** tandem v0.1.0 (commit as of evaluation) 5 + **Version:** tandem (commit as of evaluation) 6 6 7 7 ## Executive Summary 8 8 ··· 363 363 364 364 - **Server-side mirroring duplicates commits**: Every tandem commit is mirrored into the server's jj repo via `jj new/describe`. This is clever but adds complexity. Consider whether the `.tandem` store could BE the jj store (i.e., tandem directly implements jj's backend traits against `.jj/store` instead of a parallel `.tandem/` directory). 365 365 366 - - **No authentication or workspace ownership**: Any client can write to any workspace. Fine for v0.1, but agents will need workspace ACLs for multi-team scenarios. 366 + - **No authentication or workspace ownership**: Any client can write to any workspace. Agents will eventually need workspace ACLs for multi-team scenarios. 367 367 368 368 ### What's Missing (Foundational) 369 369 - **File content storage**: Either commit objects need tree/blob pointers, or tandem needs a separate file store backend. ··· 416 416 417 417 10. **Workspace ownership and ACLs** 418 418 - Prevent agent-a from writing to agent-b's workspace 419 - - Requires authentication layer (out of scope for v0.1) 419 + - Requires authentication layer (out of scope for now) 420 420 421 421 11. **Promise pipelining for batch operations** 422 422 - Currently not used (see Slice 4 tests)
tests/v1_slice2_two_agent_visibility.rs tests/slice2_two_agent_visibility.rs
tests/v1_slice3_concurrent_convergence.rs tests/slice3_concurrent_convergence.rs