AGENTS#
Execution guide for working on the tandem codebase.
How to read these docs#
- Read
ARCHITECTURE.mdfor system boundaries and project structure. - Read
docs/design-docs/workflow.mdfor the orchestrator→agents→git workflow. - Read
docs/design-docs/jj-lib-integration.mdfor trait signatures and registration. - Read
docs/exec-plans/tech-debt-tracker.mdfor known issues to work on. - Check
docs/exec-plans/completed/for context on how each slice was built. - Implement changes via failing integration test first.
What tandem is#
Tandem applies a server-client model to jj's store layer. The server hosts
a normal jj+git colocated repo. Agents on remote machines use the tandem
binary (which embeds jj-cli with a custom tandem backend) to read and write
objects over Cap'n Proto RPC.
The server is the point of origin — it typically runs on a VM/VPS as a
long-running service. It's where git operations happen (jj git push,
jj git fetch, gh pr create). The orchestrator/teamlead runs these on the
server to ship code upstream. The tandem server is the source of truth, with
GitHub as a mirror.
Installation#
Published on crates.io as jj-tandem:
cargo install jj-tandem
Requires a Rust toolchain and Cap'n Proto compiler (capnp).
Or build from source: cargo build --release.
Single binary, three modes#
tandem up --repo <path> --listen <addr> # start background daemon
tandem serve --listen <addr> --repo <path> # foreground server (systemd/docker)
tandem [jj args...] # client mode (stock jj via CliRunner)
Plus lifecycle commands that talk to a running server:
tandem down # stop daemon
tandem status [--json] # health check
tandem logs [--level <level>] [--json] # stream logs from daemon
The client mode is CliRunner::init().add_store_factories(tandem_factories()).run().
All stock jj commands work transparently: tandem new, tandem log, tandem diff,
tandem file show, tandem bookmark create are all jj commands running through our binary.
Server mode embeds jj-lib and uses the Git backend internally. When a client
calls putObject(file, bytes), the server stores the object. Objects are real
jj-compatible blobs — jj git push on the server just works.
tandem up is the easy way to start the server — it forks tandem serve --daemon
in the background, waits for the control socket to become healthy, prints the PID,
and exits. tandem serve is the foreground mode for systemd, Docker, or debugging.
Both create a control socket so tandem down/status/logs work against either.
Source layout#
src/
main.rs CLI dispatch (clap) + CliRunner passthrough
server.rs Server — jj Git backend + Cap'n Proto RPC
control.rs Control socket — daemon management (Unix socket, JSON lines)
backend.rs TandemBackend (jj-lib Backend trait)
op_store.rs TandemOpStore (jj-lib OpStore trait)
op_heads_store.rs TandemOpHeadsStore (jj-lib OpHeadsStore trait)
rpc.rs Cap'n Proto RPC client wrapper
proto_convert.rs jj protobuf ↔ Rust struct conversion
watch.rs tandem watch command
schema/
tandem.capnp Cap'n Proto schema (Store + HeadWatcher)
tests/
common/mod.rs Test harness (server spawn, HOME isolation)
slice1-7 tests Core integration tests (file round-trip, visibility, CAS, git)
slice10-13 tests Server lifecycle tests (shutdown, control socket, up/down, logs)
Docs layout#
docs/
README.md Overview and pointers
design-docs/
workflow.md Orchestrator→agents→git workflow
jj-lib-integration.md Trait signatures and store registration
rpc-protocol.md Cap'n Proto protocol details
rpc-error-model.md Error handling conventions
server-lifecycle.md tandem up/down/status/logs design
core-beliefs.md Design principles
exec-plans/
completed/ Completion notes for all 13 slices
tech-debt-tracker.md Known issues (P1/P2/P3)
product-specs/
core-product.md Product intent and scope
Critical invariants#
-
The client is stock
jj. Tandem implements jj-lib'sBackend,OpStore, andOpHeadsStoretraits as Cap'n Proto RPC stubs. There is no customtandem new/log/describe/diffCLI — those are all jj commands. -
Tests assert on file bytes, not descriptions. Every integration test must verify file content round-trips correctly via
jj cat. Description-only assertions are insufficient. -
Help text works without a server.
tandem --help,tandem serve --help, andtandemwith no args must print usage locally. Error messages must suggest alternatives for unknown commands and include addresses for connection failures.
Help text and error handling (P0)#
These are required, not nice-to-haves. QA found agents spend 50% of their time guessing commands when help is missing.
tandem --help— prints usage without server connectiontandem serve --help— explains--listenand--repoflagstandemwith no args — prints usage, not a cryptic error- Unknown commands — suggest alternatives ("did you mean
new?") - Connection errors — include the address that was tried
- Missing args — say what's needed ("serve requires
--listen <addr>") TANDEM_SERVERenv var — fallback for--serverflag on client commandsTANDEM_WORKSPACEenv var — workspace name for client
Workflow#
See docs/design-docs/workflow.md for the full picture. Summary:
- Orchestrator sets up server on a VM/VPS:
tandem up --repo /srv/project --listen 0.0.0.0:13013 - Agents init workspaces:
tandem init --tandem-server=host:13013 ~/work/project - Agents use stock jj commands: write files,
tandem new -m "feat: add auth", etc. - Agents see each other's files:
tandem file show -r <other-commit> src/auth.rs - Orchestrator ships from server:
jj bookmark create main -r <tip>,jj git push
Git operations are server-only. Agents never touch git directly.
What exists#
All core functionality is implemented across 13 slices:
| Capability | Test coverage |
|---|---|
| Single-agent file round-trip | tests/slice1_single_agent_round_trip.rs |
| Two-agent file visibility | tests/slice2_two_agent_visibility.rs |
| Concurrent file writes converge | tests/slice3_concurrent_convergence.rs |
| Promise pipelining for writes | tests/slice4_promise_pipelining.rs |
| WatchHeads real-time notifications | tests/slice5_watch_heads.rs |
| Git push/fetch round-trip | tests/slice6_git_round_trip.rs |
| End-to-end multi-agent + git | tests/slice7_end_to_end.rs |
| Bookmark management via RPC | Slice 8 (see docs/exec-plans/completed/) |
| CLI help and discoverability | Slice 9 (see docs/exec-plans/completed/) |
| Signal handling + graceful shutdown | tests/slice10_graceful_shutdown.rs |
| Control socket + tandem status | tests/slice11_control_socket.rs |
| tandem up + tandem down | tests/slice12_up_down.rs |
| tandem logs (streaming) | tests/slice13_log_streaming.rs |
See docs/exec-plans/completed/ for detailed completion notes on each slice.
See docs/exec-plans/tech-debt-tracker.md for known issues and next work.
Testing policy#
- Integration tests are the primary source of truth.
- Tests use the
tandembinary which runs jj commands — never a separate jj binary. - Acceptance criteria assert on file bytes via
jj cat, not just log descriptions. - Local deterministic tests first; cross-machine tests second.
- Use
sprites.dev/exe.devfor distributed smoke tests. - Keep networked tests opt-in (ignored by default / env-gated).
- Run:
cargo test
QA policy#
- After major milestones, run agent-based QA (see
qa/). - QA uses subagent programs, not shell scripts — agents evaluate usability.
- Naive agent (zero-docs trial-and-error) tests discoverability.
- Workflow agent tests realistic multi-agent file collaboration.
- Stress agent tests concurrent write correctness.
- Reports go to
qa/REPORT.md. - Use opus for all implementation and evaluation models.
Debug policy#
Structured tracing is built in. Do not add ad-hoc debug prints.
Flags:
--tandem-debug--tandem-debug-format pretty|json--tandem-debug-file <path>--tandem-debug-filter <filter>
Minimum events emitted:
- command lifecycle
- RPC lifecycle
- object read/write (kind, id, size)
- CAS heads success/failure + retries
- watcher subscribe/notify/reconnect