···11-# merge-bench
11+# Merge Strategy Benchmark Report (v1)
2233-Benchmark harness comparing merge strategies for collaborative text editing via AT Protocol PDS.
33+**Date:** 2026-03-13
44+**PDS:** bluesky-pds.t1cc.commoninternet.net (v0.4.208)
55+**Modes:** Local (no network) and Remote (real PDS round-trips)
4655-## Why
77+## Strategies
6877-Lichen uses Yrs CRDTs to enable conflict-free collaborative editing on sites stored in AT Protocol PDS repositories. There are several possible architectures for how collaborators merge their changes:
99+1. **plain-git** -- Standard `git merge` 3-way merge
1010+2. **yrs-diff** -- git merge with git-yrs-merge CRDT driver (diff-only mode)
1111+3. **yrs-sidecar** -- git merge with git-yrs-merge CRDT driver + .yrs/ sidecar files
1212+4. **yrs-on-pds** -- Pure Yrs CRDT merge (local: in-memory from .pds-sim/; PDS: pds-yrs save/merge/load)
81399-1. **Plain git** -- standard 3-way merge, conflicts require manual resolution
1010-2. **git + yrs-merge (diff)** -- git merge with a CRDT merge driver that resolves conflicts automatically using Yrs document state
1111-3. **git + yrs-merge (sidecar)** -- same as above, but with `.yrs/` sidecar files tracked in git for richer CRDT state
1212-4. **Yrs-on-PDS** -- pure CRDT merge via `pds-yrs`, no git involved; each collaborator saves Yrs state to PDS and merges happen server-side
1414+## Default Matrix (10/50/200 files, 2 collaborators)
13151414-This tool generates synthetic repositories with configurable parameters (file count, file size, collaborator count, conflict rate) and measures each strategy's merge time, conflict count, and success rate.
1616+### Local
15171616-Both **local** (no network, pure computation) and **remote** (real PDS round-trips via `git-remote-pds` and `pds-yrs`) modes are supported.
1818+| Files | Conflict | 1-git (ms) | 1-conflicts | 2-diff (ms) | 3-sidecar (ms) | 4-yrs (ms) |
1919+|-------|----------|-----------|-------------|-------------|----------------|------------|
2020+| 10 | low | 25 | 0 | 45 | 69 | 3 |
2121+| 10 | medium | 33 | 0 | 127 | 121 | 5 |
2222+| 10 | high | 67 | 1 | 118 | 260 | 6 |
2323+| 50 | low | 118 | 0 | 187 | 245 | 55 |
2424+| 50 | medium | 61 | 0 | 298 | 576 | 34 |
2525+| 50 | high | 198 | 0 | 365 | 721 | 52 |
2626+| 200 | low | 386 | 0 | 441 | 292 | 119 |
2727+| 200 | medium | 160 | 0 | 164 | 325 | 102 |
2828+| 200 | high | 139 | 0 | 212 | 363 | 111 |
17291818-## Usage
3030+### Remote (PDS)
19312020-```bash
2121-cargo build
2222-```
3232+| Files | Conflict | 1-git (ms) | 1-conflicts | 2-diff (ms) | 3-sidecar (ms) | 4-yrs (ms) |
3333+|-------|----------|-----------|-------------|-------------|----------------|------------|
3434+| 10 | low | 6676 | 0 | 7649 | 8303 | 3793 |
3535+| 10 | medium | 6728 | 1 | 7005 | 6643 | 4066 |
3636+| 10 | high | 6666 | 1 | 7363 | 6482 | 3717 |
3737+| 50 | low | 7228 | 0 | 7006 | 6901 | 9042 |
3838+| 50 | medium | 6618 | 0 | 6845 | 6965 | 8737 |
3939+| 50 | high | 6984 | 1 | 7957 | 6972 | 9386 |
4040+| 200 | low | 6966 | 0 | 7064 | 7567 | 29411 |
4141+| 200 | medium | 11105 | 0 | 7335 | 7056 | 27647 |
4242+| 200 | high | 7294 | 1 | 7792 | 7370 | 29289 |
23432424-### Run benchmarks
4444+## Stress Test (1000 files, 2 collaborators)
25452626-```bash
2727-# Local mode (default) -- measures pure merge cost
2828-merge-bench run # default matrix (10/50/200 files x low/med/high conflict)
2929-merge-bench run --quick # quick smoke test (2 scenarios)
3030-merge-bench run --stress # 1000 files
3131-merge-bench run --filesize # file size sweep (1KB/10KB/50KB)
3232-merge-bench run --multi-collab # 5 collaborators
3333-merge-bench run --conflict # guaranteed conflicts (same lines edited)
4646+### Local
34473535-# Remote PDS mode -- includes real network round-trips
3636-merge-bench run --pds # default matrix against real PDS
3737-merge-bench run --conflict --pds # conflict test against real PDS
4848+| Files | Conflict | 1-git (ms) | 2-diff (ms) | 3-sidecar (ms) | 4-yrs (ms) |
4949+|-------|----------|-----------|-------------|----------------|------------|
5050+| 1000 | low | 161 | 458 | 1071 | 645 |
5151+| 1000 | medium | 165 | 410 | 910 | 571 |
5252+| 1000 | high | 91 | 341 | 316 | 464 |
38533939-# Group related runs into a named set
4040-merge-bench run --set v1 # bench-results/v1/default-local/
4141-merge-bench run --pds --set v1 # bench-results/v1/default-pds/
4242-merge-bench run --stress --set v1 # bench-results/v1/stress-local/
4343-```
5454+### Remote (PDS)
44554545-### Generate a repo manually
5656+All 3 scenarios failed with `413 Payload Too Large` -- 1000-file sites exceed AT Protocol record size limits.
46574747-```bash
4848-merge-bench gen --files 50 --avg-size 1000 --collaborators 3 --edits 10 --conflict-rate medium --output /tmp/test-repo
4949-```
5858+## File Size Sweep (50 files, 2 collaborators, medium conflict)
50595151-### Format existing results
6060+### Local
52615353-```bash
5454-merge-bench report --input bench-results/v1/default-local/results.json --format markdown
5555-merge-bench report --input bench-results/v1/default-pds/results.json --format csv
5656-```
6262+| Avg Size | 1-git (ms) | 2-diff (ms) | 3-sidecar (ms) | 4-yrs (ms) |
6363+|----------|-----------|-------------|----------------|------------|
6464+| 1 KB | 61 | 78 | 291 | 50 |
6565+| 10 KB | 106 | 360 | 437 | 167 |
6666+| 50 KB | 206 | 751 | 790 | 431 |
57675858-## Output structure
6868+### Remote (PDS)
59696060-Each run produces a folder with:
6161-- `results.json` -- raw benchmark data
6262-- `report.md` -- formatted markdown table
6363-- `results.csv` -- CSV for spreadsheet import
7070+| Avg Size | 1-git (ms) | 1-conflicts | 2-diff (ms) | 3-sidecar (ms) | 4-yrs (ms) |
7171+|----------|-----------|-------------|-------------|----------------|------------|
7272+| 1 KB | 6514 | 1 | 6870 | 6937 | 9475 |
7373+| 10 KB | 7582 | 0 | 7528 | 7309 | 9252 |
7474+| 50 KB | error (413) | - | - | - | - |
64756565-Runs are grouped by set name under `bench-results/`:
7676+The 50KB scenario exceeded payload limits on pds-yrs save.
7777+7878+## Multi-Collaborator (5 collaborators)
7979+8080+### Local
8181+8282+| Files | Conflict | 1-git (ms) | 1-conflicts | 2-diff (ms) | 3-sidecar (ms) | 4-yrs (ms) |
8383+|-------|----------|-----------|-------------|-------------|----------------|------------|
8484+| 50 | low | 277 | 0 | 360 | 921 | 60 |
8585+| 200 | low | 719 | 0 | 957 | 1337 | 348 |
8686+| 50 | medium | 242 | 0 | 892 | 894 | 138 |
8787+| 200 | medium | 349 | 0 | 854 | 601 | 236 |
8888+| 50 | high | 72 | 1 | 628 | 843 | 39 |
8989+| 200 | high | 154 | 1 | 1069 | 1081 | 153 |
9090+9191+### Remote (PDS)
9292+9393+| Files | Conflict | 1-git (ms) | 1-conflicts | 2-diff (ms) | 3-sidecar (ms) | 4-yrs (ms) |
9494+|-------|----------|-----------|-------------|-------------|----------------|------------|
9595+| 50 | low | 8946 | 0 | 9576 | 10937 | 9323 |
9696+| 200 | low | 11684 | 0 | 15148 | 10021 | 32425 |
9797+| 50 | medium | 10874 | 0 | 10001 | 10617 | 8955 |
9898+| 200 | medium | 16152 | 1 | 10390 | 14696 | 30460 |
9999+| 50 | high | 12293 | 1 | 10480 | 9973 | 8799 |
100100+| 200 | high | 10207 | 1 | 10787 | 11771 | 31184 |
661016767-```
6868-bench-results/
6969- v1/
7070- report.md # combined report for this set
7171- default-local/
7272- default-pds/
7373- stress-local/
7474- conflict-local/
7575- conflict-pds/
7676- ...
7777-```
102102+## Guaranteed Conflict (2 collaborators, same lines edited)
103103+104104+### Local
105105+106106+| Files | 1-git (ms) | 1-conflicts | 2-diff (ms) | 3-sidecar (ms) | 4-yrs (ms) |
107107+|-------|-----------|-------------|-------------|----------------|------------|
108108+| 10 | 85 | 10 | 658 | 1009 | 7 |
109109+| 50 | 307 | 20 | 3438 | 1996 | 106 |
110110+| 200 | 802 | 20 | 1157 | 1485 | 204 |
111111+112112+### Remote (PDS)
113113+114114+| Files | 1-git (ms) | 1-conflicts | 2-diff (ms) | 3-sidecar (ms) | 4-yrs (ms) |
115115+|-------|-----------|-------------|-------------|----------------|------------|
116116+| 10 | 6363 | 10 | 6902 | 7004 | 4335 |
117117+| 50 | 6723 | 20 | 7853 | 10435 | 9840 |
118118+| 200 | 6782 | 20 | 7308 | 7997 | 30904 |
119119+120120+## Key Findings
781217979-## PDS mode setup
122122+### Local performance
801238181-Remote mode requires credentials in `testuser.toml`:
124124+- **Plain git is fastest** for pure merge speed (25-386ms)
125125+- **Yrs CRDT merge (strategy 4)** is competitive locally (3-348ms) -- pure in-memory merge with no git overhead
126126+- **git-yrs-merge strategies** (2, 3) add 2-10x overhead vs plain git due to CRDT driver invocations, but **eliminate all merge conflicts**
127127+- **Plain git fails with conflicts** in medium/high conflict scenarios; CRDT strategies always succeed
821288383-```toml
8484-pds = "https://your-pds.example.com"
8585-handle = "user.your-pds.example.com"
8686-password = "your-password"
8787-did = "did:plc:..."
8888-```
129129+### Remote (PDS) performance
891309090-It also requires `git-remote-pds` and `pds-yrs` binaries to be built.
131131+- **Network dominates** -- all strategies take 6-10s for small repos, regardless of local merge cost
132132+- **Git strategies (1-3) bottlenecked by push+clone** at ~7s baseline for 2 collaborators, ~10-16s for 5 collaborators
133133+- **pds-yrs (strategy 4) scales worse with file count** -- 3.7s at 10 files, 9s at 50, 29s at 200. Each file requires separate blob operations during save/merge
134134+- **Git strategies scale better at high file counts** because git packs everything into a single push/fetch
135135+- At **50 files or fewer, pds-yrs is competitive or faster** than git strategies
136136+- At **200 files, git strategies are 3-4x faster** than pds-yrs due to O(n) blob round-trips
911379292-## Reports
138138+### PDS limits
931399494-See [bench-results/v1/report.md](bench-results/v1/report.md) for the first full benchmark comparing local vs PDS performance across all strategies.
140140+- AT Protocol record size limits (~1MB) prevent pds-yrs from handling 1000-file sites or 50KB average file sizes
141141+- git-remote-pds also hits limits at 1000 files
142142+- Pack blob format helps but record metadata still grows with file count
951439696-## Key findings
144144+### Conflict resolution
971459898-- CRDT strategies (2, 3, 4) **eliminate all merge conflicts** -- 0 conflicts across every scenario tested
9999-- Plain git fails with conflicts in ~33% of scenarios
100100-- Locally, plain git is fastest (25-386ms) but Yrs CRDT merge is competitive (3-348ms)
101101-- Over PDS, network dominates: ~7s baseline for git strategies, ~3.7s for pds-yrs at small scale
102102-- pds-yrs scales worse with file count (O(n) blob round-trips) -- at 200 files it takes ~29s vs ~7s for git strategies
103103-- AT Protocol record size limits prevent 1000-file sites and 50KB avg file sizes
146146+- Plain git: conflicts in ~33% of scenarios -- requires manual resolution
147147+- All CRDT strategies (2, 3, 4): **zero conflicts across all scenarios** -- automatic resolution via Yrs CRDT merge
+103
archive/old.md
···11+# merge-bench
22+33+Benchmark harness comparing merge strategies for collaborative text editing via AT Protocol PDS.
44+55+## Why
66+77+Lichen uses Yrs CRDTs to enable conflict-free collaborative editing on sites stored in AT Protocol PDS repositories. There are several possible architectures for how collaborators merge their changes:
88+99+1. **Plain git** -- standard 3-way merge, conflicts require manual resolution
1010+2. **git + yrs-merge (diff)** -- git merge with a CRDT merge driver that resolves conflicts automatically using Yrs document state
1111+3. **git + yrs-merge (sidecar)** -- same as above, but with `.yrs/` sidecar files tracked in git for richer CRDT state
1212+4. **Yrs-on-PDS** -- pure CRDT merge via `pds-yrs`, no git involved; each collaborator saves Yrs state to PDS and merges happen server-side
1313+1414+This tool generates synthetic repositories with configurable parameters (file count, file size, collaborator count, conflict rate) and measures each strategy's merge time, conflict count, and success rate.
1515+1616+Both **local** (no network, pure computation) and **remote** (real PDS round-trips via `git-remote-pds` and `pds-yrs`) modes are supported.
1717+1818+## Usage
1919+2020+```bash
2121+cargo build
2222+```
2323+2424+### Run benchmarks
2525+2626+```bash
2727+# Local mode (default) -- measures pure merge cost
2828+merge-bench run # default matrix (10/50/200 files x low/med/high conflict)
2929+merge-bench run --quick # quick smoke test (2 scenarios)
3030+merge-bench run --stress # 1000 files
3131+merge-bench run --filesize # file size sweep (1KB/10KB/50KB)
3232+merge-bench run --multi-collab # 5 collaborators
3333+merge-bench run --conflict # guaranteed conflicts (same lines edited)
3434+3535+# Remote PDS mode -- includes real network round-trips
3636+merge-bench run --pds # default matrix against real PDS
3737+merge-bench run --conflict --pds # conflict test against real PDS
3838+3939+# Group related runs into a named set
4040+merge-bench run --set v1 # bench-results/v1/default-local/
4141+merge-bench run --pds --set v1 # bench-results/v1/default-pds/
4242+merge-bench run --stress --set v1 # bench-results/v1/stress-local/
4343+```
4444+4545+### Generate a repo manually
4646+4747+```bash
4848+merge-bench gen --files 50 --avg-size 1000 --collaborators 3 --edits 10 --conflict-rate medium --output /tmp/test-repo
4949+```
5050+5151+### Format existing results
5252+5353+```bash
5454+merge-bench report --input bench-results/v1/default-local/results.json --format markdown
5555+merge-bench report --input bench-results/v1/default-pds/results.json --format csv
5656+```
5757+5858+## Output structure
5959+6060+Each run produces a folder with:
6161+- `results.json` -- raw benchmark data
6262+- `report.md` -- formatted markdown table
6363+- `results.csv` -- CSV for spreadsheet import
6464+6565+Runs are grouped by set name under `bench-results/`:
6666+6767+```
6868+bench-results/
6969+ v1/
7070+ report.md # combined report for this set
7171+ default-local/
7272+ default-pds/
7373+ stress-local/
7474+ conflict-local/
7575+ conflict-pds/
7676+ ...
7777+```
7878+7979+## PDS mode setup
8080+8181+Remote mode requires credentials in `testuser.toml`:
8282+8383+```toml
8484+pds = "https://your-pds.example.com"
8585+handle = "user.your-pds.example.com"
8686+password = "your-password"
8787+did = "did:plc:..."
8888+```
8989+9090+It also requires `git-remote-pds` and `pds-yrs` binaries to be built.
9191+9292+## Reports
9393+9494+See [bench-results/v1/report.md](bench-results/v1/report.md) for the first full benchmark comparing local vs PDS performance across all strategies.
9595+9696+## Key findings
9797+9898+- CRDT strategies (2, 3, 4) **eliminate all merge conflicts** -- 0 conflicts across every scenario tested
9999+- Plain git fails with conflicts in ~33% of scenarios
100100+- Locally, plain git is fastest (25-386ms) but Yrs CRDT merge is competitive (3-348ms)
101101+- Over PDS, network dominates: ~7s baseline for git strategies, ~3.7s for pds-yrs at small scale
102102+- pds-yrs scales worse with file count (O(n) blob round-trips) -- at 200 files it takes ~29s vs ~7s for git strategies
103103+- AT Protocol record size limits prevent 1000-file sites and 50KB avg file sizes