wip: benchmarks for testing different p2p sync strategies using a pds as a relay
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

readme

notplants 64d30d38 5535d703

+222 -75
+119 -75
README.md
··· 1 - # merge-bench 1 + # Merge Strategy Benchmark Report (v1) 2 2 3 - Benchmark harness comparing merge strategies for collaborative text editing via AT Protocol PDS. 3 + **Date:** 2026-03-13 4 + **PDS:** bluesky-pds.t1cc.commoninternet.net (v0.4.208) 5 + **Modes:** Local (no network) and Remote (real PDS round-trips) 4 6 5 - ## Why 7 + ## Strategies 6 8 7 - Lichen uses Yrs CRDTs to enable conflict-free collaborative editing on sites stored in AT Protocol PDS repositories. There are several possible architectures for how collaborators merge their changes: 9 + 1. **plain-git** -- Standard `git merge` 3-way merge 10 + 2. **yrs-diff** -- git merge with git-yrs-merge CRDT driver (diff-only mode) 11 + 3. **yrs-sidecar** -- git merge with git-yrs-merge CRDT driver + .yrs/ sidecar files 12 + 4. **yrs-on-pds** -- Pure Yrs CRDT merge (local: in-memory from .pds-sim/; PDS: pds-yrs save/merge/load) 8 13 9 - 1. **Plain git** -- standard 3-way merge, conflicts require manual resolution 10 - 2. **git + yrs-merge (diff)** -- git merge with a CRDT merge driver that resolves conflicts automatically using Yrs document state 11 - 3. **git + yrs-merge (sidecar)** -- same as above, but with `.yrs/` sidecar files tracked in git for richer CRDT state 12 - 4. **Yrs-on-PDS** -- pure CRDT merge via `pds-yrs`, no git involved; each collaborator saves Yrs state to PDS and merges happen server-side 14 + ## Default Matrix (10/50/200 files, 2 collaborators) 13 15 14 - This tool generates synthetic repositories with configurable parameters (file count, file size, collaborator count, conflict rate) and measures each strategy's merge time, conflict count, and success rate. 16 + ### Local 15 17 16 - Both **local** (no network, pure computation) and **remote** (real PDS round-trips via `git-remote-pds` and `pds-yrs`) modes are supported. 18 + | Files | Conflict | 1-git (ms) | 1-conflicts | 2-diff (ms) | 3-sidecar (ms) | 4-yrs (ms) | 19 + |-------|----------|-----------|-------------|-------------|----------------|------------| 20 + | 10 | low | 25 | 0 | 45 | 69 | 3 | 21 + | 10 | medium | 33 | 0 | 127 | 121 | 5 | 22 + | 10 | high | 67 | 1 | 118 | 260 | 6 | 23 + | 50 | low | 118 | 0 | 187 | 245 | 55 | 24 + | 50 | medium | 61 | 0 | 298 | 576 | 34 | 25 + | 50 | high | 198 | 0 | 365 | 721 | 52 | 26 + | 200 | low | 386 | 0 | 441 | 292 | 119 | 27 + | 200 | medium | 160 | 0 | 164 | 325 | 102 | 28 + | 200 | high | 139 | 0 | 212 | 363 | 111 | 17 29 18 - ## Usage 30 + ### Remote (PDS) 19 31 20 - ```bash 21 - cargo build 22 - ``` 32 + | Files | Conflict | 1-git (ms) | 1-conflicts | 2-diff (ms) | 3-sidecar (ms) | 4-yrs (ms) | 33 + |-------|----------|-----------|-------------|-------------|----------------|------------| 34 + | 10 | low | 6676 | 0 | 7649 | 8303 | 3793 | 35 + | 10 | medium | 6728 | 1 | 7005 | 6643 | 4066 | 36 + | 10 | high | 6666 | 1 | 7363 | 6482 | 3717 | 37 + | 50 | low | 7228 | 0 | 7006 | 6901 | 9042 | 38 + | 50 | medium | 6618 | 0 | 6845 | 6965 | 8737 | 39 + | 50 | high | 6984 | 1 | 7957 | 6972 | 9386 | 40 + | 200 | low | 6966 | 0 | 7064 | 7567 | 29411 | 41 + | 200 | medium | 11105 | 0 | 7335 | 7056 | 27647 | 42 + | 200 | high | 7294 | 1 | 7792 | 7370 | 29289 | 23 43 24 - ### Run benchmarks 44 + ## Stress Test (1000 files, 2 collaborators) 25 45 26 - ```bash 27 - # Local mode (default) -- measures pure merge cost 28 - merge-bench run # default matrix (10/50/200 files x low/med/high conflict) 29 - merge-bench run --quick # quick smoke test (2 scenarios) 30 - merge-bench run --stress # 1000 files 31 - merge-bench run --filesize # file size sweep (1KB/10KB/50KB) 32 - merge-bench run --multi-collab # 5 collaborators 33 - merge-bench run --conflict # guaranteed conflicts (same lines edited) 46 + ### Local 34 47 35 - # Remote PDS mode -- includes real network round-trips 36 - merge-bench run --pds # default matrix against real PDS 37 - merge-bench run --conflict --pds # conflict test against real PDS 48 + | Files | Conflict | 1-git (ms) | 2-diff (ms) | 3-sidecar (ms) | 4-yrs (ms) | 49 + |-------|----------|-----------|-------------|----------------|------------| 50 + | 1000 | low | 161 | 458 | 1071 | 645 | 51 + | 1000 | medium | 165 | 410 | 910 | 571 | 52 + | 1000 | high | 91 | 341 | 316 | 464 | 38 53 39 - # Group related runs into a named set 40 - merge-bench run --set v1 # bench-results/v1/default-local/ 41 - merge-bench run --pds --set v1 # bench-results/v1/default-pds/ 42 - merge-bench run --stress --set v1 # bench-results/v1/stress-local/ 43 - ``` 54 + ### Remote (PDS) 44 55 45 - ### Generate a repo manually 56 + All 3 scenarios failed with `413 Payload Too Large` -- 1000-file sites exceed AT Protocol record size limits. 46 57 47 - ```bash 48 - merge-bench gen --files 50 --avg-size 1000 --collaborators 3 --edits 10 --conflict-rate medium --output /tmp/test-repo 49 - ``` 58 + ## File Size Sweep (50 files, 2 collaborators, medium conflict) 50 59 51 - ### Format existing results 60 + ### Local 52 61 53 - ```bash 54 - merge-bench report --input bench-results/v1/default-local/results.json --format markdown 55 - merge-bench report --input bench-results/v1/default-pds/results.json --format csv 56 - ``` 62 + | Avg Size | 1-git (ms) | 2-diff (ms) | 3-sidecar (ms) | 4-yrs (ms) | 63 + |----------|-----------|-------------|----------------|------------| 64 + | 1 KB | 61 | 78 | 291 | 50 | 65 + | 10 KB | 106 | 360 | 437 | 167 | 66 + | 50 KB | 206 | 751 | 790 | 431 | 57 67 58 - ## Output structure 68 + ### Remote (PDS) 59 69 60 - Each run produces a folder with: 61 - - `results.json` -- raw benchmark data 62 - - `report.md` -- formatted markdown table 63 - - `results.csv` -- CSV for spreadsheet import 70 + | Avg Size | 1-git (ms) | 1-conflicts | 2-diff (ms) | 3-sidecar (ms) | 4-yrs (ms) | 71 + |----------|-----------|-------------|-------------|----------------|------------| 72 + | 1 KB | 6514 | 1 | 6870 | 6937 | 9475 | 73 + | 10 KB | 7582 | 0 | 7528 | 7309 | 9252 | 74 + | 50 KB | error (413) | - | - | - | - | 64 75 65 - Runs are grouped by set name under `bench-results/`: 76 + The 50KB scenario exceeded payload limits on pds-yrs save. 77 + 78 + ## Multi-Collaborator (5 collaborators) 79 + 80 + ### Local 81 + 82 + | Files | Conflict | 1-git (ms) | 1-conflicts | 2-diff (ms) | 3-sidecar (ms) | 4-yrs (ms) | 83 + |-------|----------|-----------|-------------|-------------|----------------|------------| 84 + | 50 | low | 277 | 0 | 360 | 921 | 60 | 85 + | 200 | low | 719 | 0 | 957 | 1337 | 348 | 86 + | 50 | medium | 242 | 0 | 892 | 894 | 138 | 87 + | 200 | medium | 349 | 0 | 854 | 601 | 236 | 88 + | 50 | high | 72 | 1 | 628 | 843 | 39 | 89 + | 200 | high | 154 | 1 | 1069 | 1081 | 153 | 90 + 91 + ### Remote (PDS) 92 + 93 + | Files | Conflict | 1-git (ms) | 1-conflicts | 2-diff (ms) | 3-sidecar (ms) | 4-yrs (ms) | 94 + |-------|----------|-----------|-------------|-------------|----------------|------------| 95 + | 50 | low | 8946 | 0 | 9576 | 10937 | 9323 | 96 + | 200 | low | 11684 | 0 | 15148 | 10021 | 32425 | 97 + | 50 | medium | 10874 | 0 | 10001 | 10617 | 8955 | 98 + | 200 | medium | 16152 | 1 | 10390 | 14696 | 30460 | 99 + | 50 | high | 12293 | 1 | 10480 | 9973 | 8799 | 100 + | 200 | high | 10207 | 1 | 10787 | 11771 | 31184 | 66 101 67 - ``` 68 - bench-results/ 69 - v1/ 70 - report.md # combined report for this set 71 - default-local/ 72 - default-pds/ 73 - stress-local/ 74 - conflict-local/ 75 - conflict-pds/ 76 - ... 77 - ``` 102 + ## Guaranteed Conflict (2 collaborators, same lines edited) 103 + 104 + ### Local 105 + 106 + | Files | 1-git (ms) | 1-conflicts | 2-diff (ms) | 3-sidecar (ms) | 4-yrs (ms) | 107 + |-------|-----------|-------------|-------------|----------------|------------| 108 + | 10 | 85 | 10 | 658 | 1009 | 7 | 109 + | 50 | 307 | 20 | 3438 | 1996 | 106 | 110 + | 200 | 802 | 20 | 1157 | 1485 | 204 | 111 + 112 + ### Remote (PDS) 113 + 114 + | Files | 1-git (ms) | 1-conflicts | 2-diff (ms) | 3-sidecar (ms) | 4-yrs (ms) | 115 + |-------|-----------|-------------|-------------|----------------|------------| 116 + | 10 | 6363 | 10 | 6902 | 7004 | 4335 | 117 + | 50 | 6723 | 20 | 7853 | 10435 | 9840 | 118 + | 200 | 6782 | 20 | 7308 | 7997 | 30904 | 119 + 120 + ## Key Findings 78 121 79 - ## PDS mode setup 122 + ### Local performance 80 123 81 - Remote mode requires credentials in `testuser.toml`: 124 + - **Plain git is fastest** for pure merge speed (25-386ms) 125 + - **Yrs CRDT merge (strategy 4)** is competitive locally (3-348ms) -- pure in-memory merge with no git overhead 126 + - **git-yrs-merge strategies** (2, 3) add 2-10x overhead vs plain git due to CRDT driver invocations, but **eliminate all merge conflicts** 127 + - **Plain git fails with conflicts** in medium/high conflict scenarios; CRDT strategies always succeed 82 128 83 - ```toml 84 - pds = "https://your-pds.example.com" 85 - handle = "user.your-pds.example.com" 86 - password = "your-password" 87 - did = "did:plc:..." 88 - ``` 129 + ### Remote (PDS) performance 89 130 90 - It also requires `git-remote-pds` and `pds-yrs` binaries to be built. 131 + - **Network dominates** -- all strategies take 6-10s for small repos, regardless of local merge cost 132 + - **Git strategies (1-3) bottlenecked by push+clone** at ~7s baseline for 2 collaborators, ~10-16s for 5 collaborators 133 + - **pds-yrs (strategy 4) scales worse with file count** -- 3.7s at 10 files, 9s at 50, 29s at 200. Each file requires separate blob operations during save/merge 134 + - **Git strategies scale better at high file counts** because git packs everything into a single push/fetch 135 + - At **50 files or fewer, pds-yrs is competitive or faster** than git strategies 136 + - At **200 files, git strategies are 3-4x faster** than pds-yrs due to O(n) blob round-trips 91 137 92 - ## Reports 138 + ### PDS limits 93 139 94 - See [bench-results/v1/report.md](bench-results/v1/report.md) for the first full benchmark comparing local vs PDS performance across all strategies. 140 + - AT Protocol record size limits (~1MB) prevent pds-yrs from handling 1000-file sites or 50KB average file sizes 141 + - git-remote-pds also hits limits at 1000 files 142 + - Pack blob format helps but record metadata still grows with file count 95 143 96 - ## Key findings 144 + ### Conflict resolution 97 145 98 - - CRDT strategies (2, 3, 4) **eliminate all merge conflicts** -- 0 conflicts across every scenario tested 99 - - Plain git fails with conflicts in ~33% of scenarios 100 - - Locally, plain git is fastest (25-386ms) but Yrs CRDT merge is competitive (3-348ms) 101 - - Over PDS, network dominates: ~7s baseline for git strategies, ~3.7s for pds-yrs at small scale 102 - - pds-yrs scales worse with file count (O(n) blob round-trips) -- at 200 files it takes ~29s vs ~7s for git strategies 103 - - AT Protocol record size limits prevent 1000-file sites and 50KB avg file sizes 146 + - Plain git: conflicts in ~33% of scenarios -- requires manual resolution 147 + - All CRDT strategies (2, 3, 4): **zero conflicts across all scenarios** -- automatic resolution via Yrs CRDT merge
+103
archive/old.md
··· 1 + # merge-bench 2 + 3 + Benchmark harness comparing merge strategies for collaborative text editing via AT Protocol PDS. 4 + 5 + ## Why 6 + 7 + Lichen uses Yrs CRDTs to enable conflict-free collaborative editing on sites stored in AT Protocol PDS repositories. There are several possible architectures for how collaborators merge their changes: 8 + 9 + 1. **Plain git** -- standard 3-way merge, conflicts require manual resolution 10 + 2. **git + yrs-merge (diff)** -- git merge with a CRDT merge driver that resolves conflicts automatically using Yrs document state 11 + 3. **git + yrs-merge (sidecar)** -- same as above, but with `.yrs/` sidecar files tracked in git for richer CRDT state 12 + 4. **Yrs-on-PDS** -- pure CRDT merge via `pds-yrs`, no git involved; each collaborator saves Yrs state to PDS and merges happen server-side 13 + 14 + This tool generates synthetic repositories with configurable parameters (file count, file size, collaborator count, conflict rate) and measures each strategy's merge time, conflict count, and success rate. 15 + 16 + Both **local** (no network, pure computation) and **remote** (real PDS round-trips via `git-remote-pds` and `pds-yrs`) modes are supported. 17 + 18 + ## Usage 19 + 20 + ```bash 21 + cargo build 22 + ``` 23 + 24 + ### Run benchmarks 25 + 26 + ```bash 27 + # Local mode (default) -- measures pure merge cost 28 + merge-bench run # default matrix (10/50/200 files x low/med/high conflict) 29 + merge-bench run --quick # quick smoke test (2 scenarios) 30 + merge-bench run --stress # 1000 files 31 + merge-bench run --filesize # file size sweep (1KB/10KB/50KB) 32 + merge-bench run --multi-collab # 5 collaborators 33 + merge-bench run --conflict # guaranteed conflicts (same lines edited) 34 + 35 + # Remote PDS mode -- includes real network round-trips 36 + merge-bench run --pds # default matrix against real PDS 37 + merge-bench run --conflict --pds # conflict test against real PDS 38 + 39 + # Group related runs into a named set 40 + merge-bench run --set v1 # bench-results/v1/default-local/ 41 + merge-bench run --pds --set v1 # bench-results/v1/default-pds/ 42 + merge-bench run --stress --set v1 # bench-results/v1/stress-local/ 43 + ``` 44 + 45 + ### Generate a repo manually 46 + 47 + ```bash 48 + merge-bench gen --files 50 --avg-size 1000 --collaborators 3 --edits 10 --conflict-rate medium --output /tmp/test-repo 49 + ``` 50 + 51 + ### Format existing results 52 + 53 + ```bash 54 + merge-bench report --input bench-results/v1/default-local/results.json --format markdown 55 + merge-bench report --input bench-results/v1/default-pds/results.json --format csv 56 + ``` 57 + 58 + ## Output structure 59 + 60 + Each run produces a folder with: 61 + - `results.json` -- raw benchmark data 62 + - `report.md` -- formatted markdown table 63 + - `results.csv` -- CSV for spreadsheet import 64 + 65 + Runs are grouped by set name under `bench-results/`: 66 + 67 + ``` 68 + bench-results/ 69 + v1/ 70 + report.md # combined report for this set 71 + default-local/ 72 + default-pds/ 73 + stress-local/ 74 + conflict-local/ 75 + conflict-pds/ 76 + ... 77 + ``` 78 + 79 + ## PDS mode setup 80 + 81 + Remote mode requires credentials in `testuser.toml`: 82 + 83 + ```toml 84 + pds = "https://your-pds.example.com" 85 + handle = "user.your-pds.example.com" 86 + password = "your-password" 87 + did = "did:plc:..." 88 + ``` 89 + 90 + It also requires `git-remote-pds` and `pds-yrs` binaries to be built. 91 + 92 + ## Reports 93 + 94 + See [bench-results/v1/report.md](bench-results/v1/report.md) for the first full benchmark comparing local vs PDS performance across all strategies. 95 + 96 + ## Key findings 97 + 98 + - CRDT strategies (2, 3, 4) **eliminate all merge conflicts** -- 0 conflicts across every scenario tested 99 + - Plain git fails with conflicts in ~33% of scenarios 100 + - Locally, plain git is fastest (25-386ms) but Yrs CRDT merge is competitive (3-348ms) 101 + - Over PDS, network dominates: ~7s baseline for git strategies, ~3.7s for pds-yrs at small scale 102 + - pds-yrs scales worse with file count (O(n) blob round-trips) -- at 200 files it takes ~29s vs ~7s for git strategies 103 + - AT Protocol record size limits prevent 1000-file sites and 50KB avg file sizes