v6-refactor Benchmark Report#
Date: 2026-03-14
What changed: pds-yrs refactored from single YrsRepo record to two-record model (YrsRepo + YrsBranch). YrsRepo is the project-level registry (one per project, rkey = project name). YrsBranch is the per-device record with file index and blob refs. Merge discovery now reads YrsRepo to find branches instead of scanning via listRecords.
Environment: All benchmarks run against a real PDS (bluesky-pds.t1cc.commoninternet.net), sequentially to avoid network contention.
Summary#
All 24 scenarios across 5 benchmark suites pass for all 4 strategies. The refactor has no regressions.
Key findings:
- Strategy 4 (yrs-on-pds) is 3-5x faster than git-based strategies for small-to-medium repos (10-200 files)
- Strategy 4 is 2-3x slower for 1000-file stress tests (25s vs 9-12s) due to per-file PDS record operations
- CRDT strategies (2, 3, 4) produce zero conflicts in all scenarios, including guaranteed-conflict scenarios where git produces 10-37 conflicts
- Strategies 1-3 are network-dominated: git push/clone via PDS takes ~7-12s regardless of file count, making local merge time negligible
- File size impact: 50x increase in file size (1KB to 50KB) only doubles strategy 4 time (1.5s to 5s), while strategies 2-3 increase ~40%
1. Default Matrix (9 scenarios)#
2 collaborators, varying file count (10/50/200) and conflict rate (low/medium/high).
| Files | Conflict | 1-git (ms) | Git Conflicts | 2-diff (ms) | 3-sidecar (ms) | 4-yrs (ms) |
|---|---|---|---|---|---|---|
| 10 | low | 7,331 | 0 | 7,408 | 9,307 | 1,544 |
| 10 | medium | 6,384 | 1 | 7,468 | 8,372 | 1,658 |
| 10 | high | 6,744 | 3 | 8,322 | 7,036 | 1,748 |
| 50 | low | 8,144 | 0 | 7,558 | 7,748 | 1,966 |
| 50 | medium | 8,112 | 2 | 7,447 | 6,944 | 1,501 |
| 50 | high | 7,430 | 7 | 7,910 | 7,688 | 1,629 |
| 200 | low | 6,939 | 1 | 7,365 | 7,741 | 2,832 |
| 200 | medium | 8,077 | 5 | 7,519 | 8,375 | 3,004 |
| 200 | high | 6,940 | 15 | 7,871 | 7,809 | 2,666 |
Observations:
- Strategy 4 averages 2.1s across all default scenarios vs 7.6s for strategies 1-3
- Git conflicts scale with conflict rate as expected (0 at low, 15 at high/200 files)
- Strategies 2 and 3 are nearly identical in performance (~7-9s), dominated by PDS network I/O
2. Stress Test (3 scenarios)#
1000 files, 2 collaborators, 50 edits each.
| Conflict | 1-git (ms) | Git Conflicts | 2-diff (ms) | 3-sidecar (ms) | 4-yrs (ms) |
|---|---|---|---|---|---|
| low | 8,192 | 2 | 9,252 | 9,031 | 24,791 |
| medium | 8,559 | 12 | 9,742 | 10,792 | 26,336 |
| high | 8,326 | 37 | 11,281 | 11,917 | 26,208 |
Observations:
- Strategy 4 is ~3x slower at 1000 files (~25s vs ~9s) — each file requires individual PDS blob operations
- Git strategies benefit from bundling all files in a single git push/clone
- Conflict rate has minimal impact on strategy 4 time (CRDT merge is O(1) per file regardless of edits)
- Higher conflict rates do slow strategies 2-3 slightly (9s to 11-12s) as the merge driver processes more conflict regions
3. Guaranteed Conflict (3 scenarios)#
Every edit touches the same lines. Maximum conflict scenario for git.
| Files | 1-git (ms) | Git Conflicts | 2-diff (ms) | 3-sidecar (ms) | 4-yrs (ms) |
|---|---|---|---|---|---|
| 10 | 6,630 | 10 | 7,727 | 8,582 | 2,036 |
| 50 | 7,863 | 20 | 8,339 | 8,736 | 1,866 |
| 200 | 7,518 | 20 | 9,099 | 9,114 | 3,204 |
Observations:
- Git fails every scenario with 10-20 conflict files
- All three CRDT strategies handle guaranteed conflicts with zero manual intervention
- Strategy 4 remains fastest (2-3s vs 7-9s)
4. File Size Sweep (3 scenarios)#
50 files, medium conflict, varying file size (1KB/10KB/50KB).
| Avg Size | 1-git (ms) | Git Conflicts | 2-diff (ms) | 3-sidecar (ms) | 4-yrs (ms) |
|---|---|---|---|---|---|
| 1 KB | 7,750 | 2 | 8,829 | 7,174 | 1,532 |
| 10 KB | 7,107 | 2 | 7,381 | 7,624 | 2,372 |
| 50 KB | 7,395 | 2 | 10,216 | 10,252 | 4,986 |
Observations:
- Strategy 4 scales linearly with file size: 1.5s (1KB) → 2.4s (10KB) → 5.0s (50KB)
- Git strategies are less affected by file size (network overhead dominates)
- At 50KB, strategies 2-3 slow ~40% (10.2s vs 7.2s) due to larger CRDT state vectors
- Even at 50KB, strategy 4 is still 2x faster than strategies 2-3
5. Multi-Collaborator (6 scenarios)#
5 collaborators, 50 or 200 files.
| Files | Conflict | 1-git (ms) | Git Conflicts | 2-diff (ms) | 3-sidecar (ms) | 4-yrs (ms) |
|---|---|---|---|---|---|---|
| 50 | low | 10,118 | 0 | 9,741 | 10,629 | 1,718 |
| 200 | low | 11,324 | 1 | 11,433 | 10,537 | 3,366 |
| 50 | medium | 10,142 | 2 | 10,741 | 10,291 | 1,964 |
| 200 | medium | 11,384 | 5 | 12,888 | 12,443 | 3,297 |
| 50 | high | 10,967 | 7 | 12,657 | 11,930 | 2,080 |
| 200 | high | 12,031 | 15 | 12,329 | 14,769 | 3,331 |
Observations:
- 5 collaborators add ~3s to git strategies vs 2-collaborator scenarios (more branches to push/merge)
- Strategy 4 time is independent of collaborator count (~2s for 50 files regardless of 2 or 5 collaborators)
- The multi-collab advantage is strategy 4's strongest differentiator: 5-6x faster at 50 files, 3-4x faster at 200 files
- Git conflict count stays the same as 2-collaborator scenarios (conflicts are pairwise, and the merge is sequential)
Comparison with v5 (pre-refactor)#
The refactor (YrsRepo/YrsBranch split) adds one extra PDS API call per save (to create/update the YrsRepo registry record). This adds negligible overhead — strategy 4 times are within normal variance of v5 results.
The main functional improvement is in merge discovery: merge_project() now does a single getRecord on the project's YrsRepo instead of scanning with listRecords. This is both faster and more reliable.
Conclusions#
- For repos under 200 files, strategy 4 (yrs-on-pds) is the clear winner — 3-5x faster than any git-based approach
- For 1000+ file repos, git-based strategies are faster due to bundled transport — strategy 4 would need batch upload/download to compete
- CRDT merge eliminates all conflicts across all scenarios, including guaranteed-conflict cases
- Strategies 2 and 3 perform nearly identically — the sidecar approach adds no measurable overhead over diff-only
- The v6 refactor has no performance regressions — the cleaner architecture (deterministic project lookup, no listRecords scanning) maintains the same performance profile