Keep using Photos.app like you always do. Attic quietly backs up your originals and edits to an S3 bucket you control. One-way, append-only.
3
fork

Configure Feed

Select the types of activity you want to include in your feed.

docs: add lanes and adaptive concurrency explainer

+162 -1
+5 -1
README.md
··· 117 117 in separate lanes. The iCloud lane uses an AIMD controller (attic's 118 118 `AIMDController` implementing LadderKit's `AdaptiveConcurrencyControlling`) 119 119 to back off when Photos.app or iCloud pushes back, and to ramp up on a 120 - clean lane. 120 + clean lane. See [Lanes and adaptive concurrency](docs/lanes-and-adaptive-concurrency.md) 121 + for details. 121 122 - **Retry queue** — transient failures are remembered on S3 122 123 (`retry-queue.json`) and retried first on the next run, carrying 123 124 attempts/first-seen/last-message for each UUID. ··· 261 262 262 263 - [Architecture](docs/architecture.md) — How attic works: the backup pipeline, 263 264 photo library access, manifest lifecycle, and design boundaries 265 + - [Lanes and adaptive concurrency](docs/lanes-and-adaptive-concurrency.md) — 266 + Why attic splits exports into local and iCloud lanes, and how the AIMD 267 + controller adapts to iCloud throttling 264 268 - [Asset Metadata](docs/metadata.md) — Schema reference for the per-asset JSON 265 269 uploaded to S3
+157
docs/lanes-and-adaptive-concurrency.md
··· 1 + # Lanes and adaptive concurrency 2 + 3 + Attic backs up two very different kinds of photo assets in parallel: ones that 4 + are already on disk, and ones that still live on Apple's servers. Mixing them 5 + in a single pool is a compromise that loses to both extremes, so the exporter 6 + partitions each batch into two **lanes** and runs each at a concurrency limit 7 + suited to its behavior. 8 + 9 + ## The two kinds of assets 10 + 11 + With **"Optimize Mac Storage"** enabled, Photos.app keeps only recent or 12 + frequently-accessed originals on disk. Everything else is a thumbnail-only 13 + placeholder whose original lives in iCloud. 14 + 15 + ### Local assets 16 + 17 + The original file is on disk in the Photos library bundle. Exporting one is 18 + essentially a file copy + inline SHA-256 hash. 19 + 20 + - **Fast** — hundreds of MB/s, limited by disk and CPU. 21 + - **Predictable** — no network in the loop. 22 + - **Few failure modes** — "disk full" and "permission denied" are about it. 23 + - **Parallelism helps** — more concurrency, more throughput, up to disk 24 + saturation. 25 + 26 + ### iCloud-only assets 27 + 28 + Only the thumbnail is on disk. Exporting means asking Photos.app to pull the 29 + original from iCloud, which involves auth, a round-trip to cold storage, and a 30 + write back to local disk. 31 + 32 + - **Slow** — network-bound, often seconds per asset. 33 + - **Heavily throttled** — iCloud rate-limits hard if you ask for too many at 34 + once. 35 + - **Many transient failure modes** — timeouts, `-1712` AppleEvent timeouts, 36 + 503s from iCloud, rate-limit waits. 37 + - **Some permanent failures** — shared-album derivatives that have gone 38 + missing server-side raise `-1728 "Can't get media item"`. Retrying doesn't 39 + help. 40 + - **Parallelism past a point hurts** — iCloud starts queuing you, total 41 + throughput drops. 42 + 43 + ## Why a single pool is wrong 44 + 45 + If you run one pool at a single concurrency limit, you're stuck picking a 46 + number that's wrong for one side: 47 + 48 + | Pool size | Local lane | iCloud lane | 49 + |-----------|------------|-------------| 50 + | 16 | great | throttled into the ground | 51 + | 2 | drip-feed | safe but slow | 52 + 53 + Neither number is right. Splitting the lanes lets each run at its own pace. 54 + 55 + ## How the split is decided 56 + 57 + LadderKit exposes a `LocalAvailabilityProviding` protocol that answers "is 58 + this asset's original on disk?" The real implementation, 59 + `PhotosDatabaseLocalAvailability`, reads one column from Photos.sqlite: 60 + 61 + ``` 62 + ZINTERNALRESOURCE.ZLOCALAVAILABILITY = 1 63 + ``` 64 + 65 + This is the same flag Photos.app itself uses to decide whether to show the 66 + little download-cloud icon in the UI. It's cheap to read and doesn't touch 67 + PhotoKit. 68 + 69 + At batch time, `PhotoExporter` partitions every UUID: 70 + 71 + - `ZLOCALAVAILABILITY = 1` → **local lane** 72 + - everything else → **iCloud lane** 73 + 74 + ``` 75 + ┌── local lane ──→ concurrency = maxConcurrency (e.g. 16) 76 + batch ────┤ 77 + └── iCloud lane ──→ concurrency = AIMDController.currentLimit() 78 + (adapts: 1-12, starts at 6) 79 + ``` 80 + 81 + Each lane has its own `TaskGroup`, its own concurrency cap, and — critically 82 + — its own feedback signals. Congestion on the iCloud side doesn't slow the 83 + local side down. 84 + 85 + ## The iCloud lane is adaptive 86 + 87 + Because iCloud's tolerance shifts minute to minute, picking a fixed iCloud 88 + concurrency is also wrong. The iCloud lane is gated by an **AIMD controller** 89 + (attic's `AIMDController`, implementing LadderKit's 90 + `AdaptiveConcurrencyControlling` protocol). 91 + 92 + **AIMD** (Additive Increase, Multiplicative Decrease) is the congestion-control 93 + policy TCP uses. The asymmetry is the point: recover cautiously, back off 94 + hard. 95 + 96 + - The controller keeps a **sliding window of the last 20 outcomes**. 97 + - **Transient failure rate > 30%** → halve the limit (floor at `minLimit`). 98 + - **Transient failure rate ≤ 5%** → grow the limit by 1 (cap at `maxLimit`). 99 + - **Window clears on every limit change** — prevents stale pre-change 100 + outcomes from immediately re-tripping the new limit. 101 + 102 + The exporter polls `currentLimit()` between dispatches and reports each 103 + `ExportOutcome` (`.success`, `.transientFailure`, `.permanentFailure`) via 104 + `record(_:)`. The controller is observation-only — it doesn't hold permits or 105 + gate dispatch directly, it just publishes a number the exporter reads. 106 + 107 + ### Why permanent failures don't affect the limit 108 + 109 + A batch full of `-1728` shared-album tombstones isn't a lane-health signal — 110 + the lane is fine, those assets just don't exist anymore. Reporting them as 111 + transient failures would permanently pin the lane at `minLimit`. 112 + 113 + Ladder classifies each export error as `.other`, `.transientCloud`, or 114 + `.permanentlyUnavailable` and reports `.permanentFailure` to the controller 115 + for the last category. The controller **ignores `.permanentFailure`** entirely 116 + — it doesn't enter the window, doesn't count toward the rate. Attic also 117 + records permanent-unavailable UUIDs in `unavailable-assets.json` so they're 118 + skipped forever on future runs. 119 + 120 + ## What this looks like in practice 121 + 122 + On a mixed Optimize-Storage library: 123 + 124 + ``` 125 + Backup started — 2,431 assets pending 126 + Local lane: 1,804 assets → running at 16 concurrent 127 + iCloud lane: 627 assets → running at 6 concurrent (adaptive) 128 + 129 + [...] 130 + 131 + iCloud lane throttling — limit 6 → 3 132 + iCloud lane recovering — limit 3 → 4 133 + iCloud lane recovering — limit 4 → 5 134 + ``` 135 + 136 + - The local lane blasts through cached originals in parallel. 137 + - The iCloud lane ticks along at whatever rate iCloud currently tolerates. 138 + - Failures in one lane don't slow the other down. 139 + - Permanent failures (tombstones) are skipped, not retried, and don't affect 140 + concurrency tuning. 141 + 142 + ## Where this lives in the code 143 + 144 + | Layer | Type | Where | 145 + |---|---|---| 146 + | Local/iCloud split | `LocalAvailabilityProviding`, `PhotosDatabaseLocalAvailability` | LadderKit | 147 + | Per-lane dispatch | `PhotoExporter` | LadderKit | 148 + | Controller protocol | `AdaptiveConcurrencyControlling`, `ExportOutcome` | LadderKit | 149 + | Error classification | `ExportClassification` | LadderKit | 150 + | AIMD policy | `AIMDController` | `Sources/AtticCore/AIMDController.swift` | 151 + | Permanent-unavailable store | `UnavailableStore` | `Sources/AtticCore/UnavailableAssets.swift` | 152 + 153 + LadderKit supplies the **mechanism** (partitioning, protocol, outcome 154 + reporting). AtticCore supplies the **policy** (the actual AIMD controller 155 + implementation, the unavailable store, the backup pipeline). The two 156 + responsibilities are cleanly separated so a different caller could plug in a 157 + different controller (EWMA, PID, token bucket) without touching ladder.