An easy-to-host PDS on the ATProtocol, iPhone and MacOS. Maintain control of your keys and data, always.
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

at main 398 lines 18 kB view raw view rendered
1# Blob Handling Spec 2 3Relay Blob Upload, Storage, Proxy & CDN 4 5v0.1 Draft — March 2026 6 7Companion to: Provisioning API Spec, Mobile Architecture Spec, Data Migration Spec 8 9--- 10 11## 1. Overview 12 13Blobs (images, video, media files) are a core part of ATProto but are handled separately from the repo. They are not stored in CAR files and have their own upload, serving, and sync endpoints. This document specifies how the relay handles blobs across all lifecycle phases. 14 15### 1.1 Why This Matters 16 17Every image a user posts through Bluesky is a blob. Without blob handling, the relay can't serve a functional PDS — users can't upload profile pictures, attach images to posts, or share media. Blob support is on the critical path alongside OAuth. 18 19### 1.2 ATProto Blob Model 20 21Key protocol facts that drive the design: 22 23- Blobs are uploaded via `com.atproto.repo.uploadBlob` before any record references them. 24- After upload, blobs are temporary until a record references them (then permanent). 25- Unreferenced blobs are garbage-collected after a grace period (spec recommends ≥1 hour). 26- Blobs are served via `com.atproto.sync.getBlob` (server-to-server) and typically mirrored to CDNs for end-user serving. 27- Blobs are NOT in CAR files. They sync separately via `getBlob` and `listBlobs`. 28- Each blob is identified by its CID (Content Identifier, raw multicodec, base32 `b` prefix). 29- The ATProto spec does not mandate global size limits — those are per-Lexicon and per-server. 30 31--- 32 33## 2. Lifecycle Phase Behavior 34 35### 2.1 Mobile-Only Phase 36 37The relay is a full PDS. Blob handling is straightforward: 38 391. Third-party app uploads blob → relay stores it. 402. App creates a record referencing the blob → blob becomes permanent. 413. AppView/CDN fetches blob via `getBlob` for serving to users. 424. If record is deleted and no other records reference the blob → blob is garbage-collected. 43 44The relay is the authoritative blob store. Standard PDS behavior. 45 46### 2.2 Desktop-Enrolled Phase 47 48Blobs need to exist in two places: the relay (for serving to the network) and the desktop (authoritative copy). The flow changes: 49 50**Upload path (third-party app uploads via XRPC):** 51 521. Bluesky calls `uploadBlob` on the relay (the public XRPC endpoint). 532. Relay stores the blob locally and assigns a temporary CID. 543. When the app creates a record referencing the blob, the relay proxies the record-creation to the desktop (per mobile spec §4.2). 554. The relay forwards the blob data to the desktop via Iroh alongside the record data. 565. Desktop stores the blob locally as the authoritative copy. 576. Relay retains its copy as a cache for serving. 58 59**Upload path (desktop creates content locally — future):** 60 61If/when the desktop supports local content creation (e.g., a local client): 62 631. Desktop stores the blob locally. 642. Desktop pushes the blob to the relay via Iroh (alongside the unsigned commit). 653. Relay stores and serves the blob. 66 67**Read path:** 68 691. `getBlob` requests hit the relay. 702. Relay serves from its local cache. 713. If cache miss (blob was garbage-collected from relay but exists on desktop), relay fetches from desktop via Iroh and re-caches. 72 73### 2.3 Desktop Offline (During Desktop-Enrolled) 74 75- Reads: relay serves blobs from cache. Previously-uploaded blobs remain available. 76- Writes: not applicable — write XRPC returns 503 when desktop is offline, so no new blobs can be uploaded. 77- Cache miss: if a `getBlob` request arrives for a blob not in the relay's cache while the desktop is offline, relay returns 404. This should be rare if the relay's cache TTL is reasonable. 78 79--- 80 81## 3. Rust Implementation Stack 82 83### 3.1 Existing Reference: rsky-pds 84 85The `blacksky-algorithms/rsky` project includes a full Rust PDS implementation (`rsky-pds`) that already handles blob upload, storage, and serving with S3-compatible backends. This is our primary reference for blob implementation patterns. 86 87Repo: https://github.com/blacksky-algorithms/rsky 88 89### 3.2 Recommended Crates 90 91| Crate | Version | Purpose | Downloads/mo | 92|-------|---------|---------|-------------| 93| **rust-s3** | 0.37.0+ | S3-compatible object storage (R2, MinIO, S3) | ~357K | 94| **cid** | 0.11.1+ | Content Identifier generation/parsing (ATProto blob refs) | ~13.7M all-time | 95| **opendal** | 0.55.0+ | Alternative: unified storage abstraction (Apache project) | — | 96 97**rust-s3 vs opendal vs aws-sdk-s3:** 98 99- **rust-s3** is the pragmatic choice — lightweight, supports async and sync, well-tested with R2 and MinIO. Lower dependency footprint than the official AWS SDK. 100- **opendal** (Apache OpenDAL) provides a unified API across storage backends. Heavier abstraction but lets you swap from local filesystem → S3 → R2 → MinIO without code changes. Worth considering if we want backend flexibility from the start. 101- **aws-sdk-s3** is the official AWS SDK. Excellent maintenance but heavyweight (~100+ transitive deps) and async-only (Tokio). Overkill if R2 or MinIO is the primary target. 102 103**Recommendation:** Start with **rust-s3** for v0.1 (lowest friction). Evaluate migrating to **opendal** for v1.0 if multi-backend support becomes important. Use the **cid** crate for all CID operations — it's the standard multiformats implementation used across the IPFS/content-addressing ecosystem. 104 105### 3.3 MIME Type Sniffing 106 107For validating blob content types, use the `infer` crate (https://crates.io/crates/infer) — it detects file type from magic bytes without external dependencies. Lightweight and widely used (~5M downloads). 108 109--- 110 111## 4. Storage Architecture 112 113### 4.1 Relay Storage 114 115Blob data lives in S3-compatible object storage. Blob metadata lives in the relay's database (SQLite for single-node, PostgreSQL for production). 116 117**Blob metadata table:** 118 119| Column | Type | Description | 120|--------|------|-------------| 121| cid | TEXT PK | Content identifier (base32, `b` prefix) | 122| account_id | TEXT FK | Owning account | 123| mime_type | TEXT | MIME type (validated via sniffing) | 124| size_bytes | INTEGER | Blob size | 125| status | TEXT | `temporary` / `permanent` / `pending_gc` | 126| uploaded_at | TEXT | ISO 8601 | 127| referenced_at | TEXT | When first referenced by a record (null if temporary) | 128| last_accessed_at | TEXT | For cache eviction decisions | 129| storage_backend | TEXT | `local` / `s3` — where the blob data lives | 130 131**Object storage key format:** 132 133`{bucket}/{account_id}/{cid[0:2]}/{cid[2:4]}/{cid}` 134 135The two-level prefix hash prevents S3 listing performance issues with large flat namespaces. The CID is the filename — content-addressed storage is naturally deduplicated. 136 137**Backend configuration (relay.toml):** 138 139```toml 140[blobs] 141backend = "s3" # "local" for dev, "s3" for production 142 143[blobs.s3] 144endpoint = "https://account-id.r2.cloudflarestorage.com" # R2, MinIO, S3 145bucket = "pds-blobs" 146region = "auto" # R2 uses "auto" 147access_key = "..." 148secret_key = "..." 149``` 150 151For local development, blobs fall back to filesystem storage at `{data_dir}/blobs/` using the same key structure. The `storage_backend` column in the metadata table lets the relay serve blobs from either backend during migration. 152 153### 4.2 S3-Compatible Providers 154 155Tested/supported providers: 156 157| Provider | Notes | 158|----------|-------| 159| **Cloudflare R2** | No egress fees. Native CDN integration via Workers. Recommended for production. | 160| **MinIO** | Self-hosted S3. Ideal for BYO relay operators. Ships as a single binary. | 161| **AWS S3** | Standard. Higher egress costs than R2. | 162| **Backblaze B2** | Cheap storage, S3-compatible API. | 163 164BYO relay operators who don't want to run object storage can use `backend = "local"` — blobs stay on the local filesystem. This is the default for the open-source relay binary. 165 166### 4.3 Desktop Storage 167 168The desktop PDS stores blobs in its local filesystem, indexed in its local SQLite. The desktop is the authoritative copy when enrolled. No S3 dependency on the desktop — blob data stays local. 169 170### 4.4 Storage Migration Path 171 172v0.1 (dev/beta): `backend = "local"` — filesystem only, no S3 dependency. 173v1.0 (production): `backend = "s3"` — R2 or MinIO. A migration tool copies existing local blobs to the S3 bucket and updates the `storage_backend` column. 174 175--- 176 177## 4. XRPC Endpoints 178 179The relay must implement these standard ATProto endpoints: 180 181### 4.1 com.atproto.repo.uploadBlob 182 183**Method:** POST 184**Auth:** Required (OAuth bearer token) 185**Request:** Raw binary body with `Content-Type` header 186**Response:** 187```json 188{ 189 "$type": "blob", 190 "ref": {"$link": "bafkrei..."}, 191 "mimeType": "image/jpeg", 192 "size": 54499 193} 194``` 195 196**Relay behavior:** 1971. Validate MIME type (sniff bytes if needed, reject disallowed types). 1982. Check account storage quota. 1993. Store blob with `status: temporary`. 2004. Return blob reference. 2015. In desktop-enrolled mode: also forward blob to desktop via Iroh (can be async, before record creation). 202 203### 4.2 com.atproto.sync.getBlob 204 205**Method:** GET 206**Params:** `did` (string), `cid` (string) 207**Response:** Raw blob data with appropriate `Content-Type` 208 209**Relay behavior:** 2101. Look up blob in local cache. 2112. If found, serve directly. 2123. If not found and desktop is online, fetch from desktop via Iroh, re-cache, serve. 2134. If not found and desktop is offline, return 404. 214 215**Security:** Must set Content Security Policy headers. Blobs are untrusted user content — serving them without CSP is a parsing vulnerability risk. 216 217### 4.3 com.atproto.sync.listBlobs 218 219**Method:** GET 220**Params:** `did` (string), `since` (string, optional — repo revision) 221**Response:** Array of blob CIDs 222 223Lists all committed (permanent) blobs for an account, optionally since a given revision. Used by AppViews and relays for synchronization. 224 225--- 226 227## 5. Size Limits & Quotas 228 229### 5.1 Per-Blob Limits 230 231ATProto doesn't mandate global limits, but the relay should enforce sensible defaults: 232 233| Tier | Max blob size | Rationale | 234|------|--------------|-----------| 235| Free | 5 MB | Covers images, short audio. Matches common PDS limits. | 236| Pro | 50 MB | Covers video, large media. | 237| Business | 100 MB | Enterprise media needs. | 238 239These limits apply at upload time. Lexicon-specific limits (e.g., Bluesky's 1 MB for images) are enforced at record creation time. 240 241### 5.2 Per-Account Storage Quotas 242 243Blob storage counts toward the account's total storage quota (defined in provisioning API §8): 244 245| Tier | Total storage (repo + blobs) | 246|------|------------------------------| 247| Free | 500 MB | 248| Pro | 50 GB | 249| Business | 500 GB | 250 251When an account exceeds its quota, `uploadBlob` returns 413 (Payload Too Large) with a `STORAGE_EXCEEDED` error code. 252 253### 5.3 MIME Type Restrictions 254 255The relay should accept a generous allowlist and reject known-dangerous types: 256 257**Allowed:** `image/*`, `video/*`, `audio/*`, `application/pdf`, `text/plain`, `application/octet-stream` 258 259**Blocked:** Executable types (`application/x-executable`, `application/x-mach-binary`, `application/javascript`, etc.), archive types that could contain executables (`.zip`, `.tar.gz` unless explicitly needed by a Lexicon). 260 261The relay should sniff blob bytes to validate the declared MIME type and reject mismatches (e.g., a blob declared as `image/jpeg` that's actually a PE executable). 262 263--- 264 265## 6. Garbage Collection 266 267### 6.1 Temporary Blob Cleanup 268 269Blobs uploaded but never referenced by a record are garbage-collected: 270 271- **Grace period:** 6 hours (ATProto spec recommends ≥1 hour; 6 hours gives apps plenty of time). 272- **Check frequency:** Every 30 minutes, a background job scans for temporary blobs past the grace period. 273- **Action:** Delete blob data and metadata row. 274 275### 6.2 Dereferenced Blob Cleanup 276 277When a record is deleted, check if any other records in the same repo reference the blob's CID: 278 279- If no references remain → mark blob as `pending_gc`. 280- Run a second check after 24 hours (in case a new record references it). 281- If still unreferenced → delete. 282 283### 6.3 Account Deletion Cleanup 284 285On account teardown (provisioning API §7), all blobs are deleted: 286 287- During grace period: blobs are retained (account is read-only). 288- After grace period: bulk-delete all blobs for the account. 289 290### 6.4 Relay Cache Eviction (Desktop-Enrolled) 291 292When the desktop is the authoritative blob store, the relay's copy is a cache. Eviction strategy: 293 294- **LRU eviction** when relay storage exceeds a per-account cache limit. 295- Cache limit per tier: Free = 100 MB, Pro = 5 GB, Business = 50 GB. 296- Evicted blobs can be re-fetched from the desktop on demand (via `getBlob` → Iroh → desktop). 297- Never evict blobs that are less than 7 days old (matches commit buffer retention). 298 299--- 300 301## 7. CDN Integration 302 303### 7.1 Why CDN 304 305The ATProto spec recommends that AppViews mirror blobs to their own CDN rather than hitting `getBlob` directly. But for a desktop PDS that goes offline, having a relay-side CDN cache prevents blob unavailability. 306 307### 7.2 Architecture 308 309For Pro and Business tiers, the relay can optionally front blob serving with a CDN (Cloudflare R2 + Workers, or similar): 310 311``` 312[AppView] → CDN → [Relay getBlob] → (cache or Iroh → desktop) 313``` 314 315The CDN caches public blob responses with appropriate cache headers. This reduces load on the relay and ensures blobs remain available even during brief relay restarts. 316 317### 7.3 Cache Headers 318 319`getBlob` responses should include: 320- `Cache-Control: public, max-age=31536000, immutable` — blobs are content-addressed, so they never change. 321- `Content-Type`: the validated MIME type. 322- `Content-Security-Policy: default-src 'none'; sandbox` — prevent blob content from executing. 323 324The `immutable` directive is safe because CIDs are content hashes — if the content changed, the CID would change. 325 326--- 327 328## 8. Data Migration Implications 329 330### 8.1 Planned Device Swap 331 332During a planned swap (migration spec §3), the blob archive is included in the transfer bundle: 333 3341. Old device exports blobs alongside the CAR file. 3352. Bundle includes a blob manifest mapping CIDs → MIME types → sizes. 3363. New device imports blobs and verifies CIDs match. 337 338### 8.2 Unplanned Device Loss 339 340On the free tier, blobs not crawled by an AppView may be permanently lost (migration spec §4.3). The relay's cache retention helps: 341 342- **Paid tiers:** Relay holds a full blob mirror. All blobs recoverable from relay. 343- **Free tier:** Relay holds only recently-accessed blobs (cache eviction). Older blobs attempted via `getBlob` against known AppView CDNs. Blobs never crawled are lost. 344 345### 8.3 Proactive Crawl 346 347After every blob upload, the relay should call `requestCrawl` to the configured AppView. This maximizes the chance that blobs are indexed before any loss event. Already noted in the migration spec (§4.3) but important to implement at the relay level. 348 349--- 350 351## 9. Implementation Milestones 352 353### v0.1 — Basic Blob Support (blocks mobile-only phase) 354 355- `uploadBlob` endpoint with local filesystem storage 356- `getBlob` endpoint for serving 357- `listBlobs` endpoint 358- CID generation/validation via `cid` crate 359- Temporary blob garbage collection (6-hour grace) 360- MIME type validation via `infer` crate 361- Per-blob size limits 362- Account storage quota enforcement 363- `requestCrawl` after record creation with blob references 364- S3 backend support via `rust-s3` (optional, configurable — local is default) 365 366### v1.0 — Production Blobs 367 368- S3 backend as default for managed relay (R2 recommended) 369- Local → S3 migration tool 370- Dereferenced blob cleanup 371- CDN integration for Pro/Business tiers (R2 + Workers or equivalent) 372- Cache eviction for desktop-enrolled accounts 373- Blob forwarding to desktop via Iroh on upload 374- Desktop → relay blob fetch on cache miss 375- Blob manifest in device transfer bundle 376- MinIO deployment docs for BYO relay operators 377 378### Later 379 380- Video transcoding (serve multiple resolutions) 381- Blob deduplication across accounts (content-addressed storage makes this natural) 382- Blob access analytics (which blobs are hot/cold for cache optimization) 383 384--- 385 386## 10. Design Decisions 387 388| Decision | Rationale | Alternatives Considered | 389|----------|-----------|------------------------| 390| rust-s3 crate for S3 operations | Lightweight, async/sync flexible, well-tested with R2 and MinIO. 357K downloads/month. Lower deps than aws-sdk-s3. | aws-sdk-s3 (heavyweight, 100+ deps), opendal (heavier abstraction, may adopt later). | 391| S3-compatible object storage for blob data | Blobs are large, write-once, and content-addressed — a perfect fit for object storage. R2 has no egress fees. MinIO works for self-hosted. | Local filesystem only (doesn't scale, no redundancy), database BLOBs (terrible performance at scale). | 392| Local filesystem as default, S3 as production option | BYO relay operators shouldn't need to run MinIO for a small instance. Local works fine for single-user. S3 for managed relay at scale. | S3 required from day one (barrier to self-hosting), local only (no production path). | 393| Cloudflare R2 as recommended provider | Zero egress fees (biggest cost for blob serving). Native CDN via Workers. S3-compatible API. | AWS S3 (egress costs add up), Backblaze B2 (less ecosystem integration). | 394| 6-hour temp blob grace period | 6x the ATProto minimum. Generous for apps with slow record creation. Low storage cost. | 1 hour (spec minimum — too aggressive), 24 hours (unnecessary). | 395| MIME type sniffing via infer crate | Prevents content-type spoofing. No external deps. Critical for security — a mislabeled executable served as an image is dangerous. | Trust client Content-Type (unsafe), reject without sniffing (too strict). | 396| CDN with immutable cache headers | Blobs are content-addressed — the CID changes if content changes. Immutable caching is safe and eliminates invalidation complexity. | Short TTL caching (wastes CDN bandwidth), no CDN (higher relay load). | 397| Relay caches blobs in desktop-enrolled mode | Ensures blobs are served when desktop is offline. `getBlob` from AppViews needs to work 24/7. | No relay cache (blobs unavailable when desktop sleeps — breaks federation), desktop-only (same problem). | 398| Reference rsky-pds for implementation patterns | Production Rust PDS with S3 blob storage already implemented. Don't reinvent. | Build from scratch (slower, more bugs), fork rsky-pds (too coupled). |