A container registry that uses the AT Protocol for manifest storage and S3 for blob storage.
atcr.io
docker
container
atproto
go
1# Embedded PDS Architecture for Hold Services
2
3This document describes ATCR's hold service architecture using embedded ATProto PDS (Personal Data Server) for access control and federation.
4
5## Motivation
6
7### The Fragmentation Problem
8
9Several ATProto projects face similar challenges with large data storage:
10
11| Project | Large Data | Metadata | Solution |
12|---------|-----------|----------|----------|
13| **tangled.org** | Git objects | Issues, PRs, comments | External knot storage |
14| **stream.place** | Video segments | Stream info, chat | Embedded "static PDS" |
15| **ATCR** | Container blobs | Manifests, comments, builds | Embedded PDS in hold service |
16
17**Common problem:** Large binary data can't realistically live in user PDSs, but application metadata needs a federated home.
18
19**ATCR's approach:** Each hold service is a full ATProto actor with its own embedded PDS for **shared data** (captain + crew records, not user-specific data). This PDS stores access control and metadata about the hold itself.
20
21## Current Architecture
22
23### Hold Service Components
24
25```
26Hold Service (did:web:hold01.atcr.io)
27├── Embedded PDS (SQLite carstore) - Shared data only
28│ ├── Captain record (ownership metadata)
29│ ├── Crew records (access control)
30│ └── ATProto sync/repo endpoints
31├── OCI multipart upload (XRPC)
32│ ├── io.atcr.hold.initiateUpload
33│ ├── io.atcr.hold.getPartUploadUrl
34│ ├── io.atcr.hold.uploadPart
35│ ├── io.atcr.hold.completeUpload
36│ └── io.atcr.hold.abortUpload
37└── Storage driver (S3, filesystem, etc.)
38```
39
40**Important distinction:**
41- **Hold's embedded PDS** = Shared data (crew members, hold configuration)
42- **User's PDS** = User-specific data (manifests, sailor profile, personal records)
43- Hold's PDS does NOT store user-specific container data (that stays in user's own PDS)
44
45### Records Structure
46
47**Captain record** (hold ownership, single record at `io.atcr.hold.captain/self`):
48```json
49{
50 "$type": "io.atcr.hold.captain",
51 "owner": "did:plc:alice123",
52 "public": false,
53 "deployedAt": "2025-10-14T...",
54 "region": "iad",
55 "provider": "fly.io"
56}
57```
58
59**Crew records** (access control, one per member at `io.atcr.hold.crew/{rkey}`):
60```json
61{
62 "$type": "io.atcr.hold.crew",
63 "member": "did:plc:bob456",
64 "role": "admin",
65 "permissions": ["blob:read", "blob:write"],
66 "addedAt": "2025-10-14T..."
67}
68```
69
70### ATProto PDS Endpoints
71
72Standard ATProto sync endpoints:
73- `GET /xrpc/com.atproto.sync.getRepo` - Download repository as CAR file
74- `GET /xrpc/com.atproto.sync.getBlob` - Get blob or presigned download URL
75- `GET /xrpc/com.atproto.sync.subscribeRepos` - Real-time crew changes
76- `GET /xrpc/com.atproto.sync.listRepos` - List repositories
77
78Repository management:
79- `GET /xrpc/com.atproto.repo.describeRepo` - Repository metadata
80- `GET /xrpc/com.atproto.repo.getRecord` - Get specific record (captain/crew)
81- `GET /xrpc/com.atproto.repo.listRecords` - List crew members
82- `POST /xrpc/io.atcr.hold.requestCrew` - Request crew membership
83
84DID resolution:
85- `GET /.well-known/did.json` - DID document (did:web resolution)
86- `GET /.well-known/atproto-did` - DID for handle resolution
87
88### OCI Multipart Upload Flow
89
90```
911. AppView gets service token from user's PDS:
92 GET /xrpc/com.atproto.server.getServiceAuth?aud={holdDID}
93 Response: { "token": "eyJ..." }
94
952. AppView initiates multipart upload:
96 POST /xrpc/io.atcr.hold.initiateUpload
97 Authorization: Bearer {serviceToken}
98 Body: { "digest": "sha256:abc..." }
99 Response: { "uploadId": "xyz" }
100
1013. For each part:
102 POST /xrpc/io.atcr.hold.getPartUploadUrl
103 Body: { "uploadId": "xyz", "partNumber": 1 }
104 Response: { "url": "https://s3.../presigned" }
105
1064. Upload part to S3 presigned URL:
107 PUT {presignedURL}
108 Body: [part data]
109
1105. Complete upload:
111 POST /xrpc/io.atcr.hold.completeUpload
112 Body: { "uploadId": "xyz", "digest": "sha256:abc...", "parts": [...] }
113```
114
115## Implementation Details
116
117### Storage: Indigo Carstore with SQLite
118
119```go
120type HoldPDS struct {
121 did string
122 carstore carstore.CarStore
123 session *carstore.DeltaSession // Provides blockstore interface
124 repo *repo.Repo
125 dbPath string
126 uid models.Uid // User ID for carstore (fixed: 1)
127}
128```
129
130**Storage location:** Single SQLite file (`/var/lib/atcr-hold/hold.db`)
131- Contains MST nodes, records, commits in carstore tables
132- Handles compaction/cleanup automatically
133- Migration path to Postgres if needed (same carstore API)
134
135### Key Implementation Lessons
136
137#### 1. Custom Record Types Need Manual CBOR Decoding
138
139```go
140// ❌ WRONG - Fails with "unrecognized lexicon type"
141record, err := repo.GetRecord(ctx, path, &CrewRecord{})
142
143// ✅ CORRECT - Manual CBOR decoding
144recordCID, recBytes, err := repo.GetRecordBytes(ctx, path)
145var crewRecord CrewRecord
146err = crewRecord.UnmarshalCBOR(bytes.NewReader(*recBytes))
147```
148
149Indigo's lexicon system doesn't know about custom types like `io.atcr.hold.crew`.
150
151#### 2. JSON and CBOR Struct Tags Must Match
152
153```go
154// ✅ CORRECT - JSON tags match CBOR tags
155type CrewRecord struct {
156 Type string `json:"$type" cborgen:"$type"`
157 Member string `json:"member" cborgen:"member"`
158 Role string `json:"role" cborgen:"role"`
159 Permissions []string `json:"permissions" cborgen:"permissions"`
160 AddedAt string `json:"addedAt" cborgen:"addedAt"`
161}
162```
163
164CID verification requires identical bytes from JSON and CBOR encodings.
165
166#### 3. MST ForEach Returns Full Paths
167
168```go
169// ✅ CORRECT - Extract just the rkey
170err := repo.ForEach(ctx, "io.atcr.hold.crew", func(k string, v cid.Cid) error {
171 // k = "io.atcr.hold.crew/3m37dr2ddit22"
172 parts := strings.Split(k, "/")
173 rkey := parts[len(parts)-1] // "3m37dr2ddit22"
174 return nil
175})
176```
177
178#### 4. CAR Files Must Include Full MST Path
179
180For `com.atproto.sync.getRecord`, return CAR with:
1811. **Commit block** - Repo head with signature
1822. **MST tree nodes** - Path from root to record
1833. **Record block** - The actual record data
184
185Use `util.NewLoggingBstore()` to capture all accessed blocks.
186
187## IAM Challenges
188
189### Current Implementation: Service Tokens
190
191AppView uses `com.atproto.server.getServiceAuth` to get tokens for calling holds:
192
193```go
194// AppView requests service token from user's PDS
195GET /xrpc/com.atproto.server.getServiceAuth?aud={holdDID}&lxm=com.atproto.repo.getRecord
196
197// PDS returns short-lived token (60 seconds)
198{ "token": "eyJ..." }
199
200// AppView uses token to authenticate to hold
201Authorization: Bearer eyJ...
202```
203
204### Known Issues
205
206#### 1. RPC Permission Format with IP Addresses
207
208**Problem:** Service token RPC permissions don't work with IP addresses in the audience (`aud`) field:
209
210```
211Error: RPC permission format invalid
212Permission: rpc:com.atproto.repo.getRecord?aud=172.28.0.3:8080#atcr_hold
213Issue: IP address with port not supported in aud field
214```
215
216**Impact:** Local development with IP-based hold DIDs (e.g., `did:web:172.28.0.3:8080`) fails.
217
218**Workaround:** Falls back to unauthenticated requests (works for public holds only) or use hostname-based DIDs.
219
220#### 2. Dynamic Hold Discovery Limitation
221
222**Problem:** AppView can only OAuth a user's default hold (configured in AppView), not dynamically discovered holds from sailor profiles.
223
224**Current limitation:**
225- User sets `defaultHold = "did:web:alice-storage.fly.dev"` in sailor profile
226- AppView discovers hold DID when user pushes
227- AppView tries to get service token for alice's hold from user's PDS
228- BUT: User never OAuth'd through alice's hold, only through AppView's default hold
229- Result: No service token available, can't authenticate to alice's hold
230
231**Why this matters:**
232- Users can't seamlessly use BYOS (Bring Your Own Storage)
233- Hold references in sailor profiles are non-functional
234- Limits portability and decentralization goals
235
236#### 3. Trust Model: "Trust but Verify"
237
238**Current approach:**
2391. User OAuth's to AppView (credential helper flow)
2402. Hold has crew member record for user (authorization)
2413. AppView requests service token from user's PDS (proof)
2424. Hold validates service token from user's PDS (verification)
243
244**Philosophy:** "Trust but verify"
245- IF user OAuth'd to AppView AND hold has crew member record for user → generally trust
246- BUT don't want AppView to lie → need proof from user's PDS that it's actually them
247- Service tokens provide this proof (user's PDS says "yes, I authorized this")
248
249**Challenge:** Service tokens work for this model, but scope/permission format issues (see #1, #2) make it fragile in practice.
250
251### Potential Solutions
252
253#### Option A: Direct User-to-Hold Authentication
254
255Users authenticate directly to holds (bypassing AppView service tokens).
256
257**Pros:**
258- ✅ Clear trust model (user ↔ hold)
259- ✅ Works with any hold (BYOS friendly)
260- ✅ No OAuth scope issues
261
262**Cons:**
263- ❌ Multiple OAuth flows (user's PDS + each hold)
264- ❌ Complex credential management
265- ❌ Poor UX (authenticate to each hold separately)
266
267#### Option B: AppView as OAuth Client
268
269AppView pre-registers with holds and uses its own credentials (not user's).
270
271**Pros:**
272- ✅ No OAuth scope issues
273- ✅ Single OAuth flow for user
274- ✅ Simpler credential management
275
276**Cons:**
277- ❌ Holds must trust AppView (centralization)
278- ❌ Doesn't work for unknown holds
279- ❌ Requires registration process
280
281#### Option C: Public Hold API
282
283Simplify by making holds public for reads, auth only for writes.
284
285**Pros:**
286- ✅ No OAuth complexity for reads
287- ✅ Works offline (no PDS dependency)
288
289**Cons:**
290- ❌ Private holds still need auth
291- ❌ Not standard ATProto pattern
292
293#### Option D: Hybrid Service Token + API Key
294
295Use service tokens when available, fall back to API keys for BYOS holds.
296
297**Pros:**
298- ✅ Optimal for default holds
299- ✅ BYOS works with API keys
300- ✅ Backward compatible
301
302**Cons:**
303- ❌ Two auth mechanisms
304- ❌ Not pure ATProto
305
306### Recommended Approach
307
308**Short-term (MVP):**
3091. Public holds (no auth needed for reads)
3102. Default hold with service tokens (AppView-managed)
3113. Document BYOS limitation
312
313**Medium-term:**
3141. Hybrid approach (service tokens + API key fallback)
3152. Clear security model for hold operators
316
317**Long-term:**
3181. Explore direct user-to-hold OAuth
3192. Credential helper manages multiple hold sessions
3203. Auto-discover and authenticate to new holds
321
322### Understanding getServiceAuth
323
324**Purpose:** `com.atproto.server.getServiceAuth` gives a JWT to a service with access to specific functions in the user's PDS. It's a **temporary grant to a service outside of what you OAuth'd to**.
325
326**How ATCR uses it:**
327- User OAuth's to AppView (gets broad access to their account)
328- AppView needs to prove to hold that user authorized it
329- AppView calls user's PDS: "give me a token scoped for this hold"
330- User's PDS issues service token with narrow scope (e.g., `rpc:com.atproto.repo.getRecord?aud={holdDID}`)
331- AppView presents this token to hold as proof
332
333**Industry usage:**
334- `getServiceAuth` appears to be the intended pattern for inter-service auth
335- Not widely used yet (ATProto ecosystem is young)
336- Most apps use `transition:generic` scope for everything (too broad, not ideal)
337- RPC permission scopes are finicky and not well documented
338
339### Open Questions
340
3411. **RPC permission format:** Can the `aud` field in RPC permissions support IP addresses? Is this a spec limitation or implementation bug?
3422. **Scope granularity:** What's the right balance between `transition:generic` (too broad) and fine-grained RPC scopes (finicky)?
3433. **Dynamic discovery + auth:** How should AppView authenticate to arbitrary holds discovered from sailor profiles without pre-registration?
3444. **Service token caching:** Should service tokens be cached across multiple requests? Current: 50 second cache, is this optimal?
345
346## References
347
348- **Stream.place embedded PDS:** https://streamplace.leaflet.pub/3lut7mgni5s2k/l-quote/6_318-6_554#6
349- **ATProto OAuth spec:** https://atproto.com/specs/oauth
350- **ATProto XRPC spec:** https://atproto.com/specs/xrpc
351- **ATProto Service Auth:** https://docs.bsky.app/docs/api/com-atproto-server-get-service-auth
352- **CID spec:** https://github.com/multiformats/cid
353- **OCI Distribution Spec:** https://github.com/opencontainers/distribution-spec