Embedded PDS Architecture for Hold Services#

This document describes ATCR's hold service architecture using embedded ATProto PDS (Personal Data Server) for access control and federation.

Motivation#

The Fragmentation Problem#

Several ATProto projects face similar challenges with large data storage:

Project	Large Data	Metadata	Solution
tangled.org	Git objects	Issues, PRs, comments	External knot storage
stream.place	Video segments	Stream info, chat	Embedded "static PDS"
ATCR	Container blobs	Manifests, comments, builds	Embedded PDS in hold service

Common problem: Large binary data can't realistically live in user PDSs, but application metadata needs a federated home.

ATCR's approach: Each hold service is a full ATProto actor with its own embedded PDS for shared data (captain + crew records, not user-specific data). This PDS stores access control and metadata about the hold itself.

Current Architecture#

Hold Service Components#

Hold Service (did:web:hold01.atcr.io)
├── Embedded PDS (SQLite carstore) - Shared data only
│   ├── Captain record (ownership metadata)
│   ├── Crew records (access control)
│   └── ATProto sync/repo endpoints
├── OCI multipart upload (XRPC)
│   ├── io.atcr.hold.initiateUpload
│   ├── io.atcr.hold.getPartUploadUrl
│   ├── io.atcr.hold.uploadPart
│   ├── io.atcr.hold.completeUpload
│   └── io.atcr.hold.abortUpload
└── Storage driver (S3, filesystem, etc.)

Important distinction:

Hold's embedded PDS = Shared data (crew members, hold configuration)
User's PDS = User-specific data (manifests, sailor profile, personal records)
Hold's PDS does NOT store user-specific container data (that stays in user's own PDS)

Records Structure#

Captain record (hold ownership, single record at io.atcr.hold.captain/self):

{
  "$type": "io.atcr.hold.captain",
  "owner": "did:plc:alice123",
  "public": false,
  "deployedAt": "2025-10-14T...",
  "region": "iad",
  "provider": "fly.io"
}

Crew records (access control, one per member at io.atcr.hold.crew/{rkey}):

{
  "$type": "io.atcr.hold.crew",
  "member": "did:plc:bob456",
  "role": "admin",
  "permissions": ["blob:read", "blob:write"],
  "addedAt": "2025-10-14T..."
}

ATProto PDS Endpoints#

Standard ATProto sync endpoints:

GET /xrpc/com.atproto.sync.getRepo - Download repository as CAR file
GET /xrpc/com.atproto.sync.getBlob - Get blob or presigned download URL
GET /xrpc/com.atproto.sync.subscribeRepos - Real-time crew changes
GET /xrpc/com.atproto.sync.listRepos - List repositories

Repository management:

GET /xrpc/com.atproto.repo.describeRepo - Repository metadata
GET /xrpc/com.atproto.repo.getRecord - Get specific record (captain/crew)
GET /xrpc/com.atproto.repo.listRecords - List crew members
POST /xrpc/io.atcr.hold.requestCrew - Request crew membership

DID resolution:

GET /.well-known/did.json - DID document (did:web resolution)
GET /.well-known/atproto-did - DID for handle resolution

OCI Multipart Upload Flow#

1. AppView gets service token from user's PDS:
   GET /xrpc/com.atproto.server.getServiceAuth?aud={holdDID}
   Response: { "token": "eyJ..." }

2. AppView initiates multipart upload:
   POST /xrpc/io.atcr.hold.initiateUpload
   Authorization: Bearer {serviceToken}
   Body: { "digest": "sha256:abc..." }
   Response: { "uploadId": "xyz" }

3. For each part:
   POST /xrpc/io.atcr.hold.getPartUploadUrl
   Body: { "uploadId": "xyz", "partNumber": 1 }
   Response: { "url": "https://s3.../presigned" }

4. Upload part to S3 presigned URL:
   PUT {presignedURL}
   Body: [part data]

5. Complete upload:
   POST /xrpc/io.atcr.hold.completeUpload
   Body: { "uploadId": "xyz", "digest": "sha256:abc...", "parts": [...] }

Implementation Details#

Storage: Indigo Carstore with SQLite#

type HoldPDS struct {
    did      string
    carstore carstore.CarStore
    session  *carstore.DeltaSession  // Provides blockstore interface
    repo     *repo.Repo
    dbPath   string
    uid      models.Uid              // User ID for carstore (fixed: 1)
}

Storage location: Single SQLite file (/var/lib/atcr-hold/hold.db)

Contains MST nodes, records, commits in carstore tables
Handles compaction/cleanup automatically
Migration path to Postgres if needed (same carstore API)

Key Implementation Lessons#

1. Custom Record Types Need Manual CBOR Decoding#

// ❌ WRONG - Fails with "unrecognized lexicon type"
record, err := repo.GetRecord(ctx, path, &CrewRecord{})

// ✅ CORRECT - Manual CBOR decoding
recordCID, recBytes, err := repo.GetRecordBytes(ctx, path)
var crewRecord CrewRecord
err = crewRecord.UnmarshalCBOR(bytes.NewReader(*recBytes))

Indigo's lexicon system doesn't know about custom types like io.atcr.hold.crew.

2. JSON and CBOR Struct Tags Must Match#

// ✅ CORRECT - JSON tags match CBOR tags
type CrewRecord struct {
    Type        string   `json:"$type" cborgen:"$type"`
    Member      string   `json:"member" cborgen:"member"`
    Role        string   `json:"role" cborgen:"role"`
    Permissions []string `json:"permissions" cborgen:"permissions"`
    AddedAt     string   `json:"addedAt" cborgen:"addedAt"`
}

CID verification requires identical bytes from JSON and CBOR encodings.

3. MST ForEach Returns Full Paths#

// ✅ CORRECT - Extract just the rkey
err := repo.ForEach(ctx, "io.atcr.hold.crew", func(k string, v cid.Cid) error {
    // k = "io.atcr.hold.crew/3m37dr2ddit22"
    parts := strings.Split(k, "/")
    rkey := parts[len(parts)-1] // "3m37dr2ddit22"
    return nil
})

4. CAR Files Must Include Full MST Path#

For com.atproto.sync.getRecord, return CAR with:

Commit block - Repo head with signature
MST tree nodes - Path from root to record
Record block - The actual record data

Use util.NewLoggingBstore() to capture all accessed blocks.

IAM Challenges#

Current Implementation: Service Tokens#

AppView uses com.atproto.server.getServiceAuth to get tokens for calling holds:

// AppView requests service token from user's PDS
GET /xrpc/com.atproto.server.getServiceAuth?aud={holdDID}&lxm=com.atproto.repo.getRecord

// PDS returns short-lived token (60 seconds)
{ "token": "eyJ..." }

// AppView uses token to authenticate to hold
Authorization: Bearer eyJ...

Known Issues#

1. RPC Permission Format with IP Addresses#

Problem: Service token RPC permissions don't work with IP addresses in the audience (aud) field:

Error: RPC permission format invalid
Permission: rpc:com.atproto.repo.getRecord?aud=172.28.0.3:8080#atcr_hold
Issue: IP address with port not supported in aud field

Impact: Local development with IP-based hold DIDs (e.g., did:web:172.28.0.3:8080) fails.

Workaround: Falls back to unauthenticated requests (works for public holds only) or use hostname-based DIDs.

2. Dynamic Hold Discovery Limitation#

Problem: AppView can only OAuth a user's default hold (configured in AppView), not dynamically discovered holds from sailor profiles.

Current limitation:

User sets defaultHold = "did:web:alice-storage.fly.dev" in sailor profile
AppView discovers hold DID when user pushes
AppView tries to get service token for alice's hold from user's PDS
BUT: User never OAuth'd through alice's hold, only through AppView's default hold
Result: No service token available, can't authenticate to alice's hold

Why this matters:

Users can't seamlessly use BYOS (Bring Your Own Storage)
Hold references in sailor profiles are non-functional
Limits portability and decentralization goals

3. Trust Model: "Trust but Verify"#

Current approach:

User OAuth's to AppView (credential helper flow)
Hold has crew member record for user (authorization)
AppView requests service token from user's PDS (proof)
Hold validates service token from user's PDS (verification)

Philosophy: "Trust but verify"

IF user OAuth'd to AppView AND hold has crew member record for user → generally trust
BUT don't want AppView to lie → need proof from user's PDS that it's actually them
Service tokens provide this proof (user's PDS says "yes, I authorized this")

Challenge: Service tokens work for this model, but scope/permission format issues (see #1, #2) make it fragile in practice.

Potential Solutions#

Option A: Direct User-to-Hold Authentication#

Users authenticate directly to holds (bypassing AppView service tokens).

Pros:

✅ Clear trust model (user ↔ hold)
✅ Works with any hold (BYOS friendly)
✅ No OAuth scope issues

Cons:

❌ Multiple OAuth flows (user's PDS + each hold)
❌ Complex credential management
❌ Poor UX (authenticate to each hold separately)

Option B: AppView as OAuth Client#

AppView pre-registers with holds and uses its own credentials (not user's).

Pros:

✅ No OAuth scope issues
✅ Single OAuth flow for user
✅ Simpler credential management

Cons:

❌ Holds must trust AppView (centralization)
❌ Doesn't work for unknown holds
❌ Requires registration process

Option C: Public Hold API#

Simplify by making holds public for reads, auth only for writes.

Pros:

✅ No OAuth complexity for reads
✅ Works offline (no PDS dependency)

Cons:

❌ Private holds still need auth
❌ Not standard ATProto pattern

Option D: Hybrid Service Token + API Key#

Use service tokens when available, fall back to API keys for BYOS holds.

Pros:

✅ Optimal for default holds
✅ BYOS works with API keys
✅ Backward compatible

Cons:

❌ Two auth mechanisms
❌ Not pure ATProto

Recommended Approach#

Short-term (MVP):

Public holds (no auth needed for reads)
Default hold with service tokens (AppView-managed)
Document BYOS limitation

Medium-term:

Hybrid approach (service tokens + API key fallback)
Clear security model for hold operators

Long-term:

Explore direct user-to-hold OAuth
Credential helper manages multiple hold sessions
Auto-discover and authenticate to new holds

Understanding getServiceAuth#

Purpose: com.atproto.server.getServiceAuth gives a JWT to a service with access to specific functions in the user's PDS. It's a temporary grant to a service outside of what you OAuth'd to.

How ATCR uses it:

User OAuth's to AppView (gets broad access to their account)
AppView needs to prove to hold that user authorized it
AppView calls user's PDS: "give me a token scoped for this hold"
User's PDS issues service token with narrow scope (e.g., rpc:com.atproto.repo.getRecord?aud={holdDID})
AppView presents this token to hold as proof

Industry usage:

getServiceAuth appears to be the intended pattern for inter-service auth
Not widely used yet (ATProto ecosystem is young)
Most apps use transition:generic scope for everything (too broad, not ideal)
RPC permission scopes are finicky and not well documented

Open Questions#

RPC permission format: Can the aud field in RPC permissions support IP addresses? Is this a spec limitation or implementation bug?
Scope granularity: What's the right balance between transition:generic (too broad) and fine-grained RPC scopes (finicky)?
Dynamic discovery + auth: How should AppView authenticate to arbitrary holds discovered from sailor profiles without pre-registration?
Service token caching: Should service tokens be cached across multiple requests? Current: 50 second cache, is this optimal?

References#

Stream.place embedded PDS: https://streamplace.leaflet.pub/3lut7mgni5s2k/l-quote/6_318-6_554#6
ATProto OAuth spec: https://atproto.com/specs/oauth
ATProto XRPC spec: https://atproto.com/specs/xrpc
ATProto Service Auth: https://docs.bsky.app/docs/api/com-atproto-server-get-service-auth
CID spec: https://github.com/multiformats/cid
OCI Distribution Spec: https://github.com/opencontainers/distribution-spec

Configure Feed

Configure Feed

Embedded PDS Architecture for Hold Services#

Motivation#

The Fragmentation Problem#

Current Architecture#

Hold Service Components#

Records Structure#

ATProto PDS Endpoints#

OCI Multipart Upload Flow#

Implementation Details#

Storage: Indigo Carstore with SQLite#

Key Implementation Lessons#

1. Custom Record Types Need Manual CBOR Decoding#

2. JSON and CBOR Struct Tags Must Match#

3. MST ForEach Returns Full Paths#

4. CAR Files Must Include Full MST Path#

IAM Challenges#

Current Implementation: Service Tokens#

Known Issues#

1. RPC Permission Format with IP Addresses#

2. Dynamic Hold Discovery Limitation#

3. Trust Model: "Trust but Verify"#

Potential Solutions#

Option A: Direct User-to-Hold Authentication#

Option B: AppView as OAuth Client#

Option C: Public Hold API#

Option D: Hybrid Service Token + API Key#

Recommended Approach#

Understanding getServiceAuth#

Open Questions#

References#