A container registry that uses the AT Protocol for manifest storage and S3 for blob storage.
0
fork

Configure Feed

Select the types of activity you want to include in your feed.

at main 575 lines 21 kB view raw view rendered
1# ATCR Quota System 2 3This document describes ATCR's storage quota implementation using ATProto records for per-user layer tracking. 4 5## Table of Contents 6 7- [Overview](#overview) 8- [Quota Model](#quota-model) 9- [Layer Record Schema](#layer-record-schema) 10- [Quota Calculation](#quota-calculation) 11- [Push Flow](#push-flow) 12- [Delete Flow](#delete-flow) 13- [Garbage Collection](#garbage-collection) 14- [Configuration](#configuration) 15- [Future Enhancements](#future-enhancements) 16 17## Overview 18 19ATCR implements per-user storage quotas to: 201. **Limit storage consumption** on shared hold services 212. **Provide transparency** (show users their storage usage) 223. **Enable fair billing** (users pay for what they use) 23 24**Key principle:** Users pay for layers they reference, deduplicated per-user. If you push the same layer in multiple images, you only pay once. 25 26### Example Scenario 27 28``` 29Alice pushes myapp:v1 (layers A, B, C - each 100MB) 30→ Creates 3 layer records in hold's PDS 31→ Alice's quota: 300MB (3 unique layers) 32 33Alice pushes myapp:v2 (layers A, B, D) 34→ Creates 3 more layer records (A, B again, plus D) 35→ Alice's quota: 400MB (4 unique layers: A, B, C, D) 36→ Layers A, B appear twice in records but deduplicated in quota calc 37 38Bob pushes his-app:latest (layers A, E) 39→ Creates 2 layer records for Bob 40→ Bob's quota: 200MB (2 unique layers: A, E) 41→ Layer A shared with Alice in S3, but Bob pays for his own usage 42 43Physical S3 storage: 500MB (A, B, C, D, E - deduplicated globally) 44Alice's quota: 400MB 45Bob's quota: 200MB 46``` 47 48## Quota Model 49 50### Everyone Pays for What They Upload 51 52Each user is charged for all unique layers they reference, regardless of whether those layers exist in S3 from other users' uploads. 53 54**Why this model?** 55- **Simple mental model**: "I pushed 500MB of layers, I use 500MB of quota" 56- **Predictable**: Your quota doesn't change based on others' actions 57- **Clean deletion**: Delete manifest → layer records removed → quota freed 58- **No cross-user dependencies**: Users are isolated 59 60**Trade-off:** 61- Total claimed storage can exceed physical S3 storage 62- This is acceptable - deduplication is an operational benefit for ATCR, not a billing feature 63 64### ATProto-Native Storage 65 66Layer tracking uses ATProto records stored in the hold's embedded PDS: 67- **Collection**: `io.atcr.hold.layer` 68- **Repository**: Hold's DID (e.g., `did:web:hold01.atcr.io`) 69- **Records**: One per manifest-layer relationship (TID-based keys) 70 71This approach: 72- Keeps quota data in ATProto (no separate database) 73- Enables standard ATProto sync/query mechanisms 74- Provides full audit trail of layer usage 75 76## Layer Record Schema 77 78### LayerRecord 79 80```go 81// pkg/atproto/lexicon.go 82 83type LayerRecord struct { 84 Type string `json:"$type"` // "io.atcr.hold.layer" 85 Digest string `json:"digest"` // Layer digest (sha256:abc123...) 86 Size int64 `json:"size"` // Size in bytes 87 MediaType string `json:"mediaType"` // e.g., "application/vnd.oci.image.layer.v1.tar+gzip" 88 Manifest string `json:"manifest"` // at://did:plc:alice/io.atcr.manifest/abc123 89 UserDID string `json:"userDid"` // User's DID for quota grouping 90 CreatedAt string `json:"createdAt"` // ISO 8601 timestamp 91} 92``` 93 94### Record Key 95 96Records use TID (timestamp-based ID) as the rkey. This means: 97- Multiple records can exist for the same layer (from different manifests) 98- Deduplication happens at query time, not storage time 99- Simple append-only writes on manifest push 100 101### Example Records 102 103``` 104Manifest A (layers X, Y, Z) → creates 3 records 105Manifest B (layers X, W) → creates 2 records 106 107io.atcr.hold.layer collection: 108┌──────────────┬────────┬──────┬───────────────────────────────────┬─────────────────┐ 109│ rkey (TID) │ digest │ size │ manifest │ userDid │ 110├──────────────┼────────┼──────┼───────────────────────────────────┼─────────────────┤ 111│ 3jui7...001 │ X │ 100 │ at://did:plc:alice/.../manifestA │ did:plc:alice │ 112│ 3jui7...002 │ Y │ 200 │ at://did:plc:alice/.../manifestA │ did:plc:alice │ 113│ 3jui7...003 │ Z │ 150 │ at://did:plc:alice/.../manifestA │ did:plc:alice │ 114│ 3jui7...004 │ X │ 100 │ at://did:plc:alice/.../manifestB │ did:plc:alice │ ← duplicate digest 115│ 3jui7...005 │ W │ 300 │ at://did:plc:alice/.../manifestB │ did:plc:alice │ 116└──────────────┴────────┴──────┴───────────────────────────────────┴─────────────────┘ 117``` 118 119## Quota Calculation 120 121### Query: User's Unique Storage 122 123```sql 124-- Calculate quota by deduplicating layers 125SELECT SUM(size) FROM ( 126 SELECT DISTINCT digest, size 127 FROM io.atcr.hold.layer 128 WHERE userDid = ? 129) 130``` 131 132Using the example above: 133- Layer X appears twice but counted once: 100 134- Layers Y, Z, W counted once each: 200 + 150 + 300 135- **Total: 750 bytes** 136 137### Implementation 138 139```go 140// pkg/hold/quota/quota.go 141 142type QuotaManager struct { 143 pds *pds.Server // Hold's embedded PDS 144} 145 146// GetUsage calculates a user's current quota usage 147func (q *QuotaManager) GetUsage(ctx context.Context, userDID string) (int64, error) { 148 // List all layer records for this user 149 records, err := q.pds.ListRecords(ctx, LayerCollection, userDID) 150 if err != nil { 151 return 0, err 152 } 153 154 // Deduplicate by digest 155 uniqueLayers := make(map[string]int64) // digest -> size 156 for _, record := range records { 157 var layer LayerRecord 158 if err := json.Unmarshal(record.Value, &layer); err != nil { 159 continue 160 } 161 if layer.UserDID == userDID { 162 uniqueLayers[layer.Digest] = layer.Size 163 } 164 } 165 166 // Sum unique layer sizes 167 var total int64 168 for _, size := range uniqueLayers { 169 total += size 170 } 171 172 return total, nil 173} 174 175// CheckQuota returns true if user has space for additional bytes 176func (q *QuotaManager) CheckQuota(ctx context.Context, userDID string, additional int64, limit int64) (bool, int64, error) { 177 current, err := q.GetUsage(ctx, userDID) 178 if err != nil { 179 return false, 0, err 180 } 181 182 return current+additional <= limit, current, nil 183} 184``` 185 186### Quota Response 187 188```go 189type QuotaInfo struct { 190 Used int64 `json:"used"` // Current usage (deduplicated) 191 Limit int64 `json:"limit"` // User's quota limit 192 Available int64 `json:"available"` // Remaining space 193} 194``` 195 196## Push Flow 197 198### Step-by-Step: User Pushes Image 199 200``` 201┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ 202│ Client │ │ AppView │ │ Hold │ │ User PDS │ 203│ (Docker) │ │ │ │ Service │ │ │ 204└──────────┘ └──────────┘ └──────────┘ └──────────┘ 205 │ │ │ │ 206 │ 1. Upload blobs │ │ │ 207 ├─────────────────────>│ │ │ 208 │ │ 2. Route to hold │ │ 209 │ ├─────────────────────>│ │ 210 │ │ │ 3. Store in S3 │ 211 │ │ │ │ 212 │ 4. PUT manifest │ │ │ 213 ├─────────────────────>│ │ │ 214 │ │ │ │ 215 │ │ 5. Calculate quota │ │ 216 │ │ impact for new │ │ 217 │ │ layers │ │ 218 │ │ │ │ 219 │ │ 6. Check quota limit │ │ 220 │ ├─────────────────────>│ │ 221 │ │<─────────────────────┤ │ 222 │ │ │ │ 223 │ │ 7. Store manifest │ │ 224 │ ├──────────────────────┼─────────────────────>│ 225 │ │ │ │ 226 │ │ 8. Create layer │ │ 227 │ │ records │ │ 228 │ ├─────────────────────>│ │ 229 │ │ │ 9. Write to │ 230 │ │ │ hold's PDS │ 231 │ │ │ │ 232 │ 10. 201 Created │ │ │ 233 │<─────────────────────┤ │ │ 234``` 235 236### Implementation 237 238```go 239// pkg/appview/storage/routing_repository.go 240 241func (r *RoutingRepository) PutManifest(ctx context.Context, manifest distribution.Manifest) error { 242 // Parse manifest to get layers 243 layers := extractLayers(manifest) 244 245 // Get user's current unique layers from hold 246 existingLayers, err := r.holdClient.GetUserLayers(ctx, r.userDID) 247 if err != nil { 248 return err 249 } 250 existingSet := makeDigestSet(existingLayers) 251 252 // Calculate quota impact (only new unique layers) 253 var quotaImpact int64 254 for _, layer := range layers { 255 if !existingSet[layer.Digest] { 256 quotaImpact += layer.Size 257 } 258 } 259 260 // Check quota 261 ok, current, err := r.quotaManager.CheckQuota(ctx, r.userDID, quotaImpact, r.quotaLimit) 262 if err != nil { 263 return err 264 } 265 if !ok { 266 return fmt.Errorf("quota exceeded: used=%d, impact=%d, limit=%d", 267 current, quotaImpact, r.quotaLimit) 268 } 269 270 // Store manifest in user's PDS 271 manifestURI, err := r.atprotoClient.PutManifest(ctx, manifest) 272 if err != nil { 273 return err 274 } 275 276 // Create layer records in hold's PDS 277 for _, layer := range layers { 278 record := LayerRecord{ 279 Type: "io.atcr.hold.layer", 280 Digest: layer.Digest, 281 Size: layer.Size, 282 MediaType: layer.MediaType, 283 Manifest: manifestURI, 284 UserDID: r.userDID, 285 CreatedAt: time.Now().Format(time.RFC3339), 286 } 287 if err := r.holdClient.CreateLayerRecord(ctx, record); err != nil { 288 log.Printf("Warning: failed to create layer record: %v", err) 289 // Continue - reconciliation will fix 290 } 291 } 292 293 return nil 294} 295``` 296 297### Quota Check Timing 298 299Quota is checked when the **manifest is pushed** (after blobs are uploaded): 300- Blobs upload first via presigned URLs 301- Manifest pushed last triggers quota check 302- If quota exceeded, manifest is rejected (orphaned blobs cleaned by GC) 303 304This matches Harbor's approach and is the industry standard. 305 306## Delete Flow 307 308### Manifest Deletion 309 310When a user deletes a manifest: 311 312``` 313┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ 314│ User │ │ AppView │ │ Hold │ │ User PDS │ 315│ UI │ │ │ │ Service │ │ │ 316└──────────┘ └──────────┘ └──────────┘ └──────────┘ 317 │ │ │ │ 318 │ DELETE manifest │ │ │ 319 ├─────────────────────>│ │ │ 320 │ │ │ │ 321 │ │ 1. Delete manifest │ │ 322 │ │ from user's PDS │ │ 323 │ ├──────────────────────┼─────────────────────>│ 324 │ │ │ │ 325 │ │ 2. Delete layer │ │ 326 │ │ records for this │ │ 327 │ │ manifest │ │ 328 │ ├─────────────────────>│ │ 329 │ │ │ 3. Remove records │ 330 │ │ │ where manifest │ 331 │ │ │ == deleted URI │ 332 │ │ │ │ 333 │ 4. 204 No Content │ │ │ 334 │<─────────────────────┤ │ │ 335``` 336 337### Implementation 338 339```go 340// pkg/appview/handlers/manifest.go 341 342func (h *ManifestHandler) DeleteManifest(w http.ResponseWriter, r *http.Request) { 343 userDID := auth.GetDID(r.Context()) 344 repository := chi.URLParam(r, "repository") 345 digest := chi.URLParam(r, "digest") 346 347 // Get manifest URI before deletion 348 manifestURI := fmt.Sprintf("at://%s/%s/%s", userDID, ManifestCollection, digest) 349 350 // Delete manifest from user's PDS 351 if err := h.atprotoClient.DeleteRecord(ctx, ManifestCollection, digest); err != nil { 352 http.Error(w, "failed to delete manifest", 500) 353 return 354 } 355 356 // Delete associated layer records from hold's PDS 357 if err := h.holdClient.DeleteLayerRecords(ctx, manifestURI); err != nil { 358 log.Printf("Warning: failed to delete layer records: %v", err) 359 // Continue - reconciliation will clean up 360 } 361 362 w.WriteHeader(http.StatusNoContent) 363} 364``` 365 366### Hold Service: Delete Layer Records 367 368```go 369// pkg/hold/pds/xrpc.go 370 371func (s *Server) DeleteLayerRecords(ctx context.Context, manifestURI string) error { 372 // List all layer records 373 records, err := s.ListRecords(ctx, LayerCollection, "") 374 if err != nil { 375 return err 376 } 377 378 // Delete records matching this manifest 379 for _, record := range records { 380 var layer LayerRecord 381 if err := json.Unmarshal(record.Value, &layer); err != nil { 382 continue 383 } 384 if layer.Manifest == manifestURI { 385 if err := s.DeleteRecord(ctx, LayerCollection, record.RKey); err != nil { 386 log.Printf("Failed to delete layer record %s: %v", record.RKey, err) 387 } 388 } 389 } 390 391 return nil 392} 393``` 394 395### Quota After Deletion 396 397After deleting a manifest: 398- Layer records for that manifest are removed 399- Quota recalculated with `SELECT DISTINCT` query 400- If layer was only in deleted manifest → quota decreases 401- If layer exists in other manifests → quota unchanged (still deduplicated) 402 403## Garbage Collection 404 405### Orphaned Blobs 406 407Orphaned blobs accumulate when: 4081. Manifest push fails after blobs uploaded 4092. Quota exceeded - manifest rejected 4103. User deletes manifest - blobs may no longer be referenced 411 412### GC Process 413 414```go 415// pkg/hold/gc/gc.go 416 417func (gc *GarbageCollector) Run(ctx context.Context) error { 418 // Step 1: Get all referenced digests from layer records 419 records, err := gc.pds.ListRecords(ctx, LayerCollection, "") 420 if err != nil { 421 return err 422 } 423 424 referenced := make(map[string]bool) 425 for _, record := range records { 426 var layer LayerRecord 427 if err := json.Unmarshal(record.Value, &layer); err != nil { 428 continue 429 } 430 referenced[layer.Digest] = true 431 } 432 433 log.Printf("Found %d referenced blobs", len(referenced)) 434 435 // Step 2: Walk S3 blobs and delete unreferenced 436 var deleted, reclaimed int64 437 err = gc.driver.Walk(ctx, "/docker/registry/v2/blobs", func(fi storagedriver.FileInfo) error { 438 if fi.IsDir() { 439 return nil 440 } 441 442 digest := extractDigestFromPath(fi.Path()) 443 if !referenced[digest] { 444 size := fi.Size() 445 if err := gc.driver.Delete(ctx, fi.Path()); err != nil { 446 log.Printf("Failed to delete %s: %v", digest, err) 447 return nil 448 } 449 deleted++ 450 reclaimed += size 451 log.Printf("GC: deleted %s (%d bytes)", digest, size) 452 } 453 return nil 454 }) 455 456 log.Printf("GC complete: deleted %d blobs, reclaimed %d bytes", deleted, reclaimed) 457 return err 458} 459``` 460 461### GC Schedule 462 463```bash 464# Environment variable 465GC_ENABLED=true 466GC_INTERVAL=24h # Daily by default 467``` 468 469## Configuration 470 471### Hold Service Environment Variables 472 473```bash 474# .env.hold 475 476# Quota Configuration 477QUOTA_ENABLED=true 478QUOTA_DEFAULT_LIMIT=10737418240 # 10GB in bytes 479 480# Garbage Collection 481GC_ENABLED=true 482GC_INTERVAL=24h 483``` 484 485### Quota Limits by Bytes 486 487| Size | Bytes | 488|------|-------| 489| 1 GB | 1073741824 | 490| 5 GB | 5368709120 | 491| 10 GB | 10737418240 | 492| 50 GB | 53687091200 | 493| 100 GB | 107374182400 | 494 495## Future Enhancements 496 497### 1. Quota API Endpoints 498 499``` 500GET /xrpc/io.atcr.hold.getQuota?did={userDID} - Get user's quota usage 501GET /xrpc/io.atcr.hold.getQuotaBreakdown - Storage by repository 502``` 503 504### 2. Quota Alerts 505 506- Warning thresholds at 80%, 90%, 95% 507- Email/webhook notifications 508- Grace period before hard enforcement 509 510### 3. Tier-Based Quotas (Implemented) 511 512ATCR uses quota tiers to limit storage per crew member, configured via `quotas.yaml`: 513 514```yaml 515# quotas.yaml 516tiers: 517 deckhand: # Entry-level crew 518 quota: 5GB 519 bosun: # Mid-level crew 520 quota: 50GB 521 quartermaster: # High-level crew 522 quota: 100GB 523 524defaults: 525 new_crew_tier: deckhand # Default tier for new crew members 526``` 527 528| Tier | Limit | Description | 529|------|-------|-------------| 530| deckhand | 5 GB | Entry-level crew member | 531| bosun | 50 GB | Mid-level crew member | 532| quartermaster | 100 GB | Senior crew member | 533| owner (captain) | Unlimited | Hold owner always has unlimited | 534 535**Tier Resolution:** 5361. If user is captain (owner) → unlimited 5372. If crew member has explicit tier → use that tier's limit 5383. If crew member has no tier → use `defaults.new_crew_tier` 5394. If default tier not found → unlimited 540 541**Crew Record Example:** 542```json 543{ 544 "$type": "io.atcr.hold.crew", 545 "member": "did:plc:alice123", 546 "role": "writer", 547 "permissions": ["blob:write"], 548 "tier": "bosun", 549 "addedAt": "2026-01-04T12:00:00Z" 550} 551``` 552 553### 4. Rate Limiting 554 555Pull rate limits (Docker Hub style): 556- Anonymous: 100 pulls per 6 hours per IP 557- Authenticated: 200 pulls per 6 hours 558- Paid: Unlimited 559 560### 5. Quota Purchasing 561 562- Stripe integration for additional storage 563- $0.10/GB/month pricing (industry standard) 564 565## References 566 567- **Harbor Quotas:** https://goharbor.io/docs/1.10/administration/configure-project-quotas/ 568- **ATProto Spec:** https://atproto.com/specs/record 569- **OCI Distribution Spec:** https://github.com/opencontainers/distribution-spec 570 571--- 572 573**Document Version:** 2.0 574**Last Updated:** 2026-01-04 575**Model:** Per-user layer tracking with ATProto records