A container registry that uses the AT Protocol for manifest storage and S3 for blob storage. atcr.io
docker container atproto go
73
fork

Configure Feed

Select the types of activity you want to include in your feed.

1# ATCR Quota System 2 3This document describes ATCR's storage quota implementation, inspired by Harbor's proven approach to per-project blob tracking with deduplication. 4 5## Table of Contents 6 7- [Overview](#overview) 8- [Harbor's Approach (Reference Implementation)](#harbors-approach-reference-implementation) 9- [Storage Options](#storage-options) 10- [Quota Data Model](#quota-data-model) 11- [Push Flow (Detailed)](#push-flow-detailed) 12- [Delete Flow](#delete-flow) 13- [Garbage Collection](#garbage-collection) 14- [Quota Reconciliation](#quota-reconciliation) 15- [Configuration](#configuration) 16- [Trade-offs & Design Decisions](#trade-offs--design-decisions) 17- [Future Enhancements](#future-enhancements) 18 19## Overview 20 21ATCR implements per-user storage quotas to: 221. **Limit storage consumption** on shared hold services 232. **Track actual S3 costs** (what new data was added) 243. **Benefit from deduplication** (users only pay once per layer) 254. **Provide transparency** (show users their storage usage) 26 27**Key principle:** Users pay for layers they've uploaded, but only ONCE per layer regardless of how many images reference it. 28 29### Example Scenario 30 31``` 32Alice pushes myapp:v1 (layers A, B, C - each 100MB) 33→ Alice's quota: +300MB (all new layers) 34 35Alice pushes myapp:v2 (layers A, B, D) 36→ Layers A, B already claimed by Alice 37→ Layer D is new (100MB) 38→ Alice's quota: +100MB (only D is new) 39→ Total: 400MB 40 41Bob pushes his-app:latest (layers A, E) 42→ Layer A already exists in S3 (uploaded by Alice) 43→ Bob claims it for first time → +100MB to Bob's quota 44→ Layer E is new → +100MB to Bob's quota 45→ Bob's quota: 200MB 46 47Physical S3 storage: 500MB (A, B, C, D, E) 48Claimed storage: 600MB (Alice: 400MB, Bob: 200MB) 49Deduplication savings: 100MB (layer A shared) 50``` 51 52## Harbor's Approach (Reference Implementation) 53 54Harbor is built on distribution/distribution (same as ATCR) and implements quotas as middleware. Their approach: 55 56### Key Insights from Harbor 57 581. **"Shared blobs are only computed once per project"** 59 - Each project tracks which blobs it has uploaded 60 - Same blob used in multiple images counts only once per project 61 - Different projects claiming the same blob each pay for it 62 632. **Quota checked when manifest is pushed** 64 - Blobs upload first (presigned URLs, can't intercept) 65 - Manifest pushed last → quota check happens here 66 - Can reject manifest if quota exceeded (orphaned blobs cleaned by GC) 67 683. **Middleware-based implementation** 69 - distribution/distribution has NO built-in quota support 70 - Harbor added it as request preprocessing middleware 71 - Uses database (PostgreSQL) or Redis for quota storage 72 734. **Per-project ownership model** 74 - Blobs are physically deduplicated globally 75 - Quota accounting is logical (per-project claims) 76 - Total claimed storage can exceed physical storage 77 78### References 79 80- Harbor Quota Documentation: https://goharbor.io/docs/1.10/administration/configure-project-quotas/ 81- Harbor Source: https://github.com/goharbor/harbor (see `src/controller/quota`) 82 83## Storage Options 84 85The hold service needs to store quota data somewhere. Two options: 86 87### Option 1: S3-Based Storage (Recommended for BYOS) 88 89Store quota metadata alongside blobs in the same S3 bucket: 90 91``` 92Bucket structure: 93/docker/registry/v2/blobs/sha256/ab/abc123.../data ← actual blobs 94/atcr/quota/did:plc:alice.json ← quota tracking 95/atcr/quota/did:plc:bob.json 96``` 97 98**Pros:** 99- ✅ No separate database needed 100- ✅ Single S3 bucket (better UX - no second bucket to configure) 101- ✅ Quota data lives with the blobs 102- ✅ Hold service stays relatively stateless 103- ✅ Works with any S3-compatible service (Storj, Minio, Upcloud, Fly.io) 104 105**Cons:** 106- ❌ Slower than local database (network round-trip) 107- ❌ Eventual consistency issues 108- ❌ Race conditions on concurrent updates 109- ❌ Extra S3 API costs (GET/PUT per upload) 110 111**Performance:** 112- Each blob upload: 1 HEAD (blob exists?) + 1 GET (quota) + 1 PUT (update quota) 113- Typical latency: 100-200ms total overhead 114- For high-throughput registries, consider SQLite 115 116### Option 2: SQLite Database (Recommended for Shared Holds) 117 118Local database in hold service: 119 120```bash 121/var/lib/atcr/hold-quota.db 122``` 123 124**Pros:** 125- ✅ Fast local queries (no network latency) 126- ✅ ACID transactions (no race conditions) 127- ✅ Efficient for high-throughput registries 128- ✅ Can use foreign keys and joins 129 130**Cons:** 131- ❌ Makes hold service stateful (persistent volume needed) 132- ❌ Not ideal for ephemeral BYOS deployments 133- ❌ Backup/restore complexity 134- ❌ Multi-instance scaling requires shared database 135 136**Schema:** 137```sql 138CREATE TABLE user_quotas ( 139 did TEXT PRIMARY KEY, 140 quota_limit INTEGER NOT NULL DEFAULT 10737418240, -- 10GB 141 quota_used INTEGER NOT NULL DEFAULT 0, 142 updated_at TIMESTAMP 143); 144 145CREATE TABLE claimed_layers ( 146 did TEXT NOT NULL, 147 digest TEXT NOT NULL, 148 size INTEGER NOT NULL, 149 claimed_at TIMESTAMP, 150 PRIMARY KEY(did, digest) 151); 152``` 153 154### Recommendation 155 156- **BYOS (user-owned holds):** S3-based (keeps hold service ephemeral) 157- **Shared holds (multi-user):** SQLite (better performance and consistency) 158- **High-traffic production:** SQLite or PostgreSQL (Harbor uses this) 159 160## Quota Data Model 161 162### Quota File Format (S3-based) 163 164```json 165{ 166 "did": "did:plc:alice123", 167 "limit": 10737418240, 168 "used": 5368709120, 169 "claimed_layers": { 170 "sha256:abc123...": 104857600, 171 "sha256:def456...": 52428800, 172 "sha256:789ghi...": 209715200 173 }, 174 "last_updated": "2025-10-09T12:34:56Z", 175 "version": 1 176} 177``` 178 179**Fields:** 180- `did`: User's ATProto DID 181- `limit`: Maximum storage in bytes (default: 10GB) 182- `used`: Current storage usage in bytes (sum of claimed_layers) 183- `claimed_layers`: Map of digest → size for all layers user has uploaded 184- `last_updated`: Timestamp of last quota update 185- `version`: Schema version for future migrations 186 187### Why Track Individual Layers? 188 189**Q: Can't we just track a counter?** 190 191**A: We need layer tracking for:** 192 1931. **Deduplication detection** 194 - Check if user already claimed a layer → free upload 195 - Example: Updating an image reuses most layers 196 1972. **Accurate deletes** 198 - When manifest deleted, only decrement unclaimed layers 199 - User may have 5 images sharing layer A - deleting 1 image doesn't free layer A 200 2013. **Quota reconciliation** 202 - Verify quota matches reality by listing user's manifests 203 - Recalculate from layers in manifests vs claimed_layers map 204 2054. **Auditing** 206 - "Show me what I'm storing" 207 - Users can see which layers consume their quota 208 209## Push Flow (Detailed) 210 211### Step-by-Step: User Pushes Image 212 213``` 214┌──────────┐ ┌──────────┐ ┌──────────┐ 215│ Client │ │ Hold │ │ S3 │ 216│ (Docker) │ │ Service │ │ Bucket │ 217└──────────┘ └──────────┘ └──────────┘ 218 │ │ │ 219 │ 1. PUT /v2/.../blobs/ │ │ 220 │ upload?digest=sha256:abc│ │ 221 ├───────────────────────────>│ │ 222 │ │ │ 223 │ │ 2. Check if blob exists │ 224 │ │ (Stat/HEAD request) │ 225 │ ├───────────────────────────>│ 226 │ │<───────────────────────────┤ 227 │ │ 200 OK (exists) or │ 228 │ │ 404 Not Found │ 229 │ │ │ 230 │ │ 3. Read user quota │ 231 │ │ GET /atcr/quota/{did} │ 232 │ ├───────────────────────────>│ 233 │ │<───────────────────────────┤ 234 │ │ quota.json │ 235 │ │ │ 236 │ │ 4. Calculate quota impact │ 237 │ │ - If digest in │ 238 │ │ claimed_layers: 0 │ 239 │ │ - Else: size │ 240 │ │ │ 241 │ │ 5. Check quota limit │ 242 │ │ used + impact <= limit? │ 243 │ │ │ 244 │ │ 6. Update quota │ 245 │ │ PUT /atcr/quota/{did} │ 246 │ ├───────────────────────────>│ 247 │ │<───────────────────────────┤ 248 │ │ 200 OK │ 249 │ │ │ 250 │ 7. Presigned URL │ │ 251 │<───────────────────────────┤ │ 252 │ {url: "https://s3..."} │ │ 253 │ │ │ 254 │ 8. Upload blob to S3 │ │ 255 ├────────────────────────────┼───────────────────────────>│ 256 │ │ │ 257 │ 9. 200 OK │ │ 258 │<───────────────────────────┼────────────────────────────┤ 259 │ │ │ 260``` 261 262### Implementation (Pseudocode) 263 264```go 265// cmd/hold/main.go - HandlePutPresignedURL 266 267func (s *HoldService) HandlePutPresignedURL(w http.ResponseWriter, r *http.Request) { 268 var req PutPresignedURLRequest 269 json.NewDecoder(r.Body).Decode(&req) 270 271 // Step 1: Check if blob already exists in S3 272 blobPath := fmt.Sprintf("/docker/registry/v2/blobs/%s/%s/%s/data", 273 algorithm, digest[:2], digest) 274 275 _, err := s.driver.Stat(ctx, blobPath) 276 blobExists := (err == nil) 277 278 // Step 2: Read quota from S3 (or SQLite) 279 quota, err := s.quotaManager.GetQuota(req.DID) 280 if err != nil { 281 // First upload - create quota with defaults 282 quota = &Quota{ 283 DID: req.DID, 284 Limit: s.config.QuotaDefaultLimit, 285 Used: 0, 286 ClaimedLayers: make(map[string]int64), 287 } 288 } 289 290 // Step 3: Calculate quota impact 291 quotaImpact := req.Size // Default: assume new layer 292 293 if _, alreadyClaimed := quota.ClaimedLayers[req.Digest]; alreadyClaimed { 294 // User already uploaded this layer before 295 quotaImpact = 0 296 log.Printf("Layer %s already claimed by %s, no quota impact", 297 req.Digest, req.DID) 298 } else if blobExists { 299 // Blob exists in S3 (uploaded by another user) 300 // But this user is claiming it for first time 301 // Still counts against their quota 302 log.Printf("Layer %s exists globally but new to %s, quota impact: %d", 303 req.Digest, req.DID, quotaImpact) 304 } else { 305 // Brand new blob - will be uploaded to S3 306 log.Printf("New layer %s for %s, quota impact: %d", 307 req.Digest, req.DID, quotaImpact) 308 } 309 310 // Step 4: Check quota limit 311 if quota.Used + quotaImpact > quota.Limit { 312 http.Error(w, fmt.Sprintf( 313 "quota exceeded: used=%d, impact=%d, limit=%d", 314 quota.Used, quotaImpact, quota.Limit, 315 ), http.StatusPaymentRequired) // 402 316 return 317 } 318 319 // Step 5: Update quota (optimistic - before upload completes) 320 quota.Used += quotaImpact 321 if quotaImpact > 0 { 322 quota.ClaimedLayers[req.Digest] = req.Size 323 } 324 quota.LastUpdated = time.Now() 325 326 if err := s.quotaManager.SaveQuota(quota); err != nil { 327 http.Error(w, "failed to update quota", http.StatusInternalServerError) 328 return 329 } 330 331 // Step 6: Generate presigned URL 332 presignedURL, err := s.getUploadURL(ctx, req.Digest, req.Size, req.DID) 333 if err != nil { 334 // Rollback quota update on error 335 quota.Used -= quotaImpact 336 delete(quota.ClaimedLayers, req.Digest) 337 s.quotaManager.SaveQuota(quota) 338 339 http.Error(w, "failed to generate presigned URL", http.StatusInternalServerError) 340 return 341 } 342 343 // Step 7: Return presigned URL + quota info 344 resp := PutPresignedURLResponse{ 345 URL: presignedURL, 346 ExpiresAt: time.Now().Add(15 * time.Minute), 347 QuotaInfo: QuotaInfo{ 348 Used: quota.Used, 349 Limit: quota.Limit, 350 Available: quota.Limit - quota.Used, 351 Impact: quotaImpact, 352 AlreadyClaimed: quotaImpact == 0, 353 }, 354 } 355 356 w.Header().Set("Content-Type", "application/json") 357 json.NewEncoder(w).Encode(resp) 358} 359``` 360 361### Race Condition Handling 362 363**Problem:** Two concurrent uploads of the same blob 364 365``` 366Time User A User B 3670ms Upload layer X (100MB) 36810ms Upload layer X (100MB) 36920ms Check exists: NO Check exists: NO 37030ms Quota impact: 100MB Quota impact: 100MB 37140ms Update quota A: +100MB Update quota B: +100MB 37250ms Generate presigned URL Generate presigned URL 373100ms Upload to S3 completes Upload to S3 (overwrites A's) 374``` 375 376**Result:** Both users charged 100MB, but only 100MB stored in S3. 377 378**Mitigation strategies:** 379 3801. **Accept eventual consistency** (recommended for S3-based) 381 - Run periodic reconciliation to fix discrepancies 382 - Small inconsistency window (minutes) is acceptable 383 - Reconciliation uses PDS as source of truth 384 3852. **Optimistic locking** (S3 ETags) 386 ```go 387 // Use S3 ETags for conditional writes 388 oldETag := getQuotaFileETag(did) 389 err := putQuotaFileWithCondition(quota, oldETag) 390 if err == PreconditionFailed { 391 // Retry with fresh read 392 } 393 ``` 394 3953. **Database transactions** (SQLite-based) 396 ```sql 397 BEGIN TRANSACTION; 398 SELECT * FROM user_quotas WHERE did = ? FOR UPDATE; 399 UPDATE user_quotas SET used = used + ? WHERE did = ?; 400 COMMIT; 401 ``` 402 403## Delete Flow 404 405### Manifest Deletion via AppView UI 406 407When a user deletes a manifest through the AppView web interface: 408 409``` 410┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ 411│ User │ │ AppView │ │ Hold │ │ PDS │ 412│ UI │ │ Database │ │ Service │ │ │ 413└──────────┘ └──────────┘ └──────────┘ └──────────┘ 414 │ │ │ │ 415 │ DELETE manifest │ │ │ 416 ├─────────────────────>│ │ │ 417 │ │ │ │ 418 │ │ 1. Get manifest │ │ 419 │ │ and layers │ │ 420 │ │ │ │ 421 │ │ 2. Check which │ │ 422 │ │ layers still │ │ 423 │ │ referenced by │ │ 424 │ │ user's other │ │ 425 │ │ manifests │ │ 426 │ │ │ │ 427 │ │ 3. DELETE manifest │ │ 428 │ │ from PDS │ │ 429 │ ├──────────────────────┼─────────────────────>│ 430 │ │ │ │ 431 │ │ 4. POST /quota/decrement │ 432 │ ├─────────────────────>│ │ 433 │ │ {layers: [...]} │ │ 434 │ │ │ │ 435 │ │ │ 5. Update quota │ 436 │ │ │ Remove unclaimed │ 437 │ │ │ layers │ 438 │ │ │ │ 439 │ │ 6. 200 OK │ │ 440 │ │<─────────────────────┤ │ 441 │ │ │ │ 442 │ │ 7. Delete from DB │ │ 443 │ │ │ │ 444 │ 8. Success │ │ │ 445 │<─────────────────────┤ │ │ 446 │ │ │ │ 447``` 448 449### AppView Implementation 450 451```go 452// pkg/appview/handlers/manifest.go 453 454func (h *ManifestHandler) DeleteManifest(w http.ResponseWriter, r *http.Request) { 455 did := r.Context().Value("auth.did").(string) 456 repository := chi.URLParam(r, "repository") 457 digest := chi.URLParam(r, "digest") 458 459 // Step 1: Get manifest and its layers from database 460 manifest, err := db.GetManifest(h.db, digest) 461 if err != nil { 462 http.Error(w, "manifest not found", 404) 463 return 464 } 465 466 layers, err := db.GetLayersForManifest(h.db, manifest.ID) 467 if err != nil { 468 http.Error(w, "failed to get layers", 500) 469 return 470 } 471 472 // Step 2: For each layer, check if user still references it 473 // in other manifests 474 layersToDecrement := []LayerInfo{} 475 476 for _, layer := range layers { 477 // Query: does this user have other manifests using this layer? 478 stillReferenced, err := db.CheckLayerReferencedByUser( 479 h.db, did, repository, layer.Digest, manifest.ID, 480 ) 481 482 if err != nil { 483 http.Error(w, "failed to check layer references", 500) 484 return 485 } 486 487 if !stillReferenced { 488 // This layer is no longer used by user 489 layersToDecrement = append(layersToDecrement, LayerInfo{ 490 Digest: layer.Digest, 491 Size: layer.Size, 492 }) 493 } 494 } 495 496 // Step 3: Delete manifest from user's PDS 497 atprotoClient := atproto.NewClient(manifest.PDSEndpoint, did, accessToken) 498 err = atprotoClient.DeleteRecord(ctx, atproto.ManifestCollection, manifestRKey) 499 if err != nil { 500 http.Error(w, "failed to delete from PDS", 500) 501 return 502 } 503 504 // Step 4: Notify hold service to decrement quota 505 if len(layersToDecrement) > 0 { 506 holdClient := &http.Client{} 507 508 decrementReq := QuotaDecrementRequest{ 509 DID: did, 510 Layers: layersToDecrement, 511 } 512 513 body, _ := json.Marshal(decrementReq) 514 resp, err := holdClient.Post( 515 manifest.HoldEndpoint + "/quota/decrement", 516 "application/json", 517 bytes.NewReader(body), 518 ) 519 520 if err != nil || resp.StatusCode != 200 { 521 log.Printf("Warning: failed to update quota on hold service: %v", err) 522 // Continue anyway - GC reconciliation will fix it 523 } 524 } 525 526 // Step 5: Delete from AppView database 527 err = db.DeleteManifest(h.db, did, repository, digest) 528 if err != nil { 529 http.Error(w, "failed to delete from database", 500) 530 return 531 } 532 533 w.WriteHeader(http.StatusNoContent) 534} 535``` 536 537### Hold Service Decrement Endpoint 538 539```go 540// cmd/hold/main.go 541 542type QuotaDecrementRequest struct { 543 DID string `json:"did"` 544 Layers []LayerInfo `json:"layers"` 545} 546 547type LayerInfo struct { 548 Digest string `json:"digest"` 549 Size int64 `json:"size"` 550} 551 552func (s *HoldService) HandleQuotaDecrement(w http.ResponseWriter, r *http.Request) { 553 var req QuotaDecrementRequest 554 if err := json.NewDecoder(r.Body).Decode(&req); err != nil { 555 http.Error(w, "invalid request", 400) 556 return 557 } 558 559 // Read current quota 560 quota, err := s.quotaManager.GetQuota(req.DID) 561 if err != nil { 562 http.Error(w, "quota not found", 404) 563 return 564 } 565 566 // Decrement quota for each layer 567 for _, layer := range req.Layers { 568 if size, claimed := quota.ClaimedLayers[layer.Digest]; claimed { 569 // Remove from claimed layers 570 delete(quota.ClaimedLayers, layer.Digest) 571 quota.Used -= size 572 573 log.Printf("Decremented quota for %s: layer %s (%d bytes)", 574 req.DID, layer.Digest, size) 575 } else { 576 log.Printf("Warning: layer %s not in claimed_layers for %s", 577 layer.Digest, req.DID) 578 } 579 } 580 581 // Ensure quota.Used doesn't go negative (defensive) 582 if quota.Used < 0 { 583 log.Printf("Warning: quota.Used went negative for %s, resetting to 0", req.DID) 584 quota.Used = 0 585 } 586 587 // Save updated quota 588 quota.LastUpdated = time.Now() 589 if err := s.quotaManager.SaveQuota(quota); err != nil { 590 http.Error(w, "failed to save quota", 500) 591 return 592 } 593 594 // Return updated quota info 595 json.NewEncoder(w).Encode(map[string]any{ 596 "used": quota.Used, 597 "limit": quota.Limit, 598 }) 599} 600``` 601 602### SQL Query: Check Layer References 603 604```sql 605-- pkg/appview/db/queries.go 606 607-- Check if user still references this layer in other manifests 608SELECT COUNT(*) 609FROM layers l 610JOIN manifests m ON l.manifest_id = m.id 611WHERE m.did = ? -- User's DID 612 AND l.digest = ? -- Layer digest 613 AND m.id != ? -- Exclude the manifest being deleted 614``` 615 616## Garbage Collection 617 618### Background: Orphaned Blobs 619 620Orphaned blobs accumulate when: 6211. Manifest push fails after blobs uploaded (presigned URLs bypass hold) 6222. Quota exceeded - manifest rejected, blobs already in S3 6233. User deletes manifest - blobs no longer referenced 624 625**GC periodically cleans these up.** 626 627### GC Cron Implementation 628 629Similar to AppView's backfill worker, the hold service can run periodic GC: 630 631```go 632// cmd/hold/gc/gc.go 633 634type GarbageCollector struct { 635 driver storagedriver.StorageDriver 636 appviewURL string 637 holdURL string 638 quotaManager *quota.Manager 639} 640 641// Run garbage collection 642func (gc *GarbageCollector) Run(ctx context.Context) error { 643 log.Println("Starting garbage collection...") 644 645 // Step 1: Get list of referenced blobs from AppView 646 referenced, err := gc.getReferencedBlobs() 647 if err != nil { 648 return fmt.Errorf("failed to get referenced blobs: %w", err) 649 } 650 651 referencedSet := make(map[string]bool) 652 for _, digest := range referenced { 653 referencedSet[digest] = true 654 } 655 656 log.Printf("AppView reports %d referenced blobs", len(referenced)) 657 658 // Step 2: Walk S3 blobs 659 deletedCount := 0 660 reclaimedBytes := int64(0) 661 662 err = gc.driver.Walk(ctx, "/docker/registry/v2/blobs", func(fileInfo storagedriver.FileInfo) error { 663 if fileInfo.IsDir() { 664 return nil // Skip directories 665 } 666 667 // Extract digest from path 668 // Path: /docker/registry/v2/blobs/sha256/ab/abc123.../data 669 digest := extractDigestFromPath(fileInfo.Path()) 670 671 if !referencedSet[digest] { 672 // Unreferenced blob - delete it 673 size := fileInfo.Size() 674 675 if err := gc.driver.Delete(ctx, fileInfo.Path()); err != nil { 676 log.Printf("Failed to delete blob %s: %v", digest, err) 677 return nil // Continue anyway 678 } 679 680 deletedCount++ 681 reclaimedBytes += size 682 683 log.Printf("GC: Deleted unreferenced blob %s (%d bytes)", digest, size) 684 } 685 686 return nil 687 }) 688 689 if err != nil { 690 return fmt.Errorf("failed to walk blobs: %w", err) 691 } 692 693 log.Printf("GC complete: deleted %d blobs, reclaimed %d bytes", 694 deletedCount, reclaimedBytes) 695 696 return nil 697} 698 699// Get referenced blobs from AppView 700func (gc *GarbageCollector) getReferencedBlobs() ([]string, error) { 701 // Query AppView for all blobs referenced by manifests 702 // stored in THIS hold service 703 url := fmt.Sprintf("%s/internal/blobs/referenced?hold=%s", 704 gc.appviewURL, url.QueryEscape(gc.holdURL)) 705 706 resp, err := http.Get(url) 707 if err != nil { 708 return nil, err 709 } 710 defer resp.Body.Close() 711 712 var result struct { 713 Blobs []string `json:"blobs"` 714 } 715 716 if err := json.NewDecoder(resp.Body).Decode(&result); err != nil { 717 return nil, err 718 } 719 720 return result.Blobs, nil 721} 722``` 723 724### AppView Internal API 725 726```go 727// pkg/appview/handlers/internal.go 728 729// Get all referenced blobs for a specific hold 730func (h *InternalHandler) GetReferencedBlobs(w http.ResponseWriter, r *http.Request) { 731 holdEndpoint := r.URL.Query().Get("hold") 732 if holdEndpoint == "" { 733 http.Error(w, "missing hold parameter", 400) 734 return 735 } 736 737 // Query database for all layers in manifests stored in this hold 738 query := ` 739 SELECT DISTINCT l.digest 740 FROM layers l 741 JOIN manifests m ON l.manifest_id = m.id 742 WHERE m.hold_endpoint = ? 743 ` 744 745 rows, err := h.db.Query(query, holdEndpoint) 746 if err != nil { 747 http.Error(w, "database error", 500) 748 return 749 } 750 defer rows.Close() 751 752 blobs := []string{} 753 for rows.Next() { 754 var digest string 755 if err := rows.Scan(&digest); err != nil { 756 continue 757 } 758 blobs = append(blobs, digest) 759 } 760 761 json.NewEncoder(w).Encode(map[string]any{ 762 "blobs": blobs, 763 "count": len(blobs), 764 "hold": holdEndpoint, 765 }) 766} 767``` 768 769### GC Cron Schedule 770 771```go 772// cmd/hold/main.go 773 774func main() { 775 // ... service setup ... 776 777 // Start GC cron if enabled 778 if os.Getenv("GC_ENABLED") == "true" { 779 gcInterval := 24 * time.Hour // Daily by default 780 781 go func() { 782 ticker := time.NewTicker(gcInterval) 783 defer ticker.Stop() 784 785 for range ticker.C { 786 if err := garbageCollector.Run(context.Background()); err != nil { 787 log.Printf("GC error: %v", err) 788 } 789 } 790 }() 791 792 log.Printf("GC cron started: runs every %v", gcInterval) 793 } 794 795 // Start server... 796} 797``` 798 799## Quota Reconciliation 800 801### PDS as Source of Truth 802 803**Key insight:** Manifest records in PDS are publicly readable (no OAuth needed for reads). 804 805Each manifest contains: 806- Repository name 807- Digest 808- Layers array with digest + size 809- Hold endpoint 810 811The hold service can query the PDS to calculate the user's true quota: 812 813``` 8141. List all io.atcr.manifest records for user 8152. Filter manifests where holdEndpoint == this hold service 8163. Extract unique layers (deduplicate by digest) 8174. Sum layer sizes = true quota usage 8185. Compare to quota file 8196. Fix discrepancies 820``` 821 822### Implementation 823 824```go 825// cmd/hold/quota/reconcile.go 826 827type Reconciler struct { 828 quotaManager *Manager 829 atprotoResolver *atproto.Resolver 830 holdURL string 831} 832 833// ReconcileUser recalculates quota from PDS manifests 834func (r *Reconciler) ReconcileUser(ctx context.Context, did string) error { 835 log.Printf("Reconciling quota for %s", did) 836 837 // Step 1: Resolve user's PDS endpoint 838 identity, err := r.atprotoResolver.ResolveIdentity(ctx, did) 839 if err != nil { 840 return fmt.Errorf("failed to resolve DID: %w", err) 841 } 842 843 // Step 2: Create unauthenticated ATProto client 844 // (manifest records are public - no OAuth needed) 845 client := atproto.NewClient(identity.PDSEndpoint, did, "") 846 847 // Step 3: List all manifest records for this user 848 manifests, err := client.ListRecords(ctx, atproto.ManifestCollection, 1000) 849 if err != nil { 850 return fmt.Errorf("failed to list manifests: %w", err) 851 } 852 853 // Step 4: Filter manifests stored in THIS hold service 854 // and extract unique layers 855 uniqueLayers := make(map[string]int64) // digest -> size 856 857 for _, record := range manifests { 858 var manifest atproto.ManifestRecord 859 if err := json.Unmarshal(record.Value, &manifest); err != nil { 860 log.Printf("Warning: failed to parse manifest: %v", err) 861 continue 862 } 863 864 // Only count manifests stored in this hold 865 if manifest.HoldEndpoint != r.holdURL { 866 continue 867 } 868 869 // Add config blob 870 if manifest.Config.Digest != "" { 871 uniqueLayers[manifest.Config.Digest] = manifest.Config.Size 872 } 873 874 // Add layer blobs 875 for _, layer := range manifest.Layers { 876 uniqueLayers[layer.Digest] = layer.Size 877 } 878 } 879 880 // Step 5: Calculate true quota usage 881 trueUsage := int64(0) 882 for _, size := range uniqueLayers { 883 trueUsage += size 884 } 885 886 log.Printf("User %s true usage from PDS: %d bytes (%d unique layers)", 887 did, trueUsage, len(uniqueLayers)) 888 889 // Step 6: Compare with current quota file 890 quota, err := r.quotaManager.GetQuota(did) 891 if err != nil { 892 log.Printf("No existing quota for %s, creating new", did) 893 quota = &Quota{ 894 DID: did, 895 Limit: r.quotaManager.DefaultLimit, 896 ClaimedLayers: make(map[string]int64), 897 } 898 } 899 900 // Step 7: Fix discrepancies 901 if quota.Used != trueUsage || len(quota.ClaimedLayers) != len(uniqueLayers) { 902 log.Printf("Quota mismatch for %s: recorded=%d, actual=%d (diff=%d)", 903 did, quota.Used, trueUsage, trueUsage - quota.Used) 904 905 // Update quota to match PDS truth 906 quota.Used = trueUsage 907 quota.ClaimedLayers = uniqueLayers 908 quota.LastUpdated = time.Now() 909 910 if err := r.quotaManager.SaveQuota(quota); err != nil { 911 return fmt.Errorf("failed to save reconciled quota: %w", err) 912 } 913 914 log.Printf("Reconciled quota for %s: %d bytes", did, trueUsage) 915 } else { 916 log.Printf("Quota for %s is accurate", did) 917 } 918 919 return nil 920} 921 922// ReconcileAll reconciles all users (run periodically) 923func (r *Reconciler) ReconcileAll(ctx context.Context) error { 924 // Get list of all users with quota files 925 users, err := r.quotaManager.ListUsers() 926 if err != nil { 927 return err 928 } 929 930 log.Printf("Starting reconciliation for %d users", len(users)) 931 932 for _, did := range users { 933 if err := r.ReconcileUser(ctx, did); err != nil { 934 log.Printf("Failed to reconcile %s: %v", did, err) 935 // Continue with other users 936 } 937 } 938 939 log.Println("Reconciliation complete") 940 return nil 941} 942``` 943 944### Reconciliation Cron 945 946```go 947// cmd/hold/main.go 948 949func main() { 950 // ... setup ... 951 952 // Start reconciliation cron 953 if os.Getenv("QUOTA_RECONCILE_ENABLED") == "true" { 954 reconcileInterval := 24 * time.Hour // Daily 955 956 go func() { 957 ticker := time.NewTicker(reconcileInterval) 958 defer ticker.Stop() 959 960 for range ticker.C { 961 if err := reconciler.ReconcileAll(context.Background()); err != nil { 962 log.Printf("Reconciliation error: %v", err) 963 } 964 } 965 }() 966 967 log.Printf("Quota reconciliation cron started: runs every %v", reconcileInterval) 968 } 969 970 // ... start server ... 971} 972``` 973 974### Why PDS as Source of Truth Works 975 9761. **Manifests are canonical** - If manifest exists in PDS, user owns those layers 9772. **Public reads** - No OAuth needed, just resolve DID → PDS endpoint 9783. **ATProto durability** - PDS is user's authoritative data store 9794. **AppView is cache** - AppView database might lag or have inconsistencies 9805. **Reconciliation fixes drift** - Periodic sync from PDS ensures accuracy 981 982**Example reconciliation scenarios:** 983 984- **Orphaned quota entries:** User deleted manifest from PDS, but hold quota still has it 985 → Reconciliation removes from claimed_layers 986 987- **Missing quota entries:** User pushed manifest, but quota update failed 988 → Reconciliation adds to claimed_layers 989 990- **Race condition duplicates:** Two concurrent pushes double-counted a layer 991 → Reconciliation fixes to actual usage 992 993## Configuration 994 995### Hold Service Environment Variables 996 997```bash 998# .env.hold 999 1000# ============================================================================ 1001# Quota Configuration 1002# ============================================================================ 1003 1004# Enable quota enforcement 1005QUOTA_ENABLED=true 1006 1007# Default quota limit per user (bytes) 1008# 10GB = 10737418240 1009# 50GB = 53687091200 1010# 100GB = 107374182400 1011QUOTA_DEFAULT_LIMIT=10737418240 1012 1013# Storage backend for quota data 1014# Options: s3, sqlite 1015QUOTA_STORAGE_BACKEND=s3 1016 1017# For S3-based storage: 1018# Quota files stored in same bucket as blobs 1019QUOTA_STORAGE_PREFIX=/atcr/quota/ 1020 1021# For SQLite-based storage: 1022QUOTA_DB_PATH=/var/lib/atcr/hold-quota.db 1023 1024# ============================================================================ 1025# Garbage Collection 1026# ============================================================================ 1027 1028# Enable periodic garbage collection 1029GC_ENABLED=true 1030 1031# GC interval (default: 24h) 1032GC_INTERVAL=24h 1033 1034# AppView URL for GC reference checking 1035APPVIEW_URL=https://atcr.io 1036 1037# ============================================================================ 1038# Quota Reconciliation 1039# ============================================================================ 1040 1041# Enable quota reconciliation from PDS 1042QUOTA_RECONCILE_ENABLED=true 1043 1044# Reconciliation interval (default: 24h) 1045QUOTA_RECONCILE_INTERVAL=24h 1046 1047# ============================================================================ 1048# Hold Service Identity (Required) 1049# ============================================================================ 1050 1051# Public URL of this hold service 1052HOLD_PUBLIC_URL=https://hold1.example.com 1053 1054# Owner DID (for auto-registration) 1055HOLD_OWNER=did:plc:xyz123 1056``` 1057 1058### AppView Configuration 1059 1060```bash 1061# .env.appview 1062 1063# Internal API endpoint for hold services 1064# Used for GC reference checking 1065ATCR_INTERNAL_API_ENABLED=true 1066 1067# Optional: authentication token for internal APIs 1068ATCR_INTERNAL_API_TOKEN=secret123 1069``` 1070 1071## Trade-offs & Design Decisions 1072 1073### 1. Claimed Storage vs Physical Storage 1074 1075**Decision:** Track claimed storage (logical accounting) 1076 1077**Why:** 1078- Predictable for users: "you pay for what you upload" 1079- No complex cross-user dependencies 1080- Delete always gives you quota back 1081- Matches Harbor's proven model 1082 1083**Trade-off:** 1084- Total claimed can exceed physical storage 1085- Users might complain "I uploaded 10GB but S3 only has 6GB" 1086 1087**Mitigation:** 1088- Show deduplication savings metric 1089- Educate users: "You claimed 10GB, but deduplication saved 4GB" 1090 1091### 2. S3 vs SQLite for Quota Storage 1092 1093**Decision:** Support both, recommend based on use case 1094 1095**S3 Pros:** 1096- No database to manage 1097- Quota data lives with blobs 1098- Better for ephemeral BYOS 1099 1100**SQLite Pros:** 1101- Faster (no network) 1102- ACID transactions (no race conditions) 1103- Better for high-traffic shared holds 1104 1105**Trade-off:** 1106- S3: eventual consistency, race conditions 1107- SQLite: stateful service, scaling challenges 1108 1109**Mitigation:** 1110- Reconciliation fixes S3 inconsistencies 1111- SQLite can use shared DB for multi-instance 1112 1113### 3. Optimistic Quota Update 1114 1115**Decision:** Update quota BEFORE upload completes 1116 1117**Why:** 1118- Prevent race conditions (two users uploading simultaneously) 1119- Can reject before presigned URL generated 1120- Simpler flow 1121 1122**Trade-off:** 1123- If upload fails, quota already incremented (user "paid" for nothing) 1124 1125**Mitigation:** 1126- Reconciliation from PDS fixes orphaned quota entries 1127- Acceptable for MVP (upload failures are rare) 1128 1129### 4. AppView as Intermediary 1130 1131**Decision:** AppView notifies hold service on deletes 1132 1133**Why:** 1134- AppView already has manifest/layer database 1135- Can efficiently check if layer still referenced 1136- Hold service doesn't need to query PDS on every delete 1137 1138**Trade-off:** 1139- AppView → Hold dependency 1140- Network hop on delete 1141 1142**Mitigation:** 1143- If notification fails, reconciliation fixes quota 1144- Eventually consistent is acceptable 1145 1146### 5. PDS as Source of Truth 1147 1148**Decision:** Use PDS manifests for reconciliation 1149 1150**Why:** 1151- Manifests in PDS are canonical user data 1152- Public reads (no OAuth for reconciliation) 1153- AppView database might lag or be inconsistent 1154 1155**Trade-off:** 1156- Reconciliation requires PDS queries (slower) 1157- Limited to 1000 manifests per query 1158 1159**Mitigation:** 1160- Run reconciliation daily (not real-time) 1161- Paginate if user has >1000 manifests 1162 1163## Future Enhancements 1164 1165### 1. Quota API Endpoints 1166 1167``` 1168GET /quota/usage - Get current user's quota 1169GET /quota/breakdown - Get storage by repository 1170POST /quota/limit - Update user's quota limit (admin) 1171GET /quota/stats - Get hold-wide statistics 1172``` 1173 1174### 2. Quota Alerts 1175 1176Notify users when approaching limit: 1177- Email/webhook at 80%, 90%, 95% 1178- Reject uploads at 100% (currently implemented) 1179- Grace period: allow 105% temporarily 1180 1181### 3. Tiered Quotas 1182 1183Different limits based on user tier: 1184- Free: 10GB 1185- Pro: 100GB 1186- Enterprise: unlimited 1187 1188### 4. Quota Purchasing 1189 1190Allow users to buy additional storage: 1191- Stripe integration 1192- $0.10/GB/month pricing 1193- Dynamic limit updates 1194 1195### 5. Cross-Hold Deduplication 1196 1197If multiple holds share same S3 bucket: 1198- Track blob ownership globally 1199- Split costs proportionally 1200- More complex, but maximizes deduplication 1201 1202### 6. Manifest-Based Quota (Alternative Model) 1203 1204Instead of tracking layers, track manifests: 1205- Simpler: just count manifest sizes 1206- No deduplication benefits for users 1207- Might be acceptable for some use cases 1208 1209### 7. Redis-Based Quota (High Performance) 1210 1211For high-traffic registries: 1212- Use Redis instead of S3/SQLite 1213- Sub-millisecond quota checks 1214- Harbor-proven approach 1215 1216### 8. Quota Visualizations 1217 1218Web UI showing: 1219- Storage usage over time 1220- Top consumers by repository 1221- Deduplication savings graph 1222- Layer size distribution 1223 1224## Appendix: SQL Queries 1225 1226### Check if User Still References Layer 1227 1228```sql 1229-- After deleting manifest, check if user has other manifests using this layer 1230SELECT COUNT(*) 1231FROM layers l 1232JOIN manifests m ON l.manifest_id = m.id 1233WHERE m.did = ? -- User's DID 1234 AND l.digest = ? -- Layer digest to check 1235 AND m.id != ? -- Exclude the manifest being deleted 1236``` 1237 1238### Get All Unique Layers for User 1239 1240```sql 1241-- Calculate true quota usage for a user 1242SELECT DISTINCT l.digest, l.size 1243FROM layers l 1244JOIN manifests m ON l.manifest_id = m.id 1245WHERE m.did = ? 1246 AND m.hold_endpoint = ? 1247``` 1248 1249### Get Referenced Blobs for Hold 1250 1251```sql 1252-- For GC: get all blobs still referenced by any user of this hold 1253SELECT DISTINCT l.digest 1254FROM layers l 1255JOIN manifests m ON l.manifest_id = m.id 1256WHERE m.hold_endpoint = ? 1257``` 1258 1259### Get Storage Stats by Repository 1260 1261```sql 1262-- User's storage broken down by repository 1263SELECT 1264 m.repository, 1265 COUNT(DISTINCT m.id) as manifest_count, 1266 COUNT(DISTINCT l.digest) as unique_layers, 1267 SUM(l.size) as total_size 1268FROM manifests m 1269JOIN layers l ON l.manifest_id = m.id 1270WHERE m.did = ? 1271 AND m.hold_endpoint = ? 1272GROUP BY m.repository 1273ORDER BY total_size DESC 1274``` 1275 1276## References 1277 1278- **Harbor Quotas:** https://goharbor.io/docs/1.10/administration/configure-project-quotas/ 1279- **Harbor Source:** https://github.com/goharbor/harbor 1280- **ATProto Spec:** https://atproto.com/specs/record 1281- **OCI Distribution Spec:** https://github.com/opencontainers/distribution-spec 1282- **S3 API Reference:** https://docs.aws.amazon.com/AmazonS3/latest/API/ 1283- **Distribution GC:** https://github.com/distribution/distribution/blob/main/registry/storage/garbagecollect.go 1284 1285--- 1286 1287**Document Version:** 1.0 1288**Last Updated:** 2025-10-09 1289**Author:** Generated from implementation research and Harbor analysis