···11-# API Key Migration Plan
22-33-## Overview
44-55-Replace the session token system (used only by credential helper) with API keys that link to OAuth sessions. This simplifies authentication while maintaining all use cases.
66-77-## Current State
88-99-### Three Separate Auth Systems
1010-1111-1. **Session Tokens** (`pkg/auth/session/`)
1212- - JWT-like tokens: `<base64_claims>.<base64_signature>`
1313- - Created after OAuth callback, shown to user to copy
1414- - User manually pastes into credential helper config
1515- - Validated in `/auth/token` and `/auth/exchange`
1616- - 30-day TTL
1717- - **Problem:** Awkward UX, requires manual copy/paste
1818-1919-2. **UI Sessions** (`pkg/appview/session/`)
2020- - Cookie-based (`atcr_session`)
2121- - Random session ID, server-side store
2222- - 24-hour TTL
2323- - **Keep this - works well**
2424-2525-3. **App Password Auth** (via PDS)
2626- - Direct `com.atproto.server.createSession` call
2727- - No AppView involvement until token request
2828- - **Keep this - essential for non-UI users**
2929-3030-## Target State
3131-3232-### Two Auth Methods
3333-3434-1. **API Keys** (NEW - replaces session tokens)
3535- - Generated in UI after OAuth login
3636- - Format: `atcr_<32_bytes_base64>`
3737- - Linked to server-side OAuth refresh token
3838- - Multiple keys per user (laptop, CI/CD, etc.)
3939- - Revocable without re-auth
4040-4141-2. **App Passwords** (KEEP)
4242- - Direct PDS authentication
4343- - Works without UI/OAuth
4444-4545-### UI Sessions (UNCHANGED)
4646-- Cookie-based for web UI
4747-- Separate system, no changes needed
4848-4949----
5050-5151-## Implementation Plan
5252-5353-### Phase 1: API Key System
5454-5555-#### 1.1 Create API Key Store (`pkg/appview/apikey/store.go`)
5656-5757-```go
5858-package apikey
5959-6060-import (
6161- "crypto/rand"
6262- "encoding/base64"
6363- "encoding/json"
6464- "fmt"
6565- "os"
6666- "sync"
6767- "time"
6868- "golang.org/x/crypto/bcrypt"
6969-)
7070-7171-// APIKey represents a user's API key
7272-type APIKey struct {
7373- ID string `json:"id"` // UUID
7474- KeyHash string `json:"key_hash"` // bcrypt hash
7575- DID string `json:"did"` // Owner's DID
7676- Handle string `json:"handle"` // Owner's handle
7777- Name string `json:"name"` // User-provided name
7878- CreatedAt time.Time `json:"created_at"`
7979- LastUsed time.Time `json:"last_used"`
8080-}
8181-8282-// Store manages API keys
8383-type Store struct {
8484- mu sync.RWMutex
8585- keys map[string]*APIKey // keyHash -> APIKey
8686- byDID map[string][]string // DID -> []keyHash
8787- filePath string // /var/lib/atcr/api-keys.json
8888-}
8989-9090-// NewStore creates a new API key store
9191-func NewStore(filePath string) (*Store, error)
9292-9393-// Generate creates a new API key and returns the plaintext key (shown once)
9494-func (s *Store) Generate(did, handle, name string) (key string, keyID string, err error)
9595-9696-// Validate checks if an API key is valid and returns the associated data
9797-func (s *Store) Validate(key string) (*APIKey, error)
9898-9999-// List returns all API keys for a DID (without plaintext keys)
100100-func (s *Store) List(did string) []*APIKey
101101-102102-// Delete removes an API key
103103-func (s *Store) Delete(did, keyID string) error
104104-105105-// UpdateLastUsed updates the last used timestamp
106106-func (s *Store) UpdateLastUsed(keyHash string) error
107107-```
108108-109109-**Key Generation:**
110110-```go
111111-func (s *Store) Generate(did, handle, name string) (string, string, error) {
112112- // Generate 32 random bytes
113113- b := make([]byte, 32)
114114- if _, err := rand.Read(b); err != nil {
115115- return "", "", err
116116- }
117117-118118- // Format: atcr_<base64>
119119- key := "atcr_" + base64.RawURLEncoding.EncodeToString(b)
120120-121121- // Hash for storage
122122- keyHash, err := bcrypt.GenerateFromPassword([]byte(key), bcrypt.DefaultCost)
123123- if err != nil {
124124- return "", "", err
125125- }
126126-127127- // Generate ID
128128- keyID := generateUUID()
129129-130130- apiKey := &APIKey{
131131- ID: keyID,
132132- KeyHash: string(keyHash),
133133- DID: did,
134134- Handle: handle,
135135- Name: name,
136136- CreatedAt: time.Now(),
137137- LastUsed: time.Time{}, // Never used yet
138138- }
139139-140140- s.mu.Lock()
141141- s.keys[string(keyHash)] = apiKey
142142- s.byDID[did] = append(s.byDID[did], string(keyHash))
143143- s.mu.Unlock()
144144-145145- s.save()
146146-147147- // Return plaintext key (only time it's available)
148148- return key, keyID, nil
149149-}
150150-```
151151-152152-**Key Validation:**
153153-```go
154154-func (s *Store) Validate(key string) (*APIKey, error) {
155155- s.mu.RLock()
156156- defer s.mu.RUnlock()
157157-158158- // Try to match against all stored hashes
159159- for hash, apiKey := range s.keys {
160160- if err := bcrypt.CompareHashAndPassword([]byte(hash), []byte(key)); err == nil {
161161- // Update last used asynchronously
162162- go s.UpdateLastUsed(hash)
163163- return apiKey, nil
164164- }
165165- }
166166-167167- return nil, fmt.Errorf("invalid API key")
168168-}
169169-```
170170-171171-#### 1.2 Add API Key Handlers (`pkg/appview/handlers/apikeys.go`)
172172-173173-```go
174174-package handlers
175175-176176-import (
177177- "encoding/json"
178178- "html/template"
179179- "net/http"
180180- "github.com/gorilla/mux"
181181- "atcr.io/pkg/appview/apikey"
182182- "atcr.io/pkg/appview/middleware"
183183-)
184184-185185-// GenerateAPIKeyHandler handles POST /api/keys
186186-type GenerateAPIKeyHandler struct {
187187- Store *apikey.Store
188188-}
189189-190190-func (h *GenerateAPIKeyHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) {
191191- user := middleware.GetUser(r)
192192- if user == nil {
193193- http.Error(w, "Unauthorized", http.StatusUnauthorized)
194194- return
195195- }
196196-197197- name := r.FormValue("name")
198198- if name == "" {
199199- name = "Unnamed Key"
200200- }
201201-202202- key, keyID, err := h.Store.Generate(user.DID, user.Handle, name)
203203- if err != nil {
204204- http.Error(w, "Failed to generate key", http.StatusInternalServerError)
205205- return
206206- }
207207-208208- // Return key (shown once!)
209209- w.Header().Set("Content-Type", "application/json")
210210- json.NewEncoder(w).Encode(map[string]string{
211211- "id": keyID,
212212- "key": key,
213213- })
214214-}
215215-216216-// ListAPIKeysHandler handles GET /api/keys
217217-type ListAPIKeysHandler struct {
218218- Store *apikey.Store
219219-}
220220-221221-func (h *ListAPIKeysHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) {
222222- user := middleware.GetUser(r)
223223- if user == nil {
224224- http.Error(w, "Unauthorized", http.StatusUnauthorized)
225225- return
226226- }
227227-228228- keys := h.Store.List(user.DID)
229229-230230- w.Header().Set("Content-Type", "application/json")
231231- json.NewEncoder(w).Encode(keys)
232232-}
233233-234234-// DeleteAPIKeyHandler handles DELETE /api/keys/{id}
235235-type DeleteAPIKeyHandler struct {
236236- Store *apikey.Store
237237-}
238238-239239-func (h *DeleteAPIKeyHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) {
240240- user := middleware.GetUser(r)
241241- if user == nil {
242242- http.Error(w, "Unauthorized", http.StatusUnauthorized)
243243- return
244244- }
245245-246246- vars := mux.Vars(r)
247247- keyID := vars["id"]
248248-249249- if err := h.Store.Delete(user.DID, keyID); err != nil {
250250- http.Error(w, "Failed to delete key", http.StatusInternalServerError)
251251- return
252252- }
253253-254254- w.WriteHeader(http.StatusNoContent)
255255-}
256256-```
257257-258258-### Phase 2: Update Token Handler
259259-260260-#### 2.1 Modify `/auth/token` Handler (`pkg/auth/token/handler.go`)
261261-262262-```go
263263-type Handler struct {
264264- issuer *Issuer
265265- validator *atproto.SessionValidator
266266- apiKeyStore *apikey.Store // NEW
267267- defaultHoldEndpoint string
268268-}
269269-270270-func (h *Handler) ServeHTTP(w http.ResponseWriter, r *http.Request) {
271271- username, password, ok := r.BasicAuth()
272272- if !ok {
273273- return unauthorized
274274- }
275275-276276- var did, handle, accessToken string
277277-278278- // 1. Check if it's an API key (NEW)
279279- if strings.HasPrefix(password, "atcr_") {
280280- apiKey, err := h.apiKeyStore.Validate(password)
281281- if err != nil {
282282- fmt.Printf("DEBUG [token/handler]: API key validation failed: %v\n", err)
283283- return unauthorized
284284- }
285285-286286- did = apiKey.DID
287287- handle = apiKey.Handle
288288- fmt.Printf("DEBUG [token/handler]: API key validated for DID=%s, handle=%s\n", did, handle)
289289-290290- // API key is linked to OAuth session
291291- // OAuth refresher will provide access token when needed via middleware
292292- }
293293- // 2. Try app password (direct PDS)
294294- else {
295295- did, handle, accessToken, err = h.validator.CreateSessionAndGetToken(r.Context(), username, password)
296296- if err != nil {
297297- fmt.Printf("DEBUG [token/handler]: App password validation failed: %v\n", err)
298298- return unauthorized
299299- }
300300-301301- fmt.Printf("DEBUG [token/handler]: App password validated, DID=%s\n", did)
302302-303303- // Cache access token for manifest operations
304304- auth.GetGlobalTokenCache().Set(did, accessToken, 2*time.Hour)
305305-306306- // Ensure profile exists
307307- // ... existing code ...
308308- }
309309-310310- // Rest of handler: validate access, issue JWT, etc.
311311- // ... existing code ...
312312-}
313313-```
314314-315315-**Key Changes:**
316316-- Remove session token validation (`sessionManager.Validate()`)
317317-- Add API key check as first priority
318318-- Keep app password as fallback
319319-- API keys use OAuth refresher (server-side), app passwords use token cache (client-side)
320320-321321-#### 2.2 Remove `/auth/exchange` Endpoint
322322-323323-The `/auth/exchange` endpoint was only used for exchanging session tokens for registry JWTs. With API keys, this is no longer needed.
324324-325325-**Files to delete:**
326326-- `pkg/auth/exchange/handler.go`
327327-328328-**Files to update:**
329329-- `cmd/appview/serve.go` - Remove exchange handler registration
330330-331331-### Phase 3: Update UI
332332-333333-#### 3.1 Add API Keys Section to Settings Page
334334-335335-**Template** (`pkg/appview/templates/settings.html`):
336336-337337-```html
338338-<!-- Add after existing profile settings -->
339339-<section class="api-keys">
340340- <h2>API Keys</h2>
341341- <p>Generate API keys for Docker CLI and CI/CD. Each key is linked to your OAuth session.</p>
342342-343343- <!-- Generate New Key -->
344344- <div class="generate-key">
345345- <h3>Generate New API Key</h3>
346346- <form id="generate-key-form">
347347- <input type="text" id="key-name" placeholder="Key name (e.g., My Laptop)" required>
348348- <button type="submit">Generate Key</button>
349349- </form>
350350- </div>
351351-352352- <!-- Key Generated Modal (shown once) -->
353353- <div id="key-modal" class="modal hidden">
354354- <div class="modal-content">
355355- <h3>✓ API Key Generated!</h3>
356356- <p><strong>Copy this key now - it won't be shown again:</strong></p>
357357- <div class="key-display">
358358- <code id="generated-key"></code>
359359- <button onclick="copyKey()">Copy to Clipboard</button>
360360- </div>
361361- <div class="usage-instructions">
362362- <h4>Using with Docker:</h4>
363363- <pre>docker login atcr.io -u <span class="handle">{{.Profile.Handle}}</span> -p <span class="key-placeholder">[paste key here]</span></pre>
364364- </div>
365365- <button onclick="closeModal()">Done</button>
366366- </div>
367367- </div>
368368-369369- <!-- Existing Keys List -->
370370- <div class="keys-list">
371371- <h3>Your API Keys</h3>
372372- <table>
373373- <thead>
374374- <tr>
375375- <th>Name</th>
376376- <th>Created</th>
377377- <th>Last Used</th>
378378- <th>Actions</th>
379379- </tr>
380380- </thead>
381381- <tbody id="keys-table">
382382- <!-- Populated via JavaScript -->
383383- </tbody>
384384- </table>
385385- </div>
386386-</section>
387387-388388-<script>
389389-// Generate key
390390-document.getElementById('generate-key-form').addEventListener('submit', async (e) => {
391391- e.preventDefault();
392392- const name = document.getElementById('key-name').value;
393393-394394- const resp = await fetch('/api/keys', {
395395- method: 'POST',
396396- headers: {'Content-Type': 'application/x-www-form-urlencoded'},
397397- body: `name=${encodeURIComponent(name)}`
398398- });
399399-400400- const data = await resp.json();
401401-402402- // Show key in modal (only time it's available)
403403- document.getElementById('generated-key').textContent = data.key;
404404- document.getElementById('key-modal').classList.remove('hidden');
405405-406406- // Refresh keys list
407407- loadKeys();
408408-});
409409-410410-// Copy key to clipboard
411411-function copyKey() {
412412- const key = document.getElementById('generated-key').textContent;
413413- navigator.clipboard.writeText(key);
414414- alert('Copied to clipboard!');
415415-}
416416-417417-// Load existing keys
418418-async function loadKeys() {
419419- const resp = await fetch('/api/keys');
420420- const keys = await resp.json();
421421-422422- const tbody = document.getElementById('keys-table');
423423- tbody.innerHTML = keys.map(key => `
424424- <tr>
425425- <td>${key.name}</td>
426426- <td>${new Date(key.created_at).toLocaleDateString()}</td>
427427- <td>${key.last_used ? new Date(key.last_used).toLocaleDateString() : 'Never'}</td>
428428- <td><button onclick="deleteKey('${key.id}')">Revoke</button></td>
429429- </tr>
430430- `).join('');
431431-}
432432-433433-// Delete key
434434-async function deleteKey(id) {
435435- if (!confirm('Are you sure you want to revoke this key?')) return;
436436-437437- await fetch(`/api/keys/${id}`, { method: 'DELETE' });
438438- loadKeys();
439439-}
440440-441441-// Load keys on page load
442442-loadKeys();
443443-</script>
444444-445445-<style>
446446-.modal.hidden { display: none; }
447447-.modal {
448448- position: fixed;
449449- top: 0;
450450- left: 0;
451451- width: 100%;
452452- height: 100%;
453453- background: rgba(0,0,0,0.5);
454454- display: flex;
455455- align-items: center;
456456- justify-content: center;
457457-}
458458-.modal-content {
459459- background: white;
460460- padding: 2rem;
461461- border-radius: 8px;
462462- max-width: 600px;
463463-}
464464-.key-display {
465465- background: #f5f5f5;
466466- padding: 1rem;
467467- margin: 1rem 0;
468468- border-radius: 4px;
469469-}
470470-.key-display code {
471471- word-break: break-all;
472472- font-size: 14px;
473473-}
474474-.usage-instructions {
475475- margin-top: 1rem;
476476- padding: 1rem;
477477- background: #e3f2fd;
478478- border-radius: 4px;
479479-}
480480-.usage-instructions pre {
481481- background: #263238;
482482- color: #aed581;
483483- padding: 1rem;
484484- border-radius: 4px;
485485- overflow-x: auto;
486486-}
487487-.handle { color: #ffab40; }
488488-.key-placeholder { color: #64b5f6; }
489489-</style>
490490-```
491491-492492-#### 3.2 Register API Key Routes (`cmd/appview/serve.go`)
493493-494494-```go
495495-// In initializeUI() function, add:
496496-497497-// API key management routes (authenticated)
498498-authRouter.Handle("/api/keys", &uihandlers.GenerateAPIKeyHandler{
499499- Store: apiKeyStore,
500500-}).Methods("POST")
501501-502502-authRouter.Handle("/api/keys", &uihandlers.ListAPIKeysHandler{
503503- Store: apiKeyStore,
504504-}).Methods("GET")
505505-506506-authRouter.Handle("/api/keys/{id}", &uihandlers.DeleteAPIKeyHandler{
507507- Store: apiKeyStore,
508508-}).Methods("DELETE")
509509-```
510510-511511-### Phase 4: Update Credential Helper
512512-513513-#### 4.1 Simplify Configuration (`cmd/credential-helper/main.go`)
514514-515515-```go
516516-// SessionStore becomes CredentialStore
517517-type CredentialStore struct {
518518- Handle string `json:"handle"`
519519- APIKey string `json:"api_key"`
520520- AppViewURL string `json:"appview_url"`
521521-}
522522-523523-func handleConfigure(handle string) {
524524- fmt.Println("ATCR Credential Helper Configuration")
525525- fmt.Println("=====================================")
526526- fmt.Println()
527527- fmt.Println("You need an API key from the ATCR web UI.")
528528- fmt.Println()
529529-530530- appViewURL := os.Getenv("ATCR_APPVIEW_URL")
531531- if appViewURL == "" {
532532- appViewURL = defaultAppViewURL
533533- }
534534-535535- // Auto-open settings page
536536- settingsURL := appViewURL + "/settings"
537537- fmt.Printf("Opening settings page: %s\n", settingsURL)
538538- fmt.Println("Log in and generate an API key if you haven't already.")
539539- fmt.Println()
540540-541541- if err := oauth.OpenBrowser(settingsURL); err != nil {
542542- fmt.Printf("Could not open browser. Please visit: %s\n\n", settingsURL)
543543- }
544544-545545- // Prompt for credentials
546546- if handle == "" {
547547- fmt.Print("Enter your ATProto handle (e.g., alice.bsky.social): ")
548548- fmt.Scanln(&handle)
549549- } else {
550550- fmt.Printf("Using handle: %s\n", handle)
551551- }
552552-553553- fmt.Print("Enter your API key (from settings page): ")
554554- var apiKey string
555555- fmt.Scanln(&apiKey)
556556-557557- // Validate key format
558558- if !strings.HasPrefix(apiKey, "atcr_") {
559559- fmt.Fprintf(os.Stderr, "Invalid API key format. Key should start with 'atcr_'\n")
560560- os.Exit(1)
561561- }
562562-563563- // Save credentials
564564- creds := &CredentialStore{
565565- Handle: handle,
566566- APIKey: apiKey,
567567- AppViewURL: appViewURL,
568568- }
569569-570570- if err := saveCredentials(getCredentialsPath(), creds); err != nil {
571571- fmt.Fprintf(os.Stderr, "Error saving credentials: %v\n", err)
572572- os.Exit(1)
573573- }
574574-575575- fmt.Println()
576576- fmt.Println("✓ Configuration complete!")
577577- fmt.Println("You can now use docker push/pull with atcr.io")
578578-}
579579-580580-func handleGet() {
581581- var serverURL string
582582- fmt.Fscanln(os.Stdin, &serverURL)
583583-584584- // Load credentials
585585- creds, err := loadCredentials(getCredentialsPath())
586586- if err != nil {
587587- fmt.Fprintf(os.Stderr, "Error loading credentials: %v\n", err)
588588- fmt.Fprintf(os.Stderr, "Please run: docker-credential-atcr configure\n")
589589- os.Exit(1)
590590- }
591591-592592- // Return credentials for Docker
593593- // Docker will send these as Basic Auth to /auth/token
594594- response := Credentials{
595595- ServerURL: serverURL,
596596- Username: creds.Handle,
597597- Secret: creds.APIKey, // API key as password
598598- }
599599-600600- json.NewEncoder(os.Stdout).Encode(response)
601601-}
602602-```
603603-604604-**File Rename:**
605605-- `~/.atcr/session.json` → `~/.atcr/credentials.json`
606606-607607-### Phase 5: Remove Session Token System
608608-609609-#### 5.1 Delete Session Token Files
610610-611611-**Files to delete:**
612612-- `pkg/auth/session/handler.go`
613613-- `pkg/auth/exchange/handler.go`
614614-615615-#### 5.2 Update OAuth Server (`pkg/auth/oauth/server.go`)
616616-617617-**Remove session token creation:**
618618-```go
619619-// OLD (delete this):
620620-sessionToken, err := s.sessionManager.Create(did, handle)
621621-if err != nil {
622622- s.renderError(w, fmt.Sprintf("Failed to create session token: %v", err))
623623- return
624624-}
625625-626626-// Check if this is a UI login...
627627-if cookie, err := r.Cookie("oauth_return_to"); err == nil && s.uiSessionStore != nil {
628628- // UI flow...
629629-} else {
630630- // Render success page with session token (for credential helper)
631631- s.renderSuccess(w, sessionToken, handle)
632632-}
633633-```
634634-635635-**NEW (replace with):**
636636-```go
637637-// Check if this is a UI login
638638-if cookie, err := r.Cookie("oauth_return_to"); err == nil && s.uiSessionStore != nil {
639639- // Create UI session
640640- uiSessionID, err := s.uiSessionStore.Create(did, handle, sessionData.HostURL, 24*time.Hour)
641641- // ... set cookie, redirect ...
642642-} else {
643643- // Non-UI flow: redirect to settings to get API key
644644- s.renderRedirectToSettings(w, handle)
645645-}
646646-```
647647-648648-**Add redirect to settings template:**
649649-```go
650650-func (s *Server) renderRedirectToSettings(w http.ResponseWriter, handle string) {
651651- tmpl := template.Must(template.New("redirect").Parse(`
652652-<!DOCTYPE html>
653653-<html>
654654-<head>
655655- <title>Authorization Successful - ATCR</title>
656656- <meta http-equiv="refresh" content="3;url=/settings">
657657-</head>
658658-<body>
659659- <h1>✓ Authorization Successful!</h1>
660660- <p>Redirecting to settings page to generate your API key...</p>
661661- <p>If not redirected, <a href="/settings">click here</a>.</p>
662662-</body>
663663-</html>
664664- `))
665665- w.Header().Set("Content-Type", "text/html")
666666- tmpl.Execute(w, nil)
667667-}
668668-```
669669-670670-#### 5.3 Update Server Constructor
671671-672672-```go
673673-// Remove sessionManager parameter
674674-func NewServer(app *App) *Server {
675675- return &Server{
676676- app: app,
677677- }
678678-}
679679-```
680680-681681-#### 5.4 Update Registry Initialization (`cmd/appview/serve.go`)
682682-683683-```go
684684-// REMOVE session manager creation:
685685-// sessionManager, err := session.NewManagerWithPersistentSecret(secretPath, 30*24*time.Hour)
686686-687687-// Create API key store
688688-apiKeyStorePath := filepath.Join(filepath.Dir(storagePath), "api-keys.json")
689689-apiKeyStore, err := apikey.NewStore(apiKeyStorePath)
690690-if err != nil {
691691- return fmt.Errorf("failed to create API key store: %w", err)
692692-}
693693-694694-// OAuth server doesn't need session manager anymore
695695-oauthServer := oauth.NewServer(oauthApp)
696696-oauthServer.SetRefresher(refresher)
697697-if uiSessionStore != nil {
698698- oauthServer.SetUISessionStore(uiSessionStore)
699699-}
700700-701701-// Token handler gets API key store instead of session manager
702702-if issuer != nil {
703703- tokenHandler := token.NewHandler(issuer, apiKeyStore, defaultHoldEndpoint)
704704- tokenHandler.RegisterRoutes(mux)
705705-706706- // Remove exchange handler registration (no longer needed)
707707-}
708708-```
709709-710710----
711711-712712-## Migration Path
713713-714714-### For Existing Users
715715-716716-**Option 1: Smooth Migration (Recommended)**
717717-1. Keep session token validation temporarily with deprecation warning
718718-2. When session token is used, log warning and return special response header
719719-3. Docker client shows warning: "Session tokens deprecated, please regenerate API key"
720720-4. Remove session token support in next major version
721721-722722-**Option 2: Hard Cutover**
723723-1. Deploy new version with API keys
724724-2. Session tokens stop working immediately
725725-3. Users must reconfigure: `docker-credential-atcr configure`
726726-4. Cleaner but disruptive
727727-728728-### Rollout Plan
729729-730730-**Week 1: Deploy API Keys**
731731-- Add API key system
732732-- Keep session token validation
733733-- Add deprecation notice to OAuth callback
734734-735735-**Week 2-4: Migration Period**
736736-- Monitor API key adoption
737737-- Email users about migration
738738-- Provide migration guide
739739-740740-**Week 5: Remove Session Tokens**
741741-- Delete session token code
742742-- Force users to API keys
743743-744744----
745745-746746-## Testing Plan
747747-748748-### Unit Tests
749749-750750-1. **API Key Store**
751751- - Test key generation (format, uniqueness)
752752- - Test key validation (correct/incorrect keys)
753753- - Test bcrypt hashing
754754- - Test key listing/deletion
755755-756756-2. **Token Handler**
757757- - Test API key authentication
758758- - Test app password authentication
759759- - Test invalid credentials
760760- - Test key format validation
761761-762762-### Integration Tests
763763-764764-1. **Full Auth Flow**
765765- - UI login → OAuth → API key generation
766766- - Credential helper → API key → registry JWT
767767- - App password → registry JWT
768768-769769-2. **Docker Client Tests**
770770- - `docker login -u handle -p api_key`
771771- - `docker login -u handle -p app_password`
772772- - `docker push` with API key
773773- - `docker pull` with API key
774774-775775-### Security Tests
776776-777777-1. **Key Security**
778778- - Verify bcrypt hashing (not plaintext storage)
779779- - Test key shown only once
780780- - Test key revocation
781781- - Test unauthorized key access
782782-783783-2. **OAuth Security**
784784- - Verify API key links to correct OAuth session
785785- - Test expired refresh token handling
786786- - Test multiple keys for same user
787787-788788----
789789-790790-## Files Changed
791791-792792-### New Files
793793-- `pkg/appview/apikey/store.go` - API key storage and validation
794794-- `pkg/appview/handlers/apikeys.go` - API key HTTP handlers
795795-- `docs/API_KEY_MIGRATION.md` - This document
796796-797797-### Modified Files
798798-- `pkg/auth/token/handler.go` - Add API key validation, remove session token
799799-- `pkg/auth/oauth/server.go` - Remove session token creation, redirect to settings
800800-- `pkg/appview/handlers/settings.go` - Add API key management UI
801801-- `pkg/appview/templates/settings.html` - Add API key section
802802-- `cmd/credential-helper/main.go` - Simplify to use API keys
803803-- `cmd/appview/serve.go` - Initialize API key store, remove session manager
804804-805805-### Deleted Files
806806-- `pkg/auth/session/handler.go` - Session token system
807807-- `pkg/auth/exchange/handler.go` - Exchange endpoint (no longer needed)
808808-809809----
810810-811811-## Advantages
812812-813813-✅ **Simpler Auth:** Two methods instead of three (API keys + app passwords)
814814-✅ **Better UX:** No manual copy/paste of session tokens
815815-✅ **Multiple Keys:** Users can have laptop key, CI key, etc.
816816-✅ **Revocable:** Revoke individual keys without re-auth
817817-✅ **Server-Side OAuth:** Refresh tokens stay on server, not in client files
818818-✅ **Familiar Pattern:** Matches AWS ECR, GitHub tokens, etc.
819819-820820-## Backward Compatibility
821821-822822-⚠️ **Breaking Change:** Session tokens will stop working
823823-✅ **App passwords:** Still work (no changes)
824824-✅ **UI sessions:** Still work (separate system)
825825-826826-**Migration Required:** Users with session tokens must run `docker-credential-atcr configure` again to get API keys.
-281
docs/OAUTH.md
···11-# ATCR OAuth Implementation
22-33-## Overview
44-55-ATCR now supports ATProto OAuth authentication via Docker credential helpers. This allows users to authenticate with their ATProto identity (Bluesky account) and use Docker push/pull commands seamlessly.
66-77-## Architecture
88-99-### Components
1010-1111-1. **OAuth Client** (`pkg/auth/oauth/`)
1212- - Full ATProto OAuth implementation with DPoP support
1313- - Uses `authelia.com/client/oauth2` for OAuth + PAR
1414- - Uses `github.com/AxisCommunications/go-dpop` for DPoP proof generation
1515- - Automatic authorization server discovery
1616- - PKCE support for security
1717-1818-2. **Credential Helper** (`cmd/credential-helper/`)
1919- - Standalone binary: `docker-credential-atcr`
2020- - Implements Docker credential helper protocol
2121- - Manages OAuth flow with browser
2222- - Stores tokens securely in `~/.atcr/oauth-token.json`
2323-2424-3. **Registry Integration**
2525- - `/auth/exchange` endpoint exchanges OAuth tokens for registry JWTs
2626- - Existing `/auth/token` endpoint for standard Docker auth
2727-2828-## Dependencies
2929-3030-- `authelia.com/client/oauth2` - OAuth client with PAR support (2⭐, Authelia-backed)
3131-- `github.com/AxisCommunications/go-dpop` - DPoP implementation (10⭐, RFC 9449 compliant)
3232-- `github.com/golang-jwt/jwt/v5` - JWT library (transitive, 11k+⭐)
3333-3434-## Usage
3535-3636-### Setup
3737-3838-1. Build the credential helper:
3939-```bash
4040-go build -o docker-credential-atcr ./cmd/credential-helper
4141-```
4242-4343-2. Install it in your PATH:
4444-```bash
4545-sudo mv docker-credential-atcr /usr/local/bin/
4646-```
4747-4848-3. Configure Docker to use it by editing `~/.docker/config.json`:
4949-```json
5050-{
5151- "credsStore": "atcr"
5252-}
5353-```
5454-5555-### Configuration
5656-5757-Run the OAuth flow:
5858-```bash
5959-docker-credential-atcr configure
6060-```
6161-6262-This will:
6363-1. Prompt for your ATProto handle (e.g., `alice.bsky.social`)
6464-2. Open your browser for OAuth authorization
6565-3. Store the OAuth token and DPoP key in `~/.atcr/oauth-token.json`
6666-6767-### Using with Docker
6868-6969-Once configured, use Docker normally:
7070-7171-```bash
7272-# Push an image
7373-docker push atcr.io/alice/myapp:latest
7474-7575-# Pull an image
7676-docker pull atcr.io/alice/myapp:latest
7777-```
7878-7979-The credential helper automatically:
8080-1. Loads your stored OAuth token
8181-2. Refreshes it if expired
8282-3. Exchanges it for a registry JWT
8383-4. Provides the JWT to Docker
8484-8585-## How It Works
8686-8787-### OAuth Flow
8888-8989-1. **User runs** `docker-credential-atcr configure`
9090-2. **Resolve identity**: alice.bsky.social → DID → PDS endpoint
9191-3. **Discover auth server**: GET `{pds}/.well-known/oauth-authorization-server`
9292-4. **Generate DPoP key**: ECDSA P-256 key pair
9393-5. **PAR request**: POST to PAR endpoint with DPoP header + PKCE challenge
9494-6. **Open browser**: User authorizes on their PDS
9595-7. **Receive code**: Callback to `localhost:8888/callback`
9696-8. **Exchange code**: POST to token endpoint with DPoP header + PKCE verifier
9797-9. **Save tokens**: Store OAuth token + DPoP key + DID/handle
9898-9999-### Docker Push/Pull Flow
100100-101101-1. **Docker needs credentials** for `atcr.io`
102102-2. **Calls credential helper**: `docker-credential-atcr get`
103103-3. **Helper loads token** from `~/.atcr/oauth-token.json`
104104-4. **Refresh if needed**: Uses refresh token + DPoP if expired
105105-5. **Exchange for registry JWT**: POST to `/auth/exchange` with OAuth token + handle
106106-6. **Registry validates token**: Calls `getSession` on PDS to validate token
107107-7. **Registry issues JWT**: Creates registry JWT with validated DID/handle
108108-8. **Return to Docker**: `{"Username": "oauth2", "Secret": "<jwt>"}`
109109-9. **Docker uses JWT**: For authentication to registry API
110110-111111-## Security
112112-113113-### DPoP (Demonstrating Proof-of-Possession)
114114-115115-Every OAuth request includes a DPoP proof:
116116-- Unique JWT signed with ECDSA private key
117117-- Contains HTTP method, URL, timestamp, nonce
118118-- Public key (JWK) included in JWT header
119119-- Binds the token to the specific client
120120-121121-### PKCE (Proof Key for Code Exchange)
122122-123123-- Code verifier generated locally
124124-- Code challenge sent in authorization request
125125-- Verifier sent in token exchange
126126-- Prevents authorization code interception
127127-128128-### Token Storage
129129-130130-- Tokens stored in `~/.atcr/oauth-token.json`
131131-- File permissions: 0600 (owner read/write only)
132132-- DPoP key stored in PEM format
133133-- Refresh tokens for long-term access
134134-135135-## Implementation Details
136136-137137-### Code Structure
138138-139139-```
140140-pkg/auth/oauth/
141141-├── client.go # OAuth client with DPoP
142142-├── discovery.go # Authorization server discovery
143143-├── metadata.go # Client metadata document
144144-├── storage.go # Token persistence
145145-└── transport.go # DPoP HTTP transport
146146-147147-pkg/auth/atproto/
148148-├── session.go # ATProto session validation (Basic auth)
149149-└── validator.go # OAuth token validation via getSession
150150-151151-cmd/credential-helper/
152152-├── main.go # Docker credential helper protocol
153153-├── oauth.go # OAuth flow orchestration
154154-└── token.go # Token management
155155-156156-pkg/auth/exchange/
157157-└── handler.go # OAuth → Registry JWT exchange
158158-```
159159-160160-### Key Classes
161161-162162-**OAuth Client** (`pkg/auth/oauth/client.go`)
163163-- `NewClient()` - Create client with DPoP key
164164-- `InitializeForHandle()` - Discover auth server
165165-- `AuthorizeURL()` - Generate authorization URL with PAR + PKCE
166166-- `Exchange()` - Exchange code for token with DPoP
167167-- `RefreshToken()` - Refresh expired token with DPoP
168168-169169-**DPoP Transport** (`pkg/auth/oauth/transport.go`)
170170-- Implements `http.RoundTripper`
171171-- Automatically adds DPoP header to all requests
172172-- Handles nonce management and retries
173173-- Used by OAuth client for all HTTP requests
174174-175175-**Token Store** (`pkg/auth/oauth/storage.go`)
176176-- Persists OAuth tokens and DPoP key
177177-- PEM encoding for private key
178178-- Expiration checking
179179-- Secure file permissions
180180-181181-**Token Validator** (`pkg/auth/atproto/validator.go`)
182182-- `ValidateToken()` - Validate token via PDS getSession
183183-- `ValidateTokenWithResolver()` - Auto-resolve PDS from handle
184184-- Returns validated DID and handle
185185-- Used by registry to verify OAuth tokens
186186-187187-## Testing
188188-189189-### Manual Testing
190190-191191-1. Configure the helper:
192192-```bash
193193-./docker-credential-atcr configure
194194-# Enter handle: alice.bsky.social
195195-# Browser opens for authorization
196196-# Token saved to ~/.atcr/oauth-token.json
197197-```
198198-199199-2. Test credential retrieval:
200200-```bash
201201-echo '{"ServerURL": "atcr.io"}' | ./docker-credential-atcr get
202202-# Should return: {"Username":"oauth2","Secret":"<jwt>"}
203203-```
204204-205205-3. Test with Docker:
206206-```bash
207207-docker push atcr.io/alice/test:latest
208208-```
209209-210210-### Integration Testing
211211-212212-TODO: Add automated tests for:
213213-- OAuth flow with mock PDS
214214-- DPoP proof generation
215215-- Token exchange
216216-- Credential helper protocol
217217-218218-## Security Features
219219-220220-### OAuth Token Validation
221221-222222-The registry validates ATProto OAuth tokens by calling `com.atproto.server.getSession` on the user's PDS. This ensures:
223223-- Token is valid and not expired
224224-- Token belongs to the claimed user
225225-- User's DID and handle are extracted from the PDS response
226226-- No trust in client-provided identity information
227227-228228-**Flow:**
229229-1. Client sends OAuth token + handle to `/auth/exchange`
230230-2. Registry resolves handle → PDS endpoint
231231-3. Registry calls `{pds}/xrpc/com.atproto.server.getSession` with token
232232-4. PDS validates token and returns session info (DID, handle)
233233-5. Registry uses validated DID/handle to issue registry JWT
234234-235235-## Future Improvements
236236-237237-1. **Token refresh in background**
238238- - Proactively refresh before expiry
239239- - Reduce latency on Docker commands
240240-241241-3. **Multiple account support**
242242- - Store tokens for multiple handles
243243- - Allow selecting which account to use
244244-245245-4. **Revocation support**
246246- - Implement token revocation
247247- - Clean up on logout
248248-249249-5. **Better error messages**
250250- - User-friendly OAuth error handling
251251- - Guide users through common issues
252252-253253-## Troubleshooting
254254-255255-### "Failed to resolve identity"
256256-- Check internet connection
257257-- Verify handle is correct (e.g., `alice.bsky.social`)
258258-- Ensure PDS is accessible
259259-260260-### "Authorization timed out"
261261-- Complete authorization within 5 minutes
262262-- Check if browser opened correctly
263263-- Try running `configure` again
264264-265265-### "Token expired"
266266-- Credential helper should auto-refresh
267267-- If persistent, run `configure` again
268268-- Check `~/.atcr/oauth-token.json` permissions
269269-270270-### "Failed to exchange token"
271271-- Ensure registry is running
272272-- Check `/auth/exchange` endpoint is accessible
273273-- Verify token hasn't been revoked
274274-275275-## References
276276-277277-- [ATProto OAuth Specification](https://atproto.com/specs/oauth)
278278-- [RFC 9449: DPoP](https://datatracker.ietf.org/doc/html/rfc9449)
279279-- [RFC 9126: PAR](https://datatracker.ietf.org/doc/html/rfc9126)
280280-- [RFC 7636: PKCE](https://datatracker.ietf.org/doc/html/rfc7636)
281281-- [Docker Credential Helpers](https://github.com/docker/docker-credential-helpers)
+1289
docs/QUOTAS.md
···11+# ATCR Quota System
22+33+This document describes ATCR's storage quota implementation, inspired by Harbor's proven approach to per-project blob tracking with deduplication.
44+55+## Table of Contents
66+77+- [Overview](#overview)
88+- [Harbor's Approach (Reference Implementation)](#harbors-approach-reference-implementation)
99+- [Storage Options](#storage-options)
1010+- [Quota Data Model](#quota-data-model)
1111+- [Push Flow (Detailed)](#push-flow-detailed)
1212+- [Delete Flow](#delete-flow)
1313+- [Garbage Collection](#garbage-collection)
1414+- [Quota Reconciliation](#quota-reconciliation)
1515+- [Configuration](#configuration)
1616+- [Trade-offs & Design Decisions](#trade-offs--design-decisions)
1717+- [Future Enhancements](#future-enhancements)
1818+1919+## Overview
2020+2121+ATCR implements per-user storage quotas to:
2222+1. **Limit storage consumption** on shared hold services
2323+2. **Track actual S3 costs** (what new data was added)
2424+3. **Benefit from deduplication** (users only pay once per layer)
2525+4. **Provide transparency** (show users their storage usage)
2626+2727+**Key principle:** Users pay for layers they've uploaded, but only ONCE per layer regardless of how many images reference it.
2828+2929+### Example Scenario
3030+3131+```
3232+Alice pushes myapp:v1 (layers A, B, C - each 100MB)
3333+→ Alice's quota: +300MB (all new layers)
3434+3535+Alice pushes myapp:v2 (layers A, B, D)
3636+→ Layers A, B already claimed by Alice
3737+→ Layer D is new (100MB)
3838+→ Alice's quota: +100MB (only D is new)
3939+→ Total: 400MB
4040+4141+Bob pushes his-app:latest (layers A, E)
4242+→ Layer A already exists in S3 (uploaded by Alice)
4343+→ Bob claims it for first time → +100MB to Bob's quota
4444+→ Layer E is new → +100MB to Bob's quota
4545+→ Bob's quota: 200MB
4646+4747+Physical S3 storage: 500MB (A, B, C, D, E)
4848+Claimed storage: 600MB (Alice: 400MB, Bob: 200MB)
4949+Deduplication savings: 100MB (layer A shared)
5050+```
5151+5252+## Harbor's Approach (Reference Implementation)
5353+5454+Harbor is built on distribution/distribution (same as ATCR) and implements quotas as middleware. Their approach:
5555+5656+### Key Insights from Harbor
5757+5858+1. **"Shared blobs are only computed once per project"**
5959+ - Each project tracks which blobs it has uploaded
6060+ - Same blob used in multiple images counts only once per project
6161+ - Different projects claiming the same blob each pay for it
6262+6363+2. **Quota checked when manifest is pushed**
6464+ - Blobs upload first (presigned URLs, can't intercept)
6565+ - Manifest pushed last → quota check happens here
6666+ - Can reject manifest if quota exceeded (orphaned blobs cleaned by GC)
6767+6868+3. **Middleware-based implementation**
6969+ - distribution/distribution has NO built-in quota support
7070+ - Harbor added it as request preprocessing middleware
7171+ - Uses database (PostgreSQL) or Redis for quota storage
7272+7373+4. **Per-project ownership model**
7474+ - Blobs are physically deduplicated globally
7575+ - Quota accounting is logical (per-project claims)
7676+ - Total claimed storage can exceed physical storage
7777+7878+### References
7979+8080+- Harbor Quota Documentation: https://goharbor.io/docs/1.10/administration/configure-project-quotas/
8181+- Harbor Source: https://github.com/goharbor/harbor (see `src/controller/quota`)
8282+8383+## Storage Options
8484+8585+The hold service needs to store quota data somewhere. Two options:
8686+8787+### Option 1: S3-Based Storage (Recommended for BYOS)
8888+8989+Store quota metadata alongside blobs in the same S3 bucket:
9090+9191+```
9292+Bucket structure:
9393+/docker/registry/v2/blobs/sha256/ab/abc123.../data ← actual blobs
9494+/atcr/quota/did:plc:alice.json ← quota tracking
9595+/atcr/quota/did:plc:bob.json
9696+```
9797+9898+**Pros:**
9999+- ✅ No separate database needed
100100+- ✅ Single S3 bucket (better UX - no second bucket to configure)
101101+- ✅ Quota data lives with the blobs
102102+- ✅ Hold service stays relatively stateless
103103+- ✅ Works with any S3-compatible service (Storj, Minio, Upcloud, Fly.io)
104104+105105+**Cons:**
106106+- ❌ Slower than local database (network round-trip)
107107+- ❌ Eventual consistency issues
108108+- ❌ Race conditions on concurrent updates
109109+- ❌ Extra S3 API costs (GET/PUT per upload)
110110+111111+**Performance:**
112112+- Each blob upload: 1 HEAD (blob exists?) + 1 GET (quota) + 1 PUT (update quota)
113113+- Typical latency: 100-200ms total overhead
114114+- For high-throughput registries, consider SQLite
115115+116116+### Option 2: SQLite Database (Recommended for Shared Holds)
117117+118118+Local database in hold service:
119119+120120+```bash
121121+/var/lib/atcr/hold-quota.db
122122+```
123123+124124+**Pros:**
125125+- ✅ Fast local queries (no network latency)
126126+- ✅ ACID transactions (no race conditions)
127127+- ✅ Efficient for high-throughput registries
128128+- ✅ Can use foreign keys and joins
129129+130130+**Cons:**
131131+- ❌ Makes hold service stateful (persistent volume needed)
132132+- ❌ Not ideal for ephemeral BYOS deployments
133133+- ❌ Backup/restore complexity
134134+- ❌ Multi-instance scaling requires shared database
135135+136136+**Schema:**
137137+```sql
138138+CREATE TABLE user_quotas (
139139+ did TEXT PRIMARY KEY,
140140+ quota_limit INTEGER NOT NULL DEFAULT 10737418240, -- 10GB
141141+ quota_used INTEGER NOT NULL DEFAULT 0,
142142+ updated_at TIMESTAMP
143143+);
144144+145145+CREATE TABLE claimed_layers (
146146+ did TEXT NOT NULL,
147147+ digest TEXT NOT NULL,
148148+ size INTEGER NOT NULL,
149149+ claimed_at TIMESTAMP,
150150+ PRIMARY KEY(did, digest)
151151+);
152152+```
153153+154154+### Recommendation
155155+156156+- **BYOS (user-owned holds):** S3-based (keeps hold service ephemeral)
157157+- **Shared holds (multi-user):** SQLite (better performance and consistency)
158158+- **High-traffic production:** SQLite or PostgreSQL (Harbor uses this)
159159+160160+## Quota Data Model
161161+162162+### Quota File Format (S3-based)
163163+164164+```json
165165+{
166166+ "did": "did:plc:alice123",
167167+ "limit": 10737418240,
168168+ "used": 5368709120,
169169+ "claimed_layers": {
170170+ "sha256:abc123...": 104857600,
171171+ "sha256:def456...": 52428800,
172172+ "sha256:789ghi...": 209715200
173173+ },
174174+ "last_updated": "2025-10-09T12:34:56Z",
175175+ "version": 1
176176+}
177177+```
178178+179179+**Fields:**
180180+- `did`: User's ATProto DID
181181+- `limit`: Maximum storage in bytes (default: 10GB)
182182+- `used`: Current storage usage in bytes (sum of claimed_layers)
183183+- `claimed_layers`: Map of digest → size for all layers user has uploaded
184184+- `last_updated`: Timestamp of last quota update
185185+- `version`: Schema version for future migrations
186186+187187+### Why Track Individual Layers?
188188+189189+**Q: Can't we just track a counter?**
190190+191191+**A: We need layer tracking for:**
192192+193193+1. **Deduplication detection**
194194+ - Check if user already claimed a layer → free upload
195195+ - Example: Updating an image reuses most layers
196196+197197+2. **Accurate deletes**
198198+ - When manifest deleted, only decrement unclaimed layers
199199+ - User may have 5 images sharing layer A - deleting 1 image doesn't free layer A
200200+201201+3. **Quota reconciliation**
202202+ - Verify quota matches reality by listing user's manifests
203203+ - Recalculate from layers in manifests vs claimed_layers map
204204+205205+4. **Auditing**
206206+ - "Show me what I'm storing"
207207+ - Users can see which layers consume their quota
208208+209209+## Push Flow (Detailed)
210210+211211+### Step-by-Step: User Pushes Image
212212+213213+```
214214+┌──────────┐ ┌──────────┐ ┌──────────┐
215215+│ Client │ │ Hold │ │ S3 │
216216+│ (Docker) │ │ Service │ │ Bucket │
217217+└──────────┘ └──────────┘ └──────────┘
218218+ │ │ │
219219+ │ 1. PUT /v2/.../blobs/ │ │
220220+ │ upload?digest=sha256:abc│ │
221221+ ├───────────────────────────>│ │
222222+ │ │ │
223223+ │ │ 2. Check if blob exists │
224224+ │ │ (Stat/HEAD request) │
225225+ │ ├───────────────────────────>│
226226+ │ │<───────────────────────────┤
227227+ │ │ 200 OK (exists) or │
228228+ │ │ 404 Not Found │
229229+ │ │ │
230230+ │ │ 3. Read user quota │
231231+ │ │ GET /atcr/quota/{did} │
232232+ │ ├───────────────────────────>│
233233+ │ │<───────────────────────────┤
234234+ │ │ quota.json │
235235+ │ │ │
236236+ │ │ 4. Calculate quota impact │
237237+ │ │ - If digest in │
238238+ │ │ claimed_layers: 0 │
239239+ │ │ - Else: size │
240240+ │ │ │
241241+ │ │ 5. Check quota limit │
242242+ │ │ used + impact <= limit? │
243243+ │ │ │
244244+ │ │ 6. Update quota │
245245+ │ │ PUT /atcr/quota/{did} │
246246+ │ ├───────────────────────────>│
247247+ │ │<───────────────────────────┤
248248+ │ │ 200 OK │
249249+ │ │ │
250250+ │ 7. Presigned URL │ │
251251+ │<───────────────────────────┤ │
252252+ │ {url: "https://s3..."} │ │
253253+ │ │ │
254254+ │ 8. Upload blob to S3 │ │
255255+ ├────────────────────────────┼───────────────────────────>│
256256+ │ │ │
257257+ │ 9. 200 OK │ │
258258+ │<───────────────────────────┼────────────────────────────┤
259259+ │ │ │
260260+```
261261+262262+### Implementation (Pseudocode)
263263+264264+```go
265265+// cmd/hold/main.go - HandlePutPresignedURL
266266+267267+func (s *HoldService) HandlePutPresignedURL(w http.ResponseWriter, r *http.Request) {
268268+ var req PutPresignedURLRequest
269269+ json.NewDecoder(r.Body).Decode(&req)
270270+271271+ // Step 1: Check if blob already exists in S3
272272+ blobPath := fmt.Sprintf("/docker/registry/v2/blobs/%s/%s/%s/data",
273273+ algorithm, digest[:2], digest)
274274+275275+ _, err := s.driver.Stat(ctx, blobPath)
276276+ blobExists := (err == nil)
277277+278278+ // Step 2: Read quota from S3 (or SQLite)
279279+ quota, err := s.quotaManager.GetQuota(req.DID)
280280+ if err != nil {
281281+ // First upload - create quota with defaults
282282+ quota = &Quota{
283283+ DID: req.DID,
284284+ Limit: s.config.QuotaDefaultLimit,
285285+ Used: 0,
286286+ ClaimedLayers: make(map[string]int64),
287287+ }
288288+ }
289289+290290+ // Step 3: Calculate quota impact
291291+ quotaImpact := req.Size // Default: assume new layer
292292+293293+ if _, alreadyClaimed := quota.ClaimedLayers[req.Digest]; alreadyClaimed {
294294+ // User already uploaded this layer before
295295+ quotaImpact = 0
296296+ log.Printf("Layer %s already claimed by %s, no quota impact",
297297+ req.Digest, req.DID)
298298+ } else if blobExists {
299299+ // Blob exists in S3 (uploaded by another user)
300300+ // But this user is claiming it for first time
301301+ // Still counts against their quota
302302+ log.Printf("Layer %s exists globally but new to %s, quota impact: %d",
303303+ req.Digest, req.DID, quotaImpact)
304304+ } else {
305305+ // Brand new blob - will be uploaded to S3
306306+ log.Printf("New layer %s for %s, quota impact: %d",
307307+ req.Digest, req.DID, quotaImpact)
308308+ }
309309+310310+ // Step 4: Check quota limit
311311+ if quota.Used + quotaImpact > quota.Limit {
312312+ http.Error(w, fmt.Sprintf(
313313+ "quota exceeded: used=%d, impact=%d, limit=%d",
314314+ quota.Used, quotaImpact, quota.Limit,
315315+ ), http.StatusPaymentRequired) // 402
316316+ return
317317+ }
318318+319319+ // Step 5: Update quota (optimistic - before upload completes)
320320+ quota.Used += quotaImpact
321321+ if quotaImpact > 0 {
322322+ quota.ClaimedLayers[req.Digest] = req.Size
323323+ }
324324+ quota.LastUpdated = time.Now()
325325+326326+ if err := s.quotaManager.SaveQuota(quota); err != nil {
327327+ http.Error(w, "failed to update quota", http.StatusInternalServerError)
328328+ return
329329+ }
330330+331331+ // Step 6: Generate presigned URL
332332+ presignedURL, err := s.getUploadURL(ctx, req.Digest, req.Size, req.DID)
333333+ if err != nil {
334334+ // Rollback quota update on error
335335+ quota.Used -= quotaImpact
336336+ delete(quota.ClaimedLayers, req.Digest)
337337+ s.quotaManager.SaveQuota(quota)
338338+339339+ http.Error(w, "failed to generate presigned URL", http.StatusInternalServerError)
340340+ return
341341+ }
342342+343343+ // Step 7: Return presigned URL + quota info
344344+ resp := PutPresignedURLResponse{
345345+ URL: presignedURL,
346346+ ExpiresAt: time.Now().Add(15 * time.Minute),
347347+ QuotaInfo: QuotaInfo{
348348+ Used: quota.Used,
349349+ Limit: quota.Limit,
350350+ Available: quota.Limit - quota.Used,
351351+ Impact: quotaImpact,
352352+ AlreadyClaimed: quotaImpact == 0,
353353+ },
354354+ }
355355+356356+ w.Header().Set("Content-Type", "application/json")
357357+ json.NewEncoder(w).Encode(resp)
358358+}
359359+```
360360+361361+### Race Condition Handling
362362+363363+**Problem:** Two concurrent uploads of the same blob
364364+365365+```
366366+Time User A User B
367367+0ms Upload layer X (100MB)
368368+10ms Upload layer X (100MB)
369369+20ms Check exists: NO Check exists: NO
370370+30ms Quota impact: 100MB Quota impact: 100MB
371371+40ms Update quota A: +100MB Update quota B: +100MB
372372+50ms Generate presigned URL Generate presigned URL
373373+100ms Upload to S3 completes Upload to S3 (overwrites A's)
374374+```
375375+376376+**Result:** Both users charged 100MB, but only 100MB stored in S3.
377377+378378+**Mitigation strategies:**
379379+380380+1. **Accept eventual consistency** (recommended for S3-based)
381381+ - Run periodic reconciliation to fix discrepancies
382382+ - Small inconsistency window (minutes) is acceptable
383383+ - Reconciliation uses PDS as source of truth
384384+385385+2. **Optimistic locking** (S3 ETags)
386386+ ```go
387387+ // Use S3 ETags for conditional writes
388388+ oldETag := getQuotaFileETag(did)
389389+ err := putQuotaFileWithCondition(quota, oldETag)
390390+ if err == PreconditionFailed {
391391+ // Retry with fresh read
392392+ }
393393+ ```
394394+395395+3. **Database transactions** (SQLite-based)
396396+ ```sql
397397+ BEGIN TRANSACTION;
398398+ SELECT * FROM user_quotas WHERE did = ? FOR UPDATE;
399399+ UPDATE user_quotas SET used = used + ? WHERE did = ?;
400400+ COMMIT;
401401+ ```
402402+403403+## Delete Flow
404404+405405+### Manifest Deletion via AppView UI
406406+407407+When a user deletes a manifest through the AppView web interface:
408408+409409+```
410410+┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
411411+│ User │ │ AppView │ │ Hold │ │ PDS │
412412+│ UI │ │ Database │ │ Service │ │ │
413413+└──────────┘ └──────────┘ └──────────┘ └──────────┘
414414+ │ │ │ │
415415+ │ DELETE manifest │ │ │
416416+ ├─────────────────────>│ │ │
417417+ │ │ │ │
418418+ │ │ 1. Get manifest │ │
419419+ │ │ and layers │ │
420420+ │ │ │ │
421421+ │ │ 2. Check which │ │
422422+ │ │ layers still │ │
423423+ │ │ referenced by │ │
424424+ │ │ user's other │ │
425425+ │ │ manifests │ │
426426+ │ │ │ │
427427+ │ │ 3. DELETE manifest │ │
428428+ │ │ from PDS │ │
429429+ │ ├──────────────────────┼─────────────────────>│
430430+ │ │ │ │
431431+ │ │ 4. POST /quota/decrement │
432432+ │ ├─────────────────────>│ │
433433+ │ │ {layers: [...]} │ │
434434+ │ │ │ │
435435+ │ │ │ 5. Update quota │
436436+ │ │ │ Remove unclaimed │
437437+ │ │ │ layers │
438438+ │ │ │ │
439439+ │ │ 6. 200 OK │ │
440440+ │ │<─────────────────────┤ │
441441+ │ │ │ │
442442+ │ │ 7. Delete from DB │ │
443443+ │ │ │ │
444444+ │ 8. Success │ │ │
445445+ │<─────────────────────┤ │ │
446446+ │ │ │ │
447447+```
448448+449449+### AppView Implementation
450450+451451+```go
452452+// pkg/appview/handlers/manifest.go
453453+454454+func (h *ManifestHandler) DeleteManifest(w http.ResponseWriter, r *http.Request) {
455455+ did := r.Context().Value("auth.did").(string)
456456+ repository := chi.URLParam(r, "repository")
457457+ digest := chi.URLParam(r, "digest")
458458+459459+ // Step 1: Get manifest and its layers from database
460460+ manifest, err := db.GetManifest(h.db, digest)
461461+ if err != nil {
462462+ http.Error(w, "manifest not found", 404)
463463+ return
464464+ }
465465+466466+ layers, err := db.GetLayersForManifest(h.db, manifest.ID)
467467+ if err != nil {
468468+ http.Error(w, "failed to get layers", 500)
469469+ return
470470+ }
471471+472472+ // Step 2: For each layer, check if user still references it
473473+ // in other manifests
474474+ layersToDecrement := []LayerInfo{}
475475+476476+ for _, layer := range layers {
477477+ // Query: does this user have other manifests using this layer?
478478+ stillReferenced, err := db.CheckLayerReferencedByUser(
479479+ h.db, did, repository, layer.Digest, manifest.ID,
480480+ )
481481+482482+ if err != nil {
483483+ http.Error(w, "failed to check layer references", 500)
484484+ return
485485+ }
486486+487487+ if !stillReferenced {
488488+ // This layer is no longer used by user
489489+ layersToDecrement = append(layersToDecrement, LayerInfo{
490490+ Digest: layer.Digest,
491491+ Size: layer.Size,
492492+ })
493493+ }
494494+ }
495495+496496+ // Step 3: Delete manifest from user's PDS
497497+ atprotoClient := atproto.NewClient(manifest.PDSEndpoint, did, accessToken)
498498+ err = atprotoClient.DeleteRecord(ctx, atproto.ManifestCollection, manifestRKey)
499499+ if err != nil {
500500+ http.Error(w, "failed to delete from PDS", 500)
501501+ return
502502+ }
503503+504504+ // Step 4: Notify hold service to decrement quota
505505+ if len(layersToDecrement) > 0 {
506506+ holdClient := &http.Client{}
507507+508508+ decrementReq := QuotaDecrementRequest{
509509+ DID: did,
510510+ Layers: layersToDecrement,
511511+ }
512512+513513+ body, _ := json.Marshal(decrementReq)
514514+ resp, err := holdClient.Post(
515515+ manifest.HoldEndpoint + "/quota/decrement",
516516+ "application/json",
517517+ bytes.NewReader(body),
518518+ )
519519+520520+ if err != nil || resp.StatusCode != 200 {
521521+ log.Printf("Warning: failed to update quota on hold service: %v", err)
522522+ // Continue anyway - GC reconciliation will fix it
523523+ }
524524+ }
525525+526526+ // Step 5: Delete from AppView database
527527+ err = db.DeleteManifest(h.db, did, repository, digest)
528528+ if err != nil {
529529+ http.Error(w, "failed to delete from database", 500)
530530+ return
531531+ }
532532+533533+ w.WriteHeader(http.StatusNoContent)
534534+}
535535+```
536536+537537+### Hold Service Decrement Endpoint
538538+539539+```go
540540+// cmd/hold/main.go
541541+542542+type QuotaDecrementRequest struct {
543543+ DID string `json:"did"`
544544+ Layers []LayerInfo `json:"layers"`
545545+}
546546+547547+type LayerInfo struct {
548548+ Digest string `json:"digest"`
549549+ Size int64 `json:"size"`
550550+}
551551+552552+func (s *HoldService) HandleQuotaDecrement(w http.ResponseWriter, r *http.Request) {
553553+ var req QuotaDecrementRequest
554554+ if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
555555+ http.Error(w, "invalid request", 400)
556556+ return
557557+ }
558558+559559+ // Read current quota
560560+ quota, err := s.quotaManager.GetQuota(req.DID)
561561+ if err != nil {
562562+ http.Error(w, "quota not found", 404)
563563+ return
564564+ }
565565+566566+ // Decrement quota for each layer
567567+ for _, layer := range req.Layers {
568568+ if size, claimed := quota.ClaimedLayers[layer.Digest]; claimed {
569569+ // Remove from claimed layers
570570+ delete(quota.ClaimedLayers, layer.Digest)
571571+ quota.Used -= size
572572+573573+ log.Printf("Decremented quota for %s: layer %s (%d bytes)",
574574+ req.DID, layer.Digest, size)
575575+ } else {
576576+ log.Printf("Warning: layer %s not in claimed_layers for %s",
577577+ layer.Digest, req.DID)
578578+ }
579579+ }
580580+581581+ // Ensure quota.Used doesn't go negative (defensive)
582582+ if quota.Used < 0 {
583583+ log.Printf("Warning: quota.Used went negative for %s, resetting to 0", req.DID)
584584+ quota.Used = 0
585585+ }
586586+587587+ // Save updated quota
588588+ quota.LastUpdated = time.Now()
589589+ if err := s.quotaManager.SaveQuota(quota); err != nil {
590590+ http.Error(w, "failed to save quota", 500)
591591+ return
592592+ }
593593+594594+ // Return updated quota info
595595+ json.NewEncoder(w).Encode(map[string]any{
596596+ "used": quota.Used,
597597+ "limit": quota.Limit,
598598+ })
599599+}
600600+```
601601+602602+### SQL Query: Check Layer References
603603+604604+```sql
605605+-- pkg/appview/db/queries.go
606606+607607+-- Check if user still references this layer in other manifests
608608+SELECT COUNT(*)
609609+FROM layers l
610610+JOIN manifests m ON l.manifest_id = m.id
611611+WHERE m.did = ? -- User's DID
612612+ AND l.digest = ? -- Layer digest
613613+ AND m.id != ? -- Exclude the manifest being deleted
614614+```
615615+616616+## Garbage Collection
617617+618618+### Background: Orphaned Blobs
619619+620620+Orphaned blobs accumulate when:
621621+1. Manifest push fails after blobs uploaded (presigned URLs bypass hold)
622622+2. Quota exceeded - manifest rejected, blobs already in S3
623623+3. User deletes manifest - blobs no longer referenced
624624+625625+**GC periodically cleans these up.**
626626+627627+### GC Cron Implementation
628628+629629+Similar to AppView's backfill worker, the hold service can run periodic GC:
630630+631631+```go
632632+// cmd/hold/gc/gc.go
633633+634634+type GarbageCollector struct {
635635+ driver storagedriver.StorageDriver
636636+ appviewURL string
637637+ holdURL string
638638+ quotaManager *quota.Manager
639639+}
640640+641641+// Run garbage collection
642642+func (gc *GarbageCollector) Run(ctx context.Context) error {
643643+ log.Println("Starting garbage collection...")
644644+645645+ // Step 1: Get list of referenced blobs from AppView
646646+ referenced, err := gc.getReferencedBlobs()
647647+ if err != nil {
648648+ return fmt.Errorf("failed to get referenced blobs: %w", err)
649649+ }
650650+651651+ referencedSet := make(map[string]bool)
652652+ for _, digest := range referenced {
653653+ referencedSet[digest] = true
654654+ }
655655+656656+ log.Printf("AppView reports %d referenced blobs", len(referenced))
657657+658658+ // Step 2: Walk S3 blobs
659659+ deletedCount := 0
660660+ reclaimedBytes := int64(0)
661661+662662+ err = gc.driver.Walk(ctx, "/docker/registry/v2/blobs", func(fileInfo storagedriver.FileInfo) error {
663663+ if fileInfo.IsDir() {
664664+ return nil // Skip directories
665665+ }
666666+667667+ // Extract digest from path
668668+ // Path: /docker/registry/v2/blobs/sha256/ab/abc123.../data
669669+ digest := extractDigestFromPath(fileInfo.Path())
670670+671671+ if !referencedSet[digest] {
672672+ // Unreferenced blob - delete it
673673+ size := fileInfo.Size()
674674+675675+ if err := gc.driver.Delete(ctx, fileInfo.Path()); err != nil {
676676+ log.Printf("Failed to delete blob %s: %v", digest, err)
677677+ return nil // Continue anyway
678678+ }
679679+680680+ deletedCount++
681681+ reclaimedBytes += size
682682+683683+ log.Printf("GC: Deleted unreferenced blob %s (%d bytes)", digest, size)
684684+ }
685685+686686+ return nil
687687+ })
688688+689689+ if err != nil {
690690+ return fmt.Errorf("failed to walk blobs: %w", err)
691691+ }
692692+693693+ log.Printf("GC complete: deleted %d blobs, reclaimed %d bytes",
694694+ deletedCount, reclaimedBytes)
695695+696696+ return nil
697697+}
698698+699699+// Get referenced blobs from AppView
700700+func (gc *GarbageCollector) getReferencedBlobs() ([]string, error) {
701701+ // Query AppView for all blobs referenced by manifests
702702+ // stored in THIS hold service
703703+ url := fmt.Sprintf("%s/internal/blobs/referenced?hold=%s",
704704+ gc.appviewURL, url.QueryEscape(gc.holdURL))
705705+706706+ resp, err := http.Get(url)
707707+ if err != nil {
708708+ return nil, err
709709+ }
710710+ defer resp.Body.Close()
711711+712712+ var result struct {
713713+ Blobs []string `json:"blobs"`
714714+ }
715715+716716+ if err := json.NewDecoder(resp.Body).Decode(&result); err != nil {
717717+ return nil, err
718718+ }
719719+720720+ return result.Blobs, nil
721721+}
722722+```
723723+724724+### AppView Internal API
725725+726726+```go
727727+// pkg/appview/handlers/internal.go
728728+729729+// Get all referenced blobs for a specific hold
730730+func (h *InternalHandler) GetReferencedBlobs(w http.ResponseWriter, r *http.Request) {
731731+ holdEndpoint := r.URL.Query().Get("hold")
732732+ if holdEndpoint == "" {
733733+ http.Error(w, "missing hold parameter", 400)
734734+ return
735735+ }
736736+737737+ // Query database for all layers in manifests stored in this hold
738738+ query := `
739739+ SELECT DISTINCT l.digest
740740+ FROM layers l
741741+ JOIN manifests m ON l.manifest_id = m.id
742742+ WHERE m.hold_endpoint = ?
743743+ `
744744+745745+ rows, err := h.db.Query(query, holdEndpoint)
746746+ if err != nil {
747747+ http.Error(w, "database error", 500)
748748+ return
749749+ }
750750+ defer rows.Close()
751751+752752+ blobs := []string{}
753753+ for rows.Next() {
754754+ var digest string
755755+ if err := rows.Scan(&digest); err != nil {
756756+ continue
757757+ }
758758+ blobs = append(blobs, digest)
759759+ }
760760+761761+ json.NewEncoder(w).Encode(map[string]any{
762762+ "blobs": blobs,
763763+ "count": len(blobs),
764764+ "hold": holdEndpoint,
765765+ })
766766+}
767767+```
768768+769769+### GC Cron Schedule
770770+771771+```go
772772+// cmd/hold/main.go
773773+774774+func main() {
775775+ // ... service setup ...
776776+777777+ // Start GC cron if enabled
778778+ if os.Getenv("GC_ENABLED") == "true" {
779779+ gcInterval := 24 * time.Hour // Daily by default
780780+781781+ go func() {
782782+ ticker := time.NewTicker(gcInterval)
783783+ defer ticker.Stop()
784784+785785+ for range ticker.C {
786786+ if err := garbageCollector.Run(context.Background()); err != nil {
787787+ log.Printf("GC error: %v", err)
788788+ }
789789+ }
790790+ }()
791791+792792+ log.Printf("GC cron started: runs every %v", gcInterval)
793793+ }
794794+795795+ // Start server...
796796+}
797797+```
798798+799799+## Quota Reconciliation
800800+801801+### PDS as Source of Truth
802802+803803+**Key insight:** Manifest records in PDS are publicly readable (no OAuth needed for reads).
804804+805805+Each manifest contains:
806806+- Repository name
807807+- Digest
808808+- Layers array with digest + size
809809+- Hold endpoint
810810+811811+The hold service can query the PDS to calculate the user's true quota:
812812+813813+```
814814+1. List all io.atcr.manifest records for user
815815+2. Filter manifests where holdEndpoint == this hold service
816816+3. Extract unique layers (deduplicate by digest)
817817+4. Sum layer sizes = true quota usage
818818+5. Compare to quota file
819819+6. Fix discrepancies
820820+```
821821+822822+### Implementation
823823+824824+```go
825825+// cmd/hold/quota/reconcile.go
826826+827827+type Reconciler struct {
828828+ quotaManager *Manager
829829+ atprotoResolver *atproto.Resolver
830830+ holdURL string
831831+}
832832+833833+// ReconcileUser recalculates quota from PDS manifests
834834+func (r *Reconciler) ReconcileUser(ctx context.Context, did string) error {
835835+ log.Printf("Reconciling quota for %s", did)
836836+837837+ // Step 1: Resolve user's PDS endpoint
838838+ identity, err := r.atprotoResolver.ResolveIdentity(ctx, did)
839839+ if err != nil {
840840+ return fmt.Errorf("failed to resolve DID: %w", err)
841841+ }
842842+843843+ // Step 2: Create unauthenticated ATProto client
844844+ // (manifest records are public - no OAuth needed)
845845+ client := atproto.NewClient(identity.PDSEndpoint, did, "")
846846+847847+ // Step 3: List all manifest records for this user
848848+ manifests, err := client.ListRecords(ctx, atproto.ManifestCollection, 1000)
849849+ if err != nil {
850850+ return fmt.Errorf("failed to list manifests: %w", err)
851851+ }
852852+853853+ // Step 4: Filter manifests stored in THIS hold service
854854+ // and extract unique layers
855855+ uniqueLayers := make(map[string]int64) // digest -> size
856856+857857+ for _, record := range manifests {
858858+ var manifest atproto.ManifestRecord
859859+ if err := json.Unmarshal(record.Value, &manifest); err != nil {
860860+ log.Printf("Warning: failed to parse manifest: %v", err)
861861+ continue
862862+ }
863863+864864+ // Only count manifests stored in this hold
865865+ if manifest.HoldEndpoint != r.holdURL {
866866+ continue
867867+ }
868868+869869+ // Add config blob
870870+ if manifest.Config.Digest != "" {
871871+ uniqueLayers[manifest.Config.Digest] = manifest.Config.Size
872872+ }
873873+874874+ // Add layer blobs
875875+ for _, layer := range manifest.Layers {
876876+ uniqueLayers[layer.Digest] = layer.Size
877877+ }
878878+ }
879879+880880+ // Step 5: Calculate true quota usage
881881+ trueUsage := int64(0)
882882+ for _, size := range uniqueLayers {
883883+ trueUsage += size
884884+ }
885885+886886+ log.Printf("User %s true usage from PDS: %d bytes (%d unique layers)",
887887+ did, trueUsage, len(uniqueLayers))
888888+889889+ // Step 6: Compare with current quota file
890890+ quota, err := r.quotaManager.GetQuota(did)
891891+ if err != nil {
892892+ log.Printf("No existing quota for %s, creating new", did)
893893+ quota = &Quota{
894894+ DID: did,
895895+ Limit: r.quotaManager.DefaultLimit,
896896+ ClaimedLayers: make(map[string]int64),
897897+ }
898898+ }
899899+900900+ // Step 7: Fix discrepancies
901901+ if quota.Used != trueUsage || len(quota.ClaimedLayers) != len(uniqueLayers) {
902902+ log.Printf("Quota mismatch for %s: recorded=%d, actual=%d (diff=%d)",
903903+ did, quota.Used, trueUsage, trueUsage - quota.Used)
904904+905905+ // Update quota to match PDS truth
906906+ quota.Used = trueUsage
907907+ quota.ClaimedLayers = uniqueLayers
908908+ quota.LastUpdated = time.Now()
909909+910910+ if err := r.quotaManager.SaveQuota(quota); err != nil {
911911+ return fmt.Errorf("failed to save reconciled quota: %w", err)
912912+ }
913913+914914+ log.Printf("Reconciled quota for %s: %d bytes", did, trueUsage)
915915+ } else {
916916+ log.Printf("Quota for %s is accurate", did)
917917+ }
918918+919919+ return nil
920920+}
921921+922922+// ReconcileAll reconciles all users (run periodically)
923923+func (r *Reconciler) ReconcileAll(ctx context.Context) error {
924924+ // Get list of all users with quota files
925925+ users, err := r.quotaManager.ListUsers()
926926+ if err != nil {
927927+ return err
928928+ }
929929+930930+ log.Printf("Starting reconciliation for %d users", len(users))
931931+932932+ for _, did := range users {
933933+ if err := r.ReconcileUser(ctx, did); err != nil {
934934+ log.Printf("Failed to reconcile %s: %v", did, err)
935935+ // Continue with other users
936936+ }
937937+ }
938938+939939+ log.Println("Reconciliation complete")
940940+ return nil
941941+}
942942+```
943943+944944+### Reconciliation Cron
945945+946946+```go
947947+// cmd/hold/main.go
948948+949949+func main() {
950950+ // ... setup ...
951951+952952+ // Start reconciliation cron
953953+ if os.Getenv("QUOTA_RECONCILE_ENABLED") == "true" {
954954+ reconcileInterval := 24 * time.Hour // Daily
955955+956956+ go func() {
957957+ ticker := time.NewTicker(reconcileInterval)
958958+ defer ticker.Stop()
959959+960960+ for range ticker.C {
961961+ if err := reconciler.ReconcileAll(context.Background()); err != nil {
962962+ log.Printf("Reconciliation error: %v", err)
963963+ }
964964+ }
965965+ }()
966966+967967+ log.Printf("Quota reconciliation cron started: runs every %v", reconcileInterval)
968968+ }
969969+970970+ // ... start server ...
971971+}
972972+```
973973+974974+### Why PDS as Source of Truth Works
975975+976976+1. **Manifests are canonical** - If manifest exists in PDS, user owns those layers
977977+2. **Public reads** - No OAuth needed, just resolve DID → PDS endpoint
978978+3. **ATProto durability** - PDS is user's authoritative data store
979979+4. **AppView is cache** - AppView database might lag or have inconsistencies
980980+5. **Reconciliation fixes drift** - Periodic sync from PDS ensures accuracy
981981+982982+**Example reconciliation scenarios:**
983983+984984+- **Orphaned quota entries:** User deleted manifest from PDS, but hold quota still has it
985985+ → Reconciliation removes from claimed_layers
986986+987987+- **Missing quota entries:** User pushed manifest, but quota update failed
988988+ → Reconciliation adds to claimed_layers
989989+990990+- **Race condition duplicates:** Two concurrent pushes double-counted a layer
991991+ → Reconciliation fixes to actual usage
992992+993993+## Configuration
994994+995995+### Hold Service Environment Variables
996996+997997+```bash
998998+# .env.hold
999999+10001000+# ============================================================================
10011001+# Quota Configuration
10021002+# ============================================================================
10031003+10041004+# Enable quota enforcement
10051005+QUOTA_ENABLED=true
10061006+10071007+# Default quota limit per user (bytes)
10081008+# 10GB = 10737418240
10091009+# 50GB = 53687091200
10101010+# 100GB = 107374182400
10111011+QUOTA_DEFAULT_LIMIT=10737418240
10121012+10131013+# Storage backend for quota data
10141014+# Options: s3, sqlite
10151015+QUOTA_STORAGE_BACKEND=s3
10161016+10171017+# For S3-based storage:
10181018+# Quota files stored in same bucket as blobs
10191019+QUOTA_STORAGE_PREFIX=/atcr/quota/
10201020+10211021+# For SQLite-based storage:
10221022+QUOTA_DB_PATH=/var/lib/atcr/hold-quota.db
10231023+10241024+# ============================================================================
10251025+# Garbage Collection
10261026+# ============================================================================
10271027+10281028+# Enable periodic garbage collection
10291029+GC_ENABLED=true
10301030+10311031+# GC interval (default: 24h)
10321032+GC_INTERVAL=24h
10331033+10341034+# AppView URL for GC reference checking
10351035+APPVIEW_URL=https://atcr.io
10361036+10371037+# ============================================================================
10381038+# Quota Reconciliation
10391039+# ============================================================================
10401040+10411041+# Enable quota reconciliation from PDS
10421042+QUOTA_RECONCILE_ENABLED=true
10431043+10441044+# Reconciliation interval (default: 24h)
10451045+QUOTA_RECONCILE_INTERVAL=24h
10461046+10471047+# ============================================================================
10481048+# Hold Service Identity (Required)
10491049+# ============================================================================
10501050+10511051+# Public URL of this hold service
10521052+HOLD_PUBLIC_URL=https://hold1.example.com
10531053+10541054+# Owner DID (for auto-registration)
10551055+HOLD_OWNER=did:plc:xyz123
10561056+```
10571057+10581058+### AppView Configuration
10591059+10601060+```bash
10611061+# .env.appview
10621062+10631063+# Internal API endpoint for hold services
10641064+# Used for GC reference checking
10651065+ATCR_INTERNAL_API_ENABLED=true
10661066+10671067+# Optional: authentication token for internal APIs
10681068+ATCR_INTERNAL_API_TOKEN=secret123
10691069+```
10701070+10711071+## Trade-offs & Design Decisions
10721072+10731073+### 1. Claimed Storage vs Physical Storage
10741074+10751075+**Decision:** Track claimed storage (logical accounting)
10761076+10771077+**Why:**
10781078+- Predictable for users: "you pay for what you upload"
10791079+- No complex cross-user dependencies
10801080+- Delete always gives you quota back
10811081+- Matches Harbor's proven model
10821082+10831083+**Trade-off:**
10841084+- Total claimed can exceed physical storage
10851085+- Users might complain "I uploaded 10GB but S3 only has 6GB"
10861086+10871087+**Mitigation:**
10881088+- Show deduplication savings metric
10891089+- Educate users: "You claimed 10GB, but deduplication saved 4GB"
10901090+10911091+### 2. S3 vs SQLite for Quota Storage
10921092+10931093+**Decision:** Support both, recommend based on use case
10941094+10951095+**S3 Pros:**
10961096+- No database to manage
10971097+- Quota data lives with blobs
10981098+- Better for ephemeral BYOS
10991099+11001100+**SQLite Pros:**
11011101+- Faster (no network)
11021102+- ACID transactions (no race conditions)
11031103+- Better for high-traffic shared holds
11041104+11051105+**Trade-off:**
11061106+- S3: eventual consistency, race conditions
11071107+- SQLite: stateful service, scaling challenges
11081108+11091109+**Mitigation:**
11101110+- Reconciliation fixes S3 inconsistencies
11111111+- SQLite can use shared DB for multi-instance
11121112+11131113+### 3. Optimistic Quota Update
11141114+11151115+**Decision:** Update quota BEFORE upload completes
11161116+11171117+**Why:**
11181118+- Prevent race conditions (two users uploading simultaneously)
11191119+- Can reject before presigned URL generated
11201120+- Simpler flow
11211121+11221122+**Trade-off:**
11231123+- If upload fails, quota already incremented (user "paid" for nothing)
11241124+11251125+**Mitigation:**
11261126+- Reconciliation from PDS fixes orphaned quota entries
11271127+- Acceptable for MVP (upload failures are rare)
11281128+11291129+### 4. AppView as Intermediary
11301130+11311131+**Decision:** AppView notifies hold service on deletes
11321132+11331133+**Why:**
11341134+- AppView already has manifest/layer database
11351135+- Can efficiently check if layer still referenced
11361136+- Hold service doesn't need to query PDS on every delete
11371137+11381138+**Trade-off:**
11391139+- AppView → Hold dependency
11401140+- Network hop on delete
11411141+11421142+**Mitigation:**
11431143+- If notification fails, reconciliation fixes quota
11441144+- Eventually consistent is acceptable
11451145+11461146+### 5. PDS as Source of Truth
11471147+11481148+**Decision:** Use PDS manifests for reconciliation
11491149+11501150+**Why:**
11511151+- Manifests in PDS are canonical user data
11521152+- Public reads (no OAuth for reconciliation)
11531153+- AppView database might lag or be inconsistent
11541154+11551155+**Trade-off:**
11561156+- Reconciliation requires PDS queries (slower)
11571157+- Limited to 1000 manifests per query
11581158+11591159+**Mitigation:**
11601160+- Run reconciliation daily (not real-time)
11611161+- Paginate if user has >1000 manifests
11621162+11631163+## Future Enhancements
11641164+11651165+### 1. Quota API Endpoints
11661166+11671167+```
11681168+GET /quota/usage - Get current user's quota
11691169+GET /quota/breakdown - Get storage by repository
11701170+POST /quota/limit - Update user's quota limit (admin)
11711171+GET /quota/stats - Get hold-wide statistics
11721172+```
11731173+11741174+### 2. Quota Alerts
11751175+11761176+Notify users when approaching limit:
11771177+- Email/webhook at 80%, 90%, 95%
11781178+- Reject uploads at 100% (currently implemented)
11791179+- Grace period: allow 105% temporarily
11801180+11811181+### 3. Tiered Quotas
11821182+11831183+Different limits based on user tier:
11841184+- Free: 10GB
11851185+- Pro: 100GB
11861186+- Enterprise: unlimited
11871187+11881188+### 4. Quota Purchasing
11891189+11901190+Allow users to buy additional storage:
11911191+- Stripe integration
11921192+- $0.10/GB/month pricing
11931193+- Dynamic limit updates
11941194+11951195+### 5. Cross-Hold Deduplication
11961196+11971197+If multiple holds share same S3 bucket:
11981198+- Track blob ownership globally
11991199+- Split costs proportionally
12001200+- More complex, but maximizes deduplication
12011201+12021202+### 6. Manifest-Based Quota (Alternative Model)
12031203+12041204+Instead of tracking layers, track manifests:
12051205+- Simpler: just count manifest sizes
12061206+- No deduplication benefits for users
12071207+- Might be acceptable for some use cases
12081208+12091209+### 7. Redis-Based Quota (High Performance)
12101210+12111211+For high-traffic registries:
12121212+- Use Redis instead of S3/SQLite
12131213+- Sub-millisecond quota checks
12141214+- Harbor-proven approach
12151215+12161216+### 8. Quota Visualizations
12171217+12181218+Web UI showing:
12191219+- Storage usage over time
12201220+- Top consumers by repository
12211221+- Deduplication savings graph
12221222+- Layer size distribution
12231223+12241224+## Appendix: SQL Queries
12251225+12261226+### Check if User Still References Layer
12271227+12281228+```sql
12291229+-- After deleting manifest, check if user has other manifests using this layer
12301230+SELECT COUNT(*)
12311231+FROM layers l
12321232+JOIN manifests m ON l.manifest_id = m.id
12331233+WHERE m.did = ? -- User's DID
12341234+ AND l.digest = ? -- Layer digest to check
12351235+ AND m.id != ? -- Exclude the manifest being deleted
12361236+```
12371237+12381238+### Get All Unique Layers for User
12391239+12401240+```sql
12411241+-- Calculate true quota usage for a user
12421242+SELECT DISTINCT l.digest, l.size
12431243+FROM layers l
12441244+JOIN manifests m ON l.manifest_id = m.id
12451245+WHERE m.did = ?
12461246+ AND m.hold_endpoint = ?
12471247+```
12481248+12491249+### Get Referenced Blobs for Hold
12501250+12511251+```sql
12521252+-- For GC: get all blobs still referenced by any user of this hold
12531253+SELECT DISTINCT l.digest
12541254+FROM layers l
12551255+JOIN manifests m ON l.manifest_id = m.id
12561256+WHERE m.hold_endpoint = ?
12571257+```
12581258+12591259+### Get Storage Stats by Repository
12601260+12611261+```sql
12621262+-- User's storage broken down by repository
12631263+SELECT
12641264+ m.repository,
12651265+ COUNT(DISTINCT m.id) as manifest_count,
12661266+ COUNT(DISTINCT l.digest) as unique_layers,
12671267+ SUM(l.size) as total_size
12681268+FROM manifests m
12691269+JOIN layers l ON l.manifest_id = m.id
12701270+WHERE m.did = ?
12711271+ AND m.hold_endpoint = ?
12721272+GROUP BY m.repository
12731273+ORDER BY total_size DESC
12741274+```
12751275+12761276+## References
12771277+12781278+- **Harbor Quotas:** https://goharbor.io/docs/1.10/administration/configure-project-quotas/
12791279+- **Harbor Source:** https://github.com/goharbor/harbor
12801280+- **ATProto Spec:** https://atproto.com/specs/record
12811281+- **OCI Distribution Spec:** https://github.com/opencontainers/distribution-spec
12821282+- **S3 API Reference:** https://docs.aws.amazon.com/AmazonS3/latest/API/
12831283+- **Distribution GC:** https://github.com/distribution/distribution/blob/main/registry/storage/garbagecollect.go
12841284+12851285+---
12861286+12871287+**Document Version:** 1.0
12881288+**Last Updated:** 2025-10-09
12891289+**Author:** Generated from implementation research and Harbor analysis
-460
docs/SPEC.md
···11-ATProto Container Registry (atcr.io) Implementation Plan
22-33- Project Structure
44-55- /home/data/atcr.io/
66- ├── cmd/
77- │ └── registry/
88- │ └── main.go # Entrypoint that imports distribution
99- ├── pkg/
1010- │ ├── atproto/
1111- │ │ ├── client.go # ATProto client wrapper (using indigo)
1212- │ │ ├── manifest_store.go # Implements distribution.ManifestService
1313- │ │ ├── resolver.go # DID/handle resolution (alice → did:plc:...)
1414- │ │ └── lexicon.go # ATProto record schemas for manifests
1515- │ ├── storage/
1616- │ │ ├── s3_blob_store.go # Wraps distribution's S3 driver for blobs
1717- │ │ └── routing_repository.go # Routes manifests→ATProto, blobs→S3
1818- │ ├── middleware/
1919- │ │ ├── repository.go # Repository middleware registration
2020- │ │ └── registry.go # Registry middleware for name resolution
2121- │ └── server/
2222- │ └── handler.go # HTTP wrapper for custom name resolution
2323- ├── config/
2424- │ └── config.yml # Registry configuration
2525- ├── go.mod
2626- ├── go.sum
2727- ├── Dockerfile
2828- ├── README.md
2929- └── CLAUDE.md # Updated with architecture docs
3030-3131-3232- Implementation Steps
3333-3434- Phase 1: Project Setup
3535-3636- 1. Initialize Go module with github.com/distribution/distribution/v3 and github.com/bluesky-social/indigo
3737- 2. Create basic project structure
3838- 3. Set up cmd/appview/main.go that imports distribution and registers middleware
3939-4040- Phase 2: Core ATProto Integration
4141-4242- 4. Implement DID/handle resolver (pkg/atproto/resolver.go)
4343- - Resolve handles to DIDs (alice.bsky.social → did:plc:xyz)
4444- - Discover PDS endpoints from DID documents
4545- 5. Create ATProto client wrapper (pkg/atproto/client.go)
4646- - Wrap indigo SDK for manifest storage
4747- - Handle authentication with PDS
4848- 6. Design ATProto lexicon for manifest records (pkg/atproto/lexicon.go)
4949- - Define schema for storing OCI manifests as ATProto records
5050-5151- Phase 3: Storage Layer
5252-5353- 7. Implement ATProto manifest store (pkg/atproto/manifest_store.go)
5454- - Implements distribution.ManifestService
5555- - Stores/retrieves manifests from PDS
5656- 8. Implement S3 blob store wrapper (pkg/storage/s3_blob_store.go)
5757- - Wraps distribution's built-in S3 driver
5858- 9. Create routing repository (pkg/storage/routing_repository.go)
5959- - Returns ATProto store for Manifests()
6060- - Returns S3 store for Blobs()
6161-6262- Phase 4: Middleware Layer
6363-6464- 10. Implement repository middleware (pkg/middleware/repository.go)
6565- - Registers routing repository
6666- - Configurable via YAML
6767- 11. Implement registry/namespace middleware (pkg/middleware/registry.go)
6868- - Intercepts Repository(name) calls
6969- - Performs name resolution before repository creation
7070-7171- Phase 5: HTTP Layer (if needed)
7272-7373- 12. Create custom HTTP handler (pkg/server/handler.go)
7474- - Wraps distribution's HTTP handlers
7575- - Performs early name resolution: atcr.io/alice/myimage → resolve alice
7676- - Delegates to distribution handlers
7777-7878- Phase 6: Configuration & Deployment
7979-8080- 13. Create registry configuration (config/config.yml)
8181- 14. Create Dockerfile for building atcr-appview binary
8282- 16. Write README.md with usage instructions
8383-8484- Phase 7: Documentation
8585-8686- 17. Update CLAUDE.md with:
8787- - Architecture overview (ATProto for manifests, S3 for blobs)
8888- - Build/run/test commands
8989- - How name resolution works
9090- - Middleware registration flow
9191- - Key design decisions
9292-9393- Key Technical Decisions
9494-9595- ATProto Storage Design:
9696- - Manifests stored as ATProto records in user's PDS
9797- - Each image manifest is a record in a io.atcr.manifest collection
9898- - Record key = image digest (sha256:...)
9999-100100- Name Resolution:
101101- - atcr.io/alice/myimage → resolve alice to DID → discover PDS
102102- - Support both handles (alice.bsky.social) and DIDs (did:plc:xyz)
103103-104104- Blob Storage:
105105- - All layers/blobs in S3 (content-addressable by digest)
106106- - Manifests reference S3-stored blobs by digest
107107- - S3 provides cheap, durable blob storage
108108-109109- Middleware Strategy:
110110- - Repository middleware for storage routing
111111- - Registry middleware (or HTTP wrapper) for name resolution
112112- - No fork of distribution core needed
113113-114114-Perfect. To match Docker Hub/ghcr.io/gcr.io, here's what we need:
115115-116116-● Implementation Plan (Drop-in replacement for Docker Hub/ghcr.io)
117117-118118- Flow 1: Standard Token Auth (Like Docker Hub) - PRIMARY
119119-120120- # User experience
121121- docker login atcr.io -u alice.bsky.social -p <atproto-app-password>
122122- docker push atcr.io/alice/myapp:latest
123123-124124- # Behind the scenes
125125- 1. docker login stores credentials locally
126126- 2. docker push → Registry returns 401 with WWW-Authenticate: Bearer realm="https://atcr.io/auth/token"...
127127- 3. Docker auto-calls /auth/token with Basic auth (alice.bsky.social:app-password)
128128- 4. Auth service validates against ATProto createSession
129129- 5. Returns JWT token with scope for alice/myapp
130130- 6. Docker uses JWT for manifest/blob uploads
131131- 7. Registry validates JWT signature and scope
132132-133133- Components:
134134- - /auth/token endpoint (standalone service or embedded)
135135- - ATProto session validator (username/password → validate via PDS)
136136- - JWT issuer/signer
137137- - JWT validator middleware for registry
138138-139139- Flow 2: Credential Helper (Like gcr.io) - ADVANCED
140140-141141- # User experience
142142- docker-credential-atcr configure
143143- # Opens browser for ATProto OAuth
144144- docker push atcr.io/alice/myapp:latest
145145- # No manual login needed
146146-147147- # Behind the scenes
148148- 1. Helper does OAuth flow → gets ATProto access token
149149- 2. Caches token securely
150150- 3. When Docker needs credentials, calls helper via stdin/stdout
151151- 4. Helper exchanges ATProto token for registry JWT at /auth/exchange
152152- 5. Returns JWT to Docker
153153- 6. Docker uses JWT for requests
154154-155155- Components:
156156- - cmd/credential-helper/main.go - Standalone binary
157157- - ATProto OAuth client
158158- - Token exchange endpoint (/auth/exchange)
159159- - Secure token cache
160160-161161- Architecture:
162162-163163- pkg/auth/
164164- ├── token/
165165- │ ├── service.go # HTTP handler for /auth/token
166166- │ ├── claims.go # JWT claims structure
167167- │ ├── issuer.go # Signs JWTs
168168- │ └── validator.go # Validates JWTs (middleware for registry)
169169- ├── atproto/
170170- │ ├── session.go # Validates username/password via ATProto
171171- │ └── oauth.go # OAuth flow implementation
172172- ├── exchange/
173173- │ └── handler.go # /auth/exchange endpoint (OAuth → JWT)
174174- └── scope.go # Parses/validates Docker scopes
175175-176176- cmd/
177177- ├── registry/main.go # Registry server (existing)
178178- ├── auth/main.go # Standalone auth service (optional)
179179- └── credential-helper/
180180- └── main.go # docker-credential-atcr binary
181181-182182- Config:
183183-184184- auth:
185185- token:
186186- realm: https://atcr.io/auth/token # Where Docker gets tokens
187187- service: atcr.io
188188- issuer: atcr.io
189189- rootcertbundle: /etc/atcr/token-signing.crt
190190- privatekey: /etc/atcr/token-signing.pem
191191- expiration: 300
192192-193193- atproto:
194194- # Used by auth service to validate credentials
195195- pds_endpoint: https://bsky.social
196196- client_id: atcr-appview
197197- oauth_redirect: http://localhost:8888/callback
198198-199199-ATProto OAuth Implementation Plan
200200-201201- Architecture
202202-203203- Dependencies:
204204- - authelia.com/client/oauth2 - OAuth + PAR support
205205- - github.com/AxisCommunications/go-dpop - DPoP proof generation (handles JWK automatically)
206206- - github.com/golang-jwt/jwt/v5 - JWT library (transitive via go-dpop)
207207- - Our existing pkg/atproto/resolver.go - ATProto identity resolution
208208-209209- Implementation Components
210210-211211- 1. OAuth Client (pkg/auth/oauth/client.go) - ~100 lines
212212-213213- type Client struct {
214214- config *oauth2.Config
215215- dpopKey *ecdsa.PrivateKey
216216- resolver *atproto.Resolver
217217- clientID string // URL to our metadata document
218218- redirectURI string
219219- dpopNonce string // Server-provided nonce
220220- }
221221-222222- func NewClient(clientID, redirectURI string) (*Client, error)
223223- func (c *Client) AuthorizeURL(handle string, scopes []string) (string, error)
224224- func (c *Client) Exchange(code string) (*Token, error)
225225- func (c *Client) addDPoPHeader(req *http.Request, method, url string) error
226226-227227- Flow:
228228- 1. Generate ECDSA P-256 key for DPoP
229229- 2. Discover authorization server from handle/DID
230230- 3. Use authelia's PushedAuth() for PAR with DPoP header
231231- 4. Exchange code for token with DPoP proof
232232-233233- 2. Authorization Server Discovery (pkg/auth/oauth/discovery.go) - ~30 lines
234234-235235- type AuthServerMetadata struct {
236236- Issuer string `json:"issuer"`
237237- AuthorizationEndpoint string `json:"authorization_endpoint"`
238238- TokenEndpoint string `json:"token_endpoint"`
239239- PushedAuthorizationRequestEndpoint string `json:"pushed_authorization_request_endpoint"`
240240- DPoPSigningAlgValuesSupported []string `json:"dpop_signing_alg_values_supported"`
241241- }
242242-243243- func DiscoverAuthServer(pdsEndpoint string) (*AuthServerMetadata, error)
244244-245245- Implementation:
246246- - GET {pds}/.well-known/oauth-authorization-server
247247- - Parse JSON metadata
248248- - Validate required endpoints exist
249249-250250- 3. Client Metadata Server (pkg/auth/oauth/metadata.go) - ~40 lines
251251-252252- type ClientMetadata struct {
253253- ClientID string `json:"client_id"`
254254- RedirectURIs []string `json:"redirect_uris"`
255255- GrantTypes []string `json:"grant_types"`
256256- ResponseTypes []string `json:"response_types"`
257257- Scope string `json:"scope"`
258258- DPoPBoundAccessTokens bool `json:"dpop_bound_access_tokens"`
259259- }
260260-261261- func ServeMetadata(clientID string, redirectURIs []string) http.Handler
262262-263263- Serves: https://atcr.io/oauth/client-metadata.json
264264-265265- 4. Token Storage (pkg/auth/oauth/storage.go) - ~50 lines
266266-267267- type TokenStore struct {
268268- AccessToken string
269269- RefreshToken string
270270- DPoPKey *ecdsa.PrivateKey // Persist for refresh
271271- ExpiresAt time.Time
272272- }
273273-274274- func (s *TokenStore) Save(path string) error
275275- func LoadTokenStore(path string) (*TokenStore, error)
276276-277277- Storage location: ~/.atcr/oauth-tokens.json
278278-279279- 5. Credential Helper (cmd/credential-helper/main.go) - ~80 lines
280280-281281- // Docker credential helper protocol
282282- // Reads JSON from stdin, writes to stdout
283283-284284- func main() {
285285- if len(os.Args) < 2 {
286286- os.Exit(1)
287287- }
288288-289289- switch os.Args[1] {
290290- case "get":
291291- handleGet() // Return credentials for registry
292292- case "store":
293293- handleStore() // Store credentials
294294- case "erase":
295295- handleErase() // Remove credentials
296296- }
297297- }
298298-299299- func handleGet() {
300300- var request struct {
301301- ServerURL string `json:"ServerURL"`
302302- }
303303- json.NewDecoder(os.Stdin).Decode(&request)
304304-305305- // Load token from storage
306306- // Exchange for registry JWT if needed
307307- // Output: {"Username": "oauth2", "Secret": "<jwt>"}
308308- }
309309-310310- 6. OAuth Flow (cmd/credential-helper/oauth.go) - ~60 lines
311311-312312- func RunOAuthFlow(handle string) (*TokenStore, error) {
313313- // 1. Start local HTTP server on :8888
314314- // 2. Open browser to authorization URL
315315- // 3. Wait for callback with code
316316- // 4. Exchange code for token
317317- // 5. Save token store
318318- // 6. Return token
319319- }
320320-321321- func startCallbackServer() (chan string, *http.Server)
322322-323323- Complete Flow Example
324324-325325- User runs:
326326- docker-credential-atcr configure
327327-328328- What happens:
329329-330330- 1. Generate DPoP key (client.go)
331331- dpopKey, _ := ecdsa.GenerateKey(elliptic.P256(), rand.Reader)
332332-333333- 2. Resolve handle → DID → PDS (using our resolver)
334334- did, pds, _ := resolver.ResolveIdentity(ctx, "alice.bsky.social")
335335-336336- 3. Discover auth server (discovery.go)
337337- metadata, _ := DiscoverAuthServer(pds)
338338- // Returns: PAR endpoint, token endpoint, etc.
339339-340340- 4. Create PAR request with DPoP (client.go + go-dpop)
341341- // Generate DPoP proof for PAR endpoint
342342- claims := &dpop.ProofTokenClaims{
343343- Method: dpop.POST,
344344- URL: metadata.PushedAuthorizationRequestEndpoint,
345345- RegisteredClaims: &jwt.RegisteredClaims{
346346- IssuedAt: jwt.NewNumericDate(time.Now()),
347347- },
348348- }
349349- dpopProof, _ := dpop.Create(jwt.SigningMethodES256, claims, dpopKey)
350350-351351- // Use authelia for PAR
352352- config := &oauth2.Config{
353353- ClientID: "https://atcr.io/oauth/client-metadata.json",
354354- Endpoint: oauth2.Endpoint{
355355- AuthURL: metadata.AuthorizationEndpoint,
356356- TokenURL: metadata.TokenEndpoint,
357357- },
358358- }
359359-360360- // Create custom HTTP client that adds DPoP header
361361- client := &http.Client{
362362- Transport: &dpopTransport{
363363- base: http.DefaultTransport,
364364- dpopKey: dpopKey,
365365- },
366366- }
367367- ctx := context.WithValue(context.Background(), oauth2.HTTPClient, client)
368368-369369- // PAR request (authelia handles this)
370370- authURL, parResp, _ := config.PushedAuth(ctx, state,
371371- oauth2.SetAuthURLParam("code_challenge", pkceChallenge),
372372- oauth2.SetAuthURLParam("code_challenge_method", "S256"),
373373- )
374374-375375- 5. Open browser, get code (oauth.go)
376376- exec.Command("open", authURL).Run()
377377- // User authorizes
378378- // Callback: http://localhost:8888?code=xyz&state=abc
379379-380380- 6. Exchange code for token with DPoP (client.go + go-dpop)
381381- // Generate DPoP proof for token endpoint
382382- claims := &dpop.ProofTokenClaims{
383383- Method: dpop.POST,
384384- URL: metadata.TokenEndpoint,
385385- RegisteredClaims: &jwt.RegisteredClaims{
386386- IssuedAt: jwt.NewNumericDate(time.Now()),
387387- },
388388- }
389389- dpopProof, _ := dpop.Create(jwt.SigningMethodES256, claims, dpopKey)
390390-391391- // Exchange (with DPoP header added by our transport)
392392- token, _ := config.Exchange(ctx, code,
393393- oauth2.SetAuthURLParam("code_verifier", pkceVerifier),
394394- )
395395-396396- 7. Save token + DPoP key (storage.go)
397397- store := &TokenStore{
398398- AccessToken: token.AccessToken,
399399- RefreshToken: token.RefreshToken,
400400- DPoPKey: dpopKey,
401401- ExpiresAt: token.Expiry,
402402- }
403403- store.Save("~/.atcr/oauth-tokens.json")
404404-405405- Later, when docker push happens:
406406- docker push atcr.io/alice/myapp:latest
407407-408408- 1. Docker calls credential helper: docker-credential-atcr get
409409- 2. Helper loads stored token
410410- 3. Helper calls /auth/exchange with OAuth token → gets registry JWT
411411- 4. Returns JWT to Docker
412412- 5. Docker uses JWT for push
413413-414414- Directory Structure
415415-416416- pkg/auth/oauth/
417417- ├── client.go # OAuth client with DPoP integration
418418- ├── discovery.go # Authorization server discovery
419419- ├── metadata.go # Client metadata server
420420- ├── storage.go # Token persistence
421421- └── transport.go # HTTP transport that adds DPoP headers
422422-423423- cmd/credential-helper/
424424- ├── main.go # Docker credential helper protocol
425425- ├── oauth.go # OAuth flow (browser, callback)
426426- └── config.go # Configuration
427427-428428- go.mod additions:
429429- authelia.com/client/oauth2 v0.25.0
430430- github.com/AxisCommunications/go-dpop v1.1.2
431431-432432-Unified Model
433433-434434- Every hold service requires HOLD_OWNER:
435435- - Owner's PDS has the io.atcr.hold record
436436- - Owner's PDS has all io.atcr.hold.crew records
437437- - Authorization is always governed by PDS records
438438-439439- For "public" hold (like Tangled's public knot):
440440- - Owner creates hold with public: true
441441- - Anyone can push/pull without being crew
442442- - Owner can add crew records for special privileges/tracking if desired
443443-444444- Config has emergency override:
445445- auth:
446446- # Emergency freeze: ignore public setting, restrict to crew only
447447- # Use this to stop abuse without changing PDS records
448448- freeze: false
449449-450450- Authorization logic:
451451- 1. Check freeze in config → if true, skip to crew check
452452- 2. Query owner's PDS for io.atcr.hold record
453453- 3. If public: true → allow all operations (unless frozen)
454454- 4. If public: false OR frozen → query io.atcr.hold.crew records, check membership
455455-456456- Remove from config:
457457- - allow_all (replaced by public: true in PDS)
458458- - allowed_dids (replaced by crew records in PDS)
459459-460460- This way the hold owner at atcr.io can run a public hold at hold1.atcr.io that anyone can use, but can freeze it instantly if needed without touching PDS records.
-334
docs/TESTING.md
···11-# Local Testing Guide
22-33-## Quick Start
44-55-```bash
66-./test-local.sh
77-```
88-99-This automated script will:
1010-1. Create storage directories
1111-2. Build all binaries
1212-3. Start both services
1313-4. Show test commands
1414-1515-## Manual Testing Steps
1616-1717-### 1. Setup Directories
1818-1919-```bash
2020-sudo mkdir -p /var/lib/atcr/{blobs,hold,auth}
2121-sudo chown -R $USER:$USER /var/lib/atcr
2222-```
2323-2424-### 2. Build Binaries
2525-2626-```bash
2727-go build -o atcr-appview ./cmd/appview
2828-go build -o atcr-hold ./cmd/hold
2929-go build -o docker-credential-atcr ./cmd/credential-helper
3030-```
3131-3232-### 3. Configure Environment
3333-3434-Create a `.env` file in the project root:
3535-3636-```bash
3737-cp .env.example .env
3838-```
3939-4040-Edit `.env` with your credentials:
4141-4242-```env
4343-# Your ATProto handle
4444-ATPROTO_HANDLE=your-handle.bsky.social
4545-4646-# Hold service public URL (hostname becomes the hold name)
4747-HOLD_PUBLIC_URL=http://127.0.0.1:8080
4848-4949-# Enable OAuth registration on startup
5050-HOLD_AUTO_REGISTER=true
5151-```
5252-5353-**Notes:**
5454-- Use your Bluesky handle (e.g., `alice.bsky.social`)
5555-- For localhost, use `127.0.0.1` instead of `localhost` for OAuth
5656-- The hostname from the URL becomes the hold name (e.g., `127.0.0.1` or `hold1.atcr.io`)
5757-5858-**Load environment:**
5959-```bash
6060-export $(cat .env | xargs)
6161-```
6262-6363-### 4. Start Services
6464-6565-**Terminal 1 - AppView:**
6666-```bash
6767-./atcr-appview serve config/config.yml
6868-```
6969-7070-**Terminal 2 - Hold:**
7171-```bash
7272-./atcr-hold config/hold.yml
7373-```
7474-7575-### 5. Start Services and OAuth Registration
7676-7777-**Terminal 1 - AppView:**
7878-```bash
7979-./atcr-appview serve config/config.yml
8080-```
8181-8282-**Terminal 2 - Hold (OAuth registration):**
8383-```bash
8484-./atcr-hold config/hold.yml
8585-```
8686-8787-The hold service will start an OAuth flow. You'll see output like:
8888-8989-```
9090-================================================================================
9191-OAUTH AUTHORIZATION REQUIRED
9292-================================================================================
9393-9494-Please visit this URL to authorize the hold service:
9595-9696- https://bsky.social/oauth/authorize?...
9797-9898-Waiting for authorization...
9999-================================================================================
100100-```
101101-102102-**Steps:**
103103-1. Copy the OAuth URL from the logs
104104-2. Open it in your browser
105105-3. Sign in to Bluesky and authorize
106106-4. The callback will complete automatically
107107-5. Hold service registers in your PDS
108108-109109-After successful OAuth, you'll see:
110110-```
111111-✓ Created hold record: at://did:plc:.../io.atcr.hold/127.0.0.1
112112-✓ Created crew record: at://did:plc:.../io.atcr.hold.crew/127.0.0.1-did:plc:...
113113-================================================================================
114114-REGISTRATION COMPLETE
115115-================================================================================
116116-Hold service is now registered and ready to use!
117117-```
118118-119119-This creates two records in your PDS:
120120-- `io.atcr.hold` - Defines the storage endpoint URL
121121-- `io.atcr.hold.crew` - Grants you admin access
122122-123123-### 6. Test Docker Push/Pull
124124-125125-**Test 1: Basic Push**
126126-```bash
127127-# Tag an image
128128-docker tag alpine:latest localhost:5000/alice/alpine:test
129129-130130-# Push to local registry
131131-docker push localhost:5000/alice/alpine:test
132132-```
133133-134134-**Test 2: Pull**
135135-```bash
136136-# Remove local image
137137-docker rmi localhost:5000/alice/alpine:test
138138-139139-# Pull from registry
140140-docker pull localhost:5000/alice/alpine:test
141141-```
142142-143143-**Test 3: Verify Storage**
144144-```bash
145145-# Check manifests were stored in ATProto
146146-# (Check your PDS for io.atcr.manifest records)
147147-148148-# Check blobs were stored locally
149149-ls -lh /var/lib/atcr/blobs/docker/registry/v2/
150150-```
151151-152152-## OAuth Testing (Optional)
153153-154154-### Setup Credential Helper
155155-156156-```bash
157157-# Configure OAuth
158158-./docker-credential-atcr configure
159159-160160-# Follow the browser flow to authorize
161161-162162-# Verify token was saved
163163-ls -la ~/.atcr/oauth-token.json
164164-```
165165-166166-### Configure Docker to Use Helper
167167-168168-Edit `~/.docker/config.json`:
169169-```json
170170-{
171171- "credHelpers": {
172172- "localhost:5000": "atcr"
173173- }
174174-}
175175-```
176176-177177-### Test with OAuth
178178-179179-```bash
180180-# Push should now use OAuth automatically
181181-docker push localhost:5000/alice/myapp:latest
182182-```
183183-184184-## Troubleshooting
185185-186186-### Registry won't start
187187-188188-**Error:** `failed to create storage driver`
189189-```bash
190190-# Check directory permissions
191191-ls -ld /var/lib/atcr/blobs
192192-# Should be owned by your user
193193-194194-# Fix permissions
195195-sudo chown -R $USER:$USER /var/lib/atcr
196196-```
197197-198198-**Error:** `address already in use`
199199-```bash
200200-# Check what's using port 5000
201201-lsof -i :5000
202202-203203-# Kill existing process
204204-kill $(lsof -t -i :5000)
205205-```
206206-207207-### Hold service won't start
208208-209209-**Error:** `failed to create storage driver`
210210-```bash
211211-# Check hold directory
212212-ls -ld /var/lib/atcr/hold
213213-sudo chown -R $USER:$USER /var/lib/atcr/hold
214214-```
215215-216216-**Error:** `address already in use`
217217-```bash
218218-# Check port 8080
219219-lsof -i :8080
220220-kill $(lsof -t -i :8080)
221221-```
222222-223223-### Docker push fails
224224-225225-**Error:** `unauthorized: authentication required`
226226-- Check `ATPROTO_DID` and `ATPROTO_ACCESS_TOKEN` are set
227227-- Verify token is valid (not expired)
228228-- Check registry logs for auth errors
229229-230230-**Error:** `denied: requested access to the resource is denied`
231231-- Check the identity in the image name matches your DID
232232-- Example: If your handle is `alice.bsky.social`, use:
233233- ```bash
234234- docker push localhost:5000/alice/myapp:test
235235- # NOT localhost:5000/bob/myapp:test
236236- ```
237237-238238-**Error:** `failed to resolve identity`
239239-- Check internet connection (needs to resolve DIDs)
240240-- Verify handle is correct
241241-- Try using DID directly instead of handle
242242-243243-### OAuth issues
244244-245245-**Error:** `Failed to exchange token`
246246-- Ensure registry is running and accessible
247247-- Check `/auth/exchange` endpoint is responding
248248-- Verify OAuth token hasn't expired
249249-250250-**Error:** `Token validation failed`
251251-- Token might be expired
252252-- Run `./docker-credential-atcr configure` again
253253-- Check PDS is accessible
254254-255255-## Verifying the Flow
256256-257257-### Check Registry is Running
258258-```bash
259259-curl http://localhost:5000/v2/
260260-# Should return: {}
261261-```
262262-263263-### Check Hold is Running
264264-```bash
265265-curl http://localhost:8080/health
266266-# Should return: {"status":"ok"}
267267-```
268268-269269-### Check Auth Endpoint
270270-```bash
271271-curl -v http://localhost:5000/v2/
272272-# Should return 401 with WWW-Authenticate header
273273-```
274274-275275-### Inspect Stored Data
276276-277277-**Manifests (in ATProto):**
278278-- Check your PDS web interface
279279-- Look for `io.atcr.manifest` collection records
280280-281281-**Blobs (local filesystem):**
282282-```bash
283283-# List blobs
284284-find /var/lib/atcr/blobs -type f
285285-286286-# Check blob content (should be binary)
287287-ls -lh /var/lib/atcr/blobs/docker/registry/v2/blobs/sha256/
288288-```
289289-290290-## Clean Up
291291-292292-### Stop Services
293293-```bash
294294-# If using test script
295295-kill $(cat .atcr-pids)
296296-297297-# Or manually
298298-pkill atcr-appview
299299-pkill atcr-hold
300300-```
301301-302302-### Remove Test Data
303303-```bash
304304-# Remove all stored data
305305-sudo rm -rf /var/lib/atcr/*
306306-307307-# Remove OAuth tokens
308308-rm -rf ~/.atcr/
309309-```
310310-311311-### Reset Docker Config
312312-```bash
313313-# Remove credential helper config
314314-# Edit ~/.docker/config.json and remove "credHelpers" section
315315-```
316316-317317-## Next Steps
318318-319319-Once local testing works:
320320-321321-1. **Deploy to production:**
322322- - Use S3/Storj for blob storage
323323- - Deploy registry and hold to separate hosts
324324- - Configure DNS for `atcr.io`
325325-326326-2. **Enable BYOS:**
327327- - Users create `io.atcr.hold` records
328328- - Deploy their own hold service
329329- - AppView automatically routes to their storage
330330-331331-3. **Add monitoring:**
332332- - Registry metrics
333333- - Hold service metrics
334334- - Storage usage tracking