feat: add status maintenance workflow (claude code) (#400)

+93

.github/workflows/status-maintenance.yml

··· 1 + # weekly status maintenance via claude code 2 + # 3 + # archives old STATUS.md sections and optionally generates audio overviews. 4 + # runs every monday or on manual trigger. 5 + # 6 + # required secrets: 7 + # ANTHROPIC_API_KEY - claude code 8 + # GOOGLE_API_KEY - gemini TTS (for audio generation) 9 + # PLYR_BOT_TOKEN - plyr.fm developer token (for audio upload) 10 + 11 + name: status maintenance 12 + 13 + on: 14 + # TODO: restore schedule after testing 15 + # schedule: 16 + # - cron: "0 9 * * 1" # every monday 9am UTC 17 + workflow_dispatch: 18 + inputs: 19 + skip_audio: 20 + description: "skip audio generation" 21 + type: boolean 22 + default: false 23 + 24 + jobs: 25 + maintain: 26 + runs-on: ubuntu-latest 27 + permissions: 28 + contents: write 29 + pull-requests: write 30 + id-token: write 31 + 32 + steps: 33 + - uses: actions/checkout@v4 34 + with: 35 + fetch-depth: 0 36 + 37 + - uses: astral-sh/setup-uv@v4 38 + 39 + - uses: anthropics/claude-code-action@v1 40 + with: 41 + anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }} 42 + claude_args: | 43 + --allowedTools "Read,Write,Edit,Bash" 44 + prompt: | 45 + you are maintaining the plyr.fm project status file. 46 + 47 + ## context 48 + 49 + this may be the first time this workflow has run. handle gracefully: 50 + - .status_history/ directory may not exist yet 51 + - STATUS.md structure may vary 52 + 53 + ## task 1: archive old sections (if needed) 54 + 55 + read STATUS.md and count the lines. 56 + 57 + if over 500 lines: 58 + 1. identify natural section boundaries (marked by "---" or headers like "### " with dates) 59 + 2. keep the most recent ~500 lines in STATUS.md 60 + 3. move older sections to .status_history/YYYY-MM.md files, grouped by month 61 + 4. create .status_history/ directory if it doesn't exist 62 + 5. preserve raw content - don't summarize or modify the archived text 63 + 64 + if 500 lines or fewer, skip archiving. 65 + 66 + ## task 2: generate audio overview 67 + 68 + skip_audio input: ${{ inputs.skip_audio }} 69 + 70 + if skip_audio is false: 71 + 1. write a 2-3 minute podcast script to podcast_script.txt 72 + - two hosts having a casual technical conversation 73 + - focus on shipped features from the top of STATUS.md 74 + - format: "Host: ..." and "Cohost: ..." lines 75 + 76 + 2. run: uv run scripts/generate_tts.py podcast_script.txt update.wav 77 + 78 + 3. run: uv run --with plyrfm -- plyrfm upload update.wav "plyr.fm update - <today's date>" 79 + 80 + ## task 3: open PR with changes 81 + 82 + if any files changed (.status_history/*, STATUS.md): 83 + 1. create a new branch: git checkout -b status-maintenance-<date> 84 + 2. git add .status_history/ STATUS.md (NOT audio files or temp scripts) 85 + 3. commit with message "chore: weekly status maintenance" 86 + 4. push the branch: git push -u origin status-maintenance-<date> 87 + 5. create a PR using: gh pr create --title "chore: weekly status maintenance" --body "automated status archival and audio overview" 88 + 89 + if nothing changed, just report that no maintenance was needed. 90 + 91 + env: 92 + GOOGLE_API_KEY: ${{ secrets.GOOGLE_API_KEY }} 93 + PLYR_TOKEN: ${{ secrets.PLYR_BOT_TOKEN }}

+1 -3

.gitignore

··· 58 58 59 59 # gemini 60 60 .gemini/ 61 - .gemini-clipboard/ 62 - 63 - STATUS.md 61 + .gemini-clipboard/

+1210

STATUS.md

··· 1 + ### ATProto labeler and admin UI improvements (PRs #385-395, Nov 29-Dec 1, 2025) 2 + 3 + **motivation**: integrate with ATProto labeling protocol for proper copyright violation signaling, and improve admin tooling for reviewing flagged content. 4 + 5 + **what shipped**: 6 + - **ATProto labeler implementation** (PRs #385, #391): 7 + - standalone labeler service integrated into moderation Rust service 8 + - implements `com.atproto.label.queryLabels` and `subscribeLabels` XRPC endpoints 9 + - k256 ECDSA signing for cryptographic label verification 10 + - SQLite storage for labels with sequence numbers 11 + - labels emitted when copyright violations detected 12 + - negation labels for false positive resolution 13 + - **admin UI** (PRs #390, #392, #395): 14 + - web interface at `/admin` for reviewing copyright flags 15 + - htmx for server-rendered interactivity (no inline JS bloat) 16 + - static files extracted to `moderation/static/` for proper syntax highlighting 17 + - plyr.fm design tokens for brand consistency 18 + - shows track title, artist handle, match scores, and potential matches 19 + - "mark false positive" button emits negation label 20 + - **label context enrichment** (PR #392): 21 + - labels now include track_title, artist_handle, artist_did, highest_score, matches 22 + - backfill script (`scripts/backfill_label_context.py`) populated 25 existing flags 23 + - admin UI displays rich context instead of just ATProto URIs 24 + - **copyright flag visibility** (PRs #387, #389): 25 + - artist portal shows copyright flag indicator on flagged tracks 26 + - tooltip shows primary match (artist - title) for quick context 27 + - **documentation** (PR #386): 28 + - comprehensive docs at `docs/moderation/atproto-labeler.md` 29 + - covers architecture, label schema, XRPC protocol, signing keys 30 + 31 + **admin UI architecture**: 32 + - `moderation/static/admin.html` - page structure 33 + - `moderation/static/admin.css` - plyr.fm design tokens 34 + - `moderation/static/admin.js` - auth handling (~40 lines) 35 + - htmx endpoints: `/admin/flags-html`, `/admin/resolve-htmx` 36 + - server-rendered HTML partials for flag cards 37 + 38 + --- 39 + 40 + ### copyright moderation system (PRs #382, #384, Nov 29-30, 2025) 41 + 42 + **motivation**: detect potential copyright violations in uploaded tracks to avoid DMCA issues and protect the platform. 43 + 44 + **what shipped**: 45 + - **moderation service** (Rust/Axum on Fly.io): 46 + - standalone service at `plyr-moderation.fly.dev` 47 + - integrates with AuDD enterprise API for audio fingerprinting 48 + - scans audio URLs and returns matches with metadata (artist, title, album, ISRC, timecode) 49 + - auth via `X-Moderation-Key` header 50 + - **backend integration** (PR #382): 51 + - `ModerationSettings` in config (service URL, auth token, timeout) 52 + - moderation client module (`backend/_internal/moderation.py`) 53 + - fire-and-forget background task on track upload 54 + - stores results in `copyright_scans` table 55 + - scan errors stored as "clear" so tracks aren't stuck unscanned 56 + - **flagging fix** (PR #384): 57 + - AuDD enterprise API returns no confidence scores (all 0) 58 + - changed from score threshold to presence-based flagging: `is_flagged = !matches.is_empty()` 59 + - removed unused `score_threshold` config 60 + - **backfill script** (`scripts/scan_tracks_copyright.py`): 61 + - scans existing tracks that haven't been checked 62 + - `--max-duration` flag to skip long DJ sets (estimated from file size) 63 + - `--dry-run` mode to preview what would be scanned 64 + - supports dev/staging/prod environments 65 + - **review workflow**: 66 + - `copyright_scans` table has `resolution`, `reviewed_at`, `reviewed_by`, `review_notes` columns 67 + - resolution values: `violation`, `false_positive`, `original_artist` 68 + - SQL queries for dashboard: flagged tracks, unreviewed flags, violations list 69 + 70 + **initial review results** (25 flagged tracks): 71 + - 8 violations (actual copyright issues) 72 + - 11 false positives (fingerprint noise) 73 + - 6 original artists (people uploading their own distributed music) 74 + 75 + **key SQL queries**: 76 + ```sql 77 + -- unreviewed flags 78 + SELECT cs.track_id, t.title, a.handle, jsonb_array_length(cs.matches) as match_count, 79 + cs.matches->0->>'artist' as top_match_artist, cs.matches->0->>'title' as top_match_title 80 + FROM copyright_scans cs 81 + JOIN tracks t ON cs.track_id = t.id 82 + JOIN artists a ON t.artist_did = a.did 83 + WHERE cs.is_flagged = true AND cs.resolution IS NULL 84 + ORDER BY match_count DESC; 85 + 86 + -- all violations 87 + SELECT cs.track_id, t.title, a.handle, cs.matches->0->>'artist' as matched_artist, 88 + cs.matches->0->>'title' as matched_title, cs.review_notes 89 + FROM copyright_scans cs 90 + JOIN tracks t ON cs.track_id = t.id 91 + JOIN artists a ON t.artist_did = a.did 92 + WHERE cs.resolution = 'violation' 93 + ORDER BY cs.reviewed_at DESC; 94 + ``` 95 + 96 + **impact**: 97 + - automated copyright detection on upload 98 + - manual review workflow for flagged content 99 + - protection against DMCA takedown requests 100 + - clear audit trail with resolution status 101 + 102 + --- 103 + 104 + ### platform stats and media session integration (PRs #359-379, Nov 27-29, 2025) 105 + 106 + **motivation**: show platform activity at a glance, improve playback experience across devices, and give users control over their data. 107 + 108 + **what shipped**: 109 + - **platform stats endpoint and UI** (PRs #376, #378, #379): 110 + - `GET /stats` returns total plays, tracks, and artists 111 + - stats bar displays in homepage header (e.g., "1,691 plays • 55 tracks • 8 artists") 112 + - skeleton loading animation while fetching 113 + - responsive layout: visible in header on wide screens, collapses to menu on narrow 114 + - end-of-list animation on homepage 115 + - **Media Session API** (PR #371): 116 + - provides track metadata to CarPlay, lock screens, Bluetooth devices, macOS control center 117 + - artwork display with fallback to artist avatar 118 + - play/pause, prev/next, seek controls all work from system UI 119 + - position state syncs scrubbers on external interfaces 120 + - **browser tab title** (PR #374): 121 + - shows "track - artist • plyr.fm" while playing 122 + - persists across page navigation 123 + - reverts to page title when playback stops 124 + - **timed comments** (PR #359): 125 + - comments capture timestamp when added during playback 126 + - clickable timestamp buttons seek to that moment 127 + - compact scrollable comments section on track pages 128 + - **constellation integration** (PR #360): 129 + - queries constellation.microcosm.blue backlink index 130 + - enables network-wide like counts (not just plyr.fm internal) 131 + - environment-aware namespace handling 132 + - **account deletion** (PR #363): 133 + - explicit confirmation flow (type handle to confirm) 134 + - deletes all plyr.fm data (tracks, albums, likes, comments, preferences) 135 + - optional ATProto record cleanup with clear warnings about orphaned references 136 + 137 + **impact**: 138 + - platform stats give visitors immediate sense of activity 139 + - media session makes plyr.fm tracks controllable from car/lock screen/control center 140 + - timed comments enable discussion at specific moments in tracks 141 + - account deletion gives users full control over their data 142 + 143 + --- 144 + 145 + ### developer tokens with independent OAuth grants (PR #367, Nov 28, 2025) 146 + 147 + **motivation**: programmatic API access (scripts, CLIs, automation) needed tokens that survive browser logout and don't become stale when browser sessions refresh. 148 + 149 + **what shipped**: 150 + - **OAuth-based dev tokens**: each developer token gets its own OAuth authorization flow 151 + - user clicks "create token" → redirected to PDS for authorization → token created with independent credentials 152 + - tokens have their own DPoP keypair, access/refresh tokens - completely separate from browser session 153 + - **cookie isolation**: dev token exchange doesn't set browser cookie 154 + - added `is_dev_token` flag to ExchangeToken model 155 + - /auth/exchange skips Set-Cookie for dev token flows 156 + - prevents logout from deleting dev tokens (critical bug fixed during implementation) 157 + - **token management UI**: portal → "your data" → "developer tokens" 158 + - create with optional name and expiration (30/90/180/365 days or never) 159 + - list active tokens with creation/expiration dates 160 + - revoke individual tokens 161 + - **API endpoints**: 162 + - `POST /auth/developer-token/start` - initiates OAuth flow, returns auth_url 163 + - `GET /auth/developer-tokens` - list user's tokens 164 + - `DELETE /auth/developer-tokens/{prefix}` - revoke by 8-char prefix 165 + 166 + **CLI usage** (`scripts/plyr.py`): 167 + ```bash 168 + # set token in environment 169 + export PLYR_TOKEN="your_token_here" 170 + 171 + # list tracks 172 + PLYR_API_URL=https://api.plyr.fm uv run scripts/plyr.py list 173 + 174 + # upload a track 175 + PLYR_API_URL=https://api.plyr.fm uv run scripts/plyr.py upload track.mp3 "My Track" 176 + 177 + # delete a track 178 + PLYR_API_URL=https://api.plyr.fm uv run scripts/plyr.py delete 42 -y 179 + ``` 180 + 181 + **security properties**: 182 + - tokens are full sessions with encrypted OAuth credentials (Fernet) 183 + - each token refreshes independently (no staleness from browser session refresh) 184 + - revokable individually without affecting browser or other tokens 185 + - explicit OAuth consent required at PDS for each token created 186 + 187 + **testing verified**: 188 + - created token → uploaded track → logged out → deleted track with token ✓ 189 + - browser logout doesn't affect dev tokens ✓ 190 + - token works across browser sessions ✓ 191 + - staging deployment tested end-to-end ✓ 192 + 193 + **documentation**: see `docs/authentication.md` "developer tokens" section 194 + 195 + --- 196 + 197 + ### oEmbed endpoint for Leaflet.pub embeds (PRs #355-358, Nov 25, 2025) 198 + 199 + **motivation**: plyr.fm tracks embedded in Leaflet.pub (via iframely) showed a black HTML5 audio box instead of our custom embed player. 200 + 201 + **what shipped**: 202 + - **oEmbed endpoint** (PR #355): `/oembed` returns proper embed HTML with iframe 203 + - follows oEmbed spec with `type: "rich"` and iframe in `html` field 204 + - discovery link in track page `<head>` for automatic detection 205 + - **iframely domain registration**: registered plyr.fm on iframely.com (free tier) 206 + - this was the key fix - iframely now returns our embed iframe as `links.player[0]` 207 + - API key: stored in 1password (iframely account) 208 + 209 + **debugging journey** (PRs #356-358): 210 + - initially tried `og:video` meta tags to hint iframe embed - didn't work 211 + - tried removing `og:audio` to force oEmbed fallback - resulted in no player link 212 + - discovered iframely requires domain registration to trust oEmbed providers 213 + - after registration, iframely correctly returns embed iframe URL 214 + 215 + **current state**: 216 + - oEmbed endpoint working: `curl https://api.plyr.fm/oembed?url=https://plyr.fm/track/92` 217 + - iframely returns `links.player[0].href = "https://plyr.fm/embed/track/92"` (our embed) 218 + - Leaflet.pub should show proper embeds (pending their cache expiry) 219 + 220 + **impact**: 221 + - plyr.fm tracks can be embedded in Leaflet.pub and other iframely-powered services 222 + - proper embed player with cover art instead of raw HTML5 audio 223 + 224 + --- 225 + 226 + ### export & upload reliability (PRs #337-344, Nov 24, 2025) 227 + 228 + **motivation**: exports were failing silently on large files (OOM), uploads showed incorrect progress, and SSE connections triggered false error toasts. 229 + 230 + **what shipped**: 231 + - **database-backed jobs** (PR #337): moved upload/export tracking from in-memory to postgres 232 + - jobs table persists state across server restarts 233 + - enables reliable progress tracking via SSE polling 234 + - **streaming exports** (PR #343): fixed OOM on large file exports 235 + - previously loaded entire files into memory via `response["Body"].read()` 236 + - now streams to temp files, adds to zip from disk (constant memory) 237 + - 90-minute WAV files now export successfully on 1GB VM 238 + - **progress tracking fix** (PR #340): upload progress was receiving bytes but treating as percentage 239 + - `UploadProgressTracker` now properly converts bytes to percentage 240 + - upload progress bar works correctly again 241 + - **UX improvements** (PRs #338-339, #341-342, #344): 242 + - export filename now includes date (`plyr-tracks-2025-11-24.zip`) 243 + - toast notification on track deletion 244 + - fixed false "lost connection" error when SSE completes normally 245 + - progress now shows "downloading track X of Y" instead of confusing count 246 + 247 + **impact**: 248 + - exports work for arbitrarily large files (limited by disk, not RAM) 249 + - upload progress displays correctly 250 + - job state survives server restarts 251 + - clearer progress messaging during exports 252 + 253 + --- 254 + 255 + **what shipped**: 256 + - **removed hardcoded namespaces** (PR #263): 257 + - replaced hardcoded `"fm.plyr.like"` strings in `src/backend/_internal/atproto/records.py` 258 + - added `like_collection` computed field to config (mirrors existing `track_collection`) 259 + - fixed OAuth scope generation to use computed fields instead of hardcoded strings 260 + - updated `scripts/backfill_atproto_records.py` to use settings (was using hardcoded namespace) 261 + - **environment-specific namespaces**: 262 + - development: `fm.plyr.dev` (local .env) 263 + - staging: `fm.plyr.stg` (flyctl secrets) 264 + - production: `fm.plyr` (flyctl secrets) 265 + - **data migration**: 266 + - migrated 7 tracks + 5 likes from `fm.plyr.*` to `fm.plyr.dev.*` in development 267 + - migrated 7 tracks + 5 likes from `fm.plyr.*` to `fm.plyr.stg.*` in staging 268 + - used combination of automated script + manual cleanup with neon MCP and pdsx 269 + - cleaned up old staging records from production namespace 270 + - **documentation** (PR #264): 271 + - updated `docs/deployment/environments.md` with namespace configuration 272 + - updated `docs/backend/configuration.md` with environment-specific examples 273 + - removed typer from project dependencies (moved to PEP 723 inline script deps) 274 + - created `sandbox/stg-namespace-migration/README.md` documenting migration process 275 + 276 + **impact**: 277 + - ✅ test tracks/likes no longer pollute production collections 278 + - ✅ OAuth scopes environment-specific and automatically generated from config 279 + - ✅ database and ATProto records stay aligned within each environment 280 + - ✅ proper data separation for dev/staging/production environments 281 + - ✅ eliminated hardcoded namespace strings throughout codebase 282 + 283 + **lessons learned**: 284 + - PEP 723 inline script dependencies work well for ad-hoc migration scripts 285 + - database as source of truth more reliable than PDS for stale record lookups 286 + - manual cleanup sometimes faster than debugging complex migration logic 287 + 288 + **follow-up cleanup** (Nov 18, 2025): 289 + - discovered 82 orphaned test/dev records remaining in production `fm.plyr.track` namespace 290 + - created analysis script (`scripts/identify_orphaned_records.py`) to cross-reference PDS records against production database 291 + - verified all 13 production tracks safe (including critical tracks: webhook with features, dinah, lil blues improv) 292 + - automated deletion via generated script with proper PDS authentication 293 + - result: 95 → 13 records in production namespace, all production data intact 294 + - filed upstream issue ([pdsx#43](https://github.com/zzstoatzz/pdsx/issues/43)) for batch/concurrent CRUD operations 295 + 296 + ### mobile UI polish (PRs #259-261, #265, #268, Nov 17, 2025) 297 + 298 + **serialization improvements** (PRs #259-260): 299 + - created `TrackResponse` Pydantic model for consistent track serialization 300 + - fixed album endpoint to properly serialize tracks (was mixing dict/model types) 301 + - eliminated manual dict construction in favor of model-based serialization 302 + - better type safety and consistency across endpoints 303 + 304 + **notifications fix** (PR #261): 305 + - notification bot was using hardcoded `https://plyr.fm` URL 306 + - now uses environment-aware `settings.frontend.url` (staging uses `https://stg.plyr.fm`) 307 + - ensures notifications link to correct environment 308 + 309 + **sticky player padding** (PRs #265, #268): 310 + - fixed album tracks overlapping with sticky bottom player on mobile (#265) 311 + - attempted centralized padding approach (#266) but created excessive whitespace on mobile 312 + - reverted to per-page padding handling (#268) while keeping album track clearance fix 313 + - mobile padding now matches pre-centralization behavior 314 + 315 + **impact**: 316 + - ✅ consistent track serialization across all endpoints 317 + - ✅ notifications link to correct environment 318 + - ✅ album tracks properly clear sticky player on mobile 319 + - ✅ mobile padding back to appropriate levels (no excessive whitespace) 320 + 321 + ### secure browser authentication (issue #237, PRs #239-244, Nov 14-15, 2025) 322 + 323 + **motivation**: session tokens stored in localStorage were vulnerable to XSS attacks. any malicious script could read `session_id` from localStorage and hijack accounts for the full 14-day session lifetime. 324 + 325 + **what shipped**: 326 + - **HttpOnly cookies** (PR #244): backend sets `Set-Cookie: session_id=...; HttpOnly; Secure; SameSite=Lax` 327 + - HttpOnly prevents JavaScript access (XSS protection) 328 + - Secure requires HTTPS (except localhost for dev) 329 + - SameSite=Lax prevents CSRF while allowing same-site requests 330 + - cookies automatically sent with requests (no manual auth header management) 331 + - **cookie-aware auth dependencies** (PR #243): 332 + - `require_auth` checks cookies first, falls back to Authorization header 333 + - `require_artist_profile` updated with same pattern 334 + - optional auth endpoints (tracks list, track detail, album detail) now support cookies 335 + - proper parameter aliasing (`Cookie(alias="session_id")`) for FastAPI 336 + - **environment-aware cookie configuration**: 337 + - localhost: `secure=False` for HTTP development 338 + - staging/production: `secure=True` for HTTPS 339 + - no explicit domain set (prevents cross-environment session leakage) 340 + - **same-site detection**: 341 + - compares origin host vs request host 342 + - uses `SameSite=lax` when same-site (localhost→localhost, stg.plyr.fm→api-stg.plyr.fm) 343 + - prevents cookies from being sent cross-site 344 + - **frontend cleanup** (PR #239): 345 + - removed all localStorage session_id read/write operations 346 + - removed `getSessionId()`, `setSessionId()`, `getAuthHeaders()` helpers 347 + - all fetch calls use `credentials: 'include'` to send cookies 348 + - `XMLHttpRequest` uses `withCredentials: true` 349 + - auth state now managed entirely by backend via HttpOnly cookies 350 + 351 + **environment architecture**: 352 + - all environments use custom domains on same eTLD+1 for cookie sharing: 353 + - **staging**: `stg.plyr.fm` → `api-stg.plyr.fm` (both `.plyr.fm`) 354 + - **production**: `plyr.fm` → `api.plyr.fm` (both `.plyr.fm`) 355 + - **local**: `localhost:5173` → `localhost:8001` (both `localhost`) 356 + - separate cloudflare pages projects prevent staging/production cookie conflicts: 357 + - `plyr-fm-stg` for staging (tracks `main` branch) 358 + - `plyr-fm` for production (tracks `production-fe` branch) 359 + 360 + **security improvements**: 361 + - ✅ eliminated XSS session hijacking vector 362 + - ✅ tokens no longer accessible to JavaScript 363 + - ✅ CSRF protection via SameSite=Lax 364 + - ✅ secure transport enforcement (HTTPS in production) 365 + - ✅ environment isolation (no cookie sharing between staging/prod) 366 + 367 + **compatibility maintained**: 368 + - browser clients: use HttpOnly cookies automatically 369 + - SDK/CLI clients: use `Authorization: Bearer <token>` header with developer tokens 370 + - backend accepts both cookie and header auth (cookie preferred for browsers) 371 + 372 + **documentation created**: 373 + - `docs/backend/atproto-identity.md`: ATProto OAuth client metadata discovery patterns 374 + - `docs/deployment/environments.md`: updated with staging/production cookie architecture 375 + - PR #243 description: comprehensive explanation of cookie domain behavior 376 + 377 + **impact**: 378 + - closed high-priority security issue #237 379 + - production-grade auth implementation 380 + - foundation for future session management features (device tracking, forced logout) 381 + - eliminated most common web application security vulnerability 382 + 383 + ### albums feature (PRs #214-222, Nov 13-14, 2025) 384 + 385 + **motivation**: users wanted to group tracks into albums with dedicated pages, cover art, and metadata. 386 + 387 + **what shipped**: 388 + - **database schema** (PR #222): new `albums` table with title, slug, description, image_id, artist_did 389 + - album-track relationship via `album_rel` on tracks table 390 + - migration to backfill albums from existing track `extra->>'album'` metadata 391 + - 8 albums created from existing 32 tracks in production 392 + - **backend CRUD** (PR #222): full album management endpoints 393 + - `GET /albums/{handle}` - list artist's albums 394 + - `GET /albums/{handle}/{slug}` - album detail with tracks 395 + - `POST /albums` - create album (authenticated) 396 + - `PATCH /albums/{id}` - update album metadata 397 + - album cover art upload and storage in R2 398 + - **frontend pages** (PRs #214, #216-220): 399 + - album detail pages (`/u/{handle}/album/{slug}`) with track lists 400 + - artist discography sections on artist pages 401 + - album cover art display throughout UI 402 + - server-side rendering for SEO and link previews 403 + - **UI polish** (PR #228): long album title handling 404 + - 100-character slug limit with word-boundary truncation 405 + - CSS text truncation for inline album links 406 + - proper wrapping for album detail page titles 407 + - tested with 91-character production album title 408 + - **link previews** (PRs #230-231): 409 + - rich Open Graph metadata for albums (music.album type) 410 + - artist musician property, image dimensions, canonical URLs 411 + - fixed layout metadata conflicts (prevented generic tags from overriding page-specific ones) 412 + 413 + **what's NOT done** (issue #221 still open): 414 + - ATProto records for albums (consciously deferred) 415 + - reason: want to thoughtfully design the lexicon before committing to a schema 416 + - tracks work fine without album ATProto records for now 417 + 418 + **impact**: 419 + - albums now first-class citizens in UI and database 420 + - better content organization for artists with multiple releases 421 + - improved SEO with album-specific link previews 422 + - foundation for future features (album likes, album playlists) 423 + 424 + ### frontend architecture improvements (PRs #210, #227, Nov 13-14, 2025) 425 + 426 + **motivation**: eliminate "flash of loading", improve SEO, reduce code duplication, fix performance bottlenecks. 427 + 428 + **PR #210 - centralized auth and client-side load functions**: 429 + - created `lib/auth.svelte.ts` - centralized auth manager with SSR-safe guards 430 + - added `+layout.ts` - loads auth state once for entire app 431 + - added `+page.ts` to liked tracks page - loads data before component mounts 432 + - refactored all pages to use centralized auth (eliminated scattered localStorage calls) 433 + - code reduction: +256 lines, -308 lines (net -52 lines) 434 + 435 + **PR #227 - artist pages moved to server-side rendering**: 436 + - replaced client-side `onMount` fetches with `+page.server.ts` 437 + - parallel server loading of artist info, tracks, and albums 438 + - data ready before page renders (eliminates loading states) 439 + - performance: ~1.66s sequential waterfall → instant render 440 + 441 + **pattern shift**: 442 + ``` 443 + old: page loads → onMount → fetch artist → fetch tracks → fetch albums → render 444 + new: server fetches all in parallel → page renders immediately with data 445 + ``` 446 + 447 + **impact**: 448 + - eliminated "flash of loading" across artist and album pages 449 + - improved lighthouse scores and SEO (real data in initial HTML) 450 + - consistent auth patterns throughout app 451 + - better UX - pages feel instant instead of progressive 452 + 453 + **documentation**: see `docs/frontend/data-loading.md` for patterns and anti-patterns 454 + 455 + ### link preview system (PRs #230-231, Nov 14, 2025) 456 + 457 + **problem**: album pages and homepage had no Open Graph metadata, leading to poor link previews on social media. 458 + 459 + **PR #230 - add rich metadata**: 460 + - homepage: complete OG tags (type, title, description, url, site_name) 461 + - album pages: rich music.album metadata matching track page quality 462 + - added canonical URL, site name, musician property 463 + - added image dimensions (1200x1200), alt text, secure_url 464 + - improved meta description 465 + 466 + **PR #231 - fix metadata conflicts**: 467 + - root layout was rendering duplicate OG tags on all pages 468 + - social scrapers use first tags encountered (generic layout ones) 469 + - page-specific metadata was being ignored 470 + - solution: exclude pages with their own metadata from layout defaults 471 + - homepage (`/`) 472 + - track pages (`/track/*`) 473 + - album pages (`/u/*/album/*`) 474 + 475 + **result**: album links now show rich previews with cover art, artist info, track counts when shared on social platforms. 476 + 477 + ### Banana mix incident fixes (PR #191, Nov 13, 2025) 478 + 479 + **Why:** stellz uploaded "banana mix" twice due to slow UI feedback, creating duplicate tracks (56 and 57) 480 + pointing to the same R2 file. When track 57 was deleted, it removed the shared R2 file, breaking track 56 481 + with 404 errors. ATProto record for track 57 was orphaned on her PDS. Investigation also revealed storage 482 + layer was guessing file extensions by trying all formats until finding a match. 483 + 484 + **What shipped:** 485 + - **duplicate detection** (tracks.py:181-203): after saving file, checks if track with same `file_id` 486 + and `artist_did` exists. rejects upload with error instead of creating duplicate. 487 + - **refcount-based deletion** (r2.py:175-197): before deleting R2 file, queries database for refcount. 488 + only deletes if `refcount == 1`. logs when deletion skipped due to `refcount > 1`. 489 + - **exact key deletion** (r2.py:163-233, filesystem.py:85-123): updated `delete()` signature to accept 490 + optional `file_type` parameter. when provided, deletes exact key `audio/{file_id}.{file_type}` instead 491 + of looping through all formats. fallback to loop only when `file_type` is None (legacy rows, images). 492 + - upload cleanup passes `audio_format.value` 493 + - track deletion passes `track.file_type` 494 + - image deletion still uses fallback (no `image_format` field yet - tech debt) 495 + - **ATProto cleanup** (tracks.py:683-712): deletes PDS record when track deleted. handles 404 gracefully 496 + (record already gone), bubbles other errors. 497 + 498 + **Impact:** prevents "delete duplicate and nuke original" scenario. logs show exact keys being deleted 499 + instead of trying wrong extensions first. manual e2e test confirmed: uploaded .wav file, verified exact 500 + key deletion via R2 API, confirmed clean deletion with no orphans in DB/PDS/R2. 501 + 502 + **Tech debt identified:** 503 + - storage layer has accumulated naive patterns that work but aren't elegant: 504 + - image deletion still loops through formats (no `image_format` column on tracks) 505 + - could store image format alongside `image_id` to enable exact deletion 506 + - or maintain separate image metadata table 507 + - functional for now, but should clean up later 508 + 509 + ### detailed history 510 + 511 + ### Queue hydration + ATProto token hardening (Nov 12, 2025) 512 + 513 + **Why:** queue endpoints were occasionally taking 2s+ and restore operations could 401 514 + when multiple requests refreshed an expired ATProto token simultaneously. 515 + 516 + **What shipped:** 517 + - Added persistent `image_url` on `Track` rows so queue hydration no longer probes R2 518 + for every track. Queue payloads now pull art directly from Postgres, with a one-time 519 + fallback for legacy rows. 520 + - Updated `_internal/queue.py` to backfill any missing URLs once (with caching) instead 521 + of per-request GETs. 522 + - Introduced per-session locks in `_refresh_session_tokens` so only one coroutine hits 523 + `oauth_client.refresh_session` at a time; others reuse the refreshed tokens. This 524 + removes the race that caused the batch restore flow to intermittently 500/401. 525 + 526 + **Impact:** queue tail latency dropped back under 500 ms in staging tests, ATProto restore flows are now reliable under concurrent use, and Logfire no longer shows 500s 527 + from the PDS. 528 + 529 + ### Liked tracks feature (PR #157, Nov 11, 2025) 530 + 531 + - ✅ server-side persistent collections 532 + - ✅ ATProto record publication for cross-platform visibility 533 + - ✅ UI for adding/removing tracks from liked collection 534 + - ✅ like counts displayed in track responses and analytics (#170) 535 + - ✅ analytics cards now clickable links to track detail pages (#171) 536 + - ✅ liked state shown on artist page tracks (#163) 537 + 538 + ### Upload streaming + progress UX (PR #182, Nov 11, 2025) 539 + 540 + - Frontend switched from `fetch` to `XMLHttpRequest` so we can display upload progress 541 + toasts (critical for >50 MB mixes on mobile). 542 + - Upload form now clears only after the request succeeds; failed attempts leave the 543 + form intact so users don't lose metadata. 544 + - Backend writes uploads/images to temp files in 8 MB chunks before handing them to the 545 + storage layer, eliminating whole-file buffering and iOS crashes for hour-long mixes. 546 + - Deployment verified locally and by rerunning the exact repro Stella hit (85 minute 547 + mix from mobile). 548 + 549 + ### transcoder API deployment (PR #156, Nov 11, 2025) 550 + 551 + **standalone Rust transcoding service** 🎉 552 + - **deployed**: https://plyr-transcoder.fly.dev/ 553 + - **purpose**: convert AIFF/FLAC/etc. to MP3 for browser compatibility 554 + - **technology**: Axum + ffmpeg + Docker 555 + - **security**: `X-Transcoder-Key` header authentication (shared secret) 556 + - **capacity**: handles 1GB uploads, tested with 85-minute AIFF files (~858MB → 195MB MP3 in 32 seconds) 557 + - **architecture**: 558 + - 2 Fly machines for high availability 559 + - auto-stop/start for cost efficiency 560 + - stateless design (no R2 integration yet) 561 + - 320kbps MP3 output with proper ID3 tags 562 + - **status**: deployed and tested, ready for integration into plyr.fm upload pipeline 563 + - **next steps**: wire into backend with R2 integration and job queue (see issue #153) 564 + 565 + ### AIFF/AIF browser compatibility fix (PR #152, Nov 11, 2025) 566 + 567 + **format validation improvements** 568 + - **problem discovered**: AIFF/AIF files only work in Safari, not Chrome/Firefox 569 + - browsers throw `MediaError code 4: MEDIA_ERR_SRC_NOT_SUPPORTED` 570 + - users could upload files but they wouldn't play in most browsers 571 + - **immediate solution**: reject AIFF/AIF uploads at both backend and frontend 572 + - removed AIFF/AIF from AudioFormat enum 573 + - added format hints to upload UI: "supported: mp3, wav, m4a" 574 + - client-side validation with helpful error messages 575 + - **long-term solution**: deployed standalone transcoder service (see above) 576 + - separate Rust/Axum service with ffmpeg 577 + - accepts all formats, converts to browser-compatible MP3 578 + - integration into upload pipeline pending (issue #153) 579 + 580 + **observability improvements**: 581 + - added logfire instrumentation to upload background tasks 582 + - added logfire spans to R2 storage operations 583 + - documented logfire querying patterns in `docs/logfire-querying.md` 584 + 585 + ### async I/O performance fixes (PRs #149-151, Nov 10-11, 2025) 586 + 587 + Eliminated event loop blocking across backend with three critical PRs: 588 + 589 + 1. **PR #149: async R2 reads** - converted R2 `head_object` operations from sync boto3 to async aioboto3 590 + - portal page load time: 2+ seconds → ~200ms 591 + - root cause: `track.image_url` was blocking on serial R2 HEAD requests 592 + 593 + 2. **PR #150: concurrent PDS resolution** - parallelized ATProto PDS URL lookups 594 + - homepage load time: 2-6 seconds → 200-400ms 595 + - root cause: serial `resolve_atproto_data()` calls (8 artists × 200-300ms each) 596 + - fix: `asyncio.gather()` for batch resolution, database caching for subsequent loads 597 + 598 + 3. **PR #151: async storage writes/deletes** - made save/delete operations non-blocking 599 + - R2: switched to `aioboto3` for uploads/deletes (async S3 operations) 600 + - filesystem: used `anyio.Path` and `anyio.open_file()` for chunked async I/O (64KB chunks) 601 + - impact: multi-MB uploads no longer monopolize worker thread, constant memory usage 602 + 603 + ### cover art support (PRs #123-126, #132-139) 604 + - ✅ track cover image upload and storage (separate R2 bucket) 605 + - ✅ image display on track pages and player 606 + - ✅ Open Graph meta tags for track sharing 607 + - ✅ mobile-optimized layouts with cover art 608 + - ✅ sticky bottom player on mobile with cover 609 + 610 + ### track detail pages (PR #164, Nov 12, 2025) 611 + 612 + - ✅ dedicated track detail pages with large cover art 613 + - ✅ play button updates queue state correctly (#169) 614 + - ✅ liked state loaded efficiently via server-side fetch 615 + - ✅ mobile-optimized layouts with proper scrolling constraints 616 + - ✅ origin validation for image URLs (#168) 617 + 618 + ### mobile UI improvements (PRs #159-185, Nov 11-12, 2025) 619 + 620 + - ✅ compact action menus and better navigation (#161) 621 + - ✅ improved mobile responsiveness (#159) 622 + - ✅ consistent button layouts across mobile/desktop (#176-181, #185) 623 + - ✅ always show play count and like count on mobile (#177) 624 + - ✅ login page UX improvements (#174-175) 625 + - ✅ liked page UX improvements (#173) 626 + - ✅ accent color for liked tracks (#160) 627 + 628 + ### queue management improvements (PRs #110-113, #115) 629 + - ✅ visual feedback on queue add/remove 630 + - ✅ toast notifications for queue actions 631 + - ✅ better error handling for queue operations 632 + - ✅ improved shuffle and auto-advance UX 633 + 634 + ### infrastructure and tooling 635 + - ✅ R2 bucket separation: audio-prod and images-prod (PR #124) 636 + - ✅ admin script for content moderation (`scripts/delete_track.py`) 637 + - ✅ bluesky attribution link in header 638 + - ✅ changelog target added (#183) 639 + - ✅ documentation updates (#158) 640 + - ✅ track metadata edits now persist correctly (#162) 641 + 642 + ## immediate priorities 643 + 644 + ### high priority features 645 + 1. **audio transcoding pipeline integration** (issue #153) 646 + - ✅ standalone transcoder service deployed at https://plyr-transcoder.fly.dev/ 647 + - ✅ Rust/Axum service with ffmpeg, tested with 85-minute files 648 + - ✅ secure auth via X-Transcoder-Key header 649 + - ⏳ next: integrate into plyr.fm upload pipeline 650 + - backend calls transcoder API for unsupported formats 651 + - queue-based job system for async processing 652 + - R2 integration (fetch original, store MP3) 653 + - maintain original file hash for deduplication 654 + - handle transcoding failures gracefully 655 + 656 + ### critical bugs 657 + 1. **upload reliability** (issue #147): upload returns 200 but file missing from R2, no error logged 658 + - priority: high (data loss risk) 659 + - need better error handling and retry logic in background upload task 660 + 661 + 2. **database connection pool SSL errors**: intermittent failures on first request 662 + - symptom: `/tracks/` returns 500 on first request, succeeds after 663 + - fix: set `pool_pre_ping=True`, adjust `pool_recycle` for Neon timeouts 664 + - documented in `docs/logfire-querying.md` 665 + 666 + ### performance optimizations 667 + 3. **persist concrete file extensions in database**: currently brute-force probing all supported formats on read 668 + - already know `Track.file_type` and image format during upload 669 + - eliminating repeated `exists()` checks reduces filesystem/R2 HEAD spam 670 + - improves audio streaming latency (`/audio/{file_id}` endpoint walks extensions sequentially) 671 + 672 + 4. **stream large uploads directly to storage**: current implementation reads entire file into memory before background task 673 + - multi-GB uploads risk OOM 674 + - stream from `UploadFile.file` → storage backend for constant memory usage 675 + 676 + ### new features 677 + 5. **content-addressable storage** (issue #146) 678 + - hash-based file storage for automatic deduplication 679 + - reduces storage costs when multiple artists upload same file 680 + - enables content verification 681 + 682 + 6. **liked tracks feature** (issue #144): design schema and ATProto record format 683 + - server-side persistent collections 684 + - ATProto record publication for cross-platform visibility 685 + - UI for adding/removing tracks from liked collection 686 + 687 + ## open issues by timeline 688 + 689 + ### immediate 690 + - issue #153: audio transcoding pipeline (ffmpeg worker for AIFF/FLAC→MP3) 691 + - issue #147: upload reliability bug (data loss risk) 692 + - issue #144: likes feature for personal collections 693 + 694 + ### short-term 695 + - issue #146: content-addressable storage (hash-based deduplication) 696 + - issue #24: implement play count abuse prevention 697 + - database connection pool tuning (SSL errors) 698 + - file extension persistence in database 699 + 700 + ### medium-term 701 + - issue #39: postmortem - cross-domain auth deployment and remaining security TODOs 702 + - issue #46: consider removing init_db() from lifespan in favor of migration-only approach 703 + - issue #56: design public developer API and versioning 704 + - issue #57: support multiple audio item types (voice memos/snippets) 705 + - issue #122: fullscreen player for immersive playback 706 + 707 + ### long-term 708 + - migrate to plyr-owned lexicon (custom ATProto namespace with richer metadata) 709 + - publish to multiple ATProto AppViews for cross-platform visibility 710 + - explore ATProto-native notifications (replace Bluesky DM bot) 711 + - realtime queue syncing across devices via SSE/WebSocket 712 + - artist analytics dashboard improvements 713 + - issue #44: modern music streaming feature parity 714 + 715 + ## technical state 716 + 717 + ### architecture 718 + 719 + **backend** 720 + - language: Python 3.11+ 721 + - framework: FastAPI with uvicorn 722 + - database: Neon PostgreSQL (serverless, fully managed) 723 + - storage: Cloudflare R2 (S3-compatible object storage) 724 + - hosting: Fly.io (2x shared-cpu VMs, auto-scaling) 725 + - observability: Pydantic Logfire (traces, metrics, logs) 726 + - auth: ATProto OAuth 2.1 (forked SDK: github.com/zzstoatzz/atproto) 727 + 728 + **frontend** 729 + - framework: SvelteKit (latest v2.43.2) 730 + - runtime: Bun (fast JS runtime) 731 + - hosting: Cloudflare Pages (edge network) 732 + - styling: vanilla CSS with lowercase aesthetic 733 + - state management: Svelte 5 runes ($state, $derived, $effect) 734 + 735 + **deployment** 736 + - ci/cd: GitHub Actions 737 + - backend: automatic on main branch merge (fly.io deploy) 738 + - frontend: automatic on every push to main (cloudflare pages) 739 + - migrations: automated via fly.io release_command 740 + - environments: dev → staging → production (full separation) 741 + - versioning: nebula timestamp format (YYYY.MMDD.HHMMSS) 742 + 743 + **key dependencies** 744 + - atproto: forked SDK for OAuth and record management 745 + - sqlalchemy: async ORM for postgres 746 + - alembic: database migrations 747 + - boto3/aioboto3: R2 storage client 748 + - logfire: observability (FastAPI + SQLAlchemy instrumentation) 749 + - httpx: async HTTP client 750 + 751 + **what's working** 752 + 753 + **core functionality** 754 + - ✅ ATProto OAuth 2.1 authentication with encrypted state 755 + - ✅ secure session management via HttpOnly cookies (XSS protection) 756 + - ✅ developer tokens with independent OAuth grants (programmatic API access) 757 + - ✅ platform stats endpoint and homepage display (plays, tracks, artists) 758 + - ✅ Media Session API for CarPlay, lock screens, control center 759 + - ✅ timed comments on tracks with clickable timestamps 760 + - ✅ account deletion with explicit confirmation 761 + - ✅ artist profiles synced with Bluesky (avatar, display name, handle) 762 + - ✅ track upload with streaming to prevent OOM 763 + - ✅ track edit (title, artist, album, features metadata) 764 + - ✅ track deletion with cascade cleanup 765 + - ✅ audio streaming via HTML5 player with 307 redirects to R2 CDN 766 + - ✅ track metadata published as ATProto records (fm.plyr.track namespace) 767 + - ✅ play count tracking with threshold (30% or 30s, whichever comes first) 768 + - ✅ like functionality with counts 769 + - ✅ artist analytics dashboard 770 + - ✅ queue management (shuffle, auto-advance, reorder) 771 + - ✅ mobile-optimized responsive UI 772 + - ✅ cross-tab queue synchronization via BroadcastChannel 773 + - ✅ share tracks via URL with Open Graph previews (including cover art) 774 + - ✅ image URL caching in database (eliminates N+1 R2 calls) 775 + - ✅ format validation (rejects AIFF/AIF, accepts MP3/WAV/M4A with helpful error mes 776 + sages) 777 + - ✅ standalone audio transcoding service deployed and verified (see issue #153) 778 + - ✅ Bluesky embed player UI changes implemented (pending upstream social-app PR) 779 + - ✅ admin content moderation script for removing inappropriate uploads 780 + - ✅ copyright moderation system (AuDD fingerprinting, review workflow, violation tracking) 781 + - ✅ ATProto labeler for copyright violations (queryLabels, subscribeLabels XRPC endpoints) 782 + - ✅ admin UI for reviewing flagged tracks with htmx (plyr-moderation.fly.dev/admin) 783 + 784 + **albums** 785 + - ✅ album database schema with track relationships 786 + - ✅ album browsing pages (`/u/{handle}` shows discography) 787 + - ✅ album detail pages (`/u/{handle}/album/{slug}`) with full track lists 788 + - ✅ album cover art upload and display 789 + - ✅ server-side rendering for SEO 790 + - ✅ rich Open Graph metadata for link previews (music.album type) 791 + - ✅ long album title handling (100-char slugs, CSS truncation) 792 + - ⏸ ATProto records for albums (deferred, see issue #221) 793 + 794 + **frontend architecture** 795 + - ✅ server-side data loading (`+page.server.ts`) for artist and album pages 796 + - ✅ client-side data loading (`+page.ts`) for auth-dependent pages 797 + - ✅ centralized auth manager (`lib/auth.svelte.ts`) 798 + - ✅ layout-level auth state (`+layout.ts`) shared across all pages 799 + - ✅ eliminated "flash of loading" via proper load functions 800 + - ✅ consistent auth patterns (no scattered localStorage calls) 801 + 802 + **deployment (fully automated)** 803 + - **production**: 804 + - frontend: https://plyr.fm (cloudflare pages) 805 + - backend: https://relay-api.fly.dev (fly.io: 2 machines, 1GB RAM, 1 shared CPU, min 1 running) 806 + - database: neon postgresql 807 + - storage: cloudflare R2 (audio-prod and images-prod buckets) 808 + - deploy: github release → automatic 809 + 810 + - **staging**: 811 + - backend: https://api-stg.plyr.fm (fly.io: relay-api-staging) 812 + - frontend: https://stg.plyr.fm (cloudflare pages: plyr-fm-stg) 813 + - database: neon postgresql (relay-staging) 814 + - storage: cloudflare R2 (audio-stg bucket) 815 + - deploy: push to main → automatic 816 + 817 + - **development**: 818 + - backend: localhost:8000 819 + - frontend: localhost:5173 820 + - database: neon postgresql (relay-dev) 821 + - storage: cloudflare R2 (audio-dev and images-dev buckets) 822 + 823 + - **developer tooling**: 824 + - `just serve` - run backend locally 825 + - `just dev` - run frontend locally 826 + - `just test` - run test suite 827 + - `just release` - create production release (backend + frontend) 828 + - `just release-frontend-only` - deploy only frontend changes (added Nov 13) 829 + 830 + ### what's in progress 831 + 832 + **immediate work** 833 + - investigating playback auto-start behavior (#225) 834 + - page refresh sometimes starts playing immediately 835 + - may be related to queue state restoration or localStorage caching 836 + - `autoplay_next` preference not being respected in all cases 837 + - liquid glass effects as user-configurable setting (#186) 838 + 839 + **active research** 840 + - transcoding pipeline architecture (see sandbox/transcoding-pipeline-plan.md) 841 + - content moderation systems (#166, #167, #393 - takedown state representation) 842 + - PWA capabilities and offline support (#165) 843 + 844 + ### known issues 845 + 846 + **player behavior** 847 + - playback auto-start on refresh (#225) 848 + - sometimes plays immediately after page load 849 + - investigating localStorage/queue state persistence 850 + - may not respect `autoplay_next` preference in all scenarios 851 + 852 + **missing features** 853 + - no ATProto records for albums yet (#221 - consciously deferred) 854 + - no track genres/tags/descriptions yet (#155) 855 + - no AIFF/AIF transcoding support (#153) 856 + - no PWA installation prompts (#165) 857 + - no fullscreen player view (#122) 858 + - no public API for third-party integrations (#56) 859 + 860 + **technical debt** 861 + - multi-tab playback synchronization could be more robust 862 + - queue state conflicts can occur with rapid operations 863 + 864 + ### technical decisions 865 + 866 + **why Python/FastAPI instead of Rust?** 867 + - rapid prototyping velocity during MVP phase 868 + - rich ecosystem for web APIs (fastapi, sqlalchemy, pydantic) 869 + - excellent async support with asyncio 870 + - lower barrier to contribution 871 + - trade-off: accepting higher latency for faster development 872 + - future: can migrate hot paths to Rust if needed (transcoding service already planned) 873 + 874 + **why Fly.io instead of AWS/GCP?** 875 + - simple deployment model (dockerfile → production) 876 + - automatic SSL/TLS certificates 877 + - built-in global load balancing 878 + - reasonable pricing for MVP ($5/month) 879 + - easy migration path to larger providers later 880 + - trade-off: vendor-specific features, less control 881 + 882 + **why Cloudflare R2 instead of S3?** 883 + - zero egress fees (critical for audio streaming) 884 + - S3-compatible API (easy migration if needed) 885 + - integrated CDN for fast delivery 886 + - significantly cheaper than S3 for bandwidth-heavy workloads 887 + 888 + **why forked atproto SDK?** 889 + - upstream SDK lacked OAuth 2.1 support 890 + - needed custom record management patterns 891 + - maintains compatibility with ATProto spec 892 + - contributes improvements back when possible 893 + 894 + **why SvelteKit instead of React/Next.js?** 895 + - Svelte 5 runes provide excellent reactivity model 896 + - smaller bundle sizes (critical for mobile) 897 + - less boilerplate than React 898 + - SSR + static generation flexibility 899 + - modern DX with TypeScript 900 + 901 + **why Neon instead of self-hosted Postgres?** 902 + - serverless autoscaling (no capacity planning) 903 + - branch-per-PR workflow (preview databases) 904 + - automatic backups and point-in-time recovery 905 + - generous free tier for MVP 906 + - trade-off: higher latency than co-located DB, but acceptable 907 + 908 + **why reject AIFF instead of transcoding immediately?** 909 + - MVP speed: transcoding requires queue infrastructure, ffmpeg setup, error handling 910 + - user communication: better to be upfront about limitations than silent failures 911 + - resource management: transcoding is CPU-intensive, needs proper worker architecture 912 + - future flexibility: can add transcoding as optional feature (high-quality uploads → MP3 delivery) 913 + - trade-off: some users can't upload AIFF now, but those who can upload MP3 have working experience 914 + 915 + **why async everywhere?** 916 + - event loop performance: single-threaded async handles high concurrency 917 + - I/O-bound workload: most time spent waiting on network/disk 918 + - recent work (PRs #149-151) eliminated all blocking operations 919 + - alternative: thread pools for blocking I/O, but increases complexity 920 + - trade-off: debugging async code harder than sync, but worth throughput gains 921 + 922 + **why anyio.Path over thread pools?** 923 + - true async I/O: `anyio` uses OS-level async file operations where available 924 + - constant memory: chunked reads/writes (64KB) prevent OOM on large files 925 + - thread pools: would work but less efficient, more context switching 926 + - trade-off: anyio API slightly different from stdlib `pathlib`, but cleaner async semantics 927 + 928 + ## cost structure 929 + 930 + current monthly costs: ~$5-6 931 + 932 + - cloudflare pages: $0 (free tier) 933 + - cloudflare R2: ~$0.16 (storage + operations, no egress fees) 934 + - fly.io production: $5.00 (2x shared-cpu-1x VMs with auto-stop) 935 + - fly.io staging: $0 (auto-stop, only runs during testing) 936 + - neon: $0 (free tier, 0.5 CPU, 512MB RAM, 3GB storage) 937 + - logfire: $0 (free tier) 938 + - domain: $12/year (~$1/month) 939 + 940 + ## deployment URLs 941 + 942 + - **production frontend**: https://plyr.fm 943 + - **production backend**: https://relay-api.fly.dev (redirects to https://api.plyr.fm) 944 + - **staging backend**: https://api-stg.plyr.fm 945 + - **staging frontend**: https://stg.plyr.fm 946 + - **repository**: https://github.com/zzstoatzz/plyr.fm (private) 947 + - **monitoring**: https://logfire-us.pydantic.dev/zzstoatzz/relay 948 + - **bluesky**: https://bsky.app/profile/plyr.fm 949 + - **latest release**: 2025.1129.214811 950 + 951 + ## health indicators 952 + 953 + **production status**: ✅ healthy 954 + - uptime: consistently available 955 + - response times: <500ms p95 for API endpoints 956 + - error rate: <1% (mostly invalid OAuth states) 957 + - storage: ~12 tracks uploaded, functioning correctly 958 + 959 + **key metrics** 960 + - total tracks: ~12 961 + - total artists: ~3 962 + - play counts: tracked per-track 963 + - storage used: <1GB R2 964 + - database size: <10MB postgres 965 + 966 + ## next session prep 967 + 968 + **context for new agent:** 969 + 1. Fixed R2 image upload path mismatch, ensuring images save with the correct prefix. 970 + 2. Implemented UI changes for the embed player: removed the Queue button and matched fonts to the main app. 971 + 3. Opened a draft PR to the upstream social-app repository for native Plyr.fm embed support. 972 + 4. Updated issue #153 (transcoding pipeline) with a clear roadmap for integration into the backend. 973 + 5. Developed a local verification script for the transcoder service for faster local iteration. 974 + 975 + **useful commands:** 976 + - `just backend run` - run backend locally 977 + - `just frontend dev` - run frontend locally 978 + - `just test` - run test suite (from `backend/` directory) 979 + - `gh issue list` - check open issues 980 + ## admin tooling 981 + 982 + ### content moderation 983 + script: `scripts/delete_track.py` 984 + - requires `ADMIN_*` prefixed environment variables 985 + - deletes audio file from R2 986 + - deletes cover image from R2 (if exists) 987 + - deletes database record (cascades to likes and queue entries) 988 + - notes ATProto records for manual cleanup (can't delete from other users' PDS) 989 + 990 + usage: 991 + ```bash 992 + # dry run 993 + uv run scripts/delete_track.py <track_id> --dry-run 994 + 995 + # delete with confirmation 996 + uv run scripts/delete_track.py <track_id> 997 + 998 + # delete without confirmation 999 + uv run scripts/delete_track.py <track_id> --yes 1000 + 1001 + # by URL 1002 + uv run scripts/delete_track.py --url https://plyr.fm/track/34 1003 + ``` 1004 + 1005 + required environment variables: 1006 + - `ADMIN_DATABASE_URL` - production database connection 1007 + - `ADMIN_AWS_ACCESS_KEY_ID` - R2 access key 1008 + - `ADMIN_AWS_SECRET_ACCESS_KEY` - R2 secret 1009 + - `ADMIN_R2_ENDPOINT_URL` - R2 endpoint 1010 + - `ADMIN_R2_BUCKET` - R2 bucket name 1011 + 1012 + ## known issues 1013 + 1014 + ### non-blocking 1015 + - cloudflare pages preview URLs return 404 (production works fine) 1016 + - some "relay" references remain in docs and comments 1017 + - ATProto like records can't be deleted when removing tracks (orphaned on users' PDS) 1018 + 1019 + ## for new contributors 1020 + 1021 + ### getting started 1022 + 1. clone: `gh repo clone zzstoatzz/plyr.fm` 1023 + 2. install dependencies: `uv sync && cd frontend && bun install` 1024 + 3. run backend: `uv run uvicorn backend.main:app --reload` 1025 + 4. run frontend: `cd frontend && bun run dev` 1026 + 5. visit http://localhost:5173 1027 + 1028 + ### development workflow 1029 + 1. create issue on github 1030 + 2. create PR from feature branch 1031 + 3. ensure pre-commit hooks pass 1032 + 4. test locally 1033 + 5. merge to main → deploys to staging automatically 1034 + 6. verify on staging 1035 + 7. create github release → deploys to production automatically 1036 + 1037 + ### key principles 1038 + - type hints everywhere 1039 + - lowercase aesthetic 1040 + - generic terminology (use "items" not "tracks" where appropriate) 1041 + - ATProto first 1042 + - mobile matters 1043 + - cost conscious 1044 + - async everywhere (no blocking I/O) 1045 + 1046 + ### project structure 1047 + ``` 1048 + plyr.fm/ 1049 + ├── backend/ # FastAPI app & Python tooling 1050 + │ ├── src/backend/ # application code 1051 + │ │ ├── api/ # public endpoints 1052 + │ │ ├── _internal/ # internal services 1053 + │ │ ├── models/ # database schemas 1054 + │ │ └── storage/ # storage adapters 1055 + │ ├── tests/ # pytest suite 1056 + │ └── alembic/ # database migrations 1057 + ├── frontend/ # SvelteKit app 1058 + │ ├── src/lib/ # components & state 1059 + │ └── src/routes/ # pages 1060 + ├── moderation/ # Rust moderation service (ATProto labeler) 1061 + │ ├── src/ # Axum handlers, AuDD client, label signing 1062 + │ └── static/ # admin UI (html/css/js) 1063 + ├── transcoder/ # Rust audio transcoding service 1064 + ├── docs/ # documentation 1065 + └── justfile # task runner (mods: backend, frontend, moderation, transcoder) 1066 + ``` 1067 + 1068 + ## documentation 1069 + 1070 + - [deployment overview](docs/deployment/overview.md) 1071 + - [configuration guide](docs/configuration.md) 1072 + - [queue design](docs/queue-design.md) 1073 + - [logfire querying](docs/logfire-querying.md) 1074 + - [pdsx guide](docs/pdsx-guide.md) 1075 + - [neon mcp guide](docs/neon-mcp-guide.md) 1076 + 1077 + ## performance optimization session (Nov 12, 2025) 1078 + 1079 + ### issue: slow /tracks/liked endpoint 1080 + 1081 + **symptoms**: 1082 + - `/tracks/liked` taking 600-900ms consistently 1083 + - only ~25ms spent in database queries 1084 + - mysterious 575ms gap with no spans in Logfire traces 1085 + - endpoint felt sluggish compared to other pages 1086 + 1087 + **investigation**: 1088 + - examined Logfire traces for `/tracks/liked` requests 1089 + - found 5-6 liked tracks being returned per request 1090 + - DB queries completing fast (track data, artist info, like counts all under 10ms each) 1091 + - noticed R2 storage calls weren't appearing in traces despite taking majority of request time 1092 + 1093 + **root cause**: 1094 + - PR #184 added `image_url` column to tracks table to eliminate N+1 R2 API calls 1095 + - new tracks (uploaded after PR) have `image_url` populated at upload time ✅ 1096 + - legacy tracks (15 tracks uploaded before PR) had `image_url = NULL` ❌ 1097 + - fallback code called `track.get_image_url()` for NULL values 1098 + - `get_image_url()` makes uninstrumented R2 `head_object` API calls to find image extensions 1099 + - each track with NULL `image_url` = ~100-120ms of R2 API calls per request 1100 + - 5 tracks × 120ms = ~600ms of uninstrumented latency 1101 + 1102 + **why R2 calls weren't visible**: 1103 + - `storage.get_url()` method had no Logfire instrumentation 1104 + - R2 API calls happening but not creating spans 1105 + - appeared as mysterious gap in trace timeline 1106 + 1107 + **solution implemented**: 1108 + 1. created `scripts/backfill_image_urls.py` to populate missing `image_url` values 1109 + 2. ran script against production database with production R2 credentials 1110 + 3. backfilled 11 tracks successfully (4 already done in previous partial run) 1111 + 4. 3 tracks "failed" but actually have non-existent images (optional, expected) 1112 + 5. script uses concurrent `asyncio.gather()` for performance 1113 + 1114 + **key learning: environment configuration matters**: 1115 + - initial script runs failed silently because: 1116 + - script used local `.env` credentials (dev R2 bucket) 1117 + - production images stored in different R2 bucket (`images-prod`) 1118 + - `get_url()` returned `None` when images not found in dev bucket 1119 + - fix: passed production R2 credentials via environment variables: 1120 + - `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY` 1121 + - `R2_IMAGE_BUCKET=images-prod` 1122 + - `R2_PUBLIC_IMAGE_BUCKET_URL=https://pub-7ea7ea9a6f224f4f8c0321a2bb008c5a.r2.dev` 1123 + 1124 + **results**: 1125 + - before: 15 tracks needed backfill, causing ~600-900ms latency on `/tracks/liked` 1126 + - after: 13 tracks populated with `image_url`, 3 legitimately have no images 1127 + - `/tracks/liked` now loads with 0 R2 API calls instead of 5-11 1128 + - endpoint feels "really, really snappy" (user feedback) 1129 + - performance improvement visible immediately after backfill 1130 + 1131 + **database cleanup: queue_state table bloat**: 1132 + - discovered `queue_state` had 265% bloat (53 dead rows, 20 live rows) 1133 + - ran `VACUUM (FULL, ANALYZE) queue_state` against production 1134 + - result: 0 dead rows, table clean 1135 + - configured autovacuum for queue_state to prevent future bloat: 1136 + - frequent updates to this table make it prone to bloat 1137 + - should tune `autovacuum_vacuum_scale_factor` to 0.05 (5% vs default 20%) 1138 + 1139 + **endpoint performance snapshot** (post-fix, last 10 minutes): 1140 + - `GET /tracks/`: 410ms (down from 2+ seconds) 1141 + - `GET /queue/`: 399ms (down from 2+ seconds) 1142 + - `GET /tracks/liked`: now sub-200ms (down from 600-900ms) 1143 + - `GET /preferences/`: 200ms median 1144 + - `GET /auth/me`: 114ms median 1145 + - `POST /tracks/{track_id}/play`: 34ms 1146 + 1147 + **PR #184 context**: 1148 + - PR claimed "opportunistic backfill: legacy records update on first access" 1149 + - but actual implementation never saved computed `image_url` back to database 1150 + - fallback code only computed URLs on-demand, didn't persist them 1151 + - this is why repeated visits kept hitting R2 API for same tracks 1152 + - one-time backfill script was correct solution vs adding write logic to read endpoints 1153 + 1154 + **graceful ATProto recovery (PR #180)**: 1155 + - reviewed recent work on handling tracks with missing `atproto_record_uri` 1156 + - 4 tracks in production have NULL ATProto records (expected from upload failures) 1157 + - system already handles this gracefully: 1158 + - like buttons disabled with helpful tooltips 1159 + - track owners can self-service restore via portal 1160 + - `restore-record` endpoint recreates with correct TID timestamps 1161 + - no action needed - existing recovery system working as designed 1162 + 1163 + **performance metrics pre/post all recent PRs**: 1164 + - PR #184 (image_url storage): eliminated hundreds of R2 API calls per request 1165 + - today's backfill: eliminated remaining R2 calls for legacy tracks 1166 + - combined impact: queue/tracks endpoints now 5-10x faster than before PR #184 1167 + - all endpoints now consistently sub-second response times 1168 + 1169 + **documentation created**: 1170 + - `docs/neon-mcp-guide.md`: comprehensive guide for using Neon MCP 1171 + - project/branch management 1172 + - database schema inspection 1173 + - SQL query patterns for plyr.fm 1174 + - connection string generation 1175 + - environment mapping (dev/staging/prod) 1176 + - debugging workflows 1177 + - `scripts/backfill_image_urls.py`: reusable for any future image_url gaps 1178 + - dry-run mode for safety 1179 + - concurrent R2 API calls 1180 + - detailed error logging 1181 + - production-tested 1182 + 1183 + **tools and patterns established**: 1184 + - Neon MCP for database inspection and queries 1185 + - Logfire arbitrary queries for performance analysis 1186 + - production secret management via Fly.io 1187 + - `flyctl ssh console` for environment inspection 1188 + - backfill scripts with dry-run mode 1189 + - environment variable overrides for production operations 1190 + 1191 + **system health indicators**: 1192 + - ✅ no 5xx errors in recent spans 1193 + - ✅ database queries all under 70ms p95 1194 + - ✅ SSL connection pool issues resolved (no errors in recent traces) 1195 + - ✅ queue_state table bloat eliminated 1196 + - ✅ all track images either in DB or legitimately NULL 1197 + - ✅ application feels fast and responsive 1198 + 1199 + **next steps**: 1200 + 1. configure autovacuum for `queue_state` table (prevent future bloat) 1201 + 2. add Logfire instrumentation to `storage.get_url()` for visibility 1202 + 3. monitor `/tracks/liked` performance over next few days 1203 + 4. consider adding similar backfill pattern for any future column additions 1204 + 1205 + --- 1206 + 1207 + this is a living document. last updated 2025-12-01 after ATProto labeler and admin UI improvements. 1208 + 1209 + **open PRs**: 1210 + - PR #396: add rust CI and pre-commit checks (justfile case fix, cargo check hooks, check-rust.yml workflow)

+79

scripts/generate_tts.py

··· 1 + #!/usr/bin/env python3 2 + """generate audio from a podcast script using gemini TTS. 3 + 4 + usage: 5 + uv run scripts/generate_tts.py podcast_script.txt output.wav 6 + 7 + requires GOOGLE_API_KEY environment variable. 8 + """ 9 + # /// script 10 + # requires-python = ">=3.11" 11 + # dependencies = ["google-genai"] 12 + # /// 13 + 14 + import os 15 + import sys 16 + from pathlib import Path 17 + 18 + from google import genai 19 + from google.genai import types 20 + 21 + 22 + def main() -> None: 23 + if len(sys.argv) != 3: 24 + print("usage: generate_tts.py <script_file> <output_file>") 25 + sys.exit(1) 26 + 27 + script_path = Path(sys.argv[1]) 28 + output_path = Path(sys.argv[2]) 29 + 30 + if not script_path.exists(): 31 + print(f"error: {script_path} not found") 32 + sys.exit(1) 33 + 34 + api_key = os.environ.get("GOOGLE_API_KEY") 35 + if not api_key: 36 + print("error: GOOGLE_API_KEY not set") 37 + sys.exit(1) 38 + 39 + script = script_path.read_text() 40 + print(f"generating audio from {script_path} ({len(script)} chars)") 41 + 42 + client = genai.Client(api_key=api_key) 43 + response = client.models.generate_content( 44 + model="gemini-2.5-flash-preview-tts", 45 + contents=script, 46 + config=types.GenerateContentConfig( 47 + response_modalities=["AUDIO"], 48 + speech_config=types.SpeechConfig( 49 + multi_speaker_voice_config=types.MultiSpeakerVoiceConfig( 50 + speaker_voice_configs=[ 51 + types.SpeakerVoiceConfig( 52 + speaker="Host", 53 + voice_config=types.VoiceConfig( 54 + prebuilt_voice_config=types.PrebuiltVoiceConfig( 55 + voice_name="Kore" 56 + ) 57 + ), 58 + ), 59 + types.SpeakerVoiceConfig( 60 + speaker="Cohost", 61 + voice_config=types.VoiceConfig( 62 + prebuilt_voice_config=types.PrebuiltVoiceConfig( 63 + voice_name="Puck" 64 + ) 65 + ), 66 + ), 67 + ] 68 + ) 69 + ), 70 + ), 71 + ) 72 + 73 + audio_data = response.candidates[0].content.parts[0].inline_data.data 74 + output_path.write_bytes(audio_data) 75 + print(f"saved audio to {output_path} ({len(audio_data)} bytes)") 76 + 77 + 78 + if __name__ == "__main__": 79 + main()

Configure Feed

Configure Feed