commits
flo.by uploaded his catalog; AuDD identified each track's dominant
match as "Floby IV" (his stage name). every scan returned
is_flagged=true, which:
- showed a red "potential copyright violation" badge to the
artist on his own /portal page
- fired an admin DM ("copyright flag on plyr.fm / primary: X
by Floby IV") — admin received ~30 DMs in one session
`sync_copyright_resolutions` flipped is_flagged=false within 5min,
but only after the artist had already seen the flag and the DM
spam had landed.
fix: in `_store_scan_result`, look up the uploader's artist record
when is_flagged=true and compare slugified forms of the dominant
match artist to the uploader's handle and display name. on a
self-match, demote is_flagged to false at write time so the UI
flag and the DM never fire. logs `copyright self-match suppressed`
for observability.
separate semantic bug (sync flipping flags whose URI was never
labelled, not just negated) is unaddressed here — this is the
short-term fix to stop creator-visible flags + DM spam.
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
Empirical finding from iOS lock-screen testing: embed surfaces
(CollectionEmbed.svelte for albums/playlists, embed/track/[id]/+page.svelte
for single tracks) set NOTHING on navigator.mediaSession. Result on iOS
Safari and Android Chrome lock-screen controls: generic placeholder title,
no cover art, next/previous buttons either greyed out or routing to nothing.
The main app's Player.svelte has the right behavior; the embeds were
just missing it.
Adds `lib/media-session.ts` — small helper module that wraps the four
MediaSession APIs we use (metadata, playbackState, positionState,
action handlers) with no-op fallbacks on platforms without the API and
a try/catch around setPositionState (which throws on stale
duration/position during track transitions).
Wires the helpers into both embed surfaces:
- Metadata effect: re-runs on track change. Pulls title/artist from
the track and falls back through track image → collection image
for artwork (single-track embed uses trackCoverUrl directly).
- PlaybackState effect: re-runs on paused change.
- PositionState effect: re-runs on time/duration change.
- Action handlers: registered ONCE on mount with cleanup on unmount.
Single-track embed explicitly nulls previoustrack/nexttrack so the
OS greys them out instead of inheriting stale handlers.
- Cleanup on unmount: clears metadata, sets playbackState to 'none',
nulls all handlers. Prevents stale lock-screen entries when the
user navigates away from an embed mid-playback.
Does NOT touch Player.svelte — it has its own (older, inline)
MediaSession setup that works. Refactoring it to use these helpers
is a separate dedup concern.
Validated via svelte:svelte-file-editor agent: zero autofixer issues,
reactivity correct (each effect reads only its deps), unmount cleanup
fires correctly, and `$state` closures inside the action handlers
read the current value at handler-call time (not a mount-time
snapshot).
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
* fix(player): synchronous fast path for auto-advance to survive locked-screen autoplay
Reported in zzstoatzz.io/plyr.fm#1: on Android with the screen locked,
album / playlist playback stops at the end of each track instead of
advancing to the next. The reporter notes it worked in early February.
## Root cause
The chain from `<audio onended>` to `audio.play()` on the next track
goes through ~5 microtask boundaries plus an `await getAudioSource(...)`:
ended → handleTrackEnded → queue.next()
→ $effect: queue → player.currentTrack
→ $effect: load new src (await getCachedAudioUrl, fetch HEAD if gated)
→ audio.src = src; audio.load(); wait for loadeddata
→ $effect: shouldAutoPlay && !isLoadingTrack → player.paused = false
→ $effect: paused-sync → audio.play()
On a foregrounded tab this is milliseconds and works fine. On Android
with the screen locked, Chrome aggressively throttles non-foreground JS
and treats the page as "no longer audible" the moment the previous
track ends. By the time `audio.play()` finally runs, the implicit-
playback grace is gone and the call rejects with NotAllowedError. The
only way to resume is via a Media Session action handler (an explicit
lock-screen button press), which is exactly the workaround the
reporter was using.
This is not a regression from any one commit — the chain has had this
shape since before February. Most likely Chrome on Android tightened
locked-screen autoplay/freeze behavior between then and now, exposing
a long-standing fragility.
## Fix
Three coordinated changes:
1. **`queue.autoAdvanceTrack` getter** — single seam for "what should
natural end-of-track continuation play next". Today returns
`tracks[currentIndex + 1]`. Future continuation strategies (album
tail, feed continuation, recommendations) plug in here.
2. **Next-track prefetcher** — `resolveAudioSource` (extracted to
`lib/audio-source.ts`) returns a structured `ResolvedSource`
discriminator (ready / gated-denied / failed). A `$effect`
opportunistically resolves `queue.autoAdvanceTrack` while the
current track plays and stores the result in `preloadedNext`.
IndexedDB cache lookup and gated HEAD check move out of the
critical path.
3. **Synchronous fast path in `handleTrackEnded`** — when the
prefetcher has a ready source for the next track and we're not in
jam mode, swap `audio.src` and call `audio.play()` in the same tick
as the `ended` event. Reactivity (queue.next, player.currentTrack)
updates AFTER, so the autoplay grace is preserved. Pre-bumping
previousTrackId/previousFileId/previousQueueIndex before
`player.currentTrack = next; queue.next()` keeps downstream
effects no-ops; without it the queue→player sync effect's
`indexChanged` branch would seek the just-started audio back to 0.
When the preload isn't ready (race, jam active, gated denial), we
fall back to the existing reactive chain — same behavior as today.
Plus structured telemetry (`recordPlaybackRejection`) logging
errorName, visibilityState, audio.readyState, fast-path flag, and
preload state so we can confirm in production whether the fast path
actually dodges the autoplay block per browser bucket.
## What this PR does NOT do
- Does not change collection needle-drop semantics. Album/playlist
row clicks still call `queue.playNow(track)` and discard collection
context — separate problem. The new `autoAdvanceTrack` getter is
the seam where a future "soft context" continuation strategy plugs in.
- Does not refactor `TrackItem.svelte`'s `$effect.pre` reset block
or other pre-existing patterns. Scoped to the auto-advance chain.
## Validation
- `just frontend check`: 0 errors / 0 warnings.
- Reviewed via `svelte:svelte-file-editor` agent — confirmed prefetch
effect's reactivity (correct), fast-path state-write ordering
(correct, with comment-strengthening applied), and blob-URL
accounting (correct across both paths).
- `lib/audio-source.ts` extracted out so Player.svelte's growth is
justified by the actual fast-path/prefetch substance, not pure
helpers that could live elsewhere.
## Test plan
- [x] svelte-check clean.
- [ ] After deploy: reproduce on Android (screen locked) with an album
that has 3+ tracks; confirm auto-advance works end-to-end.
- [ ] Confirm desktop foreground playback unchanged.
- [ ] Confirm gated-track skipping still works (denial via prefetch
consumes the cached entry; active gated denial still triggers
the toast).
- [ ] After 24h on prod: query logfire for `audio play() rejected`
events; analyze fast-path vs slow-path rejection rates per
`error.name` and `document.visibility_state` bucket.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>
* fix(player): preserve auto-advance through gated tracks; fix telemetry pollution
review feedback on #1339:
1. **gated auto-advance no longer kept playing.** my parameterized
`handleGatedDenial(err, fromAutoAdvance)` was ALWAYS called with
`false`, including from the loader effect when consuming a cached
`gated-denied` preload. so after `handleTrackEnded` set
`shouldAutoPlay = true` and `queue.next()` advanced into the
gated-denied track, `handleGatedDenial` clobbered shouldAutoPlay
back to false before `queue.goTo(nextPlayable)` — playback
stopped instead of skipping the gated track and continuing.
pre-fast-path code unconditionally set `shouldAutoPlay = true` in
this branch.
fix: drop the `fromAutoAdvance` parameter; always intend to
auto-play after a gated skip. matches pre-PR behavior. whether
the user clicked a gated track or auto-advance landed on one,
the user wants the next playable track to start.
2. **fallback telemetry was polluting the rejection metric.**
`recordAutoAdvanceFallback` emitted via `recordPlaybackRejection`,
whose event name is `audio play() rejected`, even though no
`play()` had been attempted on the slow path at that point. any
dashboard query filtering on that event name would have counted
slow-path-fallback markers as play rejections.
fix: drop `recordAutoAdvanceFallback` entirely. instead, instrument
the existing slow-path `play().catch(...)` site (which previously
only `console.error`'d) with `recordPlaybackRejection({fastPath:
false, ...})`. now BOTH paths emit the same event, and the
`playback.fast_path` field is the genuine discriminator for
comparing rejection rates between fast and slow paths. that's the
actual question the telemetry was trying to answer.
svelte-check: 0 errors / 0 warnings.
Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>
* chore(player): drop dead frontend telemetry plumbing
review feedback: I was writing comments and commit copy that referenced
"dashboards" for fast-vs-slow path comparison. There are no dashboards.
Frontend logfire is config-flagged off (`config.browser_observability`)
because it was destabilizing the backend; nobody is querying frontend
spans. So `recordPlaybackRejection` was emitting `logfire.info` against
an unconfigured client — net effect: dead code with imaginary purpose.
Removed:
- `recordPlaybackRejection` + `PlaybackRejectionContext` from
`lib/observability.ts`. `initObservability` itself stays — fetch /
XHR auto-instrumentation is the part that DOES propagate trace
headers to the backend, and that's still useful when the flag is on.
- Both call sites in Player.svelte (slow-path and fast-path
`play().catch(...)`) now `console.error` the same way the rest of
the file already did. If a user reports lock-screen playback
trouble, the actual debug pathway is "ask them to repro in
devtools and capture the console."
Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
`test_cross_user_like` flakes intermittently in the staging integration
suite because of a real race in the like → unlike sequence:
1. user clicks LIKE → DB INSERT row R (atproto_like_uri=NULL),
`pds_create_like(R.id)` enqueued via docket.
2. user clicks UNLIKE before pds_create_like runs. atproto_like_uri
is still NULL so we just DELETE R; no PDS-delete is scheduled
because there's no URI yet.
3. `pds_create_like(R.id)` finally runs:
a. PDS create returns URI X.
b. SELECT R.id → row gone → orphan-cleanup branch fires.
c. `delete_record_by_uri(X)` is scheduled.
4. Jetstream emits the `app.bsky.feed.like` create event for X
BEFORE the matching delete event from (3c) propagates.
5. `ingest_like_create` finds no existing row for (track, user)
→ INSERTS a fresh row with URI X. **the like just resurrected
itself after the user explicitly unliked.**
6. eventually the delete event arrives and `ingest_like_delete`
by URI X clears the resurrected row — but in the gap the user
sees their unlike undone.
Fix: in (3c), tombstone the URI in Redis with a 5-minute TTL BEFORE
issuing the orphan PDS delete. `ingest_like_create` checks the
tombstone and drops the matching create event in (5). The TTL only
needs to cover Jetstream propagation; expiry is harmless because the
matching delete event still arrives shortly after.
Why Redis tombstone over a `cancelled_at` schema column: no migration,
no read-path filtering across ~15 query sites, scoped fix to the two
files actually involved in the race. Local Redis blip falls back to
the existing Jetstream-delete cleanup; user briefly sees the ghost
like but it's cleared seconds later.
Mirrors the existing track-tombstone pattern in `ingest.py` (which
prevents ghost tracks from cursor rewind) — same Redis primitive,
different prefix (`like_cancelled:` vs `plyr:tombstone:`) reflecting
the different concern (write race vs replay race).
Tests:
- tests/test_pds_create_like_tombstone.py — pds_create_like writes
the tombstone in the orphan branch and NOT on the happy path
(which would otherwise stall the user's own like indefinitely).
- tests/test_jetstream.py::TestIngestLikeCreate::test_skips_create_for_cancelled_uri
— ingest_like_create drops the create event when the URI is
tombstoned.
447/447 backend tests pass; ruff + ty clean.
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
the player bar already falls back to `track.album?.image_url` when the
per-track image is unset, but the track detail page, track-list items,
track grid cards, and the embed surface all rendered a placeholder
instead. result: the same track shows artwork in the player and a
blank in every other surface, including its own detail page.
extracted the inheritance rule into `lib/track-cover.ts`
(`trackCoverUrl` + `trackThumbnailUrl`) so every cover-rendering
surface routes through the same helper. semantically this models
the relationship correctly — the album HAS the art, the track
INHERITS unless it sets its own — instead of denormalizing the
album cover into each track row, which would silently go stale if
the album cover ever changed.
side benefit: the recent /tmp upload bug (#1336) orphaned 3 tracks
with `image_id IS NULL` while their album record kept its cover.
those tracks now render the album cover at view time without any
DB backfill, and without needing the artist to re-upload.
surfaces touched:
- routes/track/[id]/+page.svelte — visible cover + og:image cascade
both routed through the helper; previewIsTrackArt simplifies to
`coverUrl !== undefined`
- lib/components/TrackItem.svelte — list item (used in album page,
my tracks, search results, etc.)
- lib/components/TrackCard.svelte — grid card
- routes/embed/track/[id]/+page.svelte — third-party embed (bg blur,
desktop side art, mobile art card all share the same coverUrl)
ATProto track records are unchanged: artists who didn't upload a
per-track image still don't claim one in their portable record.
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
* fix(uploads): stage audio + image to shared storage before enqueueing docket
PR #1331 moved POST /tracks/ + PUT /tracks/{id}/audio onto docket
to fix a connection-pool problem, but mechanically forwarded the same
request-handler `/tmp/...` paths over Redis. on production fly.io,
`relay-api` runs multiple machines per process group; the docket worker
frequently lands on a different machine than the request handler. that
machine has its own /tmp, so the upload silently fails:
`FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmpXXXX.wav'`.
evidence (prod, 2026-04-25 darkhart.bsky.social, 7 jobs):
4 failed at varied phases (`upload`, `pds_upload`, `atproto`) — all with
the same FileNotFoundError. the 3 that succeeded all hit the same
`atproto` phase. pure luck of which worker grabbed the job. the
successful tracks also had `image_id IS NULL` in `tracks` because
`_save_image_to_storage` reads `image_path` and silently swallows the
exception (returns `(None, None, None)` on failure). that's the
"cover art shows in the player bar but not on the track page" symptom.
shape of the fix:
HTTP handler:
1. stream client upload to a request-local temp file (size enforce)
2. extract duration once, while bytes are still local
3. `storage.save(file, filename)` -> audio_file_id
4. stream image to memory, `storage.save` -> image_id, image_url, thumb_url
5. delete request-local temp file
6. enqueue docket task with file_id / image_id / URLs ONLY
worker (`run_track_upload`, `run_track_audio_replace`):
- signatures take `audio_file_id`, never a `*_path`
- `_validate_audio` reads duration from the context (no I/O)
- `_store_audio` reuses the staged id directly for web-playable
formats; for lossless, downloads from storage, transcodes via a
worker-local /tmp (single-task, never crosses machine boundary),
saves transcoded result back to storage
- `_upload_to_pds` downloads bytes from storage when not transcoded
- `_store_image` is a no-op forward (URLs already resolved in handler)
this preserves PR #1331's connection-pool win (handler returns once
storage is durable + docket task is enqueued) and removes the
multi-machine fragility entirely.
- drops aiofiles use on this path; uses `storage.get_file_data`
- removes the temp-file cleanup in `_process_upload_background` —
there's nothing local to clean
- audio_replace handler also captures `support_gate` up front so the
staged bytes land in the right bucket (private vs public) before
the worker sees them
regression coverage:
the structural change (`UploadContext` no longer has `file_path`,
docket task signatures no longer have `*_path` args) is the contract.
existing tests (`test_upload_session_reload`, `test_upload_phases`,
`track_audio_replace/test_pipeline.py`) exercise the orchestrator
end-to-end through the new context shape and pass green (46 tests).
Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>
* fix(uploads): clean up staged storage on handler-side + pre-DB worker aborts
addresses three orphan-cleanup gaps reviewer flagged on the staging refactor:
1. **handler-side**: any abort between `stage_audio_to_storage` and a
successful schedule call left staged storage objects orphaned and
the job stuck in PROCESSING. wrap staging+enqueue in try/except;
on failure delete staged audio (private if gated, public otherwise)
and image, mark the job FAILED.
2. **replace orchestrator**: `new_file_id_for_rollback` was None until
`_store_audio` returned. the gated-FLAC path (handler stages new
bytes to private bucket → `_store_audio` raises "supporter-gated
tracks cannot use lossless formats yet") left those bytes stranded.
initialize from `ctx.audio_file_id` upfront, thread the playable-
file extension through `_rollback_new_files`. add `is_gated: bool`
to ReplaceContext (handler-time decision) so rollback selects the
bucket the bytes ACTUALLY live in even under a concurrent PATCH
that flips support_gate between request and worker.
3. **upload orchestrator**: phases 1-5 raise UploadPhaseError without
releasing staged bytes. add `_cleanup_staged_media_pre_db` and a
`db_row_owns_media` boundary flag — orchestrator cleans up only
before `_create_records`, deferring to its existing reserve-then-
publish cleanup past that. covers the transcoded-sibling case.
session-expired path on both workers also deletes the staged bytes
(no recovery without a fresh sign-in; orphans serve nothing).
regression tests:
- `tests/api/test_upload_storage_cleanup.py` (4 tests)
- `track_audio_replace/test_pipeline.py` (1 test):
early-abort rolls back staged file from the right bucket per
`ctx.is_gated`
370/370 tests pass locally; ruff + ty clean.
Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>
* chore: drop stray backend/loq.toml
* chore(uploads): consolidate cleanup helper, drop redundant deferred import
once-over after CI green:
- removed redundant `from backend._internal import get_session` deferred
re-import inside `_process_upload_background` — the symbol is already
imported at module scope. updated `test_upload_session_reload` to
patch where the symbol is used (`backend.api.tracks.uploads.get_session`)
rather than where it's defined, which is the right pattern anyway.
- audio_replace's handler + session-expired path were inlining the
same `delete_gated if gated else delete` pattern that uploads exposes
as `_delete_staged_audio`. import + reuse instead of duplicating.
no behavior change; 370/370 tests pass.
Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
12-concurrent uploads targeting the same album (artist_did, slug) raced
in `get_or_create_album`: the losers caught IntegrityError and called
`db.rollback()` on the caller's shared AsyncSession. under concurrent
load this left 2/12 uploads blowing up with MissingGreenlet on the very
next pool checkout, ~300ms after INSERT albums — observed on stg during
the 12-chromatic-drone smoke test (2026-04-24).
replace SELECT-then-INSERT-then-catch with a single
`INSERT ... ON CONFLICT DO NOTHING RETURNING`. the race resolves at the
DB level, no rollback on a shared session, no churn on pool state.
regression test fires 12 concurrent `get_or_create_album` calls on
separate sessions with the same title and asserts exactly 1 row, 1
`created=True`, and all callers agree on the resulting album id.
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
the POST /tracks/ and PUT /tracks/{id}/audio handlers used
`fastapi.BackgroundTasks.add_task`, which runs the task within the
same ASGI request lifecycle after the response is sent. consequence:
any request-scoped DB session stays checked out of the pool until the
task finishes (20-100s per upload), and nothing bounds concurrency.
today flo.by uploaded 6 tracks in a single album-create fan-out. six
concurrent uploads held six of the 10 pool slots for over a minute
and starved every other request (/auth/me p95 hit 9.7s, /health 3s).
root cause: this pattern was in place from the very first streaming-
uploads commit (26a48c75, Nov 2025). docket landed a month later and
all post-upload tasks were migrated piecemeal (copyright, embedding,
genre, image moderation, atproto sync, teal, export, pds backfill)
but the upload orchestration itself never was. audio replace (#1311,
Apr 2026) copied the same pattern.
changes:
- uploads.py: add run_track_upload (docket task, primitives only,
rehydrates session, delegates to existing _process_upload_background)
+ schedule_track_upload helper
- audio_replace.py: same trio for replace
- handlers: drop `background_tasks: BackgroundTasks` param, call
await schedule_* instead
- _internal/tasks/__init__.py: register both tasks in the docket list
- test_endpoint.py: patch the scheduler helper, not the orchestrator
- tests/integration/test_album_upload.py: add
test_album_upload_10_tracks_concurrently as regression coverage —
fires 10 concurrent uploads through an album and asserts all complete
- loq.toml: relax limits on uploads.py + audio_replace.py to cover the
new wrapper functions
the existing orchestrators (_process_upload_background,
_process_replace_background) keep the same signature so every pipeline
test that drives them directly continues to pass unchanged.
buys us:
- HTTP handler returns in <1s; request-scoped DB session released on
response instead of 100s later
- per-op DB sessions via db_session() inside the task, not held across
the whole upload
- bounded concurrency via settings.docket.worker_concurrency (default
10/worker x 2 prod machines = 20 concurrent uploads, rest queue in
Redis rather than saturating the pool)
- fresh session rehydration if OAuth refreshed between queue and task
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
the component used a CSS overlay with `z-index: 1000` and manual ESC /
focus-trap plumbing. while every other sheet / modal in the codebase
(AudioRevisionsSheet, LikersSheet, LogoutModal, SearchModal,
PdsMigrationModal, FeedbackModal, Toast, TermsOverlay) uses
`z-index: 9999`. opening a confirm from *inside* one of those sheets —
specifically "restore" inside the audio version-history sheet —
rendered the confirm behind the sheet, forcing the user to dismiss the
sheet before they could click confirm.
bumping the z-index to 10000 would have been whack-a-mole. using the
native <dialog> element with `.showModal()` puts the dialog in the
browser's top layer, which stacks above every other element on the
page regardless of z-index. by construction, nested modals work.
secondary benefits from switching to the platform primitive:
- focus trap, aria-modal, ESC handling all native — removed our
reimplementations
- ::backdrop pseudo-element for backdrop styling
- role="alertdialog" for semantic correctness on confirmation prompts
- oncancel handler blocks ESC-dismiss while an async confirm is in
flight (pending=true), so the user can't dismiss a pending operation
mid-run and leave parent state inconsistent with UI state
public API of the component is unchanged — both existing callsites
(replace-audio confirm + restore-revision confirm in portal/+page.svelte)
continue to pass `open={...}` one-way and manage close via `onCancel`.
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
follow-up to #1326 — the × button that clears a selected file in the
audio-replace row is the only remaining <button> in that group that
lacked `font-family: inherit`. it currently renders only an SVG icon
so there's no visible font right now, but matching the sibling buttons
keeps the group consistent and protects against future "what if we add
a tooltip text" changes.
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
the "version history" button rendered in the browser default sans-serif
instead of the user's selected global font (mono by default). <button>
elements don't inherit font-family by default, so an explicit
`font-family: inherit` is required. matches the pattern already used
in login, tag, and track routes.
also added to .audio-replace-btn (same root cause; not visible in the
current screenshot because it only appears after a file is selected)
and .audio-upload-btn (it's on a <label> so inherits implicitly, but
adding for consistency and to protect against future markup changes).
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
the restore path used to strip audioBlob from the republished record
whenever the PDS had already GC'd the revision's original CID, silently
downgrading the track to audio_storage="r2". plyr.fm's core promise is
that users own their audio on their PDS — dropping the blob ref would
break that promise.
new behavior when PDS returns BlobNotFound on the first publish:
1. fetch the R2 bytes via storage.get_file_data(file_id, file_type)
2. upload them to the user's PDS to mint a fresh blob CID
3. republish the record with the fresh audioBlob ref
4. commit the track with audio_storage="both" + the new CID
fallback chain (rare): if R2 is also missing the bytes, or the PDS
rejects the re-upload (oversize, transient), we keep the old behavior
— republish without audioBlob and downgrade to r2-only. restore still
completes; playback keeps working via audio_url.
verified via smoke test on stg.plyr.fm (track 2202) before the fix:
post-restore PDS record had no audioBlob, DB had audio_storage="r2",
pds_blob_cid=null. with this patch, the restored record carries a
first-class PDS blob ref again.
tests:
- rewrote test_restore_falls_back_when_pds_blob_gc →
test_restore_reuploads_blob_when_pds_gc: asserts the retry record
carries the re-uploaded ref and DB ends with audio_storage="both"
- added test_restore_falls_back_to_r2_when_reupload_also_fails: covers
the R2-miss path (retained fallback behavior)
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
the CF Pages frontend build config described in environments.md was
stale. verified against the live project config on each recreate today:
- build command: `cd frontend && bun run build`
→ `cd frontend && bun install && bun run build`
(SKIP_DEPENDENCY_INSTALL=1 is set to skip CF's auto-install, so the
build command has to run `bun install` itself)
- build output: `frontend/build`
→ `frontend/.svelte-kit/cloudflare`
(matches `pages_build_output_dir` in `frontend/wrangler.toml`)
- env vars list: added `SKIP_DEPENDENCY_INSTALL=1` which was missing
- prod custom domain line: added `www.plyr.fm` alongside `plyr.fm`
same fixes applied to both prod and staging subsections.
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
reverting the retry-poll I added in #1320 after investigating properly.
the test failure in the post-#1319 integration run was NOT flakiness —
it's a real race condition in the like → pds_create → jetstream ingest
pipeline. tracked in #1321.
a retry-poll would have papered over a real bug and made future
diagnosis harder (\"oh the test just takes 5s sometimes\"). reverting
to the original assertion so the failure remains visible until the
underlying race is fixed.
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
* fix(restore): fall back when PDS has GC'd old blob
* test(integration): retry-poll the liked_tracks check in test_cross_user_like
failed in the #1319 post-deploy integration run: the liked_tracks list
was still showing the track immediately after unlike. pre-existing
eventual-consistency gap — the likes pipeline has a small lag between
the unlike write and the liked list read (cache / read-replica).
matches the pattern test_upload_searchable already uses for similar
eventually-consistent reads: retry up to 5 times with 1s sleep, fail
with a clear message if the track is still there.
caught by manual staging smoke: when restoring a revision that had
audio_storage="both" with a PDS blob ref, the restored PDS record was
being published WITHOUT an audioBlob field, silently dropping the user's
PDS-hosted copy of the audio.
root cause: the restore code explicitly passed `audio_blob=None` to
build_track_record with a misleading comment claiming "PDS blob not
re-uploaded on restore". the comment was right about the blob bytes
(they're already on PDS), but the BLOB REF must still be included in
the new record — PDS records can reference pre-uploaded blobs.
fix: if the revision carries a pds_blob_cid, construct a BlobRef
(using the stored size + the file_type's mime type) and pass it
through to build_track_record. PDS records now keep their audioBlob
field through the full replace → restore round trip.
also adds a regression test that:
- sets up the same scenario the smoke hit (both → replace → restore)
- asserts the published record contains audioBlob pointing at the
original ref
- asserts the live track row keeps audio_storage="both" and the
correct pds_blob_cid / pds_blob_size after restore
note: if the user's PDS has already GC'd the old blob, the record is
still valid — playback falls back to audio_url (R2). we don't re-upload
the blob as part of restore; that would require hauling bytes through
the backend and is out of scope for v1.
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
* feat: audio revisions with confirm-before-replace and restore
closes the UX loop on the audio-replace feature shipped in #1311-1313.
two changes shipped together:
1. **confirmation gate** before audio replace fires. picking a file no
longer kicks off the irreversible upload — clicking "replace audio"
now opens a confirm dialog. addresses Alex's report that hitting
"cancel" after picking a file did not roll back the replace (because
nothing actually fired until "replace audio" was clicked, but the
coupling between picker and that button was confusing).
2. **track_revisions table** + restore endpoint + version-history sheet.
every audio replace snapshots the displaced audio into a TrackRevision
row in the same DB transaction as the swap. column names are
provider-neutral (audio_url, not r2_url) so swapping blob providers
later doesn't leave cruft behind. retention cap is 10 per track —
pruning deletes the backing blob if no other row still references
it. PDS-only audio is never deleted (user owns those blobs).
restore is an instant pointer-swap: the chosen revision becomes the
live audio, the displaced current is snapshotted into a new revision
row, and the chosen revision row is deleted (its content is now
current). PDS record is republished as part of the same flow — non-
negotiable so the user's PDS stays in sync with plyr.fm state.
restore is rejected with 409 if it would cross the public ↔ gated
boundary — moving blobs between buckets isn't built yet, and serving
gated audio from the public bucket would defeat the gate.
the version-history surface is a bottom-sheet on mobile / centered
modal on desktop, modeled on LikersSheet. trigger lives in the audio
file section of the track edit form. each row shows format,
relative time, duration, storage location, and a restore button.
new endpoints:
- GET /tracks/{id}/revisions
- POST /tracks/{id}/revisions/{revision_id}/restore
new components:
- ConfirmDialog.svelte — generic alertdialog (used for replace + restore)
- AudioRevisionsSheet.svelte — mobile-first version-history surface
related: #1314 (orphan R2 files) — revisions give R2 files an owner,
which removes the orphan path. #1315 (in-flight tasks writing stale
results) is orthogonal and not addressed here.
Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>
* test: integration coverage for audio revisions + restore
three end-to-end tests against staging (skip when PLYR_TEST_TOKEN_* unset):
- replace_audio_creates_revision — upload, replace, verify history holds
exactly one row capturing the displaced original
- restore_swaps_audio_and_rotates_revision — upload, replace, restore;
live audio is back to the original, chosen revision row is gone, the
displaced post-replace audio is now in history
- non_owner_cannot_list_or_restore — user2 gets 403 on both list and
restore against user1's track
each test cleans up via the SDK's delete(). new endpoints aren't in the
SDK yet, so raw httpx is used for replace + revisions/restore.
these will run automatically after the PR merges and staging deploys
(the integration-tests workflow fires on deploy staging completion).
Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
"current: m4a" alone is too terse — the user has no sense of where the
audio lives or whether it was transcoded. swap to:
current: m4a · stored on your PDS
current: mp3 (transcoded from flac) · stored on plyr.fm
we don't have the original filename to show (it's content-hashed away
at upload), so format + storage location is the most useful signal.
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
UI for the new PUT /tracks/{id}/audio endpoint shipped in #1311. Lets
artists swap a track's audio without deleting + re-uploading (and losing
likes / comments / plays / the URL).
how it's wired:
- uploader.replaceAudio(trackId, file, title, onComplete) mirrors the
existing uploader.upload XHR + SSE flow but PUTs to a track-specific
endpoint with just the file. progress is surfaced via the same toast
pattern as the initial upload.
- portal/+page.svelte edit form gains an "audio file" section next to
the existing "artwork" section. picker → "replace audio" button →
picker clears immediately and the SSE flow continues in the toast.
- Player.svelte's track-load $effect now also fires on file_id change,
not just track id change. so when the currently-playing track gets a
new audio file, the <audio> element src reloads in place. on
successful replace, we fetch the fresh track row and reassign
player.currentTrack so the effect picks up the new file_id.
deliberately separate from the metadata "save changes" flow because
the upload + transcode + PDS write can take 30s+ and has its own SSE
progress; conflating it with the fast PATCH would block the form.
manual smoke depends on backend being deployed to staging.
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
* feat(backend): replace audio on existing track via PUT /tracks/{id}/audio
artists currently delete + re-upload to fix bad audio (logfire shows
darkhart.bsky.social did this 3× in ~65min). that loses likes, comments,
plays, and the track URI. add an endpoint that swaps the audio bytes while
keeping the track's stable identity intact.
orchestration is atomic with rollback:
1. validate + store new audio (R2; transcode if lossless)
2. upload to PDS (best-effort, falls back to r2-only on size limit)
3. PUT updated ATProto record (URI stable, new CID)
4. DB row swap in single tx — file_id, r2_url, atproto_record_cid, duration,
pds_blob_*, audio_storage; clears stale genre_predictions provenance
5. delete old R2 object only on success
6. fire post-replace hooks: invalidate old CopyrightScan rows, re-fire
copyright/embedding/genre tasks; never re-notify followers
7. resync album list record so its strongRef carries the new track CID
if step 3 fails, rollback deletes the just-written R2 file and leaves the
track row untouched.
reuses upload phase helpers (_validate_audio, _store_audio, _upload_to_pds)
so the transcode/PDS-blob/gating logic stays in one place.
intentional non-changes:
- labeler labels on the (URI-stable) track are NOT auto-dismissed — that's
a moderation call left for manual review
- likes/playlists/comments retain stale strongRef CIDs; this is the same
CID-churn behavior that PATCH /tracks/{id} produces today
also fixes two pre-existing test failures uncovered while building this:
- conftest pg_trgm extension only created in xdist template path
- moderation report tests leaked rate-limit budget across the session
Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>
* fix(audio-replace): tighten rollback scope + handle gated bucket
addresses two review findings on #1311:
1. **post-commit failures triggered rollback (P1)**. the previous orchestrator
wrapped both pre- and post-commit work in one try block. if any side
effect after `_commit_db_swap` raised (post-replace hooks, album resync,
cache invalidation), the except path would delete the new R2 file even
though the track row + ATProto record were already pointing at it —
leaving production with a 404 for the freshly-replaced audio.
split into two phases: pre-commit may rollback; post-commit failures are
logged and swallowed (the swap stands). each post-commit side effect
gets its own try/log so one failure doesn't skip the others.
2. **gated tracks leaked private-bucket objects (P2)**. `R2Storage.delete()`
only probes the public audio + image buckets, so cleanup and rollback
silently no-op'd on supporter-gated tracks (which live in
`private_audio_bucket_name`).
added `delete_gated()` to `R2Storage` + `StorageProtocol` (mirrors
`delete()`'s refcount guard and key probing, against the private bucket).
`_cleanup_old_files` and `_rollback_new_files` now route based on the
track's `support_gate`. also fixes the same pre-existing leak for
gated tracks deleted via the API today (separate latent bug, but the
primitive is now there).
3. **defensive metadata refresh before publish**. a concurrent PATCH that
landed between `_load_and_authorize` and `_publish_record_update` would
have its title / album / features clobbered by the stale snapshot. now
re-loads the row right before building the new ATProto record.
4. **hoist deferred imports** in audio_replace.py + storage/r2.py per the
project's "no unnecessary deferred imports" rule (CLAUDE.md). the
`backend.api.albums` import doesn't have a real circular dep — i'd
copied the pattern from mutations.py without checking.
new tests:
- post-replace hook failure does NOT roll back the new file
- album list sync failure does NOT roll back the new file
- gated track success path uses `delete_gated` for old file
- gated track rollback uses `delete_gated` for new file
- concurrent PATCH title is reflected in the published ATProto record
full xdist suite: 815 passed (was 810 + 5 new tests).
Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
the Cmd+K modal was firing keyword and semantic (mood) searches in
parallel whenever the vibe-search flag was on, then merging both lists
by score (#858). BM25 relevance and cosine similarity are on different
scales — the sort produced jarring interleaves where mediocre semantic
matches outranked solid keyword hits.
revert to the #848 interaction model: explicit mode toggle, one mode at
a time. keyword is the default. flagged users can flip to mood when
they want it; the toggle is hidden for everyone else so there's no
change for non-flagged users.
- search.svelte.ts: add mode state, setMode(), stale-mode guards on
in-flight fetches. activeResults returns just the active mode's list.
drop dedupedSemanticResults / semanticResultIds / semanticSimilarityMap.
- SearchModal.svelte: render a small keyword/mood chip toggle below the
input, gated on search.semanticEnabled. placeholder copy follows the
active mode. mood similarity % only renders in semantic mode.
arc for the record: toggle (#848) → parallel + separator (#851) →
score-merged interleave (#858) → toggle again. #851/#858 were the wrong
direction given how uneven semantic ranking still is.
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
Restores the prior hover-tooltip + mobile-sheet model for liker
lists. The inline avatar strip was fighting too many surfaces
(track rows, cards, detail page, click propagation to play button,
SvelteKit nav hijacking) and broke enough of them that the churn
outweighed the UX win.
Reverts PRs #1302, #1303, #1304, #1305, #1306, #1307, #1308 in
one commit. Search modal stability (#1301) is preserved.
Removed by this revert:
- AvatarStack.svelte, LikersStrip.svelte (never existed before)
- LikerPreview schema, get_top_likers aggregation, top_likers on
TrackResponse, all the callsite wiring
Restored by this revert:
- LikersTooltip.svelte (desktop hover tooltip)
- LikersSheet.svelte + likers-sheet.svelte.ts (mobile bottom sheet)
- LikersSheet mount in +layout.svelte
- Original .likes span markup + CSS in TrackItem, TrackCard,
track/[id]/+page.svelte
- Original supporter-circle markup + CSS on u/[handle]/+page.svelte
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
Clicking "+1" to reveal a single extra avatar expanded a strip that
scrolled nowhere — pointless. Skip the "+N" tile when the overflow
would be smaller than minOverflow (default 3) AND we have enough
users loaded to show everyone inline.
- backend `get_top_likers` default limit: 3 -> 5. tracks with 4 or 5
total likes now ship everyone in the preview, so the frontend can
render them all inline without a dead-end tile. cost per EXPLAIN
ANALYZE is still sub-millisecond on production.
- frontend `AvatarStack` gains `minOverflow` prop (default 3). only
skips "+N" when the overflow is small AND users.length >= total
(i.e. we have everyone loaded), so partial-data surfaces fall back
safely to the regular "+N" affordance.
Behavior:
- total <= 5: shows everyone inline, no "+N"
- total >= 6: shows 3 + "+N>=3" (expansion reveals meaningfully more)
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
Previous fix slapped onclick={(e) => e.stopPropagation()} on the
LikersStrip root to prevent +N/× clicks from bubbling to the outer
play button. That also ate anchor clicks on individual avatars
before they could reach document, where SvelteKit's client-side
nav hijacker lives. With that listener never firing, the browser
fell back to a full page reload — which tears down the audio
element mid-playback.
Scope the stopPropagation to just the non-anchor interactive bits:
- +N handler in LikersStrip now stopPropagation's its own event
- × collapse button already stopPropagation'd
- root span no longer stops anything
Avatar links now reach document → SvelteKit intercepts → client-side
nav → player persists → audio keeps playing. The outer play button's
existing anchor guard (closest('a')) still prevents playback on
anchor clicks, so no regression on that front either.
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
Without a label the avatars looked like they belonged to the song
itself (artist, featured collaborators) rather than the people who
liked it. Adds a "liked by" label inside LikersStrip so the meaning
is unambiguous everywhere the strip appears — TrackItem, TrackCard,
and the track detail page.
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
The strip lives inside the TrackItem play button. Clicking +N or ×
was bubbling up to the outer button's onclick, starting playback
before the expand/collapse could land. Stop click+keydown
propagation at the LikersStrip root so all interaction inside the
strip (avatar navigation, +N expand, × collapse) stays contained.
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
Previous cut lost timestamp info — hovering an avatar showed only
display name. Adding it back without reintroducing a separate panel:
- backend LikerPreview now includes `liked_at` (ISO string) pulled
from `track_likes.created_at` via the window-function query
- AvatarStack gets an optional `avatarTitle(user)` prop so parents
can customize the hover/focus tooltip
- LikersStrip passes a formatter that renders
"display name · liked 2h ago"
UserPreview.liked_at is optional — supporter avatars on the artist
page don't carry a timestamp and keep their existing display-name
tooltip.
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
The prior PR moved liker avatars inline but left the hover tooltip
and mobile bottom sheet in place. On hover, a tooltip opened above
the track row and showed... the same avatars, just a few more of
them. That was redundant and, as you pointed out, the separate
panel opening on hover was the most egregious part.
New model: the inline strip *is* the interaction.
- hover → per-avatar lift (already worked via AvatarStack)
- click an avatar → navigate to /u/{handle}
- click "+N" → the stack itself expands in place to a
horizontally-scrollable strip of every liker. same widget, just
longer. lazy-fetched via the existing tooltip-cache so the
data is there the first time you expand and instant on
subsequent expansions
- click × (or click outside, or press Escape) → collapses back
No popover, no bottom sheet, no tooltip. One affordance, consistent
across mobile and desktop.
Implementation
- AvatarStack.svelte — new scrollable + maxScrollWidth props.
When scrollable, the container gets overflow-x: auto, scroll-snap,
and a thin scrollbar. Overlap is preserved so it stays visually
the same widget, not a different one.
- LikersStrip.svelte — new wrapper that owns the expansion state
and the lazy fetch. Parents pass trackId + likeCount + topLikers
and don't think about anything else.
- TrackItem, TrackCard, track/[id]/+page.svelte — all the
tooltip/sheet state, hover timers, click-to-open-sheet handlers,
cursor: help, tooltip-open z-index gymnastics — all gone.
Replaced with <LikersStrip>.
- Deleted: LikersTooltip.svelte, LikersSheet.svelte,
likers-sheet.svelte.ts. Removed mount from +layout.svelte.
tooltip-cache.svelte.ts stays — LikersStrip uses it for the
expansion fetch.
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
Replaces the plain "N likes" text next to tracks with an overlapping
strip of the 3 most recent liker avatars (+N if more) — matching the
existing supporter-row pattern on artist pages. Both sites now render
the same `AvatarStack` presentational component; only the data flow
differs (liker avatars are maintained in our artists DB via jetstream;
supporter avatars come from atprotofans via the /artists/batch
enrichment already in place).
Backend
- new `get_top_likers(db, track_ids, limit=3)` aggregation utility
using `ROW_NUMBER() OVER (PARTITION BY track_id ORDER BY created_at
DESC)`, filtered to `rn <= limit`. Postgres 15+ pushes the limit
into the window aggregate (Run Condition), so work short-circuits
per partition. EXPLAIN ANALYZE on production (308 likes, 20-track
page): ~1ms execution, all in shared buffer cache.
- `TrackResponse.top_likers: list[LikerPreview]` added; threaded
through every list endpoint that already batches aggregations
(for_you, tracks listing, tracks /top, tracks /me, tracks /me/broken,
albums listing, users/{handle}/likes, tracks/tags, tracks/shares,
lists/hydration, liked tracks list) plus single-track endpoints
(playback /by-uri, mutations update, mutations restore-record).
- queue and jams serializers continue to skip aggregations per their
existing comments — they pass no `top_likers`, and the field
defaults to `[]`, which the frontend renders as the plain count
(pre-existing behavior).
- `LikerPreview` lives in `utilities/aggregations.py` rather than
`schemas.py` to avoid a circular import (schemas.py imports from
aggregations.py for `CopyrightInfo`).
- tests in `test_aggregations.py`: default limit, custom limit,
ordering by most-recent-first, empty track list, and the
JOIN-on-Artist filter behavior (likers without an artist row are
omitted, matching the existing `GET /tracks/{id}/likes` semantics).
Frontend
- `AvatarStack.svelte` — new purely-presentational component. Props:
`users`, `total`, `maxVisible`, `size`, `borderColor`, `moreHref`,
`onMoreClick`, `avatarHref`, `onAvatarClick`, `ariaLabel`, `class`.
Handles 0-N users, renders +overflow tile as link OR button
depending on the surface, supports fallback initials when
`avatar_url` is null.
- `UserPreview` type added to `types.ts`; matches backend
`LikerPreview` and the atprotofans-derived `Supporter` shape.
- `Track.top_likers?: UserPreview[]` added.
- wired into `TrackItem`, `TrackCard`, and `track/[id]/+page.svelte` —
the existing wrapper keeps the hover-tooltip (desktop) and
bottom-sheet (mobile) behavior on the whole strip; clicking an
individual avatar is intentionally a no-op so the detail sheet is
the canonical "see all likers" path.
- wired into `u/[handle]/+page.svelte` supporter row, replacing the
hand-rolled `.supporter-circle` markup and CSS (~65 lines deleted).
Avatars here DO link to `/u/{handle}` per existing UX; +N links
out to the atprotofans supporter page in a new tab.
- sizing is mobile-first: 20px avatars on mobile tracks, 22px on
desktop tracks, 18px in track cards, 28px/32px on the supporter
row.
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
The Cmd+K search modal jolted visibly when the user started typing:
hints (~100px) disappeared immediately, nothing rendered during the
150ms debounce window (loading was still false), "no results for X"
briefly flashed, then collapsed again while the fetch was in flight,
then popped open to result height.
two fixes:
1. set `loading=true` synchronously inside `setQuery()` when query>=2
(and semanticLoading when query>=3) so the "no results" branch never
matches during the debounce window before the fetch fires.
2. wrap the body states in `.search-body` with a 104px min-height
(matching the hints' rendered height) and `interpolate-size:
allow-keywords` + `transition: height`. the body no longer collapses
between states, and the growth to result height animates smoothly on
browsers that support interpolate-size (chrome/safari/edge 2024+).
older browsers fall back to instant resize — no regression.
an explicit `.search-progress` placeholder covers the in-between state
when the user has typed 1 char or is waiting on the first response.
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
adds a font picker alongside the existing accent color controls. six
options: mono (default), geist, inter, system, georgia, comic sans.
stored in ui_settings JSONB (no migration needed), cached in
localStorage for flash prevention, applied via --font-family CSS var.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This reverts commit 696fffa578155fe931c25fba80027ef885decaca.
increases gradient interpolation from 7 to 16 color stops for finer
transitions. removes brightness(1.08) oscillation from the breathing
animation — it amplified visible banding at color step boundaries.
no SVG noise filters this time.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This reverts commit 19d8c7c40998533eb43dba5e2aaa202fbfff6eb3.
the 135-degree gradient with only 7 color stops and a brightness-oscillating
animation created visible diagonal strips (color banding), especially in
dark/low-contrast weather palettes.
three fixes:
- increase gradient interpolation from 7 to 16 stops
- add SVG fractalNoise dither filter (soft-light blend) on the gradient layer
- remove brightness(1.08) from breathing animation (amplified banding)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
the segmented pill control for latest/for-you looked busy and
inconsistent with the rest of the UI. replace it with an inline
cycling button that matches the top tracks period toggle pattern —
tap to cycle between "latest" and "for you".
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: status maintenance — SDK namespace, CDN caching, feed switcher, telemetry incident
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* chore: add TTS audio for status update (generated locally)
gemini-2.5-pro-tts free tier quota was exhausted in CI.
generated locally with OTHER_GOOGLE_API_KEY.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: zzstoatzz <thrast36@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
plyrfm CLI moved from flat commands to noun-first subcommands:
- `plyrfm delete` → `plyrfm tracks delete`
- `plyrfm upload` → `plyrfm tracks upload`
- `plyrfm my-tracks` → `plyrfm tracks my`
also update SDK examples in llms-full.txt for namespace API
(client.search → client.discover.search, etc.)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
jetstream ignored `kind=account` events entirely — deactivation left
stale cdn.bsky.app avatar URLs (dead 404s), and reactivation never
refreshed them. identity events also skipped avatar updates.
- handle `account` events in jetstream consumer (dispatch new
`ingest_account_status_change` task)
- on deactivation: clear avatar_url so frontend doesn't show broken img
- on reactivation: fetch fresh avatar from Bluesky profile
- add avatar refresh to `ingest_identity_update` (covers PDS migrations
and handle changes too)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
the frontend toggle from #1288 only prevents new page loads from
calling initObservability(). stale cached clients (Cloudflare Pages)
continue hammering POST /logfire-proxy — 3,458 requests in 24 minutes
averaging 1.9s each, saturating the threadpool and causing /tracks/top
to take 10-18s.
guard the backend endpoint directly: return 204 immediately when the
flag is off, so no stale client can reach logfire_proxy().
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
the logfire browser SDK proxies all browser trace data through the
backend (POST /logfire-proxy/v1/traces) because Logfire requires
server-side auth. the proxy uses run_in_threadpool for a synchronous
HTTP call — under load, this saturates the threadpool and starves
async handlers including DB queries.
adds BROWSER_OBSERVABILITY env var (default: true) exposed via
GET /config. frontend gates initObservability() on this flag.
set BROWSER_OBSERVABILITY=false to disable browser telemetry proxy
and eliminate the proxy load on the backend.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
the pool warmup (added in #1025) only opened a single connection.
with pool_size=10, the other 9 connections still hit TCP+SSL setup
on the first burst of requests after deploy. logfire traces show 17
simultaneous connect events taking 1.5-5.5s each during deploys,
causing simple PK lookups to take 12s+ while connections queue up.
fix: warm all pool_size connections concurrently at startup using
asyncio.gather. connections execute SELECT 1 then return to the pool
ready for immediate use. partial failures are logged but don't block
startup.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
auth.isAuthenticated starts false and flips true after /auth/me
resolves. the probe $effect's else branch fired immediately on
mount (before auth resolved), saw feedMode === 'for-you', and
reset it to 'latest' — clobbering the localStorage-persisted
preference on every page load.
fix: gate the else branch on !auth.loading so it only fires after
auth has actually resolved and the user is genuinely not authenticated.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
three bugs from staging:
1. the probe $effect called forYouCache.fetch(), which synchronously
reads this.loading ($state) — creating a reactive dependency. any
cache state change (e.g. from setTags) re-triggered the probe, and
if tag-filtered results were empty, forYouAvailable flipped to false,
hiding the switcher. fix: use a raw fetch(limit=1) with no reactive
cache reads.
2. tag state wasn't shared between feeds. ForYouCache initialized
activeTags as empty, not from localStorage. switching feeds didn't
sync tags. tags set while in for-you mode weren't persisted. fix:
both caches initialize from localStorage.active_tags; toggleFeed
syncs tags from outgoing to incoming cache; onTagsChange persists
regardless of mode.
3. empty state said "no tracks yet" when tags filtered to zero. fix:
show "no tracks match these tags" when active tags are set.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
two issues from staging review:
1. segmented control used opaque --bg-secondary background and --radius-xl,
visually inconsistent with track items and cards which use translucent
--track-bg/--track-border and --radius-md. switched to match.
2. tag filters were hidden when viewing for-you feed. added optional
`tags` query param to GET /for-you/ — filters candidates to tracks
with at least one matching tag (same inclusive semantics as /tracks/).
ForYouCache now supports setTags(), and the homepage shows tag filters
regardless of feed mode.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
the inline "for you" text next to the heading read as a sentence
("latest tracks for you") rather than a toggle. replaced with a
proper segmented control — two pill buttons with clear active/inactive
states using the same color-mix accent pattern as settings theme buttons.
heading simplified to "tracks" with the switcher sitting alongside.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
adds a feed mode toggle to the homepage's main infinite-scroll section.
authenticated users with engagement history see a clickable toggle
(same style as the top tracks period toggle) to switch between "latest
tracks" and "for you". unauthenticated users or those without enough
engagement data see no toggle — identical to today.
- new ForYouCache state module ($lib/for-you.svelte.ts) mirroring
TracksCache's interface but hitting /for-you/
- feed mode persisted to localStorage
- tag filters hidden when viewing for-you (backend handles hidden tags)
- infinite scroll dispatches to the active cache's fetchMore()
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove 8 scripts for migrations that completed months ago:
- copy_r2_buckets.py (relay → audio-prod, Nov 2025)
- migrate_r2_bucket.py (same with DB updates)
- migrate_images_to_new_buckets.py (audio → images buckets, Nov 2025)
- migrate_sensitive_images.py (Jan 2026)
- backfill_image_urls.py (Nov 2025)
- backfill_atproto_records.py (Nov 2025)
- backfill_avatars.py (Dec 2025)
- backfill_duration.py (Dec 2025)
Add migrate_cdn_urls.py for the r2.dev → custom domain URL migration.
Dry-run by default, auto-detects environment from DATABASE_URL,
updates tracks/albums/playlists URL columns.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
HEAD requests to R2 custom domains always return cf-cache-status:
DYNAMIC. Use GET to verify real cache status.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: add CacheControl headers to R2 uploads, consolidate S3 client config
Set Cache-Control: public, max-age=31536000, immutable on all R2 uploads
(audio, images, thumbnails). Objects are content-hashed so they never
change — this tells Cloudflare's CDN and browsers to cache aggressively.
Also consolidate the S3 client connection config into _s3_client() helper
method. The same 5-line endpoint/credentials block was repeated 9 times.
Now it's one method, making an S3/R2 swap a one-line change.
Prep for switching from r2.dev URLs (no CDN caching) to custom domains
(audio.plyr.fm, images.plyr.fm) which are already provisioned.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: update R2 references for custom domain CDN migration
Replace r2.dev URLs with custom domain URLs (audio.plyr.fm,
images.plyr.fm) in public docs, internal docs, and config examples.
Drop "R2" from "R2 CDN" references — the CDN is Cloudflare's edge
cache, not R2 itself.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add fetch_list_item_uris() to _internal/atproto/records/fm_plyr/list.py
— fetches an ATProto list record and returns ordered item URIs. Replaces
5 copy-pasted fetch-then-extract blocks across playlists, albums, and
recommendations.
Add hydrate_tracks_from_uris() to api/lists/hydration.py — loads tracks
by AT-URI, batch-aggregates like/comment counts, resolves liked state,
returns ordered TrackResponses. Collapses the identical ~35-line hydration
block duplicated between get_playlist and get_playlist_by_uri.
playlists.py: 952 → 843 lines. Six unused imports removed as a side
effect (the hydration helper absorbed them).
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor: decompose lists.py and albums.py into subpackages, fix PDS URL healing
Split two monolithic API files into subpackages following the existing
api/tracks/ pattern:
- lists.py (1149 lines) → lists/{router,schemas,reorder,resolver,playlists}.py
- albums.py (995 lines) → albums/{router,schemas,cache,listing,mutations}.py
Also moves PDS URL healing from lazy per-request side effects (copy-pasted
in 5 API endpoints) to the jetstream identity event handler, where it
belongs. Identity events fire on both handle changes and PDS migrations,
so resolving the DID there keeps the cached pds_url warm proactively
instead of discovering staleness at request time.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: correct mock targets for decomposed module paths
- AsyncDidResolver: patch at source (atproto_identity.did.resolver)
since ingest.py uses a deferred import
- get_async_redis_client: update to backend.api.albums.cache
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: hoist deferred imports to top-level in decomposed modules
Move ~15 deferred imports to module-level where they don't risk circular
dependencies. The only remaining deferred import in the new packages is
backend.api.tracks.mutations.delete_track (cross-package API call).
Also keeps the AsyncDidResolver import in ingest.py deferred — it's a
heavy external dependency in a background task module.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: update mock targets for hoisted imports in album tests
With imports at top-level, mocks must target the importing module's
namespace, not the source module.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Osprey rules engine was never committed (services/osprey/ contained only
__pycache__ artifacts). Remove the stale STATUS.md reference and a
duplicate loq override for u/[handle]/+page.svelte that used a different
glob escape syntax than the rest of the file.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
flo.by uploaded his catalog; AuDD identified each track's dominant
match as "Floby IV" (his stage name). every scan returned
is_flagged=true, which:
- showed a red "potential copyright violation" badge to the
artist on his own /portal page
- fired an admin DM ("copyright flag on plyr.fm / primary: X
by Floby IV") — admin received ~30 DMs in one session
`sync_copyright_resolutions` flipped is_flagged=false within 5min,
but only after the artist had already seen the flag and the DM
spam had landed.
fix: in `_store_scan_result`, look up the uploader's artist record
when is_flagged=true and compare slugified forms of the dominant
match artist to the uploader's handle and display name. on a
self-match, demote is_flagged to false at write time so the UI
flag and the DM never fire. logs `copyright self-match suppressed`
for observability.
separate semantic bug (sync flipping flags whose URI was never
labelled, not just negated) is unaddressed here — this is the
short-term fix to stop creator-visible flags + DM spam.
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
Empirical finding from iOS lock-screen testing: embed surfaces
(CollectionEmbed.svelte for albums/playlists, embed/track/[id]/+page.svelte
for single tracks) set NOTHING on navigator.mediaSession. Result on iOS
Safari and Android Chrome lock-screen controls: generic placeholder title,
no cover art, next/previous buttons either greyed out or routing to nothing.
The main app's Player.svelte has the right behavior; the embeds were
just missing it.
Adds `lib/media-session.ts` — small helper module that wraps the four
MediaSession APIs we use (metadata, playbackState, positionState,
action handlers) with no-op fallbacks on platforms without the API and
a try/catch around setPositionState (which throws on stale
duration/position during track transitions).
Wires the helpers into both embed surfaces:
- Metadata effect: re-runs on track change. Pulls title/artist from
the track and falls back through track image → collection image
for artwork (single-track embed uses trackCoverUrl directly).
- PlaybackState effect: re-runs on paused change.
- PositionState effect: re-runs on time/duration change.
- Action handlers: registered ONCE on mount with cleanup on unmount.
Single-track embed explicitly nulls previoustrack/nexttrack so the
OS greys them out instead of inheriting stale handlers.
- Cleanup on unmount: clears metadata, sets playbackState to 'none',
nulls all handlers. Prevents stale lock-screen entries when the
user navigates away from an embed mid-playback.
Does NOT touch Player.svelte — it has its own (older, inline)
MediaSession setup that works. Refactoring it to use these helpers
is a separate dedup concern.
Validated via svelte:svelte-file-editor agent: zero autofixer issues,
reactivity correct (each effect reads only its deps), unmount cleanup
fires correctly, and `$state` closures inside the action handlers
read the current value at handler-call time (not a mount-time
snapshot).
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
* fix(player): synchronous fast path for auto-advance to survive locked-screen autoplay
Reported in zzstoatzz.io/plyr.fm#1: on Android with the screen locked,
album / playlist playback stops at the end of each track instead of
advancing to the next. The reporter notes it worked in early February.
## Root cause
The chain from `<audio onended>` to `audio.play()` on the next track
goes through ~5 microtask boundaries plus an `await getAudioSource(...)`:
ended → handleTrackEnded → queue.next()
→ $effect: queue → player.currentTrack
→ $effect: load new src (await getCachedAudioUrl, fetch HEAD if gated)
→ audio.src = src; audio.load(); wait for loadeddata
→ $effect: shouldAutoPlay && !isLoadingTrack → player.paused = false
→ $effect: paused-sync → audio.play()
On a foregrounded tab this is milliseconds and works fine. On Android
with the screen locked, Chrome aggressively throttles non-foreground JS
and treats the page as "no longer audible" the moment the previous
track ends. By the time `audio.play()` finally runs, the implicit-
playback grace is gone and the call rejects with NotAllowedError. The
only way to resume is via a Media Session action handler (an explicit
lock-screen button press), which is exactly the workaround the
reporter was using.
This is not a regression from any one commit — the chain has had this
shape since before February. Most likely Chrome on Android tightened
locked-screen autoplay/freeze behavior between then and now, exposing
a long-standing fragility.
## Fix
Three coordinated changes:
1. **`queue.autoAdvanceTrack` getter** — single seam for "what should
natural end-of-track continuation play next". Today returns
`tracks[currentIndex + 1]`. Future continuation strategies (album
tail, feed continuation, recommendations) plug in here.
2. **Next-track prefetcher** — `resolveAudioSource` (extracted to
`lib/audio-source.ts`) returns a structured `ResolvedSource`
discriminator (ready / gated-denied / failed). A `$effect`
opportunistically resolves `queue.autoAdvanceTrack` while the
current track plays and stores the result in `preloadedNext`.
IndexedDB cache lookup and gated HEAD check move out of the
critical path.
3. **Synchronous fast path in `handleTrackEnded`** — when the
prefetcher has a ready source for the next track and we're not in
jam mode, swap `audio.src` and call `audio.play()` in the same tick
as the `ended` event. Reactivity (queue.next, player.currentTrack)
updates AFTER, so the autoplay grace is preserved. Pre-bumping
previousTrackId/previousFileId/previousQueueIndex before
`player.currentTrack = next; queue.next()` keeps downstream
effects no-ops; without it the queue→player sync effect's
`indexChanged` branch would seek the just-started audio back to 0.
When the preload isn't ready (race, jam active, gated denial), we
fall back to the existing reactive chain — same behavior as today.
Plus structured telemetry (`recordPlaybackRejection`) logging
errorName, visibilityState, audio.readyState, fast-path flag, and
preload state so we can confirm in production whether the fast path
actually dodges the autoplay block per browser bucket.
## What this PR does NOT do
- Does not change collection needle-drop semantics. Album/playlist
row clicks still call `queue.playNow(track)` and discard collection
context — separate problem. The new `autoAdvanceTrack` getter is
the seam where a future "soft context" continuation strategy plugs in.
- Does not refactor `TrackItem.svelte`'s `$effect.pre` reset block
or other pre-existing patterns. Scoped to the auto-advance chain.
## Validation
- `just frontend check`: 0 errors / 0 warnings.
- Reviewed via `svelte:svelte-file-editor` agent — confirmed prefetch
effect's reactivity (correct), fast-path state-write ordering
(correct, with comment-strengthening applied), and blob-URL
accounting (correct across both paths).
- `lib/audio-source.ts` extracted out so Player.svelte's growth is
justified by the actual fast-path/prefetch substance, not pure
helpers that could live elsewhere.
## Test plan
- [x] svelte-check clean.
- [ ] After deploy: reproduce on Android (screen locked) with an album
that has 3+ tracks; confirm auto-advance works end-to-end.
- [ ] Confirm desktop foreground playback unchanged.
- [ ] Confirm gated-track skipping still works (denial via prefetch
consumes the cached entry; active gated denial still triggers
the toast).
- [ ] After 24h on prod: query logfire for `audio play() rejected`
events; analyze fast-path vs slow-path rejection rates per
`error.name` and `document.visibility_state` bucket.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>
* fix(player): preserve auto-advance through gated tracks; fix telemetry pollution
review feedback on #1339:
1. **gated auto-advance no longer kept playing.** my parameterized
`handleGatedDenial(err, fromAutoAdvance)` was ALWAYS called with
`false`, including from the loader effect when consuming a cached
`gated-denied` preload. so after `handleTrackEnded` set
`shouldAutoPlay = true` and `queue.next()` advanced into the
gated-denied track, `handleGatedDenial` clobbered shouldAutoPlay
back to false before `queue.goTo(nextPlayable)` — playback
stopped instead of skipping the gated track and continuing.
pre-fast-path code unconditionally set `shouldAutoPlay = true` in
this branch.
fix: drop the `fromAutoAdvance` parameter; always intend to
auto-play after a gated skip. matches pre-PR behavior. whether
the user clicked a gated track or auto-advance landed on one,
the user wants the next playable track to start.
2. **fallback telemetry was polluting the rejection metric.**
`recordAutoAdvanceFallback` emitted via `recordPlaybackRejection`,
whose event name is `audio play() rejected`, even though no
`play()` had been attempted on the slow path at that point. any
dashboard query filtering on that event name would have counted
slow-path-fallback markers as play rejections.
fix: drop `recordAutoAdvanceFallback` entirely. instead, instrument
the existing slow-path `play().catch(...)` site (which previously
only `console.error`'d) with `recordPlaybackRejection({fastPath:
false, ...})`. now BOTH paths emit the same event, and the
`playback.fast_path` field is the genuine discriminator for
comparing rejection rates between fast and slow paths. that's the
actual question the telemetry was trying to answer.
svelte-check: 0 errors / 0 warnings.
Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>
* chore(player): drop dead frontend telemetry plumbing
review feedback: I was writing comments and commit copy that referenced
"dashboards" for fast-vs-slow path comparison. There are no dashboards.
Frontend logfire is config-flagged off (`config.browser_observability`)
because it was destabilizing the backend; nobody is querying frontend
spans. So `recordPlaybackRejection` was emitting `logfire.info` against
an unconfigured client — net effect: dead code with imaginary purpose.
Removed:
- `recordPlaybackRejection` + `PlaybackRejectionContext` from
`lib/observability.ts`. `initObservability` itself stays — fetch /
XHR auto-instrumentation is the part that DOES propagate trace
headers to the backend, and that's still useful when the flag is on.
- Both call sites in Player.svelte (slow-path and fast-path
`play().catch(...)`) now `console.error` the same way the rest of
the file already did. If a user reports lock-screen playback
trouble, the actual debug pathway is "ask them to repro in
devtools and capture the console."
Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
`test_cross_user_like` flakes intermittently in the staging integration
suite because of a real race in the like → unlike sequence:
1. user clicks LIKE → DB INSERT row R (atproto_like_uri=NULL),
`pds_create_like(R.id)` enqueued via docket.
2. user clicks UNLIKE before pds_create_like runs. atproto_like_uri
is still NULL so we just DELETE R; no PDS-delete is scheduled
because there's no URI yet.
3. `pds_create_like(R.id)` finally runs:
a. PDS create returns URI X.
b. SELECT R.id → row gone → orphan-cleanup branch fires.
c. `delete_record_by_uri(X)` is scheduled.
4. Jetstream emits the `app.bsky.feed.like` create event for X
BEFORE the matching delete event from (3c) propagates.
5. `ingest_like_create` finds no existing row for (track, user)
→ INSERTS a fresh row with URI X. **the like just resurrected
itself after the user explicitly unliked.**
6. eventually the delete event arrives and `ingest_like_delete`
by URI X clears the resurrected row — but in the gap the user
sees their unlike undone.
Fix: in (3c), tombstone the URI in Redis with a 5-minute TTL BEFORE
issuing the orphan PDS delete. `ingest_like_create` checks the
tombstone and drops the matching create event in (5). The TTL only
needs to cover Jetstream propagation; expiry is harmless because the
matching delete event still arrives shortly after.
Why Redis tombstone over a `cancelled_at` schema column: no migration,
no read-path filtering across ~15 query sites, scoped fix to the two
files actually involved in the race. Local Redis blip falls back to
the existing Jetstream-delete cleanup; user briefly sees the ghost
like but it's cleared seconds later.
Mirrors the existing track-tombstone pattern in `ingest.py` (which
prevents ghost tracks from cursor rewind) — same Redis primitive,
different prefix (`like_cancelled:` vs `plyr:tombstone:`) reflecting
the different concern (write race vs replay race).
Tests:
- tests/test_pds_create_like_tombstone.py — pds_create_like writes
the tombstone in the orphan branch and NOT on the happy path
(which would otherwise stall the user's own like indefinitely).
- tests/test_jetstream.py::TestIngestLikeCreate::test_skips_create_for_cancelled_uri
— ingest_like_create drops the create event when the URI is
tombstoned.
447/447 backend tests pass; ruff + ty clean.
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
the player bar already falls back to `track.album?.image_url` when the
per-track image is unset, but the track detail page, track-list items,
track grid cards, and the embed surface all rendered a placeholder
instead. result: the same track shows artwork in the player and a
blank in every other surface, including its own detail page.
extracted the inheritance rule into `lib/track-cover.ts`
(`trackCoverUrl` + `trackThumbnailUrl`) so every cover-rendering
surface routes through the same helper. semantically this models
the relationship correctly — the album HAS the art, the track
INHERITS unless it sets its own — instead of denormalizing the
album cover into each track row, which would silently go stale if
the album cover ever changed.
side benefit: the recent /tmp upload bug (#1336) orphaned 3 tracks
with `image_id IS NULL` while their album record kept its cover.
those tracks now render the album cover at view time without any
DB backfill, and without needing the artist to re-upload.
surfaces touched:
- routes/track/[id]/+page.svelte — visible cover + og:image cascade
both routed through the helper; previewIsTrackArt simplifies to
`coverUrl !== undefined`
- lib/components/TrackItem.svelte — list item (used in album page,
my tracks, search results, etc.)
- lib/components/TrackCard.svelte — grid card
- routes/embed/track/[id]/+page.svelte — third-party embed (bg blur,
desktop side art, mobile art card all share the same coverUrl)
ATProto track records are unchanged: artists who didn't upload a
per-track image still don't claim one in their portable record.
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
* fix(uploads): stage audio + image to shared storage before enqueueing docket
PR #1331 moved POST /tracks/ + PUT /tracks/{id}/audio onto docket
to fix a connection-pool problem, but mechanically forwarded the same
request-handler `/tmp/...` paths over Redis. on production fly.io,
`relay-api` runs multiple machines per process group; the docket worker
frequently lands on a different machine than the request handler. that
machine has its own /tmp, so the upload silently fails:
`FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmpXXXX.wav'`.
evidence (prod, 2026-04-25 darkhart.bsky.social, 7 jobs):
4 failed at varied phases (`upload`, `pds_upload`, `atproto`) — all with
the same FileNotFoundError. the 3 that succeeded all hit the same
`atproto` phase. pure luck of which worker grabbed the job. the
successful tracks also had `image_id IS NULL` in `tracks` because
`_save_image_to_storage` reads `image_path` and silently swallows the
exception (returns `(None, None, None)` on failure). that's the
"cover art shows in the player bar but not on the track page" symptom.
shape of the fix:
HTTP handler:
1. stream client upload to a request-local temp file (size enforce)
2. extract duration once, while bytes are still local
3. `storage.save(file, filename)` -> audio_file_id
4. stream image to memory, `storage.save` -> image_id, image_url, thumb_url
5. delete request-local temp file
6. enqueue docket task with file_id / image_id / URLs ONLY
worker (`run_track_upload`, `run_track_audio_replace`):
- signatures take `audio_file_id`, never a `*_path`
- `_validate_audio` reads duration from the context (no I/O)
- `_store_audio` reuses the staged id directly for web-playable
formats; for lossless, downloads from storage, transcodes via a
worker-local /tmp (single-task, never crosses machine boundary),
saves transcoded result back to storage
- `_upload_to_pds` downloads bytes from storage when not transcoded
- `_store_image` is a no-op forward (URLs already resolved in handler)
this preserves PR #1331's connection-pool win (handler returns once
storage is durable + docket task is enqueued) and removes the
multi-machine fragility entirely.
- drops aiofiles use on this path; uses `storage.get_file_data`
- removes the temp-file cleanup in `_process_upload_background` —
there's nothing local to clean
- audio_replace handler also captures `support_gate` up front so the
staged bytes land in the right bucket (private vs public) before
the worker sees them
regression coverage:
the structural change (`UploadContext` no longer has `file_path`,
docket task signatures no longer have `*_path` args) is the contract.
existing tests (`test_upload_session_reload`, `test_upload_phases`,
`track_audio_replace/test_pipeline.py`) exercise the orchestrator
end-to-end through the new context shape and pass green (46 tests).
Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>
* fix(uploads): clean up staged storage on handler-side + pre-DB worker aborts
addresses three orphan-cleanup gaps reviewer flagged on the staging refactor:
1. **handler-side**: any abort between `stage_audio_to_storage` and a
successful schedule call left staged storage objects orphaned and
the job stuck in PROCESSING. wrap staging+enqueue in try/except;
on failure delete staged audio (private if gated, public otherwise)
and image, mark the job FAILED.
2. **replace orchestrator**: `new_file_id_for_rollback` was None until
`_store_audio` returned. the gated-FLAC path (handler stages new
bytes to private bucket → `_store_audio` raises "supporter-gated
tracks cannot use lossless formats yet") left those bytes stranded.
initialize from `ctx.audio_file_id` upfront, thread the playable-
file extension through `_rollback_new_files`. add `is_gated: bool`
to ReplaceContext (handler-time decision) so rollback selects the
bucket the bytes ACTUALLY live in even under a concurrent PATCH
that flips support_gate between request and worker.
3. **upload orchestrator**: phases 1-5 raise UploadPhaseError without
releasing staged bytes. add `_cleanup_staged_media_pre_db` and a
`db_row_owns_media` boundary flag — orchestrator cleans up only
before `_create_records`, deferring to its existing reserve-then-
publish cleanup past that. covers the transcoded-sibling case.
session-expired path on both workers also deletes the staged bytes
(no recovery without a fresh sign-in; orphans serve nothing).
regression tests:
- `tests/api/test_upload_storage_cleanup.py` (4 tests)
- `track_audio_replace/test_pipeline.py` (1 test):
early-abort rolls back staged file from the right bucket per
`ctx.is_gated`
370/370 tests pass locally; ruff + ty clean.
Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>
* chore: drop stray backend/loq.toml
* chore(uploads): consolidate cleanup helper, drop redundant deferred import
once-over after CI green:
- removed redundant `from backend._internal import get_session` deferred
re-import inside `_process_upload_background` — the symbol is already
imported at module scope. updated `test_upload_session_reload` to
patch where the symbol is used (`backend.api.tracks.uploads.get_session`)
rather than where it's defined, which is the right pattern anyway.
- audio_replace's handler + session-expired path were inlining the
same `delete_gated if gated else delete` pattern that uploads exposes
as `_delete_staged_audio`. import + reuse instead of duplicating.
no behavior change; 370/370 tests pass.
Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
12-concurrent uploads targeting the same album (artist_did, slug) raced
in `get_or_create_album`: the losers caught IntegrityError and called
`db.rollback()` on the caller's shared AsyncSession. under concurrent
load this left 2/12 uploads blowing up with MissingGreenlet on the very
next pool checkout, ~300ms after INSERT albums — observed on stg during
the 12-chromatic-drone smoke test (2026-04-24).
replace SELECT-then-INSERT-then-catch with a single
`INSERT ... ON CONFLICT DO NOTHING RETURNING`. the race resolves at the
DB level, no rollback on a shared session, no churn on pool state.
regression test fires 12 concurrent `get_or_create_album` calls on
separate sessions with the same title and asserts exactly 1 row, 1
`created=True`, and all callers agree on the resulting album id.
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
the POST /tracks/ and PUT /tracks/{id}/audio handlers used
`fastapi.BackgroundTasks.add_task`, which runs the task within the
same ASGI request lifecycle after the response is sent. consequence:
any request-scoped DB session stays checked out of the pool until the
task finishes (20-100s per upload), and nothing bounds concurrency.
today flo.by uploaded 6 tracks in a single album-create fan-out. six
concurrent uploads held six of the 10 pool slots for over a minute
and starved every other request (/auth/me p95 hit 9.7s, /health 3s).
root cause: this pattern was in place from the very first streaming-
uploads commit (26a48c75, Nov 2025). docket landed a month later and
all post-upload tasks were migrated piecemeal (copyright, embedding,
genre, image moderation, atproto sync, teal, export, pds backfill)
but the upload orchestration itself never was. audio replace (#1311,
Apr 2026) copied the same pattern.
changes:
- uploads.py: add run_track_upload (docket task, primitives only,
rehydrates session, delegates to existing _process_upload_background)
+ schedule_track_upload helper
- audio_replace.py: same trio for replace
- handlers: drop `background_tasks: BackgroundTasks` param, call
await schedule_* instead
- _internal/tasks/__init__.py: register both tasks in the docket list
- test_endpoint.py: patch the scheduler helper, not the orchestrator
- tests/integration/test_album_upload.py: add
test_album_upload_10_tracks_concurrently as regression coverage —
fires 10 concurrent uploads through an album and asserts all complete
- loq.toml: relax limits on uploads.py + audio_replace.py to cover the
new wrapper functions
the existing orchestrators (_process_upload_background,
_process_replace_background) keep the same signature so every pipeline
test that drives them directly continues to pass unchanged.
buys us:
- HTTP handler returns in <1s; request-scoped DB session released on
response instead of 100s later
- per-op DB sessions via db_session() inside the task, not held across
the whole upload
- bounded concurrency via settings.docket.worker_concurrency (default
10/worker x 2 prod machines = 20 concurrent uploads, rest queue in
Redis rather than saturating the pool)
- fresh session rehydration if OAuth refreshed between queue and task
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
the component used a CSS overlay with `z-index: 1000` and manual ESC /
focus-trap plumbing. while every other sheet / modal in the codebase
(AudioRevisionsSheet, LikersSheet, LogoutModal, SearchModal,
PdsMigrationModal, FeedbackModal, Toast, TermsOverlay) uses
`z-index: 9999`. opening a confirm from *inside* one of those sheets —
specifically "restore" inside the audio version-history sheet —
rendered the confirm behind the sheet, forcing the user to dismiss the
sheet before they could click confirm.
bumping the z-index to 10000 would have been whack-a-mole. using the
native <dialog> element with `.showModal()` puts the dialog in the
browser's top layer, which stacks above every other element on the
page regardless of z-index. by construction, nested modals work.
secondary benefits from switching to the platform primitive:
- focus trap, aria-modal, ESC handling all native — removed our
reimplementations
- ::backdrop pseudo-element for backdrop styling
- role="alertdialog" for semantic correctness on confirmation prompts
- oncancel handler blocks ESC-dismiss while an async confirm is in
flight (pending=true), so the user can't dismiss a pending operation
mid-run and leave parent state inconsistent with UI state
public API of the component is unchanged — both existing callsites
(replace-audio confirm + restore-revision confirm in portal/+page.svelte)
continue to pass `open={...}` one-way and manage close via `onCancel`.
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
follow-up to #1326 — the × button that clears a selected file in the
audio-replace row is the only remaining <button> in that group that
lacked `font-family: inherit`. it currently renders only an SVG icon
so there's no visible font right now, but matching the sibling buttons
keeps the group consistent and protects against future "what if we add
a tooltip text" changes.
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
the "version history" button rendered in the browser default sans-serif
instead of the user's selected global font (mono by default). <button>
elements don't inherit font-family by default, so an explicit
`font-family: inherit` is required. matches the pattern already used
in login, tag, and track routes.
also added to .audio-replace-btn (same root cause; not visible in the
current screenshot because it only appears after a file is selected)
and .audio-upload-btn (it's on a <label> so inherits implicitly, but
adding for consistency and to protect against future markup changes).
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
the restore path used to strip audioBlob from the republished record
whenever the PDS had already GC'd the revision's original CID, silently
downgrading the track to audio_storage="r2". plyr.fm's core promise is
that users own their audio on their PDS — dropping the blob ref would
break that promise.
new behavior when PDS returns BlobNotFound on the first publish:
1. fetch the R2 bytes via storage.get_file_data(file_id, file_type)
2. upload them to the user's PDS to mint a fresh blob CID
3. republish the record with the fresh audioBlob ref
4. commit the track with audio_storage="both" + the new CID
fallback chain (rare): if R2 is also missing the bytes, or the PDS
rejects the re-upload (oversize, transient), we keep the old behavior
— republish without audioBlob and downgrade to r2-only. restore still
completes; playback keeps working via audio_url.
verified via smoke test on stg.plyr.fm (track 2202) before the fix:
post-restore PDS record had no audioBlob, DB had audio_storage="r2",
pds_blob_cid=null. with this patch, the restored record carries a
first-class PDS blob ref again.
tests:
- rewrote test_restore_falls_back_when_pds_blob_gc →
test_restore_reuploads_blob_when_pds_gc: asserts the retry record
carries the re-uploaded ref and DB ends with audio_storage="both"
- added test_restore_falls_back_to_r2_when_reupload_also_fails: covers
the R2-miss path (retained fallback behavior)
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
the CF Pages frontend build config described in environments.md was
stale. verified against the live project config on each recreate today:
- build command: `cd frontend && bun run build`
→ `cd frontend && bun install && bun run build`
(SKIP_DEPENDENCY_INSTALL=1 is set to skip CF's auto-install, so the
build command has to run `bun install` itself)
- build output: `frontend/build`
→ `frontend/.svelte-kit/cloudflare`
(matches `pages_build_output_dir` in `frontend/wrangler.toml`)
- env vars list: added `SKIP_DEPENDENCY_INSTALL=1` which was missing
- prod custom domain line: added `www.plyr.fm` alongside `plyr.fm`
same fixes applied to both prod and staging subsections.
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
reverting the retry-poll I added in #1320 after investigating properly.
the test failure in the post-#1319 integration run was NOT flakiness —
it's a real race condition in the like → pds_create → jetstream ingest
pipeline. tracked in #1321.
a retry-poll would have papered over a real bug and made future
diagnosis harder (\"oh the test just takes 5s sometimes\"). reverting
to the original assertion so the failure remains visible until the
underlying race is fixed.
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
* fix(restore): fall back when PDS has GC'd old blob
* test(integration): retry-poll the liked_tracks check in test_cross_user_like
failed in the #1319 post-deploy integration run: the liked_tracks list
was still showing the track immediately after unlike. pre-existing
eventual-consistency gap — the likes pipeline has a small lag between
the unlike write and the liked list read (cache / read-replica).
matches the pattern test_upload_searchable already uses for similar
eventually-consistent reads: retry up to 5 times with 1s sleep, fail
with a clear message if the track is still there.
caught by manual staging smoke: when restoring a revision that had
audio_storage="both" with a PDS blob ref, the restored PDS record was
being published WITHOUT an audioBlob field, silently dropping the user's
PDS-hosted copy of the audio.
root cause: the restore code explicitly passed `audio_blob=None` to
build_track_record with a misleading comment claiming "PDS blob not
re-uploaded on restore". the comment was right about the blob bytes
(they're already on PDS), but the BLOB REF must still be included in
the new record — PDS records can reference pre-uploaded blobs.
fix: if the revision carries a pds_blob_cid, construct a BlobRef
(using the stored size + the file_type's mime type) and pass it
through to build_track_record. PDS records now keep their audioBlob
field through the full replace → restore round trip.
also adds a regression test that:
- sets up the same scenario the smoke hit (both → replace → restore)
- asserts the published record contains audioBlob pointing at the
original ref
- asserts the live track row keeps audio_storage="both" and the
correct pds_blob_cid / pds_blob_size after restore
note: if the user's PDS has already GC'd the old blob, the record is
still valid — playback falls back to audio_url (R2). we don't re-upload
the blob as part of restore; that would require hauling bytes through
the backend and is out of scope for v1.
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
* feat: audio revisions with confirm-before-replace and restore
closes the UX loop on the audio-replace feature shipped in #1311-1313.
two changes shipped together:
1. **confirmation gate** before audio replace fires. picking a file no
longer kicks off the irreversible upload — clicking "replace audio"
now opens a confirm dialog. addresses Alex's report that hitting
"cancel" after picking a file did not roll back the replace (because
nothing actually fired until "replace audio" was clicked, but the
coupling between picker and that button was confusing).
2. **track_revisions table** + restore endpoint + version-history sheet.
every audio replace snapshots the displaced audio into a TrackRevision
row in the same DB transaction as the swap. column names are
provider-neutral (audio_url, not r2_url) so swapping blob providers
later doesn't leave cruft behind. retention cap is 10 per track —
pruning deletes the backing blob if no other row still references
it. PDS-only audio is never deleted (user owns those blobs).
restore is an instant pointer-swap: the chosen revision becomes the
live audio, the displaced current is snapshotted into a new revision
row, and the chosen revision row is deleted (its content is now
current). PDS record is republished as part of the same flow — non-
negotiable so the user's PDS stays in sync with plyr.fm state.
restore is rejected with 409 if it would cross the public ↔ gated
boundary — moving blobs between buckets isn't built yet, and serving
gated audio from the public bucket would defeat the gate.
the version-history surface is a bottom-sheet on mobile / centered
modal on desktop, modeled on LikersSheet. trigger lives in the audio
file section of the track edit form. each row shows format,
relative time, duration, storage location, and a restore button.
new endpoints:
- GET /tracks/{id}/revisions
- POST /tracks/{id}/revisions/{revision_id}/restore
new components:
- ConfirmDialog.svelte — generic alertdialog (used for replace + restore)
- AudioRevisionsSheet.svelte — mobile-first version-history surface
related: #1314 (orphan R2 files) — revisions give R2 files an owner,
which removes the orphan path. #1315 (in-flight tasks writing stale
results) is orthogonal and not addressed here.
Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>
* test: integration coverage for audio revisions + restore
three end-to-end tests against staging (skip when PLYR_TEST_TOKEN_* unset):
- replace_audio_creates_revision — upload, replace, verify history holds
exactly one row capturing the displaced original
- restore_swaps_audio_and_rotates_revision — upload, replace, restore;
live audio is back to the original, chosen revision row is gone, the
displaced post-replace audio is now in history
- non_owner_cannot_list_or_restore — user2 gets 403 on both list and
restore against user1's track
each test cleans up via the SDK's delete(). new endpoints aren't in the
SDK yet, so raw httpx is used for replace + revisions/restore.
these will run automatically after the PR merges and staging deploys
(the integration-tests workflow fires on deploy staging completion).
Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
"current: m4a" alone is too terse — the user has no sense of where the
audio lives or whether it was transcoded. swap to:
current: m4a · stored on your PDS
current: mp3 (transcoded from flac) · stored on plyr.fm
we don't have the original filename to show (it's content-hashed away
at upload), so format + storage location is the most useful signal.
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
UI for the new PUT /tracks/{id}/audio endpoint shipped in #1311. Lets
artists swap a track's audio without deleting + re-uploading (and losing
likes / comments / plays / the URL).
how it's wired:
- uploader.replaceAudio(trackId, file, title, onComplete) mirrors the
existing uploader.upload XHR + SSE flow but PUTs to a track-specific
endpoint with just the file. progress is surfaced via the same toast
pattern as the initial upload.
- portal/+page.svelte edit form gains an "audio file" section next to
the existing "artwork" section. picker → "replace audio" button →
picker clears immediately and the SSE flow continues in the toast.
- Player.svelte's track-load $effect now also fires on file_id change,
not just track id change. so when the currently-playing track gets a
new audio file, the <audio> element src reloads in place. on
successful replace, we fetch the fresh track row and reassign
player.currentTrack so the effect picks up the new file_id.
deliberately separate from the metadata "save changes" flow because
the upload + transcode + PDS write can take 30s+ and has its own SSE
progress; conflating it with the fast PATCH would block the form.
manual smoke depends on backend being deployed to staging.
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
* feat(backend): replace audio on existing track via PUT /tracks/{id}/audio
artists currently delete + re-upload to fix bad audio (logfire shows
darkhart.bsky.social did this 3× in ~65min). that loses likes, comments,
plays, and the track URI. add an endpoint that swaps the audio bytes while
keeping the track's stable identity intact.
orchestration is atomic with rollback:
1. validate + store new audio (R2; transcode if lossless)
2. upload to PDS (best-effort, falls back to r2-only on size limit)
3. PUT updated ATProto record (URI stable, new CID)
4. DB row swap in single tx — file_id, r2_url, atproto_record_cid, duration,
pds_blob_*, audio_storage; clears stale genre_predictions provenance
5. delete old R2 object only on success
6. fire post-replace hooks: invalidate old CopyrightScan rows, re-fire
copyright/embedding/genre tasks; never re-notify followers
7. resync album list record so its strongRef carries the new track CID
if step 3 fails, rollback deletes the just-written R2 file and leaves the
track row untouched.
reuses upload phase helpers (_validate_audio, _store_audio, _upload_to_pds)
so the transcode/PDS-blob/gating logic stays in one place.
intentional non-changes:
- labeler labels on the (URI-stable) track are NOT auto-dismissed — that's
a moderation call left for manual review
- likes/playlists/comments retain stale strongRef CIDs; this is the same
CID-churn behavior that PATCH /tracks/{id} produces today
also fixes two pre-existing test failures uncovered while building this:
- conftest pg_trgm extension only created in xdist template path
- moderation report tests leaked rate-limit budget across the session
Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>
* fix(audio-replace): tighten rollback scope + handle gated bucket
addresses two review findings on #1311:
1. **post-commit failures triggered rollback (P1)**. the previous orchestrator
wrapped both pre- and post-commit work in one try block. if any side
effect after `_commit_db_swap` raised (post-replace hooks, album resync,
cache invalidation), the except path would delete the new R2 file even
though the track row + ATProto record were already pointing at it —
leaving production with a 404 for the freshly-replaced audio.
split into two phases: pre-commit may rollback; post-commit failures are
logged and swallowed (the swap stands). each post-commit side effect
gets its own try/log so one failure doesn't skip the others.
2. **gated tracks leaked private-bucket objects (P2)**. `R2Storage.delete()`
only probes the public audio + image buckets, so cleanup and rollback
silently no-op'd on supporter-gated tracks (which live in
`private_audio_bucket_name`).
added `delete_gated()` to `R2Storage` + `StorageProtocol` (mirrors
`delete()`'s refcount guard and key probing, against the private bucket).
`_cleanup_old_files` and `_rollback_new_files` now route based on the
track's `support_gate`. also fixes the same pre-existing leak for
gated tracks deleted via the API today (separate latent bug, but the
primitive is now there).
3. **defensive metadata refresh before publish**. a concurrent PATCH that
landed between `_load_and_authorize` and `_publish_record_update` would
have its title / album / features clobbered by the stale snapshot. now
re-loads the row right before building the new ATProto record.
4. **hoist deferred imports** in audio_replace.py + storage/r2.py per the
project's "no unnecessary deferred imports" rule (CLAUDE.md). the
`backend.api.albums` import doesn't have a real circular dep — i'd
copied the pattern from mutations.py without checking.
new tests:
- post-replace hook failure does NOT roll back the new file
- album list sync failure does NOT roll back the new file
- gated track success path uses `delete_gated` for old file
- gated track rollback uses `delete_gated` for new file
- concurrent PATCH title is reflected in the published ATProto record
full xdist suite: 815 passed (was 810 + 5 new tests).
Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
the Cmd+K modal was firing keyword and semantic (mood) searches in
parallel whenever the vibe-search flag was on, then merging both lists
by score (#858). BM25 relevance and cosine similarity are on different
scales — the sort produced jarring interleaves where mediocre semantic
matches outranked solid keyword hits.
revert to the #848 interaction model: explicit mode toggle, one mode at
a time. keyword is the default. flagged users can flip to mood when
they want it; the toggle is hidden for everyone else so there's no
change for non-flagged users.
- search.svelte.ts: add mode state, setMode(), stale-mode guards on
in-flight fetches. activeResults returns just the active mode's list.
drop dedupedSemanticResults / semanticResultIds / semanticSimilarityMap.
- SearchModal.svelte: render a small keyword/mood chip toggle below the
input, gated on search.semanticEnabled. placeholder copy follows the
active mode. mood similarity % only renders in semantic mode.
arc for the record: toggle (#848) → parallel + separator (#851) →
score-merged interleave (#858) → toggle again. #851/#858 were the wrong
direction given how uneven semantic ranking still is.
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
Restores the prior hover-tooltip + mobile-sheet model for liker
lists. The inline avatar strip was fighting too many surfaces
(track rows, cards, detail page, click propagation to play button,
SvelteKit nav hijacking) and broke enough of them that the churn
outweighed the UX win.
Reverts PRs #1302, #1303, #1304, #1305, #1306, #1307, #1308 in
one commit. Search modal stability (#1301) is preserved.
Removed by this revert:
- AvatarStack.svelte, LikersStrip.svelte (never existed before)
- LikerPreview schema, get_top_likers aggregation, top_likers on
TrackResponse, all the callsite wiring
Restored by this revert:
- LikersTooltip.svelte (desktop hover tooltip)
- LikersSheet.svelte + likers-sheet.svelte.ts (mobile bottom sheet)
- LikersSheet mount in +layout.svelte
- Original .likes span markup + CSS in TrackItem, TrackCard,
track/[id]/+page.svelte
- Original supporter-circle markup + CSS on u/[handle]/+page.svelte
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
Clicking "+1" to reveal a single extra avatar expanded a strip that
scrolled nowhere — pointless. Skip the "+N" tile when the overflow
would be smaller than minOverflow (default 3) AND we have enough
users loaded to show everyone inline.
- backend `get_top_likers` default limit: 3 -> 5. tracks with 4 or 5
total likes now ship everyone in the preview, so the frontend can
render them all inline without a dead-end tile. cost per EXPLAIN
ANALYZE is still sub-millisecond on production.
- frontend `AvatarStack` gains `minOverflow` prop (default 3). only
skips "+N" when the overflow is small AND users.length >= total
(i.e. we have everyone loaded), so partial-data surfaces fall back
safely to the regular "+N" affordance.
Behavior:
- total <= 5: shows everyone inline, no "+N"
- total >= 6: shows 3 + "+N>=3" (expansion reveals meaningfully more)
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
Previous fix slapped onclick={(e) => e.stopPropagation()} on the
LikersStrip root to prevent +N/× clicks from bubbling to the outer
play button. That also ate anchor clicks on individual avatars
before they could reach document, where SvelteKit's client-side
nav hijacker lives. With that listener never firing, the browser
fell back to a full page reload — which tears down the audio
element mid-playback.
Scope the stopPropagation to just the non-anchor interactive bits:
- +N handler in LikersStrip now stopPropagation's its own event
- × collapse button already stopPropagation'd
- root span no longer stops anything
Avatar links now reach document → SvelteKit intercepts → client-side
nav → player persists → audio keeps playing. The outer play button's
existing anchor guard (closest('a')) still prevents playback on
anchor clicks, so no regression on that front either.
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
Without a label the avatars looked like they belonged to the song
itself (artist, featured collaborators) rather than the people who
liked it. Adds a "liked by" label inside LikersStrip so the meaning
is unambiguous everywhere the strip appears — TrackItem, TrackCard,
and the track detail page.
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
The strip lives inside the TrackItem play button. Clicking +N or ×
was bubbling up to the outer button's onclick, starting playback
before the expand/collapse could land. Stop click+keydown
propagation at the LikersStrip root so all interaction inside the
strip (avatar navigation, +N expand, × collapse) stays contained.
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
Previous cut lost timestamp info — hovering an avatar showed only
display name. Adding it back without reintroducing a separate panel:
- backend LikerPreview now includes `liked_at` (ISO string) pulled
from `track_likes.created_at` via the window-function query
- AvatarStack gets an optional `avatarTitle(user)` prop so parents
can customize the hover/focus tooltip
- LikersStrip passes a formatter that renders
"display name · liked 2h ago"
UserPreview.liked_at is optional — supporter avatars on the artist
page don't carry a timestamp and keep their existing display-name
tooltip.
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
The prior PR moved liker avatars inline but left the hover tooltip
and mobile bottom sheet in place. On hover, a tooltip opened above
the track row and showed... the same avatars, just a few more of
them. That was redundant and, as you pointed out, the separate
panel opening on hover was the most egregious part.
New model: the inline strip *is* the interaction.
- hover → per-avatar lift (already worked via AvatarStack)
- click an avatar → navigate to /u/{handle}
- click "+N" → the stack itself expands in place to a
horizontally-scrollable strip of every liker. same widget, just
longer. lazy-fetched via the existing tooltip-cache so the
data is there the first time you expand and instant on
subsequent expansions
- click × (or click outside, or press Escape) → collapses back
No popover, no bottom sheet, no tooltip. One affordance, consistent
across mobile and desktop.
Implementation
- AvatarStack.svelte — new scrollable + maxScrollWidth props.
When scrollable, the container gets overflow-x: auto, scroll-snap,
and a thin scrollbar. Overlap is preserved so it stays visually
the same widget, not a different one.
- LikersStrip.svelte — new wrapper that owns the expansion state
and the lazy fetch. Parents pass trackId + likeCount + topLikers
and don't think about anything else.
- TrackItem, TrackCard, track/[id]/+page.svelte — all the
tooltip/sheet state, hover timers, click-to-open-sheet handlers,
cursor: help, tooltip-open z-index gymnastics — all gone.
Replaced with <LikersStrip>.
- Deleted: LikersTooltip.svelte, LikersSheet.svelte,
likers-sheet.svelte.ts. Removed mount from +layout.svelte.
tooltip-cache.svelte.ts stays — LikersStrip uses it for the
expansion fetch.
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
Replaces the plain "N likes" text next to tracks with an overlapping
strip of the 3 most recent liker avatars (+N if more) — matching the
existing supporter-row pattern on artist pages. Both sites now render
the same `AvatarStack` presentational component; only the data flow
differs (liker avatars are maintained in our artists DB via jetstream;
supporter avatars come from atprotofans via the /artists/batch
enrichment already in place).
Backend
- new `get_top_likers(db, track_ids, limit=3)` aggregation utility
using `ROW_NUMBER() OVER (PARTITION BY track_id ORDER BY created_at
DESC)`, filtered to `rn <= limit`. Postgres 15+ pushes the limit
into the window aggregate (Run Condition), so work short-circuits
per partition. EXPLAIN ANALYZE on production (308 likes, 20-track
page): ~1ms execution, all in shared buffer cache.
- `TrackResponse.top_likers: list[LikerPreview]` added; threaded
through every list endpoint that already batches aggregations
(for_you, tracks listing, tracks /top, tracks /me, tracks /me/broken,
albums listing, users/{handle}/likes, tracks/tags, tracks/shares,
lists/hydration, liked tracks list) plus single-track endpoints
(playback /by-uri, mutations update, mutations restore-record).
- queue and jams serializers continue to skip aggregations per their
existing comments — they pass no `top_likers`, and the field
defaults to `[]`, which the frontend renders as the plain count
(pre-existing behavior).
- `LikerPreview` lives in `utilities/aggregations.py` rather than
`schemas.py` to avoid a circular import (schemas.py imports from
aggregations.py for `CopyrightInfo`).
- tests in `test_aggregations.py`: default limit, custom limit,
ordering by most-recent-first, empty track list, and the
JOIN-on-Artist filter behavior (likers without an artist row are
omitted, matching the existing `GET /tracks/{id}/likes` semantics).
Frontend
- `AvatarStack.svelte` — new purely-presentational component. Props:
`users`, `total`, `maxVisible`, `size`, `borderColor`, `moreHref`,
`onMoreClick`, `avatarHref`, `onAvatarClick`, `ariaLabel`, `class`.
Handles 0-N users, renders +overflow tile as link OR button
depending on the surface, supports fallback initials when
`avatar_url` is null.
- `UserPreview` type added to `types.ts`; matches backend
`LikerPreview` and the atprotofans-derived `Supporter` shape.
- `Track.top_likers?: UserPreview[]` added.
- wired into `TrackItem`, `TrackCard`, and `track/[id]/+page.svelte` —
the existing wrapper keeps the hover-tooltip (desktop) and
bottom-sheet (mobile) behavior on the whole strip; clicking an
individual avatar is intentionally a no-op so the detail sheet is
the canonical "see all likers" path.
- wired into `u/[handle]/+page.svelte` supporter row, replacing the
hand-rolled `.supporter-circle` markup and CSS (~65 lines deleted).
Avatars here DO link to `/u/{handle}` per existing UX; +N links
out to the atprotofans supporter page in a new tab.
- sizing is mobile-first: 20px avatars on mobile tracks, 22px on
desktop tracks, 18px in track cards, 28px/32px on the supporter
row.
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
The Cmd+K search modal jolted visibly when the user started typing:
hints (~100px) disappeared immediately, nothing rendered during the
150ms debounce window (loading was still false), "no results for X"
briefly flashed, then collapsed again while the fetch was in flight,
then popped open to result height.
two fixes:
1. set `loading=true` synchronously inside `setQuery()` when query>=2
(and semanticLoading when query>=3) so the "no results" branch never
matches during the debounce window before the fetch fires.
2. wrap the body states in `.search-body` with a 104px min-height
(matching the hints' rendered height) and `interpolate-size:
allow-keywords` + `transition: height`. the body no longer collapses
between states, and the growth to result height animates smoothly on
browsers that support interpolate-size (chrome/safari/edge 2024+).
older browsers fall back to instant resize — no regression.
an explicit `.search-progress` placeholder covers the in-between state
when the user has typed 1 char or is waiting on the first response.
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
adds a font picker alongside the existing accent color controls. six
options: mono (default), geist, inter, system, georgia, comic sans.
stored in ui_settings JSONB (no migration needed), cached in
localStorage for flash prevention, applied via --font-family CSS var.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
increases gradient interpolation from 7 to 16 color stops for finer
transitions. removes brightness(1.08) oscillation from the breathing
animation — it amplified visible banding at color step boundaries.
no SVG noise filters this time.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
the 135-degree gradient with only 7 color stops and a brightness-oscillating
animation created visible diagonal strips (color banding), especially in
dark/low-contrast weather palettes.
three fixes:
- increase gradient interpolation from 7 to 16 stops
- add SVG fractalNoise dither filter (soft-light blend) on the gradient layer
- remove brightness(1.08) from breathing animation (amplified banding)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
the segmented pill control for latest/for-you looked busy and
inconsistent with the rest of the UI. replace it with an inline
cycling button that matches the top tracks period toggle pattern —
tap to cycle between "latest" and "for you".
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: status maintenance — SDK namespace, CDN caching, feed switcher, telemetry incident
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* chore: add TTS audio for status update (generated locally)
gemini-2.5-pro-tts free tier quota was exhausted in CI.
generated locally with OTHER_GOOGLE_API_KEY.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: zzstoatzz <thrast36@gmail.com>
plyrfm CLI moved from flat commands to noun-first subcommands:
- `plyrfm delete` → `plyrfm tracks delete`
- `plyrfm upload` → `plyrfm tracks upload`
- `plyrfm my-tracks` → `plyrfm tracks my`
also update SDK examples in llms-full.txt for namespace API
(client.search → client.discover.search, etc.)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
jetstream ignored `kind=account` events entirely — deactivation left
stale cdn.bsky.app avatar URLs (dead 404s), and reactivation never
refreshed them. identity events also skipped avatar updates.
- handle `account` events in jetstream consumer (dispatch new
`ingest_account_status_change` task)
- on deactivation: clear avatar_url so frontend doesn't show broken img
- on reactivation: fetch fresh avatar from Bluesky profile
- add avatar refresh to `ingest_identity_update` (covers PDS migrations
and handle changes too)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
the frontend toggle from #1288 only prevents new page loads from
calling initObservability(). stale cached clients (Cloudflare Pages)
continue hammering POST /logfire-proxy — 3,458 requests in 24 minutes
averaging 1.9s each, saturating the threadpool and causing /tracks/top
to take 10-18s.
guard the backend endpoint directly: return 204 immediately when the
flag is off, so no stale client can reach logfire_proxy().
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
the logfire browser SDK proxies all browser trace data through the
backend (POST /logfire-proxy/v1/traces) because Logfire requires
server-side auth. the proxy uses run_in_threadpool for a synchronous
HTTP call — under load, this saturates the threadpool and starves
async handlers including DB queries.
adds BROWSER_OBSERVABILITY env var (default: true) exposed via
GET /config. frontend gates initObservability() on this flag.
set BROWSER_OBSERVABILITY=false to disable browser telemetry proxy
and eliminate the proxy load on the backend.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
the pool warmup (added in #1025) only opened a single connection.
with pool_size=10, the other 9 connections still hit TCP+SSL setup
on the first burst of requests after deploy. logfire traces show 17
simultaneous connect events taking 1.5-5.5s each during deploys,
causing simple PK lookups to take 12s+ while connections queue up.
fix: warm all pool_size connections concurrently at startup using
asyncio.gather. connections execute SELECT 1 then return to the pool
ready for immediate use. partial failures are logged but don't block
startup.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
auth.isAuthenticated starts false and flips true after /auth/me
resolves. the probe $effect's else branch fired immediately on
mount (before auth resolved), saw feedMode === 'for-you', and
reset it to 'latest' — clobbering the localStorage-persisted
preference on every page load.
fix: gate the else branch on !auth.loading so it only fires after
auth has actually resolved and the user is genuinely not authenticated.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
three bugs from staging:
1. the probe $effect called forYouCache.fetch(), which synchronously
reads this.loading ($state) — creating a reactive dependency. any
cache state change (e.g. from setTags) re-triggered the probe, and
if tag-filtered results were empty, forYouAvailable flipped to false,
hiding the switcher. fix: use a raw fetch(limit=1) with no reactive
cache reads.
2. tag state wasn't shared between feeds. ForYouCache initialized
activeTags as empty, not from localStorage. switching feeds didn't
sync tags. tags set while in for-you mode weren't persisted. fix:
both caches initialize from localStorage.active_tags; toggleFeed
syncs tags from outgoing to incoming cache; onTagsChange persists
regardless of mode.
3. empty state said "no tracks yet" when tags filtered to zero. fix:
show "no tracks match these tags" when active tags are set.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
two issues from staging review:
1. segmented control used opaque --bg-secondary background and --radius-xl,
visually inconsistent with track items and cards which use translucent
--track-bg/--track-border and --radius-md. switched to match.
2. tag filters were hidden when viewing for-you feed. added optional
`tags` query param to GET /for-you/ — filters candidates to tracks
with at least one matching tag (same inclusive semantics as /tracks/).
ForYouCache now supports setTags(), and the homepage shows tag filters
regardless of feed mode.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
the inline "for you" text next to the heading read as a sentence
("latest tracks for you") rather than a toggle. replaced with a
proper segmented control — two pill buttons with clear active/inactive
states using the same color-mix accent pattern as settings theme buttons.
heading simplified to "tracks" with the switcher sitting alongside.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
adds a feed mode toggle to the homepage's main infinite-scroll section.
authenticated users with engagement history see a clickable toggle
(same style as the top tracks period toggle) to switch between "latest
tracks" and "for you". unauthenticated users or those without enough
engagement data see no toggle — identical to today.
- new ForYouCache state module ($lib/for-you.svelte.ts) mirroring
TracksCache's interface but hitting /for-you/
- feed mode persisted to localStorage
- tag filters hidden when viewing for-you (backend handles hidden tags)
- infinite scroll dispatches to the active cache's fetchMore()
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove 8 scripts for migrations that completed months ago:
- copy_r2_buckets.py (relay → audio-prod, Nov 2025)
- migrate_r2_bucket.py (same with DB updates)
- migrate_images_to_new_buckets.py (audio → images buckets, Nov 2025)
- migrate_sensitive_images.py (Jan 2026)
- backfill_image_urls.py (Nov 2025)
- backfill_atproto_records.py (Nov 2025)
- backfill_avatars.py (Dec 2025)
- backfill_duration.py (Dec 2025)
Add migrate_cdn_urls.py for the r2.dev → custom domain URL migration.
Dry-run by default, auto-detects environment from DATABASE_URL,
updates tracks/albums/playlists URL columns.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: add CacheControl headers to R2 uploads, consolidate S3 client config
Set Cache-Control: public, max-age=31536000, immutable on all R2 uploads
(audio, images, thumbnails). Objects are content-hashed so they never
change — this tells Cloudflare's CDN and browsers to cache aggressively.
Also consolidate the S3 client connection config into _s3_client() helper
method. The same 5-line endpoint/credentials block was repeated 9 times.
Now it's one method, making an S3/R2 swap a one-line change.
Prep for switching from r2.dev URLs (no CDN caching) to custom domains
(audio.plyr.fm, images.plyr.fm) which are already provisioned.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: update R2 references for custom domain CDN migration
Replace r2.dev URLs with custom domain URLs (audio.plyr.fm,
images.plyr.fm) in public docs, internal docs, and config examples.
Drop "R2" from "R2 CDN" references — the CDN is Cloudflare's edge
cache, not R2 itself.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add fetch_list_item_uris() to _internal/atproto/records/fm_plyr/list.py
— fetches an ATProto list record and returns ordered item URIs. Replaces
5 copy-pasted fetch-then-extract blocks across playlists, albums, and
recommendations.
Add hydrate_tracks_from_uris() to api/lists/hydration.py — loads tracks
by AT-URI, batch-aggregates like/comment counts, resolves liked state,
returns ordered TrackResponses. Collapses the identical ~35-line hydration
block duplicated between get_playlist and get_playlist_by_uri.
playlists.py: 952 → 843 lines. Six unused imports removed as a side
effect (the hydration helper absorbed them).
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor: decompose lists.py and albums.py into subpackages, fix PDS URL healing
Split two monolithic API files into subpackages following the existing
api/tracks/ pattern:
- lists.py (1149 lines) → lists/{router,schemas,reorder,resolver,playlists}.py
- albums.py (995 lines) → albums/{router,schemas,cache,listing,mutations}.py
Also moves PDS URL healing from lazy per-request side effects (copy-pasted
in 5 API endpoints) to the jetstream identity event handler, where it
belongs. Identity events fire on both handle changes and PDS migrations,
so resolving the DID there keeps the cached pds_url warm proactively
instead of discovering staleness at request time.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: correct mock targets for decomposed module paths
- AsyncDidResolver: patch at source (atproto_identity.did.resolver)
since ingest.py uses a deferred import
- get_async_redis_client: update to backend.api.albums.cache
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: hoist deferred imports to top-level in decomposed modules
Move ~15 deferred imports to module-level where they don't risk circular
dependencies. The only remaining deferred import in the new packages is
backend.api.tracks.mutations.delete_track (cross-package API call).
Also keeps the AsyncDidResolver import in ingest.py deferred — it's a
heavy external dependency in a background task module.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: update mock targets for hoisted imports in album tests
With imports at top-level, mocks must target the importing module's
namespace, not the source module.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Osprey rules engine was never committed (services/osprey/ contained only
__pycache__ artifacts). Remove the stale STATUS.md reference and a
duplicate loq override for u/[handle]/+page.svelte that used a different
glob escape syntax than the rest of the file.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>