audio streaming app plyr.fm
38
fork

Configure Feed

Select the types of activity you want to include in your feed.

fix: revert AcoustID/chromaprint back to AuDD for copyright scanning

AcoustID's fingerprint database doesn't match well for DJ sets and
sample-heavy tracks. AuDD handles these via fuzzy matching, which is
what plyr.fm's moderation actually needs.

Restores moderation service to pre-#1163 AuDD integration, costs
export script with AuDD billing logic, and costs dashboard with
request tracking. Keeps CI-only guard, check_legal_dates enhancement,
and export-costs.yml release tag logic from follow-up commits.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

zzstoatzz 2dcbaca5 16316769

+514 -503
+3 -3
STATUS.md
··· 231 231 232 232 ### current focus 233 233 234 - AuDD audio fingerprinting ($5/1000 requests) has been replaced with free AcoustID lookups via `fpcalc` (#1163). the moderation service now shells out to a vendored `fpcalc` binary and queries AcoustID directly. a pure zig chromaprint library ([`@zzstoatzz.io/chromaprint.zig`](https://tangled.sh/@zzstoatzz.io/chromaprint.zig)) was built as a learning exercise and produces exact-match fingerprints. next: add a staging environment for the moderation service (#1165). 234 + reverted from AcoustID/fpcalc back to AuDD for copyright scanning (#1173). AcoustID's fingerprint database doesn't match well for DJ sets and sample-heavy tracks — AuDD handles these via fuzzy matching. the chromaprint.zig learning exercise remains at [`@zzstoatzz.io/chromaprint.zig`](https://tangled.sh/@zzstoatzz.io/chromaprint.zig). next: add a staging environment for the moderation service (#1165). 235 235 236 236 ### known issues 237 237 - iOS PWA audio may hang on first play after backgrounding ··· 339 339 - fly.io (backend + redis + transcoder + moderation): ~$24/month 340 340 - neon postgres: $5/month 341 341 - cloudflare (R2 + pages + domain): ~$1/month 342 - - copyright scanning (AcoustID + fpcalc): $0 (replaced AuDD) 342 + - copyright scanning (AuDD): ~$5-10/month 343 343 - replicate (genre classification): <$1/month (scales to zero, ~$0.00019/run) 344 344 - logfire: $0 (free tier) 345 345 ··· 367 367 368 368 --- 369 369 370 - this is a living document. last updated 2026-03-20 (early March archived, AuDD → AcoustID complete, costs export tied to release tags). 370 + this is a living document. last updated 2026-03-22 (reverted AcoustID back to AuDD — better fuzzy matching for DJ sets/samples). 371 371
+1 -1
backend/src/backend/config.py
··· 160 160 description="USPTO DMCA agent registration number", 161 161 ) 162 162 terms_last_updated: datetime = Field( 163 - default=datetime(2026, 3, 20), 163 + default=datetime(2026, 3, 22), 164 164 description="Date the terms/privacy were last materially updated. " 165 165 "Users who accepted before this date will be prompted to re-accept.", 166 166 )
+1 -1
docs/legal/privacy.md
··· 41 41 - [Fly.io](https://fly.io) - backend hosting 42 42 - [Neon](https://neon.tech) - database 43 43 - [Logfire](https://logfire.pydantic.dev) - error monitoring 44 - - [AcoustID](https://acoustid.org) - audio fingerprinting for copyright detection 44 + - [AuDD](https://audd.io) - audio fingerprinting for copyright detection 45 45 - [Anthropic](https://anthropic.com) - image analysis for content moderation 46 46 - [ATProtoFans](https://atprotofans.com) - supporter validation for gated content 47 47 - [Modal](https://modal.com) - audio processing for search embeddings
+290 -24
frontend/src/routes/costs/+page.svelte
··· 10 10 [key: string]: number; 11 11 } 12 12 13 - interface ScanDailyData { 13 + interface DailyData { 14 14 date: string; 15 15 scans: number; 16 16 flagged: number; 17 + requests: number; 17 18 } 18 19 19 20 interface CostData { ··· 34 35 breakdown: CostBreakdown; 35 36 note: string; 36 37 }; 37 - copyright_scanning: { 38 + audd: { 38 39 amount: number; 39 - scans_30d: number; 40 - flagged_30d: number; 40 + base_cost: number; 41 + overage_cost: number; 42 + scans_this_period: number; 43 + requests_this_period: number; 44 + audio_seconds: number; 45 + free_requests: number; 46 + remaining_free: number; 47 + billable_requests: number; 41 48 flag_rate: number; 42 - daily: ScanDailyData[]; 49 + daily: DailyData[]; 43 50 note: string; 44 51 }; 45 52 }; ··· 52 59 let loading = $state(true); 53 60 let error = $state<string | null>(null); 54 61 let data = $state<CostData | null>(null); 62 + let timeRange = $state<'day' | 'week' | 'month'>('month'); 63 + 64 + // filter daily data based on selected time range 65 + // returns the last N days of data based on selection 66 + let filteredDaily = $derived.by(() => { 67 + if (!data?.costs.audd.daily.length) return []; 68 + const daily = data.costs.audd.daily; 69 + if (timeRange === 'day') { 70 + // show last 2 days (today + yesterday) for 24h view 71 + return daily.slice(-2); 72 + } else if (timeRange === 'week') { 73 + // show last 7 days 74 + return daily.slice(-7); 75 + } else { 76 + // show all (up to 30 days) 77 + return daily; 78 + } 79 + }); 80 + 81 + // calculate totals for selected time range 82 + let filteredTotals = $derived.by(() => { 83 + return { 84 + requests: filteredDaily.reduce((sum, d) => sum + d.requests, 0), 85 + scans: filteredDaily.reduce((sum, d) => sum + d.scans, 0) 86 + }; 87 + }); 88 + 55 89 // derived values for bar chart scaling 56 90 let maxCost = $derived( 57 91 data 58 92 ? Math.max( 59 93 data.costs.fly_io.amount, 60 94 data.costs.neon.amount, 61 - data.costs.cloudflare.amount 95 + data.costs.cloudflare.amount, 96 + data.costs.audd.amount 62 97 ) 63 98 : 1 64 99 ); 100 + 101 + let maxRequests = $derived.by(() => { 102 + return filteredDaily.length ? Math.max(...filteredDaily.map((d) => d.requests)) : 1; 103 + }); 65 104 66 105 onMount(async () => { 67 106 try { ··· 183 222 <span class="cost-note">{data.costs.cloudflare.note}</span> 184 223 </div> 185 224 225 + <div class="cost-item"> 226 + <div class="cost-header"> 227 + <span class="cost-name">audd</span> 228 + <span class="cost-amount">{formatCurrency(data.costs.audd.amount)}</span> 229 + </div> 230 + <div class="cost-bar-bg"> 231 + <div 232 + class="cost-bar audd" 233 + style="width: {barWidth(data.costs.audd.amount, maxCost)}%" 234 + ></div> 235 + </div> 236 + <span class="cost-note">{data.costs.audd.note}</span> 237 + </div> 186 238 </div> 187 239 </section> 188 240 189 - <!-- copyright scanning --> 190 - <section class="scanning-section"> 191 - <div class="cost-item"> 192 - <div class="cost-header"> 193 - <span class="cost-name">copyright scanning</span> 194 - <span class="cost-amount free">free</span> 241 + <!-- audd details --> 242 + <section class="audd-section"> 243 + <div class="audd-header"> 244 + <h2>api requests (audd)</h2> 245 + <div class="time-range-toggle"> 246 + <button 247 + class:active={timeRange === 'day'} 248 + onclick={() => (timeRange = 'day')} 249 + > 250 + 24h 251 + </button> 252 + <button 253 + class:active={timeRange === 'week'} 254 + onclick={() => (timeRange = 'week')} 255 + > 256 + 7d 257 + </button> 258 + <button 259 + class:active={timeRange === 'month'} 260 + onclick={() => (timeRange = 'month')} 261 + > 262 + 30d 263 + </button> 264 + </div> 265 + </div> 266 + 267 + <div class="audd-stats"> 268 + <div class="stat"> 269 + <span class="stat-value">{filteredTotals.requests.toLocaleString()}</span> 270 + <span class="stat-label">requests ({timeRange === 'day' ? '24h' : timeRange === 'week' ? '7d' : '30d'})</span> 271 + </div> 272 + <div class="stat"> 273 + <span class="stat-value">{data.costs.audd.remaining_free.toLocaleString()}</span> 274 + <span class="stat-label">free remaining</span> 275 + </div> 276 + <div class="stat"> 277 + <span class="stat-value">{filteredTotals.scans.toLocaleString()}</span> 278 + <span class="stat-label">tracks scanned</span> 195 279 </div> 196 - <span class="cost-note">{data.costs.copyright_scanning.note}</span> 197 - {#if data.costs.copyright_scanning.scans_30d > 0} 198 - <span class="cost-note"> 199 - {data.costs.copyright_scanning.scans_30d} scans last 30 days 200 - ({data.costs.copyright_scanning.flag_rate}% flagged) 201 - </span> 202 - {/if} 203 280 </div> 281 + 282 + <p class="audd-explainer"> 283 + 1 request = 12s of audio. {data.costs.audd.free_requests.toLocaleString()} free/month, 284 + then ${(5).toFixed(2)}/1k requests. 285 + {#if data.costs.audd.billable_requests > 0} 286 + <strong>{data.costs.audd.billable_requests.toLocaleString()} billable</strong> this billing period. 287 + {/if} 288 + </p> 289 + 290 + {#if filteredDaily.length > 0} 291 + <div class="daily-chart"> 292 + <h3>daily requests</h3> 293 + <div class="chart-bars"> 294 + {#each filteredDaily as day} 295 + <div class="chart-bar-container"> 296 + <div 297 + class="chart-bar" 298 + style="height: {Math.max(4, (day.requests / maxRequests) * 100)}%" 299 + title="{day.date}: {day.requests} requests ({day.scans} tracks)" 300 + ></div> 301 + <span class="chart-label">{day.date.slice(5)}</span> 302 + </div> 303 + {/each} 304 + </div> 305 + </div> 306 + {:else} 307 + <p class="no-data">no requests in this time range</p> 308 + {/if} 204 309 </section> 205 310 206 311 <!-- support cta --> ··· 310 415 margin-bottom: 2rem; 311 416 } 312 417 313 - .breakdown-section h2 { 418 + .breakdown-section h2, 419 + .audd-section h2 { 314 420 font-size: var(--text-sm); 315 421 text-transform: uppercase; 316 422 letter-spacing: 0.08em; ··· 364 470 transition: width 0.3s ease; 365 471 } 366 472 473 + .cost-bar.audd { 474 + background: var(--warning); 475 + } 476 + 367 477 .cost-note { 368 478 font-size: var(--text-xs); 369 479 color: var(--text-tertiary); 370 480 } 371 481 372 - /* scanning section */ 373 - .scanning-section { 482 + /* audd section */ 483 + .audd-section { 374 484 margin-bottom: 2rem; 375 485 } 376 486 377 - .cost-amount.free { 378 - color: var(--success, #4caf50); 487 + .audd-header { 488 + display: flex; 489 + justify-content: space-between; 490 + align-items: center; 491 + margin-bottom: 1rem; 492 + gap: 1rem; 493 + } 494 + 495 + .audd-header h2 { 496 + margin-bottom: 0; 497 + } 498 + 499 + .time-range-toggle { 500 + display: flex; 501 + gap: 0.25rem; 502 + background: var(--bg-tertiary); 503 + border: 1px solid var(--border-subtle); 504 + border-radius: var(--radius-base); 505 + padding: 0.25rem; 506 + } 507 + 508 + .time-range-toggle button { 509 + padding: 0.35rem 0.75rem; 510 + font-family: inherit; 511 + font-size: var(--text-xs); 512 + font-weight: 500; 513 + background: transparent; 514 + border: none; 515 + border-radius: var(--radius-sm); 516 + color: var(--text-secondary); 517 + cursor: pointer; 518 + transition: all 0.15s; 519 + } 520 + 521 + .time-range-toggle button:hover { 522 + color: var(--text-primary); 523 + } 524 + 525 + .time-range-toggle button.active { 526 + background: var(--accent); 527 + color: white; 528 + } 529 + 530 + .no-data { 531 + text-align: center; 532 + color: var(--text-tertiary); 533 + font-size: var(--text-sm); 534 + padding: 2rem; 535 + background: var(--bg-tertiary); 536 + border: 1px solid var(--border-subtle); 537 + border-radius: var(--radius-md); 538 + } 539 + 540 + .audd-stats { 541 + display: grid; 542 + grid-template-columns: repeat(3, 1fr); 543 + gap: 1rem; 544 + margin-bottom: 1rem; 545 + } 546 + 547 + .audd-explainer { 548 + font-size: var(--text-sm); 549 + color: var(--text-secondary); 550 + margin-bottom: 1.5rem; 551 + line-height: 1.5; 552 + } 553 + 554 + .audd-explainer strong { 555 + color: var(--warning); 556 + } 557 + 558 + .stat { 559 + display: flex; 560 + flex-direction: column; 561 + align-items: center; 562 + padding: 1rem; 563 + background: var(--bg-tertiary); 564 + border: 1px solid var(--border-subtle); 565 + border-radius: var(--radius-md); 566 + } 567 + 568 + .stat-value { 569 + font-size: var(--text-2xl); 570 + font-weight: 700; 571 + color: var(--text-primary); 572 + font-variant-numeric: tabular-nums; 573 + } 574 + 575 + .stat-label { 576 + font-size: var(--text-xs); 577 + color: var(--text-tertiary); 578 + text-align: center; 579 + margin-top: 0.25rem; 580 + } 581 + 582 + /* daily chart */ 583 + .daily-chart { 584 + background: var(--bg-tertiary); 585 + border: 1px solid var(--border-subtle); 586 + border-radius: var(--radius-md); 587 + padding: 1rem; 588 + overflow: hidden; 589 + } 590 + 591 + .daily-chart h3 { 592 + font-size: var(--text-xs); 593 + text-transform: uppercase; 594 + letter-spacing: 0.05em; 595 + color: var(--text-tertiary); 596 + margin: 0 0 1rem; 597 + } 598 + 599 + .chart-bars { 600 + display: flex; 601 + align-items: flex-end; 602 + gap: 2px; 603 + height: 100px; 604 + width: 100%; 605 + } 606 + 607 + .chart-bar-container { 608 + flex: 1 1 0; 609 + min-width: 0; 610 + display: flex; 611 + flex-direction: column; 612 + align-items: center; 613 + height: 100%; 614 + } 615 + 616 + .chart-bar { 617 + width: 100%; 618 + background: var(--accent); 619 + border-radius: 2px 2px 0 0; 620 + min-height: 4px; 621 + margin-top: auto; 622 + transition: height 0.3s ease; 623 + } 624 + 625 + .chart-bar:hover { 626 + opacity: 0.8; 627 + } 628 + 629 + .chart-label { 630 + font-size: 0.55rem; 631 + color: var(--text-tertiary); 632 + margin-top: 0.25rem; 633 + white-space: nowrap; 634 + overflow: hidden; 635 + text-overflow: ellipsis; 636 + max-width: 100%; 379 637 } 380 638 381 639 /* support section */ ··· 453 711 @media (max-width: 480px) { 454 712 .total-amount { 455 713 font-size: 2.5rem; 714 + } 715 + 716 + .audd-stats { 717 + grid-template-columns: 1fr; 718 + } 719 + 720 + .chart-label { 721 + display: none; 456 722 } 457 723 } 458 724 </style>
+2 -2
frontend/src/routes/privacy/+page.svelte
··· 18 18 <div class="legal-container"> 19 19 <article class="legal-content"> 20 20 <h1>Privacy Policy</h1> 21 - <p class="last-updated">Last updated: March 20, 2026</p> 21 + <p class="last-updated">Last updated: March 22, 2026</p> 22 22 23 23 <p class="intro"> 24 24 {APP_NAME} ("we", "us", or "our") is an audio streaming application built on the ··· 83 83 <li><strong><a href="https://fly.io" target="_blank" rel="noopener">Fly.io</a></strong> - backend hosting</li> 84 84 <li><strong><a href="https://neon.tech" target="_blank" rel="noopener">Neon</a></strong> - database</li> 85 85 <li><strong><a href="https://logfire.pydantic.dev" target="_blank" rel="noopener">Logfire</a></strong> - error monitoring</li> 86 - <li><strong><a href="https://acoustid.org" target="_blank" rel="noopener">AcoustID</a></strong> - audio fingerprinting for copyright detection</li> 86 + <li><strong><a href="https://audd.io" target="_blank" rel="noopener">AuDD</a></strong> - audio fingerprinting for copyright detection</li> 87 87 <li><strong><a href="https://anthropic.com" target="_blank" rel="noopener">Anthropic</a></strong> - image analysis for content moderation</li> 88 88 <li><strong><a href="https://atprotofans.com" target="_blank" rel="noopener">ATProtoFans</a></strong> - supporter validation for gated content</li> 89 89 <li><strong><a href="https://modal.com" target="_blank" rel="noopener">Modal</a></strong> - audio processing for search embeddings</li>
-4
loq.toml
··· 228 228 [[rules]] 229 229 path = "frontend/src/lib/components/embed/CollectionEmbed.svelte" 230 230 max_lines = 580 231 - 232 - [[rules]] 233 - path = "services/moderation/src/audd.rs" 234 - max_lines = 509
+85 -17
scripts/costs/export_costs.py
··· 9 9 uv run scripts/costs/export_costs.py # export to R2 (prod) 10 10 uv run scripts/costs/export_costs.py --dry-run # print JSON, don't upload 11 11 uv run scripts/costs/export_costs.py --env stg # use staging db 12 + 13 + AudD billing model: 14 + - $5/month base (indie plan) 15 + - 6000 free requests/month (1000 base + 5000 bonus) 16 + - $5 per 1000 requests after free tier 17 + - 1 request = 12 seconds of audio 18 + - so a 5-minute track = ceil(300/12) = 25 requests 12 19 """ 13 20 14 21 import asyncio ··· 21 28 import typer 22 29 from pydantic import Field 23 30 from pydantic_settings import BaseSettings, SettingsConfigDict 31 + 32 + # billing constants 33 + AUDD_BILLING_DAY = 24 34 + AUDD_SECONDS_PER_REQUEST = 12 35 + AUDD_FREE_REQUESTS = 6000 # 1000 base + 5000 bonus on indie plan 36 + AUDD_COST_PER_1000 = 5.00 # $5 per 1000 requests 37 + AUDD_BASE_COST = 5.00 # $5/month base 24 38 25 39 # fixed monthly costs (updated 2025-12-26) 26 40 # fly.io: manually updated from cost explorer (TODO: use fly billing API) ··· 83 97 app = typer.Typer(add_completion=False) 84 98 85 99 86 - async def get_scan_stats(db_url: str) -> dict[str, Any]: 87 - """fetch copyright scan stats from postgres.""" 100 + def get_billing_period_start() -> datetime: 101 + """get the start of current billing period (24th of month)""" 102 + now = datetime.now() 103 + if now.day >= AUDD_BILLING_DAY: 104 + return datetime(now.year, now.month, AUDD_BILLING_DAY) 105 + else: 106 + first_of_month = datetime(now.year, now.month, 1) 107 + prev_month = first_of_month - timedelta(days=1) 108 + return datetime(prev_month.year, prev_month.month, AUDD_BILLING_DAY) 109 + 110 + 111 + async def get_audd_stats(db_url: str) -> dict[str, Any]: 112 + """fetch audd scan stats from postgres. 113 + 114 + calculates AudD API requests from track duration: 115 + - each 12 seconds of audio = 1 API request 116 + - derived by joining copyright_scans with tracks table 117 + """ 88 118 import asyncpg 89 119 90 - # 30 days of history for the daily chart 120 + billing_start = get_billing_period_start() 121 + # 30 days of history for the daily chart (independent of billing cycle) 91 122 history_start = datetime.now() - timedelta(days=30) 92 123 93 124 conn = await asyncpg.connect(db_url) 94 125 try: 126 + # get totals: scans, flagged, and derived API requests from duration 127 + # uses billing period for accurate cost calculation 95 128 row = await conn.fetchrow( 96 129 """ 97 130 SELECT 98 131 COUNT(*) as total_scans, 99 - COUNT(CASE WHEN cs.is_flagged THEN 1 END) as flagged 132 + COUNT(CASE WHEN cs.is_flagged THEN 1 END) as flagged, 133 + COALESCE(SUM(CEIL((t.extra->>'duration')::float / $2)), 0)::bigint as total_requests, 134 + COALESCE(SUM((t.extra->>'duration')::int), 0)::bigint as total_seconds 100 135 FROM copyright_scans cs 136 + JOIN tracks t ON t.id = cs.track_id 101 137 WHERE cs.scanned_at >= $1 102 138 """, 103 - history_start, 139 + billing_start, 140 + AUDD_SECONDS_PER_REQUEST, 104 141 ) 105 142 total_scans = row["total_scans"] 106 143 flagged = row["flagged"] 144 + total_requests = row["total_requests"] 145 + total_seconds = row["total_seconds"] 107 146 147 + # daily breakdown for chart - 30 days of history for flexible views 108 148 daily = await conn.fetch( 109 149 """ 110 150 SELECT 111 151 DATE(cs.scanned_at) as date, 112 152 COUNT(*) as scans, 113 - COUNT(CASE WHEN cs.is_flagged THEN 1 END) as flagged 153 + COUNT(CASE WHEN cs.is_flagged THEN 1 END) as flagged, 154 + COALESCE(SUM(CEIL((t.extra->>'duration')::float / $2)), 0)::bigint as requests 114 155 FROM copyright_scans cs 156 + JOIN tracks t ON t.id = cs.track_id 115 157 WHERE cs.scanned_at >= $1 116 158 GROUP BY DATE(cs.scanned_at) 117 159 ORDER BY date 118 160 """, 119 161 history_start, 162 + AUDD_SECONDS_PER_REQUEST, 120 163 ) 121 164 165 + # calculate costs 166 + billable_requests = max(0, total_requests - AUDD_FREE_REQUESTS) 167 + overage_cost = round(billable_requests * AUDD_COST_PER_1000 / 1000, 2) 168 + total_cost = AUDD_BASE_COST + overage_cost 169 + 122 170 return { 171 + "billing_period_start": billing_start.isoformat(), 123 172 "total_scans": total_scans, 173 + "total_requests": total_requests, 174 + "total_audio_seconds": total_seconds, 124 175 "flagged": flagged, 125 176 "flag_rate": round(flagged / total_scans * 100, 1) if total_scans else 0, 177 + "free_requests": AUDD_FREE_REQUESTS, 178 + "remaining_free": max(0, AUDD_FREE_REQUESTS - total_requests), 179 + "billable_requests": billable_requests, 180 + "base_cost": AUDD_BASE_COST, 181 + "overage_cost": overage_cost, 182 + "estimated_cost": total_cost, 126 183 "daily": [ 127 184 { 128 185 "date": r["date"].isoformat(), 129 186 "scans": r["scans"], 130 187 "flagged": r["flagged"], 188 + "requests": r["requests"], 131 189 } 132 190 for r in daily 133 191 ], ··· 136 194 await conn.close() 137 195 138 196 139 - def build_cost_data(scan_stats: dict[str, Any]) -> dict[str, Any]: 197 + def build_cost_data(audd_stats: dict[str, Any]) -> dict[str, Any]: 140 198 """assemble full cost dashboard data""" 199 + # calculate plyr-specific fly costs 141 200 plyr_fly = sum(FIXED_COSTS["fly_io"]["breakdown"].values()) 142 201 143 202 monthly_total = ( 144 - plyr_fly + FIXED_COSTS["neon"]["total"] + FIXED_COSTS["cloudflare"]["total"] 203 + plyr_fly 204 + + FIXED_COSTS["neon"]["total"] 205 + + FIXED_COSTS["cloudflare"]["total"] 206 + + audd_stats["estimated_cost"] 145 207 ) 146 208 147 209 return { ··· 166 228 }, 167 229 "note": FIXED_COSTS["cloudflare"]["note"], 168 230 }, 169 - "copyright_scanning": { 170 - "amount": 0, 171 - "scans_30d": scan_stats["total_scans"], 172 - "flagged_30d": scan_stats["flagged"], 173 - "flag_rate": scan_stats["flag_rate"], 174 - "daily": scan_stats["daily"], 175 - "note": "free (AcoustID + fpcalc)", 231 + "audd": { 232 + "amount": audd_stats["estimated_cost"], 233 + "base_cost": audd_stats["base_cost"], 234 + "overage_cost": audd_stats["overage_cost"], 235 + "scans_this_period": audd_stats["total_scans"], 236 + "requests_this_period": audd_stats["total_requests"], 237 + "audio_seconds": audd_stats["total_audio_seconds"], 238 + "free_requests": audd_stats["free_requests"], 239 + "remaining_free": audd_stats["remaining_free"], 240 + "billable_requests": audd_stats["billable_requests"], 241 + "flag_rate": audd_stats["flag_rate"], 242 + "daily": audd_stats["daily"], 243 + "note": f"copyright detection ($5 base + ${AUDD_COST_PER_1000}/1k requests over {AUDD_FREE_REQUESTS})", 176 244 }, 177 245 }, 178 246 "support": { ··· 221 289 222 290 async def run(): 223 291 db_url = settings.get_db_url(env) 224 - scan_stats = await get_scan_stats(db_url) 225 - data = build_cost_data(scan_stats) 292 + audd_stats = await get_audd_stats(db_url) 293 + data = build_cost_data(audd_stats) 226 294 227 295 if dry_run: 228 296 print(json.dumps(data, indent=2))
-49
services/moderation/Cargo.lock
··· 485 485 checksum = "877a4ace8713b0bcf2a4e7eec82529c029f1d0619886d18145fea96c3ffe5c0f" 486 486 487 487 [[package]] 488 - name = "errno" 489 - version = "0.3.14" 490 - source = "registry+https://github.com/rust-lang/crates.io-index" 491 - checksum = "39cab71617ae0d63f51a36d69f866391735b51691dbda63cf6f96d042b63efeb" 492 - dependencies = [ 493 - "libc", 494 - "windows-sys 0.61.2", 495 - ] 496 - 497 - [[package]] 498 488 name = "etcetera" 499 489 version = "0.8.0" 500 490 source = "registry+https://github.com/rust-lang/crates.io-index" ··· 517 507 ] 518 508 519 509 [[package]] 520 - name = "fastrand" 521 - version = "2.3.0" 522 - source = "registry+https://github.com/rust-lang/crates.io-index" 523 - checksum = "37909eebbb50d72f9059c3b6d82c0463f2ff062c9e95845c43a6c9c0355411be" 524 - 525 - [[package]] 526 510 name = "ff" 527 511 version = "0.13.1" 528 512 source = "registry+https://github.com/rust-lang/crates.io-index" ··· 1128 1112 ] 1129 1113 1130 1114 [[package]] 1131 - name = "linux-raw-sys" 1132 - version = "0.12.1" 1133 - source = "registry+https://github.com/rust-lang/crates.io-index" 1134 - checksum = "32a66949e030da00e8c7d4434b251670a91556f4144941d37452769c25d58a53" 1135 - 1136 - [[package]] 1137 1115 name = "litemap" 1138 1116 version = "0.8.1" 1139 1117 source = "registry+https://github.com/rust-lang/crates.io-index" ··· 1248 1226 "serde_ipld_dagcbor", 1249 1227 "serde_json", 1250 1228 "sqlx", 1251 - "tempfile", 1252 1229 "thiserror 2.0.17", 1253 1230 "tokio", 1254 1231 "tokio-stream", ··· 1712 1689 checksum = "357703d41365b4b27c590e3ed91eabb1b663f07c4c084095e60cbed4362dff0d" 1713 1690 1714 1691 [[package]] 1715 - name = "rustix" 1716 - version = "1.1.4" 1717 - source = "registry+https://github.com/rust-lang/crates.io-index" 1718 - checksum = "b6fe4565b9518b83ef4f91bb47ce29620ca828bd32cb7e408f0062e9930ba190" 1719 - dependencies = [ 1720 - "bitflags", 1721 - "errno", 1722 - "libc", 1723 - "linux-raw-sys", 1724 - "windows-sys 0.61.2", 1725 - ] 1726 - 1727 - [[package]] 1728 1692 name = "rustls" 1729 1693 version = "0.23.35" 1730 1694 source = "registry+https://github.com/rust-lang/crates.io-index" ··· 2236 2200 "proc-macro2", 2237 2201 "quote", 2238 2202 "syn 2.0.111", 2239 - ] 2240 - 2241 - [[package]] 2242 - name = "tempfile" 2243 - version = "3.27.0" 2244 - source = "registry+https://github.com/rust-lang/crates.io-index" 2245 - checksum = "32497e9a4c7b38532efcdebeef879707aa9f794296a4f0244f6f69e9bc8574bd" 2246 - dependencies = [ 2247 - "fastrand", 2248 - "getrandom 0.3.4", 2249 - "once_cell", 2250 - "rustix", 2251 - "windows-sys 0.61.2", 2252 2203 ] 2253 2204 2254 2205 [[package]]
+1 -2
services/moderation/Cargo.toml
··· 20 20 serde_json = "1.0" 21 21 sqlx = { version = "0.8", features = ["runtime-tokio", "postgres", "chrono", "tls-rustls"] } 22 22 thiserror = "2.0" 23 - tempfile = "3.14" 24 - tokio = { version = "1.40", features = ["rt-multi-thread", "macros", "signal", "sync", "process"] } 23 + tokio = { version = "1.40", features = ["rt-multi-thread", "macros", "signal", "sync"] } 25 24 tokio-stream = { version = "0.1", features = ["sync"] } 26 25 tower-http = { version = "0.6", features = ["fs"] } 27 26 tracing = "0.1"
+1 -1
services/moderation/Dockerfile
··· 8 8 9 9 FROM debian:bookworm-slim 10 10 11 - RUN apt-get update && apt-get install -y ca-certificates libchromaprint-tools && rm -rf /var/lib/apt/lists/* 11 + RUN apt-get update && apt-get install -y ca-certificates && rm -rf /var/lib/apt/lists/* 12 12 13 13 WORKDIR /app 14 14 COPY --from=builder /app/target/release/moderation /usr/local/bin/moderation
+1 -1
services/moderation/justfile
··· 7 7 run: 8 8 MODERATION_HOST="${MODERATION_HOST:-127.0.0.1}" \ 9 9 MODERATION_PORT="${MODERATION_PORT:-8083}" \ 10 - MODERATION_ACOUSTID_API_KEY="${MODERATION_ACOUSTID_API_KEY:-}" \ 10 + MODERATION_AUDD_API_TOKEN="${MODERATION_AUDD_API_TOKEN:-}" \ 11 11 cargo watch -x run 12 12 13 13 build:
+1 -1
services/moderation/src/admin.rs
··· 572 572 }) 573 573 .unwrap_or_default(); 574 574 575 - // Show match count badge 575 + // Show match count instead of score (AuDD doesn't provide scores in accurate_offsets mode) 576 576 let match_count_badge = ctx 577 577 .and_then(|c| c.matches.as_ref()) 578 578 .filter(|m| !m.is_empty())
+109 -384
services/moderation/src/audd.rs
··· 1 - //! Audio fingerprinting via fpcalc (Chromaprint) + AcoustID lookup. 1 + //! AuDD audio fingerprinting integration. 2 2 3 3 use std::collections::HashMap; 4 4 5 5 use axum::{extract::State, Json}; 6 6 use serde::{Deserialize, Serialize}; 7 - use tokio::io::AsyncWriteExt; 8 7 use tracing::info; 9 8 10 9 use crate::state::{AppError, AppState}; 11 10 12 - // --- request/response types (unchanged API contract) --- 11 + // --- request/response types --- 13 12 14 13 #[derive(Debug, Deserialize)] 15 14 pub struct ScanRequest { ··· 25 24 /// The dominant song if one exists (artist - title) 26 25 #[serde(skip_serializing_if = "Option::is_none")] 27 26 pub dominant_match: Option<String>, 28 - /// Highest AcoustID score (0-100) 27 + /// Legacy field - always 0 since AudD doesn't return scores 29 28 pub highest_score: i32, 30 29 pub raw_response: serde_json::Value, 31 30 } ··· 45 44 pub offset_ms: Option<i64>, 46 45 } 47 46 48 - // --- fpcalc types --- 47 + // --- audd api types --- 49 48 50 49 #[derive(Debug, Deserialize)] 51 - struct FpcalcOutput { 52 - duration: f64, 53 - fingerprint: String, 54 - } 55 - 56 - // --- acoustid types --- 57 - 58 - #[derive(Debug, Deserialize)] 59 - struct AcoustidResponse { 60 - status: String, 61 - #[serde(default)] 62 - results: Vec<AcoustidResult>, 63 - // error responses 64 - error: Option<AcoustidError>, 65 - } 66 - 67 - #[derive(Debug, Deserialize)] 68 - struct AcoustidError { 69 - message: String, 50 + pub struct AuddResponse { 51 + pub status: Option<String>, 52 + pub result: Option<AuddResult>, 70 53 } 71 54 72 55 #[derive(Debug, Deserialize)] 73 - struct AcoustidResult { 74 - score: f64, 75 - #[serde(default)] 76 - recordings: Vec<AcoustidRecording>, 56 + #[serde(untagged)] 57 + pub enum AuddResult { 58 + Groups(Vec<AuddGroup>), 59 + Single(AuddSong), 77 60 } 78 61 79 62 #[derive(Debug, Deserialize)] 80 - struct AcoustidRecording { 81 - title: Option<String>, 82 - #[serde(default)] 83 - artists: Vec<AcoustidArtist>, 63 + pub struct AuddGroup { 64 + pub offset: Option<serde_json::Value>, 65 + pub songs: Option<Vec<AuddSong>>, 84 66 } 85 67 86 68 #[derive(Debug, Deserialize)] 87 - struct AcoustidArtist { 88 - name: String, 69 + #[allow(dead_code)] 70 + pub struct AuddSong { 71 + pub artist: Option<String>, 72 + pub title: Option<String>, 73 + pub album: Option<String>, 74 + pub score: Option<i32>, 75 + pub isrc: Option<String>, 76 + pub timecode: Option<String>, 77 + pub release_date: Option<String>, 78 + pub label: Option<String>, 79 + pub song_link: Option<String>, 89 80 } 90 81 91 82 // --- handler --- 92 83 93 - /// Scan audio for copyright matches via fpcalc + AcoustID. 84 + /// Scan audio for copyright matches via AuDD. 94 85 pub async fn scan( 95 86 State(state): State<AppState>, 96 87 Json(request): Json<ScanRequest>, 97 88 ) -> Result<Json<ScanResponse>, AppError> { 98 89 info!(audio_url = %request.audio_url, "scanning audio"); 99 90 100 - // 1. download audio to temp file 101 91 let client = reqwest::Client::new(); 102 - let audio_bytes = client 103 - .get(&request.audio_url) 104 - .send() 105 - .await 106 - .map_err(|e| AppError::Scan(format!("failed to download audio: {e}")))? 107 - .bytes() 108 - .await 109 - .map_err(|e| AppError::Scan(format!("failed to read audio bytes: {e}")))?; 110 - 111 - let tmp = tempfile::NamedTempFile::new() 112 - .map_err(|e| AppError::Scan(format!("failed to create temp file: {e}")))?; 113 - let tmp_path = tmp.path().to_owned(); 114 - { 115 - let mut file = tokio::fs::File::create(&tmp_path) 116 - .await 117 - .map_err(|e| AppError::Scan(format!("failed to write temp file: {e}")))?; 118 - file.write_all(&audio_bytes) 119 - .await 120 - .map_err(|e| AppError::Scan(format!("failed to write audio data: {e}")))?; 121 - } 122 - 123 - // 2. run fpcalc 124 - let fpcalc_output = tokio::process::Command::new("fpcalc") 125 - .arg("-json") 126 - .arg(&tmp_path) 127 - .output() 128 - .await 129 - .map_err(|e| AppError::Scan(format!("fpcalc execution failed: {e}")))?; 130 - 131 - if !fpcalc_output.status.success() { 132 - let stderr = String::from_utf8_lossy(&fpcalc_output.stderr); 133 - return Err(AppError::Scan(format!("fpcalc failed: {stderr}"))); 134 - } 135 - 136 - let fpcalc: FpcalcOutput = serde_json::from_slice(&fpcalc_output.stdout) 137 - .map_err(|e| AppError::Scan(format!("failed to parse fpcalc output: {e}")))?; 138 - 139 - info!(duration = fpcalc.duration, "fpcalc fingerprint generated"); 140 - 141 - // 3. lookup on AcoustID 142 - let raw_response: serde_json::Value = client 143 - .post("https://api.acoustid.org/v2/lookup") 92 + let response = client 93 + .post(&state.audd_api_url) 144 94 .form(&[ 145 - ("client", state.acoustid_api_key.as_str()), 146 - ("meta", "recordings"), 147 - ("duration", &(fpcalc.duration as i64).to_string()), 148 - ("fingerprint", &fpcalc.fingerprint), 95 + ("api_token", &state.audd_api_token), 96 + ("url", &request.audio_url), 97 + ("accurate_offsets", &"1".to_string()), 149 98 ]) 150 99 .send() 151 100 .await 152 - .map_err(|e| AppError::Scan(format!("acoustid request failed: {e}")))? 101 + .map_err(|e| AppError::Audd(format!("request failed: {e}")))?; 102 + 103 + let raw_response: serde_json::Value = response 153 104 .json() 154 105 .await 155 - .map_err(|e| AppError::Scan(format!("failed to parse acoustid response: {e}")))?; 106 + .map_err(|e| AppError::Audd(format!("failed to parse response: {e}")))?; 156 107 157 - let acoustid_response: AcoustidResponse = serde_json::from_value(raw_response.clone()) 158 - .map_err(|e| AppError::Scan(format!("failed to deserialize acoustid response: {e}")))?; 108 + let audd_response: AuddResponse = serde_json::from_value(raw_response.clone()) 109 + .map_err(|e| AppError::Audd(format!("failed to parse audd response: {e}")))?; 159 110 160 - if acoustid_response.status == "error" { 161 - let msg = acoustid_response 162 - .error 163 - .map(|e| e.message) 164 - .unwrap_or_else(|| "unknown error".to_string()); 165 - return Err(AppError::Scan(format!("acoustid returned error: {msg}"))); 111 + if audd_response.status.as_deref() == Some("error") { 112 + return Err(AppError::Audd(format!( 113 + "audd returned error: {}", 114 + raw_response 115 + ))); 166 116 } 167 117 168 - // 4. map to response format 169 - let matches = extract_matches(&acoustid_response); 170 - let highest_score = matches.iter().map(|m| m.score).max().unwrap_or(0); 118 + let matches = extract_matches(&audd_response); 171 119 let (dominant_match, dominant_match_pct) = find_dominant_match(&matches); 172 120 173 - let is_flagged = highest_score >= state.copyright_score_threshold; 121 + // Flag if any single song dominates the matches (>= threshold % of segments) 122 + // This filters out false positives where random segments match different songs 123 + let is_flagged = dominant_match_pct >= state.copyright_score_threshold; 174 124 175 125 info!( 176 126 match_count = matches.len(), 177 - highest_score, 178 127 dominant_match_pct, 179 128 dominant_match = dominant_match.as_deref().unwrap_or("none"), 180 129 is_flagged, ··· 186 135 is_flagged, 187 136 dominant_match_pct, 188 137 dominant_match, 189 - highest_score, 138 + highest_score: 0, // AudD doesn't return scores 190 139 raw_response, 191 140 })) 192 141 } 193 142 194 143 // --- helpers --- 195 144 196 - fn extract_matches(response: &AcoustidResponse) -> Vec<AuddMatch> { 197 - response 198 - .results 199 - .iter() 200 - .flat_map(|result| { 201 - let score = (result.score * 100.0) as i32; 202 - result.recordings.iter().map(move |recording| { 203 - let artist = if recording.artists.is_empty() { 204 - "Unknown".to_string() 205 - } else { 206 - recording 207 - .artists 208 - .iter() 209 - .map(|a| a.name.as_str()) 210 - .collect::<Vec<_>>() 211 - .join(", ") 212 - }; 213 - AuddMatch { 214 - artist, 215 - title: recording 216 - .title 217 - .clone() 218 - .unwrap_or_else(|| "Unknown".to_string()), 219 - album: None, 220 - score, 221 - isrc: None, 222 - timecode: None, 223 - offset_ms: None, 224 - } 145 + fn extract_matches(response: &AuddResponse) -> Vec<AuddMatch> { 146 + let Some(result) = &response.result else { 147 + return vec![]; 148 + }; 149 + 150 + match result { 151 + AuddResult::Groups(groups) => groups 152 + .iter() 153 + .flat_map(|group| { 154 + group 155 + .songs 156 + .as_ref() 157 + .map(|songs| { 158 + songs 159 + .iter() 160 + .map(|song| parse_song(song, group.offset.as_ref())) 161 + .collect::<Vec<_>>() 162 + }) 163 + .unwrap_or_default() 225 164 }) 226 - }) 227 - .collect() 165 + .collect(), 166 + AuddResult::Single(song) => vec![parse_song(song, None)], 167 + } 168 + } 169 + 170 + fn parse_song(song: &AuddSong, offset: Option<&serde_json::Value>) -> AuddMatch { 171 + let offset_ms = offset.and_then(|v| match v { 172 + serde_json::Value::Number(n) => n.as_i64(), 173 + serde_json::Value::String(s) => parse_timecode_to_ms(s), 174 + _ => None, 175 + }); 176 + 177 + AuddMatch { 178 + artist: song.artist.clone().unwrap_or_else(|| "Unknown".to_string()), 179 + title: song.title.clone().unwrap_or_else(|| "Unknown".to_string()), 180 + album: song.album.clone(), 181 + score: song.score.unwrap_or(0), 182 + isrc: song.isrc.clone(), 183 + timecode: song.timecode.clone(), 184 + offset_ms, 185 + } 186 + } 187 + 188 + fn parse_timecode_to_ms(timecode: &str) -> Option<i64> { 189 + let parts: Vec<&str> = timecode.split(':').collect(); 190 + match parts.len() { 191 + 2 => { 192 + let mins: i64 = parts[0].parse().ok()?; 193 + let secs: i64 = parts[1].parse().ok()?; 194 + Some((mins * 60 + secs) * 1000) 195 + } 196 + 3 => { 197 + let hours: i64 = parts[0].parse().ok()?; 198 + let mins: i64 = parts[1].parse().ok()?; 199 + let secs: i64 = parts[2].parse().ok()?; 200 + Some((hours * 3600 + mins * 60 + secs) * 1000) 201 + } 202 + _ => None, 203 + } 228 204 } 229 205 230 206 /// Find the dominant song in matches (the one that appears most frequently). 231 207 /// Returns (dominant_song_name, percentage_of_total_matches). 208 + /// 209 + /// AudD doesn't return confidence scores, so we use match frequency as a proxy: 210 + /// if the same song matches across many segments of the track, it's likely real. 211 + /// Random false positives tend to be scattered across different songs. 232 212 fn find_dominant_match(matches: &[AuddMatch]) -> (Option<String>, i32) { 233 213 if matches.is_empty() { 234 214 return (None, 0); ··· 252 232 253 233 (Some(dominant_name), pct) 254 234 } 255 - 256 - #[cfg(test)] 257 - mod tests { 258 - use super::*; 259 - 260 - fn make_match(artist: &str, title: &str, score: i32) -> AuddMatch { 261 - AuddMatch { 262 - artist: artist.to_string(), 263 - title: title.to_string(), 264 - album: None, 265 - score, 266 - isrc: None, 267 - timecode: None, 268 - offset_ms: None, 269 - } 270 - } 271 - 272 - // --- extract_matches --- 273 - 274 - #[test] 275 - fn test_extract_matches_basic() { 276 - let response = AcoustidResponse { 277 - status: "ok".to_string(), 278 - results: vec![AcoustidResult { 279 - score: 0.97, 280 - recordings: vec![AcoustidRecording { 281 - title: Some("Never Gonna Give You Up".to_string()), 282 - artists: vec![AcoustidArtist { 283 - name: "Rick Astley".to_string(), 284 - }], 285 - }], 286 - }], 287 - error: None, 288 - }; 289 - 290 - let matches = extract_matches(&response); 291 - assert_eq!(matches.len(), 1); 292 - assert_eq!(matches[0].artist, "Rick Astley"); 293 - assert_eq!(matches[0].title, "Never Gonna Give You Up"); 294 - assert_eq!(matches[0].score, 97); 295 - } 296 - 297 - #[test] 298 - fn test_extract_matches_multiple_artists() { 299 - let response = AcoustidResponse { 300 - status: "ok".to_string(), 301 - results: vec![AcoustidResult { 302 - score: 0.85, 303 - recordings: vec![AcoustidRecording { 304 - title: Some("Under Pressure".to_string()), 305 - artists: vec![ 306 - AcoustidArtist { 307 - name: "Queen".to_string(), 308 - }, 309 - AcoustidArtist { 310 - name: "David Bowie".to_string(), 311 - }, 312 - ], 313 - }], 314 - }], 315 - error: None, 316 - }; 317 - 318 - let matches = extract_matches(&response); 319 - assert_eq!(matches[0].artist, "Queen, David Bowie"); 320 - assert_eq!(matches[0].score, 85); 321 - } 322 - 323 - #[test] 324 - fn test_extract_matches_missing_fields() { 325 - let response = AcoustidResponse { 326 - status: "ok".to_string(), 327 - results: vec![AcoustidResult { 328 - score: 0.5, 329 - recordings: vec![AcoustidRecording { 330 - title: None, 331 - artists: vec![], 332 - }], 333 - }], 334 - error: None, 335 - }; 336 - 337 - let matches = extract_matches(&response); 338 - assert_eq!(matches.len(), 1); 339 - assert_eq!(matches[0].artist, "Unknown"); 340 - assert_eq!(matches[0].title, "Unknown"); 341 - assert_eq!(matches[0].score, 50); 342 - } 343 - 344 - #[test] 345 - fn test_extract_matches_empty_results() { 346 - let response = AcoustidResponse { 347 - status: "ok".to_string(), 348 - results: vec![], 349 - error: None, 350 - }; 351 - 352 - let matches = extract_matches(&response); 353 - assert!(matches.is_empty()); 354 - } 355 - 356 - #[test] 357 - fn test_extract_matches_multiple_results_and_recordings() { 358 - let response = AcoustidResponse { 359 - status: "ok".to_string(), 360 - results: vec![ 361 - AcoustidResult { 362 - score: 0.97, 363 - recordings: vec![ 364 - AcoustidRecording { 365 - title: Some("Song A".to_string()), 366 - artists: vec![AcoustidArtist { 367 - name: "Artist 1".to_string(), 368 - }], 369 - }, 370 - AcoustidRecording { 371 - title: Some("Song B".to_string()), 372 - artists: vec![AcoustidArtist { 373 - name: "Artist 2".to_string(), 374 - }], 375 - }, 376 - ], 377 - }, 378 - AcoustidResult { 379 - score: 0.42, 380 - recordings: vec![AcoustidRecording { 381 - title: Some("Song C".to_string()), 382 - artists: vec![AcoustidArtist { 383 - name: "Artist 3".to_string(), 384 - }], 385 - }], 386 - }, 387 - ], 388 - error: None, 389 - }; 390 - 391 - let matches = extract_matches(&response); 392 - assert_eq!(matches.len(), 3); 393 - // First result's recordings get score 97 394 - assert_eq!(matches[0].score, 97); 395 - assert_eq!(matches[1].score, 97); 396 - // Second result's recording gets score 42 397 - assert_eq!(matches[2].score, 42); 398 - } 399 - 400 - // --- find_dominant_match --- 401 - 402 - #[test] 403 - fn test_find_dominant_empty() { 404 - let (name, pct) = find_dominant_match(&[]); 405 - assert!(name.is_none()); 406 - assert_eq!(pct, 0); 407 - } 408 - 409 - #[test] 410 - fn test_find_dominant_single_match() { 411 - let matches = vec![make_match("Rick Astley", "Never Gonna Give You Up", 97)]; 412 - let (name, pct) = find_dominant_match(&matches); 413 - assert_eq!(name.unwrap(), "rick astley - never gonna give you up"); 414 - assert_eq!(pct, 100); 415 - } 416 - 417 - #[test] 418 - fn test_find_dominant_clear_winner() { 419 - let matches = vec![ 420 - make_match("Rick Astley", "Never Gonna Give You Up", 97), 421 - make_match("Rick Astley", "Never Gonna Give You Up", 95), 422 - make_match("Rick Astley", "Never Gonna Give You Up", 90), 423 - make_match("Other Artist", "Other Song", 40), 424 - ]; 425 - let (name, pct) = find_dominant_match(&matches); 426 - assert_eq!(name.unwrap(), "rick astley - never gonna give you up"); 427 - assert_eq!(pct, 75); // 3 out of 4 428 - } 429 - 430 - #[test] 431 - fn test_find_dominant_case_insensitive() { 432 - let matches = vec![ 433 - make_match("RICK ASTLEY", "Never Gonna Give You Up", 97), 434 - make_match("rick astley", "never gonna give you up", 95), 435 - ]; 436 - let (_name, pct) = find_dominant_match(&matches); 437 - assert_eq!(pct, 100); // both collapse to same key 438 - } 439 - 440 - // --- AcoustID response deserialization --- 441 - 442 - #[test] 443 - fn test_acoustid_response_parsing() { 444 - let json = serde_json::json!({ 445 - "status": "ok", 446 - "results": [{ 447 - "score": 0.971652, 448 - "recordings": [{ 449 - "title": "Never Gonna Give You Up", 450 - "artists": [{"name": "Rick Astley"}] 451 - }] 452 - }] 453 - }); 454 - 455 - let response: AcoustidResponse = serde_json::from_value(json).unwrap(); 456 - assert_eq!(response.status, "ok"); 457 - assert_eq!(response.results.len(), 1); 458 - assert!((response.results[0].score - 0.971652).abs() < 0.0001); 459 - } 460 - 461 - #[test] 462 - fn test_acoustid_error_response_parsing() { 463 - let json = serde_json::json!({ 464 - "status": "error", 465 - "error": {"message": "invalid api key"} 466 - }); 467 - 468 - let response: AcoustidResponse = serde_json::from_value(json).unwrap(); 469 - assert_eq!(response.status, "error"); 470 - assert_eq!(response.error.unwrap().message, "invalid api key"); 471 - } 472 - 473 - #[test] 474 - fn test_acoustid_empty_results() { 475 - let json = serde_json::json!({ 476 - "status": "ok", 477 - "results": [] 478 - }); 479 - 480 - let response: AcoustidResponse = serde_json::from_value(json).unwrap(); 481 - assert!(response.results.is_empty()); 482 - } 483 - 484 - #[test] 485 - fn test_acoustid_missing_recordings() { 486 - // AcoustID can return results with no recordings 487 - let json = serde_json::json!({ 488 - "status": "ok", 489 - "results": [{ 490 - "score": 0.5 491 - }] 492 - }); 493 - 494 - let response: AcoustidResponse = serde_json::from_value(json).unwrap(); 495 - assert!(response.results[0].recordings.is_empty()); 496 - } 497 - 498 - #[test] 499 - fn test_fpcalc_output_parsing() { 500 - let json = serde_json::json!({ 501 - "duration": 211.48, 502 - "fingerprint": "AQADtNIiTUkkOcmRH5d0HNFxXMdxHPmR" 503 - }); 504 - 505 - let output: FpcalcOutput = serde_json::from_value(json).unwrap(); 506 - assert!((output.duration - 211.48).abs() < 0.01); 507 - assert!(output.fingerprint.starts_with("AQADt")); 508 - } 509 - }
+8 -4
services/moderation/src/config.rs
··· 8 8 pub host: String, 9 9 pub port: u16, 10 10 pub auth_token: Option<String>, 11 - pub acoustid_api_key: String, 11 + pub audd_api_token: String, 12 + pub audd_api_url: String, 12 13 pub database_url: Option<String>, 13 14 pub labeler_did: Option<String>, 14 15 pub labeler_signing_key: Option<String>, ··· 16 17 pub claude_api_key: Option<String>, 17 18 /// Claude model to use (default: claude-sonnet-4-5-20250929) 18 19 pub claude_model: String, 19 - /// Minimum AcoustID score (0-100) to flag a track as a copyright match (default: 30) 20 + /// Minimum percentage of matches that must belong to a single song to flag (default: 30) 21 + /// AudD doesn't return confidence scores, so we use match frequency as a proxy. 20 22 pub copyright_score_threshold: i32, 21 23 } 22 24 ··· 30 32 .and_then(|v| v.parse().ok()) 31 33 .unwrap_or(8083), 32 34 auth_token: env::var("MODERATION_AUTH_TOKEN").ok(), 33 - acoustid_api_key: env::var("MODERATION_ACOUSTID_API_KEY") 34 - .map_err(|_| anyhow!("MODERATION_ACOUSTID_API_KEY is required"))?, 35 + audd_api_token: env::var("MODERATION_AUDD_API_TOKEN") 36 + .map_err(|_| anyhow!("MODERATION_AUDD_API_TOKEN is required"))?, 37 + audd_api_url: env::var("MODERATION_AUDD_API_URL") 38 + .unwrap_or_else(|_| "https://enterprise.audd.io/".to_string()), 35 39 database_url: env::var("MODERATION_DATABASE_URL").ok(), 36 40 labeler_did: env::var("MODERATION_LABELER_DID").ok(), 37 41 labeler_signing_key: env::var("MODERATION_LABELER_SIGNING_KEY").ok(),
+1 -1
services/moderation/src/handlers.rs
··· 52 52 } 53 53 54 54 /// Normalize a score from integer (0-100) to float (0.0-1.0) range. 55 - /// AcoustID returns scores as integers like 85 meaning 85%. 55 + /// AuDD returns scores as integers like 85 meaning 85%. 56 56 fn normalize_score(score: f64) -> f64 { 57 57 if score > 1.0 { 58 58 score / 100.0
+4 -3
services/moderation/src/main.rs
··· 1 1 //! plyr.fm moderation service 2 2 //! 3 3 //! Provides: 4 - //! - Audio fingerprinting via fpcalc + AcoustID for copyright detection 4 + //! - AuDD audio fingerprinting for copyright detection 5 5 //! - ATProto labeler endpoints (queryLabels, subscribeLabels) 6 6 //! - Label emission for copyright violations 7 7 //! - Admin UI for reviewing and resolving flags ··· 76 76 }; 77 77 78 78 let state = AppState { 79 - acoustid_api_key: config.acoustid_api_key, 79 + audd_api_token: config.audd_api_token, 80 + audd_api_url: config.audd_api_url, 80 81 db: db.map(Arc::new), 81 82 signer: signer.map(Arc::new), 82 83 label_tx, ··· 91 92 .route("/health", get(handlers::health)) 92 93 // Sensitive images (public) 93 94 .route("/sensitive-images", get(handlers::get_sensitive_images)) 94 - // Copyright scanning (fpcalc + AcoustID) 95 + // AuDD scanning 95 96 .route("/scan", post(audd::scan)) 96 97 // Image moderation via Claude 97 98 .route("/scan-image", post(handlers::scan_image))
+6 -5
services/moderation/src/state.rs
··· 17 17 /// Shared application state. 18 18 #[derive(Clone)] 19 19 pub struct AppState { 20 - pub acoustid_api_key: String, 20 + pub audd_api_token: String, 21 + pub audd_api_url: String, 21 22 pub db: Option<Arc<LabelDb>>, 22 23 pub signer: Option<Arc<LabelSigner>>, 23 24 pub label_tx: Option<broadcast::Sender<(i64, Label)>>, 24 25 /// Claude client for image moderation (if configured) 25 26 pub claude: Option<Arc<ClaudeClient>>, 26 - /// Minimum AcoustID score (0-100) to flag a track as a copyright match 27 + /// Minimum percentage of matches that must belong to a single song to flag 27 28 pub copyright_score_threshold: i32, 28 29 } 29 30 30 31 /// Application error type. 31 32 #[derive(Debug, thiserror::Error)] 32 33 pub enum AppError { 33 - #[error("scan error: {0}")] 34 - Scan(String), 34 + #[error("audd error: {0}")] 35 + Audd(String), 35 36 36 37 #[error("claude error: {0}")] 37 38 Claude(String), ··· 62 63 fn into_response(self) -> Response { 63 64 error!(error = %self, "request failed"); 64 65 let (status, error_type) = match &self { 65 - AppError::Scan(_) => (StatusCode::BAD_GATEWAY, "ScanError"), 66 + AppError::Audd(_) => (StatusCode::BAD_GATEWAY, "AuddError"), 66 67 AppError::Claude(_) => (StatusCode::BAD_GATEWAY, "ClaudeError"), 67 68 AppError::ImageModerationNotConfigured => { 68 69 (StatusCode::SERVICE_UNAVAILABLE, "ImageModerationNotConfigured")