Ionosphere.tv
3
fork

Configure Feed

Select the types of activity you want to include in your feed.

chore: remove superpowers plans/specs, perf traces, pycache from repo

Added docs/superpowers/, __pycache__/, Trace-*.json.gz to gitignore.
No API tokens in tracked files (only in untracked .env).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

+3 -15565
+3
.gitignore
··· 18 18 .sprite 19 19 .claude/worktrees 20 20 .superpowers/ 21 + docs/superpowers/ 22 + __pycache__/ 23 + Trace-*.json.gz
Trace-20260413T010910.json.gz

This is a binary file and will not be displayed.

apps/ionosphere-appview/tools/__pycache__/evaluate.cpython-313.pyc

This is a binary file and will not be displayed.

apps/ionosphere-appview/tools/__pycache__/merge_enrichment.cpython-313.pyc

This is a binary file and will not be displayed.

apps/ionosphere-appview/tools/__pycache__/test_evaluate.cpython-313-pytest-9.0.2.pyc

This is a binary file and will not be displayed.

apps/ionosphere-appview/tools/__pycache__/test_merge.cpython-313-pytest-9.0.2.pyc

This is a binary file and will not be displayed.

-2756
docs/superpowers/plans/2026-03-30-ionosphere-implementation.md
··· 1 - # Ionosphere Implementation Plan 2 - 3 - > **For agentic workers:** REQUIRED: Use superpowers:subagent-driven-development (if subagents available) or superpowers:executing-plans to implement this plan. Steps use checkbox (`- [ ]`) syntax for tracking. 4 - 5 - **Goal:** Build ionosphere.tv — a semantically enriched AT Protocol conference video archive for ATmosphereConf 2026 VODs. 6 - 7 - **Architecture:** Pannacotta-pattern pipeline: ingest source AT Protocol records (Streamplace VODs + ATmosphereConf schedule) → correlate → transcribe → assemble RelationalText documents → LLM enrichment → SQLite appview → Next.js SSG frontend with synchronized video+transcript playback. 8 - 9 - **Tech Stack:** TypeScript, pnpm workspaces, Next.js 15, React 18, Tailwind CSS, Hono, better-sqlite3, relational-text, @atproto/api, vitest. 10 - 11 - **Spec:** `docs/superpowers/specs/2026-03-30-ionosphere-design.md` 12 - 13 - --- 14 - 15 - ## Chunk 1: Project Scaffold & Lexicons 16 - 17 - Sets up the monorepo workspace, defines all AT Protocol lexicons, and creates the format-lexicon with facet type definitions. After this chunk, the project structure exists and the data model is formalized. 18 - 19 - ### Task 1: Initialize pnpm workspace 20 - 21 - **Files:** 22 - - Create: `package.json` 23 - - Create: `pnpm-workspace.yaml` 24 - - Create: `tsconfig.json` 25 - - Create: `.gitignore` 26 - 27 - - [ ] **Step 1: Create root package.json** 28 - 29 - ```json 30 - { 31 - "name": "ionosphere-workspace", 32 - "private": true, 33 - "scripts": { 34 - "dev": "pnpm --filter ionosphere dev", 35 - "build": "pnpm --filter ionosphere build", 36 - "appview": "pnpm --filter ionosphere-appview appview" 37 - }, 38 - "devDependencies": {} 39 - } 40 - ``` 41 - 42 - - [ ] **Step 2: Create pnpm-workspace.yaml** 43 - 44 - ```yaml 45 - packages: 46 - - 'apps/*' 47 - - 'formats/*' 48 - ``` 49 - 50 - - [ ] **Step 3: Create tsconfig.json** 51 - 52 - Base TypeScript config for the workspace. ESM, strict, Node 20+ target. 53 - 54 - ```json 55 - { 56 - "compilerOptions": { 57 - "target": "ES2022", 58 - "module": "ESNext", 59 - "moduleResolution": "bundler", 60 - "strict": true, 61 - "esModuleInterop": true, 62 - "skipLibCheck": true, 63 - "forceConsistentCasingInFileNames": true, 64 - "resolveJsonModule": true, 65 - "declaration": true, 66 - "declarationMap": true, 67 - "sourceMap": true, 68 - "outDir": "dist" 69 - }, 70 - "exclude": ["node_modules", "dist"] 71 - } 72 - ``` 73 - 74 - - [ ] **Step 4: Create .gitignore** 75 - 76 - ``` 77 - node_modules/ 78 - dist/ 79 - .next/ 80 - data/audio/ 81 - data/transcripts/ 82 - *.sqlite 83 - *.sqlite-journal 84 - *.sqlite-wal 85 - .env 86 - .env.local 87 - ``` 88 - 89 - - [ ] **Step 5: Commit** 90 - 91 - ```bash 92 - git add package.json pnpm-workspace.yaml tsconfig.json .gitignore 93 - git commit -m "chore: initialize pnpm workspace" 94 - ``` 95 - 96 - ### Task 2: Define AT Protocol lexicons 97 - 98 - **Files:** 99 - - Create: `lexicons/tv/ionosphere/talk.json` 100 - - Create: `lexicons/tv/ionosphere/speaker.json` 101 - - Create: `lexicons/tv/ionosphere/concept.json` 102 - - Create: `lexicons/tv/ionosphere/event.json` 103 - 104 - - [ ] **Step 1: Create talk lexicon** 105 - 106 - ```json 107 - { 108 - "lexicon": 1, 109 - "$type": "com.atproto.lexicon.schema", 110 - "id": "tv.ionosphere.talk", 111 - "revision": 1, 112 - "description": "A conference talk with video reference and enriched transcript document.", 113 - "defs": { 114 - "main": { 115 - "type": "record", 116 - "key": "any", 117 - "record": { 118 - "type": "object", 119 - "required": ["title", "eventUri"], 120 - "properties": { 121 - "title": { 122 - "type": "string", 123 - "description": "Talk title." 124 - }, 125 - "document": { 126 - "type": "ref", 127 - "ref": "org.relationaltext.richtext.document", 128 - "description": "Enriched transcript document with temporal and semantic facets." 129 - }, 130 - "speakerUris": { 131 - "type": "array", 132 - "items": { "type": "string", "format": "at-uri" }, 133 - "description": "AT URIs of tv.ionosphere.speaker records." 134 - }, 135 - "videoUri": { 136 - "type": "string", 137 - "format": "at-uri", 138 - "description": "AT URI to place.stream.video record." 139 - }, 140 - "scheduleUri": { 141 - "type": "string", 142 - "format": "at-uri", 143 - "description": "AT URI to source community.lexicon.calendar.event record." 144 - }, 145 - "eventUri": { 146 - "type": "string", 147 - "format": "at-uri", 148 - "description": "AT URI to tv.ionosphere.event record." 149 - }, 150 - "room": { 151 - "type": "string", 152 - "description": "Room or track name." 153 - }, 154 - "category": { 155 - "type": "string", 156 - "description": "Talk category from schedule." 157 - }, 158 - "talkType": { 159 - "type": "string", 160 - "description": "Type: presentation, lightning-talk, panel, workshop, etc." 161 - }, 162 - "startsAt": { 163 - "type": "string", 164 - "format": "datetime", 165 - "description": "Scheduled start time (ISO 8601)." 166 - }, 167 - "endsAt": { 168 - "type": "string", 169 - "format": "datetime", 170 - "description": "Scheduled end time (ISO 8601)." 171 - }, 172 - "duration": { 173 - "type": "integer", 174 - "description": "Video duration in nanoseconds (from VOD record)." 175 - }, 176 - "description": { 177 - "type": "string", 178 - "description": "Talk description/abstract from schedule." 179 - } 180 - } 181 - } 182 - } 183 - } 184 - } 185 - ``` 186 - 187 - - [ ] **Step 2: Create speaker lexicon** 188 - 189 - ```json 190 - { 191 - "lexicon": 1, 192 - "$type": "com.atproto.lexicon.schema", 193 - "id": "tv.ionosphere.speaker", 194 - "revision": 1, 195 - "description": "A conference speaker.", 196 - "defs": { 197 - "main": { 198 - "type": "record", 199 - "key": "any", 200 - "record": { 201 - "type": "object", 202 - "required": ["name"], 203 - "properties": { 204 - "name": { 205 - "type": "string", 206 - "description": "Speaker display name." 207 - }, 208 - "handle": { 209 - "type": "string", 210 - "description": "AT Protocol handle (e.g., 'signez.fr')." 211 - }, 212 - "did": { 213 - "type": "string", 214 - "format": "did", 215 - "description": "Speaker's DID, if known." 216 - }, 217 - "bio": { 218 - "type": "string", 219 - "description": "Speaker bio." 220 - }, 221 - "affiliations": { 222 - "type": "array", 223 - "items": { "type": "string" }, 224 - "description": "Organizations, projects, or affiliations." 225 - } 226 - } 227 - } 228 - } 229 - } 230 - } 231 - ``` 232 - 233 - - [ ] **Step 3: Create concept lexicon** 234 - 235 - ```json 236 - { 237 - "lexicon": 1, 238 - "$type": "com.atproto.lexicon.schema", 239 - "id": "tv.ionosphere.concept", 240 - "revision": 1, 241 - "description": "A knowledge entity referenced in talk transcripts.", 242 - "defs": { 243 - "main": { 244 - "type": "record", 245 - "key": "any", 246 - "record": { 247 - "type": "object", 248 - "required": ["name"], 249 - "properties": { 250 - "name": { 251 - "type": "string", 252 - "description": "Canonical concept name." 253 - }, 254 - "aliases": { 255 - "type": "array", 256 - "items": { "type": "string" }, 257 - "description": "Alternative names for matching." 258 - }, 259 - "description": { 260 - "type": "string", 261 - "description": "Brief description of the concept." 262 - }, 263 - "wikidataId": { 264 - "type": "string", 265 - "description": "Wikidata Q-identifier (e.g., 'Q123456')." 266 - }, 267 - "url": { 268 - "type": "string", 269 - "format": "uri", 270 - "description": "Canonical external URL for the concept." 271 - } 272 - } 273 - } 274 - } 275 - } 276 - } 277 - ``` 278 - 279 - - [ ] **Step 4: Create event lexicon** 280 - 281 - ```json 282 - { 283 - "lexicon": 1, 284 - "$type": "com.atproto.lexicon.schema", 285 - "id": "tv.ionosphere.event", 286 - "revision": 1, 287 - "description": "A conference or event whose talks are archived.", 288 - "defs": { 289 - "main": { 290 - "type": "record", 291 - "key": "any", 292 - "record": { 293 - "type": "object", 294 - "required": ["name", "startsAt", "endsAt"], 295 - "properties": { 296 - "name": { 297 - "type": "string", 298 - "description": "Event name." 299 - }, 300 - "description": { 301 - "type": "string", 302 - "description": "Event description." 303 - }, 304 - "location": { 305 - "type": "string", 306 - "description": "Venue and city." 307 - }, 308 - "startsAt": { 309 - "type": "string", 310 - "format": "datetime", 311 - "description": "Event start date (ISO 8601)." 312 - }, 313 - "endsAt": { 314 - "type": "string", 315 - "format": "datetime", 316 - "description": "Event end date (ISO 8601)." 317 - }, 318 - "tracks": { 319 - "type": "array", 320 - "items": { "type": "string" }, 321 - "description": "Room/track names." 322 - }, 323 - "scheduleRepo": { 324 - "type": "string", 325 - "format": "did", 326 - "description": "DID of the repo containing schedule records." 327 - }, 328 - "vodRepo": { 329 - "type": "string", 330 - "format": "did", 331 - "description": "DID of the repo containing VOD records." 332 - } 333 - } 334 - } 335 - } 336 - } 337 - } 338 - ``` 339 - 340 - - [ ] **Step 5: Commit** 341 - 342 - ```bash 343 - git add lexicons/ 344 - git commit -m "feat: define AT Protocol lexicons for talk, speaker, concept, event" 345 - ``` 346 - 347 - ### Task 3: Create format-lexicon and facet type definitions 348 - 349 - **Files:** 350 - - Create: `formats/tv.ionosphere/package.json` 351 - - Create: `formats/tv.ionosphere/tsconfig.json` 352 - - Create: `formats/tv.ionosphere/ionosphere.lexicon.json` 353 - 354 - - [ ] **Step 1: Create format package.json** 355 - 356 - ```json 357 - { 358 - "name": "@ionosphere/format", 359 - "version": "0.1.0", 360 - "type": "module", 361 - "main": "ts/index.ts", 362 - "exports": { 363 - ".": "./ts/index.ts", 364 - "./assemble": "./ts/assemble.ts" 365 - }, 366 - "dependencies": { 367 - "relational-text": "^0.1.1" 368 - }, 369 - "devDependencies": { 370 - "typescript": "^5", 371 - "vitest": "^3.0.0" 372 - }, 373 - "scripts": { 374 - "test": "vitest run", 375 - "typecheck": "tsc --noEmit" 376 - } 377 - } 378 - ``` 379 - 380 - - [ ] **Step 2: Create format tsconfig.json** 381 - 382 - ```json 383 - { 384 - "extends": "../../tsconfig.json", 385 - "compilerOptions": { 386 - "rootDir": "ts", 387 - "outDir": "dist" 388 - }, 389 - "include": ["ts"] 390 - } 391 - ``` 392 - 393 - - [ ] **Step 3: Create ionosphere.lexicon.json** 394 - 395 - This defines the RelationalText facet types, not AT Protocol record types. 396 - 397 - Follows pannacotta's `$type: "org.relationaltext.format-lexicon"` schema exactly. 398 - 399 - ```json 400 - { 401 - "$type": "org.relationaltext.format-lexicon", 402 - "id": "tv.ionosphere.facet", 403 - "name": "Ionosphere Talk Annotations", 404 - "description": "Semantic annotations for conference talk transcripts: timestamps, speakers, concepts, cross-references", 405 - "features": [ 406 - { 407 - "typeId": "tv.ionosphere.facet#speaker-segment", 408 - "featureClass": "block", 409 - "expandStart": false, 410 - "expandEnd": false 411 - }, 412 - { 413 - "typeId": "tv.ionosphere.facet#concept-ref", 414 - "featureClass": "inline", 415 - "expandStart": false, 416 - "expandEnd": false 417 - }, 418 - { 419 - "typeId": "tv.ionosphere.facet#speaker-ref", 420 - "featureClass": "inline", 421 - "expandStart": false, 422 - "expandEnd": false 423 - }, 424 - { 425 - "typeId": "tv.ionosphere.facet#talk-xref", 426 - "featureClass": "inline", 427 - "expandStart": false, 428 - "expandEnd": false 429 - }, 430 - { 431 - "typeId": "tv.ionosphere.facet#link", 432 - "featureClass": "inline", 433 - "expandStart": false, 434 - "expandEnd": false 435 - }, 436 - { 437 - "typeId": "tv.ionosphere.facet#timestamp", 438 - "featureClass": "meta", 439 - "expandStart": false, 440 - "expandEnd": false 441 - } 442 - ] 443 - } 444 - ``` 445 - 446 - Feature property schemas (speakerUri, conceptUri, startTime/endTime, etc.) are defined in TypeScript code, not in the format-lexicon JSON — matching pannacotta's pattern. 447 - 448 - - [ ] **Step 4: Create format entry point** 449 - 450 - Create `formats/tv.ionosphere/ts/index.ts`: 451 - 452 - ```typescript 453 - export const NAMESPACE = "tv.ionosphere"; 454 - 455 - // Shared types used across packages 456 - export interface WordTimestamp { 457 - word: string; 458 - start: number; // seconds 459 - end: number; // seconds 460 - confidence: number; 461 - } 462 - 463 - export interface TranscriptResult { 464 - text: string; 465 - words: WordTimestamp[]; 466 - } 467 - ``` 468 - 469 - Canonical location for shared types. The appview's `transcribe.ts` imports these rather than defining its own. 470 - 471 - - [ ] **Step 5: Run pnpm install** 472 - 473 - ```bash 474 - pnpm install 475 - ``` 476 - 477 - - [ ] **Step 6: Commit** 478 - 479 - ```bash 480 - git add formats/ 481 - git commit -m "feat: create format-lexicon with facet type definitions" 482 - ``` 483 - 484 - --- 485 - 486 - ### Task 4: Define lens specifications 487 - 488 - Lenses are the insulation layer between upstream source lexicons and the internal ionosphere data model. When Streamplace's lexicons change (they've indicated they will), only the lens rules need updating — the rest of the pipeline stays stable. 489 - 490 - **Files:** 491 - - Create: `formats/tv.ionosphere/lenses/schedule-to-talk.lens.json` 492 - - Create: `formats/tv.ionosphere/lenses/vod-to-talk.lens.json` 493 - - Create: `formats/tv.ionosphere/lenses/transcript-to-document.lens.json` 494 - 495 - - [ ] **Step 1: Create schedule-to-talk lens** 496 - 497 - Maps `community.lexicon.calendar.event` fields to `tv.ionosphere.talk` fields. This is where the upstream schedule format is absorbed. 498 - 499 - ```json 500 - { 501 - "$type": "org.relationaltext.lens", 502 - "id": "community.lexicon.calendar.event.to.tv.ionosphere.talk.v1", 503 - "description": "Transform ATmosphereConf schedule events into ionosphere talk records", 504 - "source": "community.lexicon.calendar.event", 505 - "target": "tv.ionosphere.talk", 506 - "invertible": false, 507 - "rules": [ 508 - { 509 - "match": { "name": "name" }, 510 - "replace": { "name": "title" } 511 - }, 512 - { 513 - "match": { "name": "description" }, 514 - "replace": { "name": "description" } 515 - }, 516 - { 517 - "match": { "name": "startsAt" }, 518 - "replace": { "name": "startsAt" } 519 - }, 520 - { 521 - "match": { "name": "endsAt" }, 522 - "replace": { "name": "endsAt" } 523 - }, 524 - { 525 - "match": { "name": "additionalData.room" }, 526 - "replace": { "name": "room" } 527 - }, 528 - { 529 - "match": { "name": "additionalData.category" }, 530 - "replace": { "name": "category" } 531 - }, 532 - { 533 - "match": { "name": "additionalData.type" }, 534 - "replace": { "name": "talkType" } 535 - }, 536 - { 537 - "match": { "name": "additionalData.speakers" }, 538 - "replace": { "name": "speakers" } 539 - } 540 - ] 541 - } 542 - ``` 543 - 544 - - [ ] **Step 2: Create vod-to-talk lens** 545 - 546 - Maps `place.stream.video` fields to the video-related fields on `tv.ionosphere.talk`. 547 - 548 - ```json 549 - { 550 - "$type": "org.relationaltext.lens", 551 - "id": "place.stream.video.to.tv.ionosphere.talk.v1", 552 - "description": "Map Streamplace VOD record fields to ionosphere talk video metadata", 553 - "source": "place.stream.video", 554 - "target": "tv.ionosphere.talk", 555 - "invertible": false, 556 - "rules": [ 557 - { 558 - "match": { "name": "title" }, 559 - "replace": { "name": "title" } 560 - }, 561 - { 562 - "match": { "name": "duration" }, 563 - "replace": { "name": "duration" } 564 - }, 565 - { 566 - "match": { "name": "creator" }, 567 - "replace": { "name": "streamCreator" } 568 - } 569 - ] 570 - } 571 - ``` 572 - 573 - - [ ] **Step 3: Create transcript-to-document lens** 574 - 575 - Maps raw transcript facets to ionosphere document facets. Initially a passthrough — the transcript word timestamps map directly to `tv.ionosphere.facet#timestamp`. 576 - 577 - ```json 578 - { 579 - "$type": "org.relationaltext.lens", 580 - "id": "transcript.to.tv.ionosphere.document.v1", 581 - "description": "Map raw transcript timing data to ionosphere document timestamp facets", 582 - "source": "transcript.raw", 583 - "target": "tv.ionosphere.facet", 584 - "passthrough": "keep", 585 - "rules": [ 586 - { 587 - "match": { "name": "word-timing" }, 588 - "replace": { "name": "timestamp" } 589 - } 590 - ] 591 - } 592 - ``` 593 - 594 - - [ ] **Step 4: Create lens loader utility** 595 - 596 - Create `formats/tv.ionosphere/ts/lenses.ts`: 597 - 598 - ```typescript 599 - import { readFileSync } from "node:fs"; 600 - import path from "node:path"; 601 - 602 - export interface LensSpec { 603 - $type: string; 604 - id: string; 605 - description: string; 606 - source: string; 607 - target: string; 608 - invertible?: boolean; 609 - passthrough?: "keep" | "drop"; 610 - rules: LensRule[]; 611 - } 612 - 613 - export interface LensRule { 614 - match: { name: string }; 615 - replace: { name: string }; 616 - } 617 - 618 - const LENS_DIR = path.resolve(import.meta.dirname, "../lenses"); 619 - 620 - export function loadLens(filename: string): LensSpec { 621 - const raw = readFileSync(path.join(LENS_DIR, filename), "utf-8"); 622 - return JSON.parse(raw); 623 - } 624 - 625 - /** 626 - * Apply a lens to transform a source record's fields to target field names. 627 - * Returns a new object with renamed keys per the lens rules. 628 - * Fields not matched by any rule are kept or dropped per `passthrough`. 629 - */ 630 - export function applyLens( 631 - lens: LensSpec, 632 - source: Record<string, any> 633 - ): Record<string, any> { 634 - const result: Record<string, any> = {}; 635 - const matched = new Set<string>(); 636 - 637 - for (const rule of lens.rules) { 638 - const sourceName = rule.match.name; 639 - // Support dotted paths (e.g., "additionalData.room") 640 - const value = getNestedValue(source, sourceName); 641 - if (value !== undefined) { 642 - result[rule.replace.name] = value; 643 - matched.add(sourceName.split(".")[0]); 644 - } 645 - } 646 - 647 - // Handle passthrough 648 - if (lens.passthrough === "keep") { 649 - for (const [key, value] of Object.entries(source)) { 650 - if (!matched.has(key) && !(key in result)) { 651 - result[key] = value; 652 - } 653 - } 654 - } 655 - 656 - return result; 657 - } 658 - 659 - function getNestedValue(obj: any, path: string): any { 660 - const parts = path.split("."); 661 - let current = obj; 662 - for (const part of parts) { 663 - if (current == null) return undefined; 664 - current = current[part]; 665 - } 666 - return current; 667 - } 668 - ``` 669 - 670 - - [ ] **Step 5: Add lens loader test** 671 - 672 - Create `formats/tv.ionosphere/ts/lenses.test.ts`: 673 - 674 - ```typescript 675 - import { describe, it, expect } from "vitest"; 676 - import { applyLens, type LensSpec } from "./lenses.js"; 677 - 678 - describe("applyLens", () => { 679 - const lens: LensSpec = { 680 - $type: "org.relationaltext.lens", 681 - id: "test", 682 - description: "test lens", 683 - source: "source", 684 - target: "target", 685 - rules: [ 686 - { match: { name: "name" }, replace: { name: "title" } }, 687 - { match: { name: "additionalData.room" }, replace: { name: "room" } }, 688 - ], 689 - }; 690 - 691 - it("renames fields per rules", () => { 692 - const result = applyLens(lens, { name: "Hello" }); 693 - expect(result.title).toBe("Hello"); 694 - expect(result.name).toBeUndefined(); 695 - }); 696 - 697 - it("handles dotted paths", () => { 698 - const result = applyLens(lens, { 699 - name: "Test", 700 - additionalData: { room: "Room 1", type: "presentation" }, 701 - }); 702 - expect(result.title).toBe("Test"); 703 - expect(result.room).toBe("Room 1"); 704 - }); 705 - 706 - it("drops unmatched fields by default", () => { 707 - const result = applyLens(lens, { name: "Test", extra: "value" }); 708 - expect(result.extra).toBeUndefined(); 709 - }); 710 - 711 - it("keeps unmatched fields with passthrough=keep", () => { 712 - const keepLens = { ...lens, passthrough: "keep" as const }; 713 - const result = applyLens(keepLens, { name: "Test", extra: "value" }); 714 - expect(result.extra).toBe("value"); 715 - }); 716 - }); 717 - ``` 718 - 719 - - [ ] **Step 6: Run tests** 720 - 721 - ```bash 722 - cd formats/tv.ionosphere && pnpm test 723 - ``` 724 - 725 - - [ ] **Step 7: Update format package exports** 726 - 727 - Add to `formats/tv.ionosphere/package.json` exports: 728 - ```json 729 - "./lenses": "./ts/lenses.ts" 730 - ``` 731 - 732 - - [ ] **Step 8: Commit** 733 - 734 - ```bash 735 - git add formats/tv.ionosphere/lenses/ formats/tv.ionosphere/ts/lenses.ts formats/tv.ionosphere/ts/lenses.test.ts formats/tv.ionosphere/package.json 736 - git commit -m "feat: lens specifications and loader for source lexicon transformation" 737 - ``` 738 - 739 - ### Deferred from Chunk 1 740 - 741 - - **panproto**: Schema versioning will be integrated once lexicons stabilize past their initial revision. 742 - - **Pretext**: Transcript layout integration depends on evaluating Pretext's available API surface area, which is currently undocumented. The plan uses a custom `TranscriptView` component that can be swapped for Pretext later. 743 - 744 - --- 745 - 746 - ## Chunk 2: Appview Scaffold & Data Ingest 747 - 748 - Creates the appview app, sets up SQLite schema, and implements ingestion of source data from Streamplace VODs and ATmosphereConf schedule records. The ingest pipeline uses the lens loader from Chunk 1 to transform source records. 749 - 750 - ### Task 5: Scaffold the appview app 751 - 752 - **Files:** 753 - - Create: `apps/ionosphere-appview/package.json` 754 - - Create: `apps/ionosphere-appview/tsconfig.json` 755 - - Create: `apps/ionosphere-appview/src/appview.ts` 756 - - Create: `apps/ionosphere-appview/src/db.ts` 757 - - Create: `apps/ionosphere-appview/src/routes.ts` 758 - 759 - - [ ] **Step 1: Create appview package.json** 760 - 761 - ```json 762 - { 763 - "name": "ionosphere-appview", 764 - "version": "0.1.0", 765 - "type": "module", 766 - "dependencies": { 767 - "@atproto/api": "^0.15.0", 768 - "@hono/node-server": "^1.13.0", 769 - "@ionosphere/format": "workspace:*", 770 - "better-sqlite3": "^12.8.0", 771 - "hono": "^4.7.0", 772 - "relational-text": "^0.1.1", 773 - "tsx": "^4.19.0" 774 - }, 775 - "devDependencies": { 776 - "@types/better-sqlite3": "^7.6.0", 777 - "typescript": "^5.7.0", 778 - "vitest": "^3.0.0" 779 - }, 780 - "scripts": { 781 - "appview": "tsx src/appview.ts", 782 - "ingest": "tsx src/ingest.ts", 783 - "test": "vitest run", 784 - "typecheck": "tsc --noEmit" 785 - } 786 - } 787 - ``` 788 - 789 - - [ ] **Step 2: Create appview tsconfig.json** 790 - 791 - ```json 792 - { 793 - "extends": "../../tsconfig.json", 794 - "compilerOptions": { 795 - "rootDir": "src", 796 - "outDir": "dist" 797 - }, 798 - "include": ["src"] 799 - } 800 - ``` 801 - 802 - - [ ] **Step 3: Create db.ts with SQLite schema** 803 - 804 - ```typescript 805 - import Database from "better-sqlite3"; 806 - import path from "node:path"; 807 - 808 - const DB_PATH = path.resolve( 809 - import.meta.dirname, 810 - "../../data/ionosphere.sqlite" 811 - ); 812 - 813 - export function openDb(): Database.Database { 814 - const db = new Database(DB_PATH); 815 - db.pragma("journal_mode = WAL"); 816 - db.pragma("foreign_keys = ON"); 817 - return db; 818 - } 819 - 820 - export function migrate(db: Database.Database): void { 821 - db.exec(` 822 - CREATE TABLE IF NOT EXISTS events ( 823 - uri TEXT PRIMARY KEY, 824 - did TEXT NOT NULL, 825 - rkey TEXT NOT NULL, 826 - name TEXT NOT NULL, 827 - description TEXT, 828 - location TEXT, 829 - starts_at TEXT NOT NULL, 830 - ends_at TEXT NOT NULL, 831 - tracks TEXT, -- JSON array 832 - schedule_repo TEXT, 833 - vod_repo TEXT, 834 - created_at TEXT DEFAULT CURRENT_TIMESTAMP 835 - ); 836 - 837 - CREATE TABLE IF NOT EXISTS speakers ( 838 - uri TEXT PRIMARY KEY, 839 - did TEXT, 840 - rkey TEXT NOT NULL, 841 - name TEXT NOT NULL, 842 - handle TEXT, 843 - speaker_did TEXT, 844 - bio TEXT, 845 - affiliations TEXT, -- JSON array 846 - created_at TEXT DEFAULT CURRENT_TIMESTAMP 847 - ); 848 - 849 - CREATE TABLE IF NOT EXISTS talks ( 850 - uri TEXT PRIMARY KEY, 851 - did TEXT NOT NULL, 852 - rkey TEXT NOT NULL, 853 - title TEXT NOT NULL, 854 - description TEXT, 855 - document TEXT, -- JSON: RelationalText document 856 - video_uri TEXT, 857 - schedule_uri TEXT, 858 - event_uri TEXT NOT NULL, 859 - room TEXT, 860 - category TEXT, 861 - talk_type TEXT, 862 - starts_at TEXT, 863 - ends_at TEXT, 864 - duration INTEGER, -- nanoseconds 865 - created_at TEXT DEFAULT CURRENT_TIMESTAMP, 866 - FOREIGN KEY (event_uri) REFERENCES events(uri) ON DELETE CASCADE 867 - ); 868 - 869 - CREATE TABLE IF NOT EXISTS talk_speakers ( 870 - talk_uri TEXT NOT NULL, 871 - speaker_uri TEXT NOT NULL, 872 - PRIMARY KEY (talk_uri, speaker_uri), 873 - FOREIGN KEY (talk_uri) REFERENCES talks(uri) ON DELETE CASCADE, 874 - FOREIGN KEY (speaker_uri) REFERENCES speakers(uri) ON DELETE CASCADE 875 - ); 876 - 877 - CREATE TABLE IF NOT EXISTS concepts ( 878 - uri TEXT PRIMARY KEY, 879 - did TEXT NOT NULL, 880 - rkey TEXT NOT NULL, 881 - name TEXT NOT NULL, 882 - aliases TEXT, -- JSON array 883 - description TEXT, 884 - wikidata_id TEXT, 885 - url TEXT, 886 - created_at TEXT DEFAULT CURRENT_TIMESTAMP 887 - ); 888 - 889 - CREATE TABLE IF NOT EXISTS talk_concepts ( 890 - talk_uri TEXT NOT NULL, 891 - concept_uri TEXT NOT NULL, 892 - mention_count INTEGER DEFAULT 1, 893 - PRIMARY KEY (talk_uri, concept_uri), 894 - FOREIGN KEY (talk_uri) REFERENCES talks(uri) ON DELETE CASCADE, 895 - FOREIGN KEY (concept_uri) REFERENCES concepts(uri) ON DELETE CASCADE 896 - ); 897 - 898 - CREATE TABLE IF NOT EXISTS talk_crossrefs ( 899 - from_talk_uri TEXT NOT NULL, 900 - to_talk_uri TEXT NOT NULL, 901 - PRIMARY KEY (from_talk_uri, to_talk_uri), 902 - FOREIGN KEY (from_talk_uri) REFERENCES talks(uri) ON DELETE CASCADE, 903 - FOREIGN KEY (to_talk_uri) REFERENCES talks(uri) ON DELETE CASCADE 904 - ); 905 - 906 - -- Track pipeline status per talk 907 - CREATE TABLE IF NOT EXISTS pipeline_status ( 908 - talk_uri TEXT PRIMARY KEY, 909 - ingested INTEGER DEFAULT 0, 910 - transcribed INTEGER DEFAULT 0, 911 - assembled INTEGER DEFAULT 0, 912 - enriched INTEGER DEFAULT 0, 913 - updated_at TEXT DEFAULT CURRENT_TIMESTAMP, 914 - FOREIGN KEY (talk_uri) REFERENCES talks(uri) ON DELETE CASCADE 915 - ); 916 - `); 917 - } 918 - ``` 919 - 920 - - [ ] **Step 4: Create routes.ts with basic API** 921 - 922 - ```typescript 923 - import { Hono } from "hono"; 924 - import type Database from "better-sqlite3"; 925 - 926 - export function createRoutes(db: Database.Database): Hono { 927 - const app = new Hono(); 928 - 929 - app.get("/health", (c) => c.json({ status: "ok" })); 930 - 931 - app.get("/talks", (c) => { 932 - const talks = db 933 - .prepare( 934 - `SELECT t.*, GROUP_CONCAT(s.name) as speaker_names 935 - FROM talks t 936 - LEFT JOIN talk_speakers ts ON t.uri = ts.talk_uri 937 - LEFT JOIN speakers s ON ts.speaker_uri = s.uri 938 - GROUP BY t.uri 939 - ORDER BY t.starts_at ASC` 940 - ) 941 - .all(); 942 - return c.json({ talks }); 943 - }); 944 - 945 - app.get("/talks/:rkey", (c) => { 946 - const { rkey } = c.req.param(); 947 - const talk = db 948 - .prepare("SELECT * FROM talks WHERE rkey = ?") 949 - .get(rkey); 950 - if (!talk) return c.json({ error: "not found" }, 404); 951 - 952 - const speakers = db 953 - .prepare( 954 - `SELECT s.* FROM speakers s 955 - JOIN talk_speakers ts ON s.uri = ts.speaker_uri 956 - WHERE ts.talk_uri = ?` 957 - ) 958 - .all((talk as any).uri); 959 - 960 - const concepts = db 961 - .prepare( 962 - `SELECT c.* FROM concepts c 963 - JOIN talk_concepts tc ON c.uri = tc.concept_uri 964 - WHERE tc.talk_uri = ?` 965 - ) 966 - .all((talk as any).uri); 967 - 968 - return c.json({ talk, speakers, concepts }); 969 - }); 970 - 971 - app.get("/speakers", (c) => { 972 - const speakers = db.prepare("SELECT * FROM speakers ORDER BY name ASC").all(); 973 - return c.json({ speakers }); 974 - }); 975 - 976 - app.get("/speakers/:rkey", (c) => { 977 - const { rkey } = c.req.param(); 978 - const speaker = db.prepare("SELECT * FROM speakers WHERE rkey = ?").get(rkey); 979 - if (!speaker) return c.json({ error: "not found" }, 404); 980 - 981 - const talks = db 982 - .prepare( 983 - `SELECT t.* FROM talks t 984 - JOIN talk_speakers ts ON t.uri = ts.talk_uri 985 - WHERE ts.speaker_uri = ? 986 - ORDER BY t.starts_at ASC` 987 - ) 988 - .all((speaker as any).uri); 989 - 990 - return c.json({ speaker, talks }); 991 - }); 992 - 993 - app.get("/concepts", (c) => { 994 - const concepts = db 995 - .prepare("SELECT * FROM concepts ORDER BY name ASC") 996 - .all(); 997 - return c.json({ concepts }); 998 - }); 999 - 1000 - app.get("/concepts/:rkey", (c) => { 1001 - const { rkey } = c.req.param(); 1002 - const concept = db.prepare("SELECT * FROM concepts WHERE rkey = ?").get(rkey); 1003 - if (!concept) return c.json({ error: "not found" }, 404); 1004 - 1005 - const talks = db 1006 - .prepare( 1007 - `SELECT t.* FROM talks t 1008 - JOIN talk_concepts tc ON t.uri = tc.talk_uri 1009 - WHERE tc.concept_uri = ? 1010 - ORDER BY t.starts_at ASC` 1011 - ) 1012 - .all((concept as any).uri); 1013 - 1014 - return c.json({ concept, talks }); 1015 - }); 1016 - 1017 - return app; 1018 - } 1019 - ``` 1020 - 1021 - - [ ] **Step 5: Create appview.ts entry point** 1022 - 1023 - ```typescript 1024 - import { serve } from "@hono/node-server"; 1025 - import { openDb, migrate } from "./db.js"; 1026 - import { createRoutes } from "./routes.js"; 1027 - 1028 - const db = openDb(); 1029 - migrate(db); 1030 - 1031 - const app = createRoutes(db); 1032 - 1033 - const port = parseInt(process.env.PORT || "3001", 10); 1034 - serve({ fetch: app.fetch, port }, (info) => { 1035 - console.log(`Ionosphere appview running on http://localhost:${info.port}`); 1036 - }); 1037 - ``` 1038 - 1039 - - [ ] **Step 6: Run pnpm install and verify** 1040 - 1041 - ```bash 1042 - pnpm install 1043 - ``` 1044 - 1045 - - [ ] **Step 7: Commit** 1046 - 1047 - ```bash 1048 - git add apps/ionosphere-appview/ 1049 - git commit -m "feat: scaffold appview with SQLite schema and REST API" 1050 - ``` 1051 - 1052 - ### Task 6: Implement data ingest from AT Protocol 1053 - 1054 - **Files:** 1055 - - Create: `apps/ionosphere-appview/src/ingest.ts` 1056 - - Create: `apps/ionosphere-appview/src/correlate.ts` 1057 - 1058 - - [ ] **Step 1: Write test for correlation logic** 1059 - 1060 - Create `apps/ionosphere-appview/src/correlate.test.ts`: 1061 - 1062 - ```typescript 1063 - import { describe, it, expect } from "vitest"; 1064 - import { correlate, type ScheduleEvent, type VodRecord } from "./correlate.js"; 1065 - 1066 - describe("correlate", () => { 1067 - const schedule: ScheduleEvent[] = [ 1068 - { 1069 - uri: "at://did:plc:test/community.lexicon.calendar.event/abc", 1070 - name: "Building Cirrus: a single-user, serverless PDS", 1071 - startsAt: "2026-03-28T16:15:00.000Z", 1072 - endsAt: "2026-03-28T16:45:00.000Z", 1073 - type: "presentation", 1074 - room: "Great Hall South", 1075 - category: "Development and Protocol", 1076 - speakers: [{ id: "test.bsky.social", name: "Test Speaker" }], 1077 - description: "A test talk.", 1078 - }, 1079 - ]; 1080 - 1081 - const vods: VodRecord[] = [ 1082 - { 1083 - uri: "at://did:plc:rbvrr34edl5ddpuwcubjiost/place.stream.video/123", 1084 - title: "Building Cirrus: a single-user, serverless PDS", 1085 - creator: "did:plc:7tattzlorncahxgtdiuci7x7", 1086 - duration: 2238000000000, 1087 - createdAt: "2026-03-28T16:50:00Z", 1088 - }, 1089 - { 1090 - uri: "at://did:plc:rbvrr34edl5ddpuwcubjiost/place.stream.video/456", 1091 - title: "lunch", 1092 - creator: "did:plc:7tattzlorncahxgtdiuci7x7", 1093 - duration: 4308000000000, 1094 - createdAt: "2026-03-28T19:30:00Z", 1095 - }, 1096 - ]; 1097 - 1098 - it("matches VODs to schedule events by title", () => { 1099 - const matches = correlate(schedule, vods); 1100 - expect(matches).toHaveLength(1); 1101 - expect(matches[0].schedule.name).toBe( 1102 - "Building Cirrus: a single-user, serverless PDS" 1103 - ); 1104 - expect(matches[0].vod.uri).toContain("123"); 1105 - }); 1106 - 1107 - it("filters out noise titles", () => { 1108 - const matches = correlate(schedule, vods); 1109 - expect(matches.every((m) => m.vod.title !== "lunch")).toBe(true); 1110 - }); 1111 - }); 1112 - ``` 1113 - 1114 - - [ ] **Step 2: Run test to verify it fails** 1115 - 1116 - Run: `cd apps/ionosphere-appview && pnpm test` 1117 - Expected: FAIL — correlate module doesn't exist yet. 1118 - 1119 - - [ ] **Step 3: Implement correlate.ts** 1120 - 1121 - ```typescript 1122 - export interface ScheduleEvent { 1123 - uri: string; 1124 - name: string; 1125 - startsAt: string; 1126 - endsAt: string; 1127 - type: string; 1128 - room: string; 1129 - category: string; 1130 - speakers: Array<{ id: string; name: string }>; 1131 - description: string; 1132 - } 1133 - 1134 - export interface VodRecord { 1135 - uri: string; 1136 - title: string; 1137 - creator: string; 1138 - duration: number; // nanoseconds 1139 - createdAt: string; 1140 - } 1141 - 1142 - export interface Match { 1143 - schedule: ScheduleEvent; 1144 - vod: VodRecord; 1145 - confidence: number; // 0-1 1146 - } 1147 - 1148 - const NOISE_TITLES = new Set([ 1149 - "lunch", 1150 - "lunch break", 1151 - "break", 1152 - "doors open", 1153 - "starting soon", 1154 - "join us tomorrow", 1155 - "lunch day", 1156 - "breakfast", 1157 - "coffee break", 1158 - "irl only", 1159 - "no stream", 1160 - ]); 1161 - 1162 - function isNoise(title: string): boolean { 1163 - const lower = title.toLowerCase().trim(); 1164 - if (NOISE_TITLES.has(lower)) return true; 1165 - if (lower.startsWith("lunch")) return true; 1166 - if (lower.startsWith("doors open")) return true; 1167 - if (lower.startsWith("atmoshereconf starting")) return true; 1168 - if (lower.startsWith("atmosphereconf starting")) return true; 1169 - if (lower.startsWith("join us")) return true; 1170 - if (lower.startsWith("please join")) return true; 1171 - if (lower.startsWith("follow @")) return true; 1172 - if (lower.includes("starting soon")) return true; 1173 - return false; 1174 - } 1175 - 1176 - function normalizeTitle(title: string): string { 1177 - return title 1178 - .toLowerCase() 1179 - .replace(/[^a-z0-9\s]/g, "") 1180 - .replace(/\s+/g, " ") 1181 - .trim(); 1182 - } 1183 - 1184 - function titleSimilarity(a: string, b: string): number { 1185 - const na = normalizeTitle(a); 1186 - const nb = normalizeTitle(b); 1187 - if (na === nb) return 1; 1188 - if (na.includes(nb) || nb.includes(na)) return 0.9; 1189 - 1190 - const wordsA = new Set(na.split(" ")); 1191 - const wordsB = new Set(nb.split(" ")); 1192 - const intersection = new Set([...wordsA].filter((w) => wordsB.has(w))); 1193 - const union = new Set([...wordsA, ...wordsB]); 1194 - return intersection.size / union.size; 1195 - } 1196 - 1197 - export function correlate( 1198 - schedule: ScheduleEvent[], 1199 - vods: VodRecord[] 1200 - ): Match[] { 1201 - const matches: Match[] = []; 1202 - const usedVods = new Set<string>(); 1203 - 1204 - // Filter noise VODs 1205 - const realVods = vods.filter((v) => !isNoise(v.title)); 1206 - 1207 - for (const event of schedule) { 1208 - let bestMatch: VodRecord | null = null; 1209 - let bestScore = 0; 1210 - 1211 - for (const vod of realVods) { 1212 - if (usedVods.has(vod.uri)) continue; 1213 - const score = titleSimilarity(event.name, vod.title); 1214 - if (score > bestScore) { 1215 - bestScore = score; 1216 - bestMatch = vod; 1217 - } 1218 - } 1219 - 1220 - if (bestMatch && bestScore >= 0.5) { 1221 - matches.push({ 1222 - schedule: event, 1223 - vod: bestMatch, 1224 - confidence: bestScore, 1225 - }); 1226 - usedVods.add(bestMatch.uri); 1227 - } 1228 - } 1229 - 1230 - return matches.sort( 1231 - (a, b) => 1232 - new Date(a.schedule.startsAt).getTime() - 1233 - new Date(b.schedule.startsAt).getTime() 1234 - ); 1235 - } 1236 - ``` 1237 - 1238 - - [ ] **Step 4: Run test to verify it passes** 1239 - 1240 - Run: `cd apps/ionosphere-appview && pnpm test` 1241 - Expected: PASS 1242 - 1243 - - [ ] **Step 5: Implement ingest.ts** 1244 - 1245 - This fetches source records from AT Protocol and writes correlated talks + speakers to SQLite. 1246 - 1247 - ```typescript 1248 - import { openDb, migrate } from "./db.js"; 1249 - import { correlate, type ScheduleEvent, type VodRecord } from "./correlate.js"; 1250 - import { loadLens, applyLens } from "@ionosphere/format/lenses"; 1251 - 1252 - const scheduleLens = loadLens("schedule-to-talk.lens.json"); 1253 - const vodLens = loadLens("vod-to-talk.lens.json"); 1254 - 1255 - const SCHEDULE_DID = "did:plc:3xewinw4wtimo2lqfy5fm5sw"; 1256 - const SCHEDULE_COLLECTION = "community.lexicon.calendar.event"; 1257 - const VOD_DID = "did:plc:rbvrr34edl5ddpuwcubjiost"; 1258 - const VOD_COLLECTION = "place.stream.video"; 1259 - const VOD_PDS = "https://iameli.com"; 1260 - const BSKY_API = "https://bsky.social"; 1261 - 1262 - // ionosphere.tv's own DID — placeholder until the real DID is created. 1263 - // All ionosphere domain records use this as their repo DID. 1264 - const IONOSPHERE_DID = "did:plc:ionosphere-placeholder"; 1265 - 1266 - const EVENT_URI = `at://${IONOSPHERE_DID}/tv.ionosphere.event/atmosphereconf-2026`; 1267 - 1268 - async function fetchAllRecords( 1269 - baseUrl: string, 1270 - repo: string, 1271 - collection: string 1272 - ): Promise<any[]> { 1273 - const records: any[] = []; 1274 - let cursor: string | undefined; 1275 - 1276 - do { 1277 - const params = new URLSearchParams({ 1278 - repo, 1279 - collection, 1280 - limit: "100", 1281 - }); 1282 - if (cursor) params.set("cursor", cursor); 1283 - 1284 - const res = await fetch( 1285 - `${baseUrl}/xrpc/com.atproto.repo.listRecords?${params}` 1286 - ); 1287 - if (!res.ok) throw new Error(`Failed to fetch: ${res.status}`); 1288 - const data = await res.json(); 1289 - records.push(...data.records); 1290 - cursor = data.cursor; 1291 - } while (cursor); 1292 - 1293 - return records; 1294 - } 1295 - 1296 - function parseScheduleEvent(record: any): ScheduleEvent | null { 1297 - const v = record.value; 1298 - const ad = v.additionalData; 1299 - if (!ad?.isAtmosphereconf) return null; 1300 - if (v.status === "community.lexicon.calendar.event#cancelled") return null; 1301 - // Skip non-talk types 1302 - const type = ad?.type || ""; 1303 - if (["info", "food"].includes(type)) return null; 1304 - 1305 - // Apply lens to transform source fields to internal names 1306 - const mapped = applyLens(scheduleLens, v); 1307 - 1308 - return { 1309 - uri: record.uri, 1310 - name: mapped.title, 1311 - startsAt: mapped.startsAt, 1312 - endsAt: mapped.endsAt, 1313 - type: mapped.talkType || "", 1314 - room: mapped.room || "", 1315 - category: mapped.category || "", 1316 - speakers: mapped.speakers || [], 1317 - description: mapped.description || "", 1318 - }; 1319 - } 1320 - 1321 - function parseVodRecord(record: any): VodRecord { 1322 - return { 1323 - uri: record.uri, 1324 - title: record.value.title, 1325 - creator: record.value.creator, 1326 - duration: record.value.duration, 1327 - createdAt: record.value.createdAt, 1328 - }; 1329 - } 1330 - 1331 - function slugify(name: string): string { 1332 - return name 1333 - .toLowerCase() 1334 - .replace(/[^a-z0-9]+/g, "-") 1335 - .replace(/^-|-$/g, ""); 1336 - } 1337 - 1338 - async function main() { 1339 - console.log("Fetching schedule events..."); 1340 - const scheduleRaw = await fetchAllRecords( 1341 - BSKY_API, 1342 - SCHEDULE_DID, 1343 - SCHEDULE_COLLECTION 1344 - ); 1345 - const schedule = scheduleRaw 1346 - .map(parseScheduleEvent) 1347 - .filter((e): e is ScheduleEvent => e !== null); 1348 - console.log(` ${schedule.length} schedule events (filtered from ${scheduleRaw.length})`); 1349 - 1350 - console.log("Fetching VOD records..."); 1351 - const vodRaw = await fetchAllRecords(VOD_PDS, VOD_DID, VOD_COLLECTION); 1352 - const vods = vodRaw.map(parseVodRecord); 1353 - console.log(` ${vods.length} VOD records`); 1354 - 1355 - console.log("Correlating..."); 1356 - const matches = correlate(schedule, vods); 1357 - console.log(` ${matches.length} matches`); 1358 - 1359 - const db = openDb(); 1360 - migrate(db); 1361 - 1362 - // Insert event 1363 - db.prepare( 1364 - `INSERT OR REPLACE INTO events (uri, did, rkey, name, description, location, starts_at, ends_at, tracks, schedule_repo, vod_repo) 1365 - VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)` 1366 - ).run( 1367 - EVENT_URI, 1368 - SCHEDULE_DID, 1369 - "atmosphereconf-2026", 1370 - "ATmosphereConf 2026", 1371 - "The global gathering for the AT Protocol community.", 1372 - "AMS Student Nest, UBC, Vancouver, BC, Canada", 1373 - "2026-03-26T00:00:00Z", 1374 - "2026-03-29T23:59:59Z", 1375 - JSON.stringify(["Great Hall South", "Performance Theatre", "Room 2301"]), 1376 - SCHEDULE_DID, 1377 - VOD_DID 1378 - ); 1379 - 1380 - // Collect unique speakers, insert them 1381 - const speakerMap = new Map<string, { name: string; handle: string }>(); 1382 - for (const m of matches) { 1383 - for (const s of m.schedule.speakers) { 1384 - if (!speakerMap.has(s.id)) { 1385 - speakerMap.set(s.id, { name: s.name, handle: s.id }); 1386 - } 1387 - } 1388 - } 1389 - 1390 - const insertSpeaker = db.prepare( 1391 - `INSERT OR REPLACE INTO speakers (uri, did, rkey, name, handle) 1392 - VALUES (?, ?, ?, ?, ?)` 1393 - ); 1394 - for (const [handle, speaker] of speakerMap) { 1395 - const rkey = slugify(handle); 1396 - const uri = `at://${IONOSPHERE_DID}/tv.ionosphere.speaker/${rkey}`; 1397 - insertSpeaker.run(uri, IONOSPHERE_DID, rkey, speaker.name, speaker.handle); 1398 - } 1399 - console.log(` ${speakerMap.size} speakers`); 1400 - 1401 - // Insert talks 1402 - const insertTalk = db.prepare( 1403 - `INSERT OR REPLACE INTO talks (uri, did, rkey, title, description, video_uri, schedule_uri, event_uri, room, category, talk_type, starts_at, ends_at, duration) 1404 - VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)` 1405 - ); 1406 - const insertTalkSpeaker = db.prepare( 1407 - `INSERT OR REPLACE INTO talk_speakers (talk_uri, speaker_uri) 1408 - VALUES (?, ?)` 1409 - ); 1410 - const insertPipelineStatus = db.prepare( 1411 - `INSERT OR REPLACE INTO pipeline_status (talk_uri, ingested) 1412 - VALUES (?, 1)` 1413 - ); 1414 - 1415 - for (const m of matches) { 1416 - const rkey = m.schedule.uri.split("/").pop()!; 1417 - const talkUri = `at://${IONOSPHERE_DID}/tv.ionosphere.talk/${rkey}`; 1418 - 1419 - insertTalk.run( 1420 - talkUri, 1421 - IONOSPHERE_DID, 1422 - rkey, 1423 - m.schedule.name, 1424 - m.schedule.description, 1425 - m.vod.uri, 1426 - m.schedule.uri, 1427 - EVENT_URI, 1428 - m.schedule.room, 1429 - m.schedule.category, 1430 - m.schedule.type, 1431 - m.schedule.startsAt, 1432 - m.schedule.endsAt, 1433 - m.vod.duration 1434 - ); 1435 - 1436 - for (const s of m.schedule.speakers) { 1437 - const speakerRkey = slugify(s.id); 1438 - const speakerUri = `at://ionosphere.tv/tv.ionosphere.speaker/${speakerRkey}`; 1439 - insertTalkSpeaker.run(talkUri, speakerUri); 1440 - } 1441 - 1442 - insertPipelineStatus.run(talkUri); 1443 - } 1444 - 1445 - console.log(`\nIngested ${matches.length} talks into database.`); 1446 - 1447 - // Report unmatched 1448 - const unmatchedSchedule = schedule.filter( 1449 - (s) => !matches.some((m) => m.schedule.uri === s.uri) 1450 - ); 1451 - if (unmatchedSchedule.length > 0) { 1452 - console.log(`\nUnmatched schedule events (${unmatchedSchedule.length}):`); 1453 - for (const s of unmatchedSchedule) { 1454 - console.log(` - ${s.name} (${s.type})`); 1455 - } 1456 - } 1457 - 1458 - db.close(); 1459 - } 1460 - 1461 - main().catch(console.error); 1462 - ``` 1463 - 1464 - - [ ] **Step 6: Run ingest and verify** 1465 - 1466 - ```bash 1467 - mkdir -p data 1468 - cd apps/ionosphere-appview && pnpm ingest 1469 - ``` 1470 - 1471 - Verify output shows correlated talks, speakers, and any unmatched events. 1472 - 1473 - - [ ] **Step 7: Start appview and test API** 1474 - 1475 - ```bash 1476 - cd apps/ionosphere-appview && pnpm appview & 1477 - curl http://localhost:3001/health 1478 - curl http://localhost:3001/talks | python3 -m json.tool | head -30 1479 - curl http://localhost:3001/speakers | python3 -m json.tool | head -20 1480 - ``` 1481 - 1482 - - [ ] **Step 8: Commit** 1483 - 1484 - ```bash 1485 - git add apps/ionosphere-appview/src/ingest.ts apps/ionosphere-appview/src/correlate.ts apps/ionosphere-appview/src/correlate.test.ts data/ 1486 - git commit -m "feat: ingest VOD and schedule data, correlate talks to videos" 1487 - ``` 1488 - 1489 - --- 1490 - 1491 - ## Chunk 3: Next.js Frontend Scaffold 1492 - 1493 - Sets up the Next.js SSG frontend with basic page routes and a working video player. After this chunk, you can browse talks and watch videos. 1494 - 1495 - ### Task 7: Scaffold Next.js app 1496 - 1497 - **Files:** 1498 - - Create: `apps/ionosphere/package.json` 1499 - - Create: `apps/ionosphere/tsconfig.json` 1500 - - Create: `apps/ionosphere/next.config.ts` 1501 - - Create: `apps/ionosphere/tailwind.config.ts` 1502 - - Create: `apps/ionosphere/postcss.config.mjs` 1503 - - Create: `apps/ionosphere/src/app/layout.tsx` 1504 - - Create: `apps/ionosphere/src/app/page.tsx` 1505 - - Create: `apps/ionosphere/src/lib/api.ts` 1506 - 1507 - - [ ] **Step 1: Create package.json** 1508 - 1509 - ```json 1510 - { 1511 - "name": "ionosphere", 1512 - "version": "0.1.0", 1513 - "type": "module", 1514 - "dependencies": { 1515 - "@ionosphere/format": "workspace:*", 1516 - "next": "^15", 1517 - "react": "^18", 1518 - "react-dom": "^18", 1519 - "relational-text": "^0.1.1" 1520 - }, 1521 - "devDependencies": { 1522 - "@tailwindcss/typography": "^0.5.19", 1523 - "@types/node": "^22", 1524 - "@types/react": "^18", 1525 - "@types/react-dom": "^18", 1526 - "autoprefixer": "^10.4.20", 1527 - "postcss": "^8.4.49", 1528 - "tailwindcss": "^3.4.19", 1529 - "typescript": "^5" 1530 - }, 1531 - "scripts": { 1532 - "dev": "next dev --port 3002", 1533 - "build": "next build", 1534 - "start": "next start" 1535 - } 1536 - } 1537 - ``` 1538 - 1539 - - [ ] **Step 2: Create next.config.ts** 1540 - 1541 - ```typescript 1542 - import type { NextConfig } from "next"; 1543 - 1544 - const nextConfig: NextConfig = { 1545 - output: "export", 1546 - }; 1547 - 1548 - export default nextConfig; 1549 - ``` 1550 - 1551 - - [ ] **Step 3: Create tailwind.config.ts** 1552 - 1553 - ```typescript 1554 - import type { Config } from "tailwindcss"; 1555 - import typography from "@tailwindcss/typography"; 1556 - 1557 - const config: Config = { 1558 - content: ["./src/**/*.{ts,tsx}"], 1559 - theme: { 1560 - extend: {}, 1561 - }, 1562 - plugins: [typography], 1563 - }; 1564 - 1565 - export default config; 1566 - ``` 1567 - 1568 - - [ ] **Step 4: Create postcss.config.mjs** 1569 - 1570 - ```javascript 1571 - export default { 1572 - plugins: { 1573 - tailwindcss: {}, 1574 - autoprefixer: {}, 1575 - }, 1576 - }; 1577 - ``` 1578 - 1579 - - [ ] **Step 5: Create tsconfig.json** 1580 - 1581 - ```json 1582 - { 1583 - "compilerOptions": { 1584 - "target": "ES2017", 1585 - "lib": ["dom", "dom.iterable", "esnext"], 1586 - "allowJs": true, 1587 - "skipLibCheck": true, 1588 - "strict": true, 1589 - "noEmit": true, 1590 - "esModuleInterop": true, 1591 - "module": "esnext", 1592 - "moduleResolution": "bundler", 1593 - "resolveJsonModule": true, 1594 - "isolatedModules": true, 1595 - "jsx": "preserve", 1596 - "incremental": true, 1597 - "plugins": [{ "name": "next" }], 1598 - "paths": { 1599 - "@/*": ["./src/*"] 1600 - } 1601 - }, 1602 - "include": ["next-env.d.ts", "**/*.ts", "**/*.tsx"], 1603 - "exclude": ["node_modules"] 1604 - } 1605 - ``` 1606 - 1607 - - [ ] **Step 6: Create src/lib/api.ts** 1608 - 1609 - API client that reads from the appview (at build time for SSG, at runtime for dev). 1610 - 1611 - ```typescript 1612 - const API_BASE = process.env.NEXT_PUBLIC_API_URL || "http://localhost:3001"; 1613 - 1614 - async function fetchApi<T>(path: string): Promise<T> { 1615 - const res = await fetch(`${API_BASE}${path}`, { next: { revalidate: false } }); 1616 - if (!res.ok) throw new Error(`API error: ${res.status} ${path}`); 1617 - return res.json(); 1618 - } 1619 - 1620 - export async function getTalks() { 1621 - return fetchApi<{ talks: any[] }>("/talks"); 1622 - } 1623 - 1624 - export async function getTalk(rkey: string) { 1625 - return fetchApi<{ talk: any; speakers: any[]; concepts: any[] }>( 1626 - `/talks/${rkey}` 1627 - ); 1628 - } 1629 - 1630 - export async function getSpeakers() { 1631 - return fetchApi<{ speakers: any[] }>("/speakers"); 1632 - } 1633 - 1634 - export async function getSpeaker(rkey: string) { 1635 - return fetchApi<{ speaker: any; talks: any[] }>(`/speakers/${rkey}`); 1636 - } 1637 - 1638 - export async function getConcepts() { 1639 - return fetchApi<{ concepts: any[] }>("/concepts"); 1640 - } 1641 - 1642 - export async function getConcept(rkey: string) { 1643 - return fetchApi<{ concept: any; talks: any[] }>(`/concepts/${rkey}`); 1644 - } 1645 - ``` 1646 - 1647 - - [ ] **Step 7: Create src/app/globals.css** 1648 - 1649 - ```css 1650 - @tailwind base; 1651 - @tailwind components; 1652 - @tailwind utilities; 1653 - ``` 1654 - 1655 - - [ ] **Step 8: Create layout.tsx** 1656 - 1657 - ```tsx 1658 - import type { Metadata } from "next"; 1659 - import "./globals.css"; 1660 - 1661 - export const metadata: Metadata = { 1662 - title: "Ionosphere", 1663 - description: 1664 - "Semantically enriched conference video archive for ATmosphereConf 2026", 1665 - }; 1666 - 1667 - export default function RootLayout({ 1668 - children, 1669 - }: { 1670 - children: React.ReactNode; 1671 - }) { 1672 - return ( 1673 - <html lang="en"> 1674 - <body className="bg-neutral-950 text-neutral-100 min-h-screen"> 1675 - <header className="border-b border-neutral-800 px-6 py-4"> 1676 - <nav className="max-w-6xl mx-auto flex items-center gap-6"> 1677 - <a href="/" className="text-xl font-bold tracking-tight"> 1678 - Ionosphere 1679 - </a> 1680 - <a href="/talks" className="text-neutral-400 hover:text-neutral-100"> 1681 - Talks 1682 - </a> 1683 - <a 1684 - href="/speakers" 1685 - className="text-neutral-400 hover:text-neutral-100" 1686 - > 1687 - Speakers 1688 - </a> 1689 - <a 1690 - href="/concepts" 1691 - className="text-neutral-400 hover:text-neutral-100" 1692 - > 1693 - Concepts 1694 - </a> 1695 - </nav> 1696 - </header> 1697 - <main className="max-w-6xl mx-auto px-6 py-8">{children}</main> 1698 - </body> 1699 - </html> 1700 - ); 1701 - } 1702 - ``` 1703 - 1704 - - [ ] **Step 9: Create page.tsx (home)** 1705 - 1706 - ```tsx 1707 - import { getTalks } from "@/lib/api"; 1708 - 1709 - export default async function Home() { 1710 - const { talks } = await getTalks(); 1711 - 1712 - return ( 1713 - <div> 1714 - <h1 className="text-4xl font-bold mb-2">ATmosphereConf 2026</h1> 1715 - <p className="text-neutral-400 mb-8"> 1716 - Semantically enriched conference archive. {talks.length} talks. 1717 - </p> 1718 - <div className="grid gap-4"> 1719 - {talks.slice(0, 20).map((talk: any) => ( 1720 - <a 1721 - key={talk.rkey} 1722 - href={`/talks/${talk.rkey}`} 1723 - className="block p-4 rounded-lg border border-neutral-800 hover:border-neutral-600 transition-colors" 1724 - > 1725 - <h2 className="font-semibold">{talk.title}</h2> 1726 - <div className="text-sm text-neutral-400 mt-1"> 1727 - {talk.speaker_names} &middot; {talk.room} &middot; {talk.talk_type} 1728 - </div> 1729 - </a> 1730 - ))} 1731 - </div> 1732 - </div> 1733 - ); 1734 - } 1735 - ``` 1736 - 1737 - - [ ] **Step 10: Install and verify** 1738 - 1739 - ```bash 1740 - pnpm install 1741 - cd apps/ionosphere && pnpm dev 1742 - ``` 1743 - 1744 - Visit http://localhost:3002 — should show talk list (requires appview running on 3001). 1745 - 1746 - - [ ] **Step 11: Commit** 1747 - 1748 - ```bash 1749 - git add apps/ionosphere/ 1750 - git commit -m "feat: scaffold Next.js frontend with talk listing" 1751 - ``` 1752 - 1753 - ### Task 8: Talk page with video player 1754 - 1755 - **Files:** 1756 - - Create: `apps/ionosphere/src/app/talks/page.tsx` 1757 - - Create: `apps/ionosphere/src/app/talks/[rkey]/page.tsx` 1758 - - Create: `apps/ionosphere/src/app/components/VideoPlayer.tsx` 1759 - 1760 - - [ ] **Step 1: Create talks index page** 1761 - 1762 - `apps/ionosphere/src/app/talks/page.tsx`: 1763 - 1764 - ```tsx 1765 - import { getTalks } from "@/lib/api"; 1766 - 1767 - export default async function TalksPage() { 1768 - const { talks } = await getTalks(); 1769 - 1770 - // Group by day 1771 - const byDay = new Map<string, any[]>(); 1772 - for (const talk of talks) { 1773 - const day = talk.starts_at?.slice(0, 10) || "unknown"; 1774 - if (!byDay.has(day)) byDay.set(day, []); 1775 - byDay.get(day)!.push(talk); 1776 - } 1777 - 1778 - return ( 1779 - <div> 1780 - <h1 className="text-3xl font-bold mb-6">All Talks</h1> 1781 - {[...byDay.entries()].map(([day, dayTalks]) => ( 1782 - <section key={day} className="mb-8"> 1783 - <h2 className="text-xl font-semibold text-neutral-300 mb-4"> 1784 - {new Date(day + "T00:00:00Z").toLocaleDateString("en-US", { 1785 - weekday: "long", 1786 - month: "long", 1787 - day: "numeric", 1788 - })} 1789 - </h2> 1790 - <div className="grid gap-3"> 1791 - {dayTalks.map((talk: any) => ( 1792 - <a 1793 - key={talk.rkey} 1794 - href={`/talks/${talk.rkey}`} 1795 - className="block p-4 rounded-lg border border-neutral-800 hover:border-neutral-600 transition-colors" 1796 - > 1797 - <h3 className="font-semibold">{talk.title}</h3> 1798 - <div className="text-sm text-neutral-400 mt-1"> 1799 - {talk.speaker_names} &middot; {talk.room} &middot;{" "} 1800 - {talk.talk_type} 1801 - </div> 1802 - </a> 1803 - ))} 1804 - </div> 1805 - </section> 1806 - ))} 1807 - </div> 1808 - ); 1809 - } 1810 - ``` 1811 - 1812 - - [ ] **Step 2: Create VideoPlayer component** 1813 - 1814 - `apps/ionosphere/src/app/components/VideoPlayer.tsx`: 1815 - 1816 - ```tsx 1817 - "use client"; 1818 - 1819 - import { useRef, useEffect } from "react"; 1820 - 1821 - interface VideoPlayerProps { 1822 - videoUri: string; 1823 - onTimeUpdate?: (timeNs: number) => void; 1824 - } 1825 - 1826 - const VOD_ENDPOINT = "https://vod-beta.stream.place/xrpc/place.stream.playback.getVideoPlaylist"; 1827 - 1828 - export default function VideoPlayer({ videoUri, onTimeUpdate }: VideoPlayerProps) { 1829 - const videoRef = useRef<HTMLVideoElement>(null); 1830 - 1831 - const playlistUrl = `${VOD_ENDPOINT}?uri=${encodeURIComponent(videoUri)}`; 1832 - 1833 - useEffect(() => { 1834 - const video = videoRef.current; 1835 - if (!video) return; 1836 - 1837 - // HLS.js for browsers that don't support HLS natively 1838 - let hls: any; 1839 - 1840 - async function setupHls() { 1841 - if (video!.canPlayType("application/vnd.apple.mpegurl")) { 1842 - // Safari supports HLS natively 1843 - video!.src = playlistUrl; 1844 - } else { 1845 - const { default: Hls } = await import("hls.js"); 1846 - if (Hls.isSupported()) { 1847 - hls = new Hls(); 1848 - hls.loadSource(playlistUrl); 1849 - hls.attachMedia(video!); 1850 - } 1851 - } 1852 - } 1853 - 1854 - setupHls(); 1855 - 1856 - return () => { 1857 - if (hls) hls.destroy(); 1858 - }; 1859 - }, [playlistUrl]); 1860 - 1861 - useEffect(() => { 1862 - const video = videoRef.current; 1863 - if (!video || !onTimeUpdate) return; 1864 - 1865 - const handler = () => { 1866 - // Convert seconds to nanoseconds for consistency with VOD duration format 1867 - onTimeUpdate(video.currentTime * 1e9); 1868 - }; 1869 - 1870 - video.addEventListener("timeupdate", handler); 1871 - return () => video.removeEventListener("timeupdate", handler); 1872 - }, [onTimeUpdate]); 1873 - 1874 - return ( 1875 - <video 1876 - ref={videoRef} 1877 - controls 1878 - className="w-full rounded-lg bg-black aspect-video" 1879 - /> 1880 - ); 1881 - } 1882 - ``` 1883 - 1884 - - [ ] **Step 3: Create talk detail page** 1885 - 1886 - `apps/ionosphere/src/app/talks/[rkey]/page.tsx`: 1887 - 1888 - ```tsx 1889 - import { getTalk, getTalks } from "@/lib/api"; 1890 - import VideoPlayer from "@/app/components/VideoPlayer"; 1891 - 1892 - export async function generateStaticParams() { 1893 - const { talks } = await getTalks(); 1894 - return talks.map((t: any) => ({ rkey: t.rkey })); 1895 - } 1896 - 1897 - export default async function TalkPage({ 1898 - params, 1899 - }: { 1900 - params: Promise<{ rkey: string }>; 1901 - }) { 1902 - const { rkey } = await params; 1903 - const { talk, speakers, concepts } = await getTalk(rkey); 1904 - 1905 - const durationMin = talk.duration ? (talk.duration / 1e9 / 60).toFixed(0) : null; 1906 - 1907 - return ( 1908 - <div className="grid grid-cols-1 lg:grid-cols-3 gap-8"> 1909 - <div className="lg:col-span-2"> 1910 - {talk.video_uri && <VideoPlayer videoUri={talk.video_uri} />} 1911 - <h1 className="text-2xl font-bold mt-4">{talk.title}</h1> 1912 - <div className="text-neutral-400 mt-1"> 1913 - {speakers.map((s: any) => s.name).join(", ")} 1914 - {durationMin && <> &middot; {durationMin} min</>} 1915 - {talk.room && <> &middot; {talk.room}</>} 1916 - </div> 1917 - {talk.description && ( 1918 - <p className="text-neutral-300 mt-4 leading-relaxed"> 1919 - {talk.description} 1920 - </p> 1921 - )} 1922 - {/* Transcript will go here in a later task */} 1923 - <div className="mt-8 p-6 rounded-lg border border-neutral-800 text-neutral-500 text-sm"> 1924 - Transcript not yet available. 1925 - </div> 1926 - </div> 1927 - <aside className="space-y-6"> 1928 - <section> 1929 - <h2 className="text-sm font-semibold text-neutral-400 uppercase tracking-wide mb-2"> 1930 - Speakers 1931 - </h2> 1932 - {speakers.map((s: any) => ( 1933 - <a 1934 - key={s.rkey} 1935 - href={`/speakers/${s.rkey}`} 1936 - className="block text-neutral-200 hover:text-white" 1937 - > 1938 - {s.name} 1939 - {s.handle && ( 1940 - <span className="text-neutral-500 ml-1">@{s.handle}</span> 1941 - )} 1942 - </a> 1943 - ))} 1944 - </section> 1945 - {talk.category && ( 1946 - <section> 1947 - <h2 className="text-sm font-semibold text-neutral-400 uppercase tracking-wide mb-2"> 1948 - Category 1949 - </h2> 1950 - <span className="text-neutral-300">{talk.category}</span> 1951 - </section> 1952 - )} 1953 - {talk.talk_type && ( 1954 - <section> 1955 - <h2 className="text-sm font-semibold text-neutral-400 uppercase tracking-wide mb-2"> 1956 - Type 1957 - </h2> 1958 - <span className="text-neutral-300">{talk.talk_type}</span> 1959 - </section> 1960 - )} 1961 - </aside> 1962 - </div> 1963 - ); 1964 - } 1965 - ``` 1966 - 1967 - - [ ] **Step 4: Add hls.js dependency** 1968 - 1969 - ```bash 1970 - cd apps/ionosphere && pnpm add hls.js 1971 - ``` 1972 - 1973 - - [ ] **Step 5: Verify talk page with video playback** 1974 - 1975 - With appview running on :3001 and frontend on :3002, navigate to a talk page and verify the video loads and plays from Streamplace. 1976 - 1977 - - [ ] **Step 6: Commit** 1978 - 1979 - ```bash 1980 - git add apps/ionosphere/src/app/talks/ apps/ionosphere/src/app/components/VideoPlayer.tsx 1981 - git commit -m "feat: talk pages with HLS video player" 1982 - ``` 1983 - 1984 - ### Task 9: Speaker and concept pages 1985 - 1986 - **Files:** 1987 - - Create: `apps/ionosphere/src/app/speakers/page.tsx` 1988 - - Create: `apps/ionosphere/src/app/speakers/[rkey]/page.tsx` 1989 - - Create: `apps/ionosphere/src/app/concepts/page.tsx` 1990 - - Create: `apps/ionosphere/src/app/concepts/[rkey]/page.tsx` 1991 - 1992 - - [ ] **Step 1: Create speakers index page** 1993 - 1994 - ```tsx 1995 - import { getSpeakers } from "@/lib/api"; 1996 - 1997 - export default async function SpeakersPage() { 1998 - const { speakers } = await getSpeakers(); 1999 - 2000 - return ( 2001 - <div> 2002 - <h1 className="text-3xl font-bold mb-6">Speakers</h1> 2003 - <div className="grid gap-3 sm:grid-cols-2 lg:grid-cols-3"> 2004 - {speakers.map((s: any) => ( 2005 - <a 2006 - key={s.rkey} 2007 - href={`/speakers/${s.rkey}`} 2008 - className="block p-4 rounded-lg border border-neutral-800 hover:border-neutral-600 transition-colors" 2009 - > 2010 - <div className="font-semibold">{s.name}</div> 2011 - {s.handle && ( 2012 - <div className="text-sm text-neutral-400">@{s.handle}</div> 2013 - )} 2014 - </a> 2015 - ))} 2016 - </div> 2017 - </div> 2018 - ); 2019 - } 2020 - ``` 2021 - 2022 - - [ ] **Step 2: Create speaker detail page** 2023 - 2024 - ```tsx 2025 - import { getSpeaker, getSpeakers } from "@/lib/api"; 2026 - 2027 - export async function generateStaticParams() { 2028 - const { speakers } = await getSpeakers(); 2029 - return speakers.map((s: any) => ({ rkey: s.rkey })); 2030 - } 2031 - 2032 - export default async function SpeakerPage({ 2033 - params, 2034 - }: { 2035 - params: Promise<{ rkey: string }>; 2036 - }) { 2037 - const { rkey } = await params; 2038 - const { speaker, talks } = await getSpeaker(rkey); 2039 - 2040 - return ( 2041 - <div> 2042 - <h1 className="text-3xl font-bold">{speaker.name}</h1> 2043 - {speaker.handle && ( 2044 - <div className="text-neutral-400 mt-1">@{speaker.handle}</div> 2045 - )} 2046 - {speaker.bio && ( 2047 - <p className="text-neutral-300 mt-4">{speaker.bio}</p> 2048 - )} 2049 - <h2 className="text-xl font-semibold mt-8 mb-4">Talks</h2> 2050 - <div className="grid gap-3"> 2051 - {talks.map((t: any) => ( 2052 - <a 2053 - key={t.rkey} 2054 - href={`/talks/${t.rkey}`} 2055 - className="block p-4 rounded-lg border border-neutral-800 hover:border-neutral-600 transition-colors" 2056 - > 2057 - <div className="font-semibold">{t.title}</div> 2058 - <div className="text-sm text-neutral-400 mt-1"> 2059 - {t.room} &middot; {t.talk_type} 2060 - </div> 2061 - </a> 2062 - ))} 2063 - </div> 2064 - </div> 2065 - ); 2066 - } 2067 - ``` 2068 - 2069 - - [ ] **Step 3: Create concepts index page** 2070 - 2071 - ```tsx 2072 - import { getConcepts } from "@/lib/api"; 2073 - 2074 - export default async function ConceptsPage() { 2075 - const { concepts } = await getConcepts(); 2076 - 2077 - if (concepts.length === 0) { 2078 - return ( 2079 - <div> 2080 - <h1 className="text-3xl font-bold mb-6">Concepts</h1> 2081 - <p className="text-neutral-400"> 2082 - Concepts will appear here after transcript enrichment. 2083 - </p> 2084 - </div> 2085 - ); 2086 - } 2087 - 2088 - return ( 2089 - <div> 2090 - <h1 className="text-3xl font-bold mb-6">Concepts</h1> 2091 - <div className="grid gap-3 sm:grid-cols-2 lg:grid-cols-3"> 2092 - {concepts.map((c: any) => ( 2093 - <a 2094 - key={c.rkey} 2095 - href={`/concepts/${c.rkey}`} 2096 - className="block p-4 rounded-lg border border-neutral-800 hover:border-neutral-600 transition-colors" 2097 - > 2098 - <div className="font-semibold">{c.name}</div> 2099 - {c.description && ( 2100 - <div className="text-sm text-neutral-400 mt-1 line-clamp-2"> 2101 - {c.description} 2102 - </div> 2103 - )} 2104 - </a> 2105 - ))} 2106 - </div> 2107 - </div> 2108 - ); 2109 - } 2110 - ``` 2111 - 2112 - - [ ] **Step 4: Create concept detail page** 2113 - 2114 - ```tsx 2115 - import { getConcept, getConcepts } from "@/lib/api"; 2116 - 2117 - export async function generateStaticParams() { 2118 - const { concepts } = await getConcepts(); 2119 - return concepts.map((c: any) => ({ rkey: c.rkey })); 2120 - } 2121 - 2122 - export default async function ConceptPage({ 2123 - params, 2124 - }: { 2125 - params: Promise<{ rkey: string }>; 2126 - }) { 2127 - const { rkey } = await params; 2128 - const { concept, talks } = await getConcept(rkey); 2129 - 2130 - return ( 2131 - <div> 2132 - <h1 className="text-3xl font-bold">{concept.name}</h1> 2133 - {concept.description && ( 2134 - <p className="text-neutral-300 mt-4">{concept.description}</p> 2135 - )} 2136 - {concept.wikidata_id && ( 2137 - <a 2138 - href={`https://www.wikidata.org/wiki/${concept.wikidata_id}`} 2139 - className="text-blue-400 hover:underline text-sm mt-2 inline-block" 2140 - target="_blank" 2141 - rel="noopener" 2142 - > 2143 - Wikidata 2144 - </a> 2145 - )} 2146 - <h2 className="text-xl font-semibold mt-8 mb-4"> 2147 - Mentioned in {talks.length} talk{talks.length !== 1 ? "s" : ""} 2148 - </h2> 2149 - <div className="grid gap-3"> 2150 - {talks.map((t: any) => ( 2151 - <a 2152 - key={t.rkey} 2153 - href={`/talks/${t.rkey}`} 2154 - className="block p-4 rounded-lg border border-neutral-800 hover:border-neutral-600 transition-colors" 2155 - > 2156 - <div className="font-semibold">{t.title}</div> 2157 - </a> 2158 - ))} 2159 - </div> 2160 - </div> 2161 - ); 2162 - } 2163 - ``` 2164 - 2165 - - [ ] **Step 5: Verify all pages** 2166 - 2167 - Navigate to /speakers, /speakers/:rkey, /concepts, verify rendering. 2168 - 2169 - - [ ] **Step 6: Commit** 2170 - 2171 - ```bash 2172 - git add apps/ionosphere/src/app/speakers/ apps/ionosphere/src/app/concepts/ 2173 - git commit -m "feat: speaker and concept pages" 2174 - ``` 2175 - 2176 - --- 2177 - 2178 - ## Chunk 4: Transcription Pipeline 2179 - 2180 - Implements audio extraction from HLS streams and transcription with word-level timestamps. After this chunk, talks have transcripts stored in the database. 2181 - 2182 - ### Task 10: Audio extraction from HLS 2183 - 2184 - **Files:** 2185 - - Create: `apps/ionosphere-appview/src/extract-audio.ts` 2186 - 2187 - - [ ] **Step 1: Write test for audio extraction** 2188 - 2189 - Create `apps/ionosphere-appview/src/extract-audio.test.ts`: 2190 - 2191 - ```typescript 2192 - import { describe, it, expect } from "vitest"; 2193 - import { buildPlaylistUrl } from "./extract-audio.js"; 2194 - 2195 - describe("extract-audio", () => { 2196 - it("builds correct playlist URL from video URI", () => { 2197 - const uri = 2198 - "at://did:plc:rbvrr34edl5ddpuwcubjiost/place.stream.video/3mi5stzyxji2e"; 2199 - const url = buildPlaylistUrl(uri); 2200 - expect(url).toBe( 2201 - "https://vod-beta.stream.place/xrpc/place.stream.playback.getVideoPlaylist?uri=at%3A%2F%2Fdid%3Aplc%3Arbvrr34edl5ddpuwcubjiost%2Fplace.stream.video%2F3mi5stzyxji2e" 2202 - ); 2203 - }); 2204 - }); 2205 - ``` 2206 - 2207 - - [ ] **Step 2: Run test to verify it fails** 2208 - 2209 - Run: `cd apps/ionosphere-appview && pnpm test` 2210 - 2211 - - [ ] **Step 3: Implement extract-audio.ts** 2212 - 2213 - Uses ffmpeg to extract audio from the HLS stream. ffmpeg must be installed on the system. 2214 - 2215 - ```typescript 2216 - import { execSync } from "node:child_process"; 2217 - import { existsSync, mkdirSync } from "node:fs"; 2218 - import path from "node:path"; 2219 - 2220 - const VOD_ENDPOINT = 2221 - "https://vod-beta.stream.place/xrpc/place.stream.playback.getVideoPlaylist"; 2222 - const AUDIO_DIR = path.resolve(import.meta.dirname, "../../data/audio"); 2223 - 2224 - export function buildPlaylistUrl(videoUri: string): string { 2225 - return `${VOD_ENDPOINT}?uri=${encodeURIComponent(videoUri)}`; 2226 - } 2227 - 2228 - export function extractAudio( 2229 - videoUri: string, 2230 - talkRkey: string 2231 - ): string { 2232 - mkdirSync(AUDIO_DIR, { recursive: true }); 2233 - 2234 - const outputPath = path.join(AUDIO_DIR, `${talkRkey}.wav`); 2235 - if (existsSync(outputPath)) { 2236 - console.log(` Audio already exists: ${talkRkey}.wav`); 2237 - return outputPath; 2238 - } 2239 - 2240 - const playlistUrl = buildPlaylistUrl(videoUri); 2241 - console.log(` Extracting audio for ${talkRkey}...`); 2242 - 2243 - execSync( 2244 - `ffmpeg -i "${playlistUrl}" -vn -acodec pcm_s16le -ar 16000 -ac 1 "${outputPath}"`, 2245 - { stdio: "inherit", timeout: 600_000 } 2246 - ); 2247 - 2248 - return outputPath; 2249 - } 2250 - ``` 2251 - 2252 - - [ ] **Step 4: Run test to verify it passes** 2253 - 2254 - Run: `cd apps/ionosphere-appview && pnpm test` 2255 - 2256 - - [ ] **Step 5: Commit** 2257 - 2258 - ```bash 2259 - git add apps/ionosphere-appview/src/extract-audio.ts apps/ionosphere-appview/src/extract-audio.test.ts 2260 - git commit -m "feat: audio extraction from HLS streams via ffmpeg" 2261 - ``` 2262 - 2263 - ### Task 11: Transcription integration 2264 - 2265 - **Files:** 2266 - - Create: `apps/ionosphere-appview/src/transcribe.ts` 2267 - 2268 - This task is a skeleton — the actual transcription provider is pluggable. Start with a file-based interface that can wrap any provider. 2269 - 2270 - - [ ] **Step 1: Define transcription types and interface** 2271 - 2272 - ```typescript 2273 - import { existsSync, readFileSync, writeFileSync, mkdirSync } from "node:fs"; 2274 - import path from "node:path"; 2275 - import { extractAudio } from "./extract-audio.js"; 2276 - import { openDb } from "./db.js"; 2277 - 2278 - const TRANSCRIPT_DIR = path.resolve( 2279 - import.meta.dirname, 2280 - "../../data/transcripts" 2281 - ); 2282 - 2283 - export interface WordTimestamp { 2284 - word: string; 2285 - start: number; // seconds 2286 - end: number; // seconds 2287 - confidence: number; 2288 - } 2289 - 2290 - export interface TranscriptResult { 2291 - text: string; 2292 - words: WordTimestamp[]; 2293 - } 2294 - 2295 - export type TranscriptionProvider = ( 2296 - audioPath: string 2297 - ) => Promise<TranscriptResult>; 2298 - 2299 - // Placeholder provider — replace with real implementation 2300 - async function placeholderProvider( 2301 - audioPath: string 2302 - ): Promise<TranscriptResult> { 2303 - throw new Error( 2304 - `No transcription provider configured. Audio file: ${audioPath}` 2305 - ); 2306 - } 2307 - 2308 - export async function transcribeTalk( 2309 - talkRkey: string, 2310 - videoUri: string, 2311 - provider: TranscriptionProvider = placeholderProvider 2312 - ): Promise<TranscriptResult> { 2313 - mkdirSync(TRANSCRIPT_DIR, { recursive: true }); 2314 - 2315 - const cachedPath = path.join(TRANSCRIPT_DIR, `${talkRkey}.json`); 2316 - 2317 - // Check cache 2318 - if (existsSync(cachedPath)) { 2319 - console.log(` Transcript cached: ${talkRkey}`); 2320 - return JSON.parse(readFileSync(cachedPath, "utf-8")); 2321 - } 2322 - 2323 - // Extract audio 2324 - const audioPath = extractAudio(videoUri, talkRkey); 2325 - 2326 - // Transcribe 2327 - console.log(` Transcribing ${talkRkey}...`); 2328 - const result = await provider(audioPath); 2329 - 2330 - // Cache result 2331 - writeFileSync(cachedPath, JSON.stringify(result, null, 2)); 2332 - console.log(` Saved transcript: ${cachedPath}`); 2333 - 2334 - return result; 2335 - } 2336 - 2337 - // CLI entry point: transcribe all talks that have a video but no transcript 2338 - async function main() { 2339 - const db = openDb(); 2340 - const talks = db 2341 - .prepare( 2342 - `SELECT t.rkey, t.video_uri FROM talks t 2343 - JOIN pipeline_status ps ON t.uri = ps.talk_uri 2344 - WHERE t.video_uri IS NOT NULL AND ps.transcribed = 0 2345 - LIMIT 5` 2346 - ) 2347 - .all() as Array<{ rkey: string; video_uri: string }>; 2348 - 2349 - console.log(`${talks.length} talks to transcribe`); 2350 - 2351 - for (const talk of talks) { 2352 - try { 2353 - await transcribeTalk(talk.rkey, talk.video_uri); 2354 - db.prepare( 2355 - `UPDATE pipeline_status SET transcribed = 1, updated_at = CURRENT_TIMESTAMP 2356 - WHERE talk_uri = (SELECT uri FROM talks WHERE rkey = ?)` 2357 - ).run(talk.rkey); 2358 - } catch (err) { 2359 - console.error(` Failed: ${talk.rkey}:`, (err as Error).message); 2360 - } 2361 - } 2362 - 2363 - db.close(); 2364 - } 2365 - 2366 - // Only run main when executed directly 2367 - if (import.meta.url === `file://${process.argv[1]}`) { 2368 - main().catch(console.error); 2369 - } 2370 - ``` 2371 - 2372 - - [ ] **Step 2: Add transcribe script to appview package.json** 2373 - 2374 - Add to scripts: 2375 - ```json 2376 - "transcribe": "tsx src/transcribe.ts" 2377 - ``` 2378 - 2379 - - [ ] **Step 3: Commit** 2380 - 2381 - ```bash 2382 - git add apps/ionosphere-appview/src/transcribe.ts 2383 - git commit -m "feat: transcription pipeline skeleton with provider interface and caching" 2384 - ``` 2385 - 2386 - --- 2387 - 2388 - ## Chunk 5: Document Assembly & Timestamp Provider 2389 - 2390 - Converts raw transcripts into RelationalText documents with timestamp facets, and implements the frontend timestamp sync. 2391 - 2392 - ### Task 12: Document assembly — transcript to RelationalText 2393 - 2394 - **Files:** 2395 - - Create: `formats/tv.ionosphere/ts/assemble.ts` 2396 - - Create: `formats/tv.ionosphere/ts/assemble.test.ts` 2397 - 2398 - - [ ] **Step 1: Write test for document assembly** 2399 - 2400 - ```typescript 2401 - import { describe, it, expect } from "vitest"; 2402 - import { assembleDocument, type TranscriptInput } from "./assemble.js"; 2403 - 2404 - describe("assembleDocument", () => { 2405 - const transcript: TranscriptInput = { 2406 - text: "Hello world this is a test", 2407 - words: [ 2408 - { word: "Hello", start: 0.0, end: 0.5, confidence: 0.99 }, 2409 - { word: "world", start: 0.5, end: 1.0, confidence: 0.98 }, 2410 - { word: "this", start: 1.0, end: 1.3, confidence: 0.97 }, 2411 - { word: "is", start: 1.3, end: 1.5, confidence: 0.99 }, 2412 - { word: "a", start: 1.5, end: 1.6, confidence: 0.99 }, 2413 - { word: "test", start: 1.6, end: 2.0, confidence: 0.95 }, 2414 - ], 2415 - }; 2416 - 2417 - it("creates a document with text matching the transcript", () => { 2418 - const doc = assembleDocument(transcript); 2419 - expect(doc.text).toBe("Hello world this is a test"); 2420 - }); 2421 - 2422 - it("creates timestamp facets for each word", () => { 2423 - const doc = assembleDocument(transcript); 2424 - const timestampFacets = doc.facets.filter((f: any) => 2425 - f.features.some((feat: any) => feat.$type === "tv.ionosphere.facet#timestamp") 2426 - ); 2427 - expect(timestampFacets).toHaveLength(6); 2428 - }); 2429 - 2430 - it("timestamp facets have correct byte ranges", () => { 2431 - const doc = assembleDocument(transcript); 2432 - const first = doc.facets.find((f: any) => 2433 - f.features.some( 2434 - (feat: any) => 2435 - feat.$type === "tv.ionosphere.facet#timestamp" && 2436 - feat.startTime === 0 2437 - ) 2438 - ); 2439 - expect(first).toBeDefined(); 2440 - expect(first!.index.byteStart).toBe(0); 2441 - expect(first!.index.byteEnd).toBe(5); // "Hello" = 5 bytes 2442 - }); 2443 - 2444 - it("timestamp times are in nanoseconds", () => { 2445 - const doc = assembleDocument(transcript); 2446 - const first = doc.facets[0]; 2447 - const ts = first.features.find( 2448 - (f: any) => f.$type === "tv.ionosphere.facet#timestamp" 2449 - ); 2450 - expect(ts.startTime).toBe(0); 2451 - expect(ts.endTime).toBe(500_000_000); // 0.5s in ns 2452 - }); 2453 - }); 2454 - ``` 2455 - 2456 - - [ ] **Step 2: Run test to verify it fails** 2457 - 2458 - Run: `cd formats/tv.ionosphere && pnpm test` 2459 - 2460 - - [ ] **Step 3: Implement assemble.ts** 2461 - 2462 - ```typescript 2463 - export interface WordTimestamp { 2464 - word: string; 2465 - start: number; // seconds 2466 - end: number; // seconds 2467 - confidence: number; 2468 - } 2469 - 2470 - export interface TranscriptInput { 2471 - text: string; 2472 - words: WordTimestamp[]; 2473 - } 2474 - 2475 - export interface Facet { 2476 - index: { byteStart: number; byteEnd: number }; 2477 - features: Array<Record<string, any>>; 2478 - } 2479 - 2480 - export interface Document { 2481 - text: string; 2482 - facets: Facet[]; 2483 - } 2484 - 2485 - function secondsToNs(s: number): number { 2486 - return Math.round(s * 1e9); 2487 - } 2488 - 2489 - export function assembleDocument(transcript: TranscriptInput): Document { 2490 - const encoder = new TextEncoder(); 2491 - const facets: Facet[] = []; 2492 - 2493 - // Build byte offset map by finding each word in the text 2494 - let searchFrom = 0; 2495 - for (const word of transcript.words) { 2496 - const idx = transcript.text.indexOf(word.word, searchFrom); 2497 - if (idx === -1) continue; 2498 - 2499 - const byteStart = encoder.encode(transcript.text.slice(0, idx)).length; 2500 - const byteEnd = 2501 - encoder.encode(transcript.text.slice(0, idx + word.word.length)).length; 2502 - 2503 - facets.push({ 2504 - index: { byteStart, byteEnd }, 2505 - features: [ 2506 - { 2507 - $type: "tv.ionosphere.facet#timestamp", 2508 - startTime: secondsToNs(word.start), 2509 - endTime: secondsToNs(word.end), 2510 - }, 2511 - ], 2512 - }); 2513 - 2514 - searchFrom = idx + word.word.length; 2515 - } 2516 - 2517 - return { text: transcript.text, facets }; 2518 - } 2519 - ``` 2520 - 2521 - - [ ] **Step 4: Run test to verify it passes** 2522 - 2523 - Run: `cd formats/tv.ionosphere && pnpm test` 2524 - 2525 - - [ ] **Step 5: Commit** 2526 - 2527 - ```bash 2528 - git add formats/tv.ionosphere/ts/assemble.ts formats/tv.ionosphere/ts/assemble.test.ts 2529 - git commit -m "feat: assemble RelationalText documents from transcripts with timestamp facets" 2530 - ``` 2531 - 2532 - ### Task 13: Timestamp provider and transcript sync in frontend 2533 - 2534 - **Files:** 2535 - - Create: `apps/ionosphere/src/app/components/TimestampProvider.tsx` 2536 - - Create: `apps/ionosphere/src/app/components/TranscriptView.tsx` 2537 - 2538 - - [ ] **Step 1: Create TimestampProvider** 2539 - 2540 - ```tsx 2541 - "use client"; 2542 - 2543 - import { 2544 - createContext, 2545 - useContext, 2546 - useState, 2547 - useCallback, 2548 - type ReactNode, 2549 - } from "react"; 2550 - 2551 - interface TimestampContextValue { 2552 - currentTimeNs: number; 2553 - setCurrentTimeNs: (ns: number) => void; 2554 - seekTo: (ns: number) => void; 2555 - onSeek: (handler: (ns: number) => void) => () => void; 2556 - } 2557 - 2558 - const TimestampContext = createContext<TimestampContextValue | null>(null); 2559 - 2560 - export function useTimestamp() { 2561 - const ctx = useContext(TimestampContext); 2562 - if (!ctx) throw new Error("useTimestamp must be used within TimestampProvider"); 2563 - return ctx; 2564 - } 2565 - 2566 - export function TimestampProvider({ children }: { children: ReactNode }) { 2567 - const [currentTimeNs, setCurrentTimeNs] = useState(0); 2568 - const [seekHandlers] = useState<Set<(ns: number) => void>>(new Set()); 2569 - 2570 - const seekTo = useCallback( 2571 - (ns: number) => { 2572 - for (const handler of seekHandlers) { 2573 - handler(ns); 2574 - } 2575 - }, 2576 - [seekHandlers] 2577 - ); 2578 - 2579 - const onSeek = useCallback( 2580 - (handler: (ns: number) => void) => { 2581 - seekHandlers.add(handler); 2582 - return () => seekHandlers.delete(handler); 2583 - }, 2584 - [seekHandlers] 2585 - ); 2586 - 2587 - return ( 2588 - <TimestampContext.Provider 2589 - value={{ currentTimeNs, setCurrentTimeNs, seekTo, onSeek }} 2590 - > 2591 - {children} 2592 - </TimestampContext.Provider> 2593 - ); 2594 - } 2595 - ``` 2596 - 2597 - - [ ] **Step 2: Create TranscriptView** 2598 - 2599 - ```tsx 2600 - "use client"; 2601 - 2602 - import { useTimestamp } from "./TimestampProvider"; 2603 - import { useRef, useEffect } from "react"; 2604 - 2605 - interface TranscriptFacet { 2606 - index: { byteStart: number; byteEnd: number }; 2607 - features: Array<{ 2608 - $type: string; 2609 - startTime?: number; 2610 - endTime?: number; 2611 - [key: string]: any; 2612 - }>; 2613 - } 2614 - 2615 - interface TranscriptDocument { 2616 - text: string; 2617 - facets: TranscriptFacet[]; 2618 - } 2619 - 2620 - interface TranscriptViewProps { 2621 - document: TranscriptDocument; 2622 - } 2623 - 2624 - interface WordSpan { 2625 - text: string; 2626 - startTime: number; 2627 - endTime: number; 2628 - byteStart: number; 2629 - byteEnd: number; 2630 - } 2631 - 2632 - function extractWordSpans(doc: TranscriptDocument): WordSpan[] { 2633 - const encoder = new TextEncoder(); 2634 - const textBytes = encoder.encode(doc.text); 2635 - const decoder = new TextDecoder(); 2636 - 2637 - return doc.facets 2638 - .filter((f) => 2639 - f.features.some((feat) => feat.$type === "tv.ionosphere.facet#timestamp") 2640 - ) 2641 - .map((f) => { 2642 - const ts = f.features.find( 2643 - (feat) => feat.$type === "tv.ionosphere.facet#timestamp" 2644 - )!; 2645 - return { 2646 - text: decoder.decode(textBytes.slice(f.index.byteStart, f.index.byteEnd)), 2647 - startTime: ts.startTime!, 2648 - endTime: ts.endTime!, 2649 - byteStart: f.index.byteStart, 2650 - byteEnd: f.index.byteEnd, 2651 - }; 2652 - }) 2653 - .sort((a, b) => a.byteStart - b.byteStart); 2654 - } 2655 - 2656 - export default function TranscriptView({ document }: TranscriptViewProps) { 2657 - const { currentTimeNs, seekTo } = useTimestamp(); 2658 - const containerRef = useRef<HTMLDivElement>(null); 2659 - const activeRef = useRef<HTMLSpanElement>(null); 2660 - 2661 - const words = extractWordSpans(document); 2662 - 2663 - // Auto-scroll to active word 2664 - useEffect(() => { 2665 - if (activeRef.current) { 2666 - activeRef.current.scrollIntoView({ 2667 - behavior: "smooth", 2668 - block: "center", 2669 - }); 2670 - } 2671 - }, [currentTimeNs]); 2672 - 2673 - return ( 2674 - <div 2675 - ref={containerRef} 2676 - className="mt-8 p-6 rounded-lg border border-neutral-800 max-h-96 overflow-y-auto leading-relaxed" 2677 - > 2678 - {words.map((word, i) => { 2679 - const isActive = 2680 - currentTimeNs >= word.startTime && currentTimeNs < word.endTime; 2681 - 2682 - return ( 2683 - <span 2684 - key={i} 2685 - ref={isActive ? activeRef : undefined} 2686 - onClick={() => seekTo(word.startTime)} 2687 - className={`cursor-pointer transition-colors ${ 2688 - isActive 2689 - ? "bg-blue-500/30 text-white rounded px-0.5" 2690 - : "text-neutral-300 hover:text-white" 2691 - }`} 2692 - > 2693 - {word.text}{" "} 2694 - </span> 2695 - ); 2696 - })} 2697 - </div> 2698 - ); 2699 - } 2700 - ``` 2701 - 2702 - - [ ] **Step 3: Update VideoPlayer to integrate with TimestampProvider** 2703 - 2704 - Update `apps/ionosphere/src/app/components/VideoPlayer.tsx` to use the timestamp context: 2705 - 2706 - Add to the component, after existing imports: 2707 - ```tsx 2708 - import { useTimestamp } from "./TimestampProvider"; 2709 - ``` 2710 - 2711 - Replace the `onTimeUpdate` prop pattern with context-based time broadcasting and seek listening. The component should call `setCurrentTimeNs` on video timeupdate events, and listen for `onSeek` calls to seek the video element. 2712 - 2713 - - [ ] **Step 4: Update talk page to use TimestampProvider and TranscriptView** 2714 - 2715 - Wrap the talk page content in `<TimestampProvider>` and conditionally render `<TranscriptView>` when a document is available. 2716 - 2717 - - [ ] **Step 5: Verify end-to-end** 2718 - 2719 - With a talk that has a transcript in the database, verify: 2720 - 1. Video plays 2721 - 2. Transcript words highlight as video plays 2722 - 3. Clicking a word seeks the video 2723 - 2724 - - [ ] **Step 6: Commit** 2725 - 2726 - ```bash 2727 - git add apps/ionosphere/src/app/components/ 2728 - git commit -m "feat: timestamp provider and synchronized transcript view" 2729 - ``` 2730 - 2731 - --- 2732 - 2733 - ## Chunk 6: LLM Enrichment Pipeline (Future) 2734 - 2735 - This chunk covers LLM-assisted semantic enrichment of transcripts. It is deferred until transcription is working and validated on the corpus. The implementation will: 2736 - 2737 - 1. Pass transcript text + talk context to an LLM 2738 - 2. Extract concept mentions, speaker references, talk cross-references, and links 2739 - 3. Create `tv.ionosphere.concept` records and annotation layers 2740 - 4. Store enrichment results as `pub.layers.annotation` layers on the document 2741 - 5. Update the appview index with concept/speaker/crossref join tables 2742 - 2743 - This is documented but not planned in detail yet — the exact approach depends on transcript quality and cost evaluation. 2744 - 2745 - --- 2746 - 2747 - ## Summary 2748 - 2749 - | Chunk | Tasks | What it delivers | 2750 - |-------|-------|------------------| 2751 - | 1: Scaffold & Lexicons | 1-4 | Workspace, lexicons, format-lexicon, lens specs + loader | 2752 - | 2: Appview & Ingest | 5-6 | SQLite schema, REST API, lens-driven data ingest pipeline | 2753 - | 3: Frontend | 7-9 | Next.js SSG, talk/speaker/concept pages, video player | 2754 - | 4: Transcription | 10-11 | Audio extraction, transcription pipeline skeleton | 2755 - | 5: Document Assembly | 12-13 | RelationalText documents, timestamp sync, transcript view | 2756 - | 6: Enrichment | (future) | LLM annotation, concept extraction, knowledge graph |
-913
docs/superpowers/plans/2026-03-31-lens-layer.md
··· 1 - # Lens Layer Implementation Plan 2 - 3 - > **For agentic workers:** REQUIRED: Use superpowers:subagent-driven-development (if subagents available) or superpowers:executing-plans to implement this plan. Steps use checkbox (`- [ ]`) syntax for tracking. 4 - 5 - **Goal:** Replace the custom lens runtime with panproto, making all schema boundaries declarative and forwards-compatible. 6 - 7 - **Architecture:** `@panproto/core` provides the lens runtime via WASM. Source lexicons (calendar events, VOD records, Whisper output) are stored alongside ionosphere lexicons. Panproto auto-generates lenses from lexicon pairs. Serialized protolens chains are stored as AT Protocol records on the PDS, indexed by the appview, and resolved at runtime by pipeline scripts. 8 - 9 - **Tech Stack:** `@panproto/core` (WASM, TypeScript SDK), AT Protocol lexicons, `@ionosphere/format` (workspace package), Vitest. 10 - 11 - **Spec:** `docs/superpowers/specs/2026-03-31-lens-layer-design.md` 12 - 13 - --- 14 - 15 - ## File Map 16 - 17 - ### New files 18 - - `lexicons/community/lexicon/calendar/event.json` — source lexicon for ATmosphereConf schedule events 19 - - `lexicons/place/stream/video.json` — source lexicon for Streamplace VOD records 20 - - `lexicons/openai/whisper/verbose_json.json` — source lexicon for Whisper API output 21 - - `formats/tv.ionosphere/ts/panproto.ts` — thin panproto wrapper (init, loadSchema, convert, resolve) 22 - - `formats/tv.ionosphere/ts/panproto.test.ts` — lens law + conversion tests 23 - - `apps/ionosphere-appview/src/lens-resolver.ts` — pipeline-side lens resolution (appview index → PDS fetch) 24 - 25 - ### Modified files 26 - - `formats/tv.ionosphere/package.json` — add `@panproto/core` dependency 27 - - `formats/tv.ionosphere/ts/lenses.ts` — delete contents, re-export from panproto.ts 28 - - `formats/tv.ionosphere/ts/lenses.test.ts` — delete old tests 29 - - `apps/ionosphere-appview/package.json` — add `@panproto/core` dependency 30 - - `apps/ionosphere-appview/src/db.ts` — add `lenses` table to migration 31 - - `apps/ionosphere-appview/src/indexer.ts` — add `org.relationaltext.lens` collection handling 32 - - Note: `backfill.ts` imports `IONOSPHERE_COLLECTIONS` from `indexer.ts` — no direct modification needed 33 - - `apps/ionosphere-appview/src/publish.ts` — add step 0: publish lens records 34 - - `apps/ionosphere-appview/src/ingest.ts` — replace `loadLens`/`applyLens` with panproto convert 35 - - `apps/ionosphere-appview/src/providers/openai-whisper.ts` — replace ad-hoc mapping with lens 36 - 37 - ### Deleted (contents replaced) 38 - - `formats/tv.ionosphere/lenses/*.lens.json` — replaced by source lexicons + auto-generation 39 - 40 - --- 41 - 42 - ## Chunk 1: Panproto Foundation 43 - 44 - Install `@panproto/core`, create the thin wrapper, verify it works with ATProto lexicons. 45 - 46 - ### Task 1: Install @panproto/core 47 - 48 - **Files:** 49 - - Modify: `formats/tv.ionosphere/package.json` 50 - - Modify: `apps/ionosphere-appview/package.json` 51 - 52 - - [ ] **Step 1: Add dependency to format package** 53 - 54 - ```bash 55 - cd formats/tv.ionosphere && pnpm add @panproto/core 56 - ``` 57 - 58 - - [ ] **Step 2: Add dependency to appview package** 59 - 60 - ```bash 61 - cd apps/ionosphere-appview && pnpm add @panproto/core 62 - ``` 63 - 64 - - [ ] **Step 3: Verify install** 65 - 66 - ```bash 67 - cd /Users/blainecook/Code/skeetv && pnpm install 68 - ``` 69 - 70 - Expected: Clean install, no errors. 71 - 72 - - [ ] **Step 4: Commit** 73 - 74 - ```bash 75 - git add formats/tv.ionosphere/package.json apps/ionosphere-appview/package.json pnpm-lock.yaml 76 - git commit -m "chore: add @panproto/core dependency" 77 - ``` 78 - 79 - ### Task 2: Add source lexicons 80 - 81 - We need the lexicon JSON for the schemas we don't own — the source side of each lens. These are authored based on the actual record shapes from the source PDSes. 82 - 83 - **Files:** 84 - - Create: `lexicons/community/lexicon/calendar/event.json` 85 - - Create: `lexicons/place/stream/video.json` 86 - - Create: `lexicons/openai/whisper/verbose_json.json` 87 - 88 - - [ ] **Step 1: Fetch a sample calendar event record to verify field names** 89 - 90 - ```bash 91 - curl -s "https://bsky.social/xrpc/com.atproto.repo.listRecords?repo=did:plc:3xewinw4wtimo2lqfy5fm5sw&collection=community.lexicon.calendar.event&limit=1" | python3 -m json.tool | head -50 92 - ``` 93 - 94 - Use the output to verify the lexicon fields match the actual record shape. 95 - 96 - - [ ] **Step 2: Write `lexicons/community/lexicon/calendar/event.json`** 97 - 98 - Create the lexicon based on the actual ATmosphereConf schedule event record shape. Must include: `name`, `description`, `startsAt`, `endsAt`, `status`, `additionalData` (object with `room`, `category`, `type`, `speakers`, `isAtmosphereconf`). 99 - 100 - - [ ] **Step 3: Fetch a sample VOD record to verify field names** 101 - 102 - ```bash 103 - curl -s "https://iameli.com/xrpc/com.atproto.repo.listRecords?repo=did:plc:rbvrr34edl5ddpuwcubjiost&collection=place.stream.video&limit=1" | python3 -m json.tool | head -50 104 - ``` 105 - 106 - - [ ] **Step 4: Write `lexicons/place/stream/video.json`** 107 - 108 - Based on actual Streamplace VOD record shape. Must include: `title`, `duration`, `creator`, `createdAt`. 109 - 110 - - [ ] **Step 5: Write `lexicons/openai/whisper/verbose_json.json`** 111 - 112 - Based on the OpenAI Whisper verbose_json response format. Must include: `text`, `words` (array of `{ word, start, end }`). 113 - 114 - Note: OpenAI's Whisper output is not an ATProto record. Create the lexicon as a schema description of its shape so panproto can parse it. If `parseLexicon` doesn't accept non-ATProto schemas for this case, use `panproto.protocol('json-schema')` instead and adjust accordingly. 115 - 116 - - [ ] **Step 6: Commit** 117 - 118 - ```bash 119 - git add lexicons/community/ lexicons/place/ lexicons/openai/ 120 - git commit -m "feat: add source lexicons for calendar events, VOD records, and Whisper output" 121 - ``` 122 - 123 - ### Task 3: Create panproto wrapper 124 - 125 - **Files:** 126 - - Create: `formats/tv.ionosphere/ts/panproto.ts` 127 - - Create: `formats/tv.ionosphere/ts/panproto.test.ts` 128 - 129 - - [ ] **Step 1: Write the test file** 130 - 131 - ```typescript 132 - // formats/tv.ionosphere/ts/panproto.test.ts 133 - import { describe, it, expect } from "vitest"; 134 - import { init, loadSchema, convert } from "./panproto.js"; 135 - import { readFileSync } from "node:fs"; 136 - import path from "node:path"; 137 - 138 - const LEXICON_DIR = path.resolve(import.meta.dirname, "../../../lexicons"); 139 - 140 - function readLexicon(relativePath: string): object { 141 - return JSON.parse( 142 - readFileSync(path.join(LEXICON_DIR, relativePath), "utf-8") 143 - ); 144 - } 145 - 146 - describe("panproto wrapper", () => { 147 - it("initializes panproto", async () => { 148 - const pp = await init(); 149 - expect(pp).toBeDefined(); 150 - // ATProto should be a built-in protocol 151 - expect(pp.listProtocols()).toContain("atproto"); 152 - }); 153 - 154 - it("parses an ionosphere lexicon", async () => { 155 - const schema = await loadSchema( 156 - readLexicon("tv/ionosphere/talk.json") 157 - ); 158 - expect(schema).toBeDefined(); 159 - expect(schema.data).toBeDefined(); 160 - }); 161 - 162 - it("converts a calendar event to a talk", async () => { 163 - const calendarSchema = await loadSchema( 164 - readLexicon("community/lexicon/calendar/event.json") 165 - ); 166 - const talkSchema = await loadSchema( 167 - readLexicon("tv/ionosphere/talk.json") 168 - ); 169 - 170 - const event = { 171 - name: "Building with AT Protocol", 172 - description: "A talk about building apps", 173 - startsAt: "2026-03-27T10:00:00Z", 174 - endsAt: "2026-03-27T10:30:00Z", 175 - additionalData: { 176 - room: "Great Hall South", 177 - category: "developer", 178 - type: "presentation", 179 - speakers: [{ id: "alice.bsky.social", name: "Alice" }], 180 - isAtmosphereconf: true, 181 - }, 182 - }; 183 - 184 - const result = await convert(event, calendarSchema, talkSchema, { 185 - eventUri: "", 186 - }); 187 - 188 - expect(result).toBeDefined(); 189 - // Verify key fields were mapped 190 - expect((result as any).title).toBe("Building with AT Protocol"); 191 - expect((result as any).room).toBe("Great Hall South"); 192 - expect((result as any).startsAt).toBe("2026-03-27T10:00:00Z"); 193 - }); 194 - 195 - it("verifies lens laws for calendar→talk lens", async () => { 196 - const calendarSchema = await loadSchema( 197 - readLexicon("community/lexicon/calendar/event.json") 198 - ); 199 - const talkSchema = await loadSchema( 200 - readLexicon("tv/ionosphere/talk.json") 201 - ); 202 - 203 - const pp = await init(); 204 - const lens = pp.lens(calendarSchema, talkSchema); 205 - expect(lens).toBeDefined(); 206 - 207 - // Verify GetPut and PutGet laws hold on a sample record 208 - const { encode } = await import("@msgpack/msgpack"); 209 - const sampleEvent = encode({ 210 - name: "Test Talk", 211 - description: "A test", 212 - startsAt: "2026-03-27T10:00:00Z", 213 - endsAt: "2026-03-27T10:30:00Z", 214 - additionalData: { 215 - room: "Room A", 216 - category: "dev", 217 - type: "presentation", 218 - speakers: [], 219 - isAtmosphereconf: true, 220 - }, 221 - }); 222 - const laws = lens.checkLaws(sampleEvent); 223 - expect(laws.passed).toBe(true); 224 - }); 225 - }); 226 - ``` 227 - 228 - - [ ] **Step 2: Run test to verify it fails** 229 - 230 - ```bash 231 - cd formats/tv.ionosphere && pnpm test -- panproto.test 232 - ``` 233 - 234 - Expected: FAIL — `./panproto.js` not found. 235 - 236 - - [ ] **Step 3: Write the panproto wrapper** 237 - 238 - ```typescript 239 - // formats/tv.ionosphere/ts/panproto.ts 240 - import { Panproto, type LensHandle, type BuiltSchema } from "@panproto/core"; 241 - 242 - let _panproto: Panproto | null = null; 243 - 244 - /** 245 - * Initialize the panproto runtime (lazy singleton). 246 - * WASM is loaded once and reused across all calls. 247 - */ 248 - export async function init(): Promise<Panproto> { 249 - if (!_panproto) _panproto = await Panproto.init(); 250 - return _panproto; 251 - } 252 - 253 - /** 254 - * Parse an ATProto lexicon JSON into a panproto schema. 255 - */ 256 - export async function loadSchema( 257 - lexiconJson: object | string 258 - ): Promise<BuiltSchema> { 259 - const pp = await init(); 260 - return pp.parseLexicon(lexiconJson); 261 - } 262 - 263 - /** 264 - * Create a lens between two schemas. 265 - */ 266 - export async function createLens( 267 - from: BuiltSchema, 268 - to: BuiltSchema 269 - ): Promise<LensHandle> { 270 - const pp = await init(); 271 - return pp.lens(from, to); 272 - } 273 - 274 - /** 275 - * Convert a record from one schema to another using an auto-generated lens. 276 - * Plain JS objects in, plain JS objects out. 277 - */ 278 - export async function convert( 279 - data: object, 280 - from: BuiltSchema, 281 - to: BuiltSchema, 282 - defaults?: Record<string, unknown> 283 - ): Promise<unknown> { 284 - const pp = await init(); 285 - return pp.convert(data, { from, to, defaults }); 286 - } 287 - 288 - /** 289 - * Generate and serialize a protolens chain between two schemas. 290 - * Used by publish.ts to create lens records for the PDS. 291 - */ 292 - export async function serializeChain( 293 - from: BuiltSchema, 294 - to: BuiltSchema 295 - ): Promise<string> { 296 - const pp = await init(); 297 - const chain = pp.protolensChain(from, to); 298 - return chain.toJson(); 299 - } 300 - 301 - // Re-export types that pipeline scripts need 302 - export type { LensHandle, BuiltSchema, Panproto }; 303 - ``` 304 - 305 - - [ ] **Step 4: Run test to verify it passes** 306 - 307 - ```bash 308 - cd formats/tv.ionosphere && pnpm test -- panproto.test 309 - ``` 310 - 311 - Expected: All 4 tests pass. If `parseLexicon` doesn't handle a source lexicon (Whisper is not ATProto), note the issue and adjust the test — we may need `protocol('json-schema')` for that one. 312 - 313 - - [ ] **Step 5: Commit** 314 - 315 - ```bash 316 - git add formats/tv.ionosphere/ts/panproto.ts formats/tv.ionosphere/ts/panproto.test.ts 317 - git commit -m "feat: panproto wrapper for lens operations" 318 - ``` 319 - 320 - ### Task 4: Add panproto export path 321 - 322 - **Files:** 323 - - Modify: `formats/tv.ionosphere/package.json` (exports) 324 - 325 - - [ ] **Step 1: Update package.json exports** 326 - 327 - Add panproto export path to `formats/tv.ionosphere/package.json`: 328 - 329 - ```json 330 - "exports": { 331 - ".": "./ts/index.ts", 332 - "./assemble": "./ts/assemble.ts", 333 - "./lenses": "./ts/lenses.ts", 334 - "./panproto": "./ts/panproto.ts", 335 - "./transcript-encoding": "./ts/transcript-encoding.ts" 336 - } 337 - ``` 338 - 339 - Note: `lenses.ts` keeps its current exports for now — `ingest.ts` still imports from it. We'll replace it in Chunk 4 after the pipeline is rewired. 340 - 341 - - [ ] **Step 2: Run all format tests** 342 - 343 - ```bash 344 - cd formats/tv.ionosphere && pnpm test 345 - ``` 346 - 347 - Expected: All tests pass (lenses, panproto, transcript-encoding, assemble). 348 - 349 - - [ ] **Step 3: Commit** 350 - 351 - ```bash 352 - git add formats/tv.ionosphere/package.json 353 - git commit -m "chore: add panproto export path to format package" 354 - ``` 355 - 356 - --- 357 - 358 - ## Chunk 2: Appview Lens Indexing 359 - 360 - Add lens records to the appview's indexer, database, and backfill so lenses are discoverable. 361 - 362 - ### Task 5: Add lenses table to database 363 - 364 - **Files:** 365 - - Modify: `apps/ionosphere-appview/src/db.ts:19-147` (inside `migrate` function) 366 - 367 - - [ ] **Step 1: Add the lenses table to the migration** 368 - 369 - Add this SQL to the `migrate` function in `db.ts`, after the `annotations` table and before the `_cursor` table: 370 - 371 - ```sql 372 - CREATE TABLE IF NOT EXISTS lenses ( 373 - uri TEXT PRIMARY KEY, 374 - did TEXT NOT NULL, 375 - rkey TEXT NOT NULL, 376 - source_nsid TEXT, 377 - target_nsid TEXT, 378 - version INTEGER DEFAULT 1, 379 - chain_json TEXT, 380 - created_at TEXT DEFAULT CURRENT_TIMESTAMP 381 - ); 382 - ``` 383 - 384 - - [ ] **Step 2: Verify the appview starts cleanly** 385 - 386 - ```bash 387 - cd apps/ionosphere-appview && PORT=9401 npx tsx src/appview.ts & 388 - sleep 3 && curl -s http://localhost:9401/health | python3 -m json.tool 389 - kill %1 390 - ``` 391 - 392 - Expected: `{"status": "ok"}`. The new table is created alongside existing tables. 393 - 394 - - [ ] **Step 3: Commit** 395 - 396 - ```bash 397 - git add apps/ionosphere-appview/src/db.ts 398 - git commit -m "feat: add lenses table to appview schema" 399 - ``` 400 - 401 - ### Task 6: Index lens records 402 - 403 - **Files:** 404 - - Modify: `apps/ionosphere-appview/src/indexer.ts` 405 - 406 - - [ ] **Step 1: Add `org.relationaltext.lens` to IONOSPHERE_COLLECTIONS** 407 - 408 - In `indexer.ts`, add to the `IONOSPHERE_COLLECTIONS` array: 409 - 410 - ```typescript 411 - export const IONOSPHERE_COLLECTIONS = [ 412 - "tv.ionosphere.event", 413 - "tv.ionosphere.talk", 414 - "tv.ionosphere.speaker", 415 - "tv.ionosphere.concept", 416 - "tv.ionosphere.transcript", 417 - "tv.ionosphere.annotation", 418 - "org.relationaltext.lens", 419 - ]; 420 - ``` 421 - 422 - - [ ] **Step 2: Add delete handler for lenses** 423 - 424 - In the `processEvent` function's delete switch statement, add: 425 - 426 - ```typescript 427 - case "org.relationaltext.lens": 428 - db.prepare("DELETE FROM lenses WHERE uri = ?").run(uri); 429 - break; 430 - ``` 431 - 432 - - [ ] **Step 3: Add create/update handler for lenses** 433 - 434 - Add a new case in the create/update switch and the indexer function: 435 - 436 - ```typescript 437 - case "org.relationaltext.lens": 438 - indexLens(db, event.did, rkey, uri, record); 439 - break; 440 - ``` 441 - 442 - ```typescript 443 - function indexLens( 444 - db: Database.Database, 445 - did: string, 446 - rkey: string, 447 - uri: string, 448 - record: Record<string, unknown> 449 - ): void { 450 - db.prepare( 451 - `INSERT OR REPLACE INTO lenses 452 - (uri, did, rkey, source_nsid, target_nsid, version, chain_json) 453 - VALUES (?, ?, ?, ?, ?, ?, ?)` 454 - ).run( 455 - uri, 456 - did, 457 - rkey, 458 - (record.source as string) || null, 459 - (record.target as string) || null, 460 - (record.version as number) || 1, 461 - record.chainJson ? JSON.stringify(record.chainJson) : null 462 - ); 463 - } 464 - ``` 465 - 466 - - [ ] **Step 4: Run existing tests** 467 - 468 - ```bash 469 - cd apps/ionosphere-appview && pnpm test 470 - ``` 471 - 472 - Expected: Existing tests still pass. 473 - 474 - - [ ] **Step 5: Commit** 475 - 476 - ```bash 477 - git add apps/ionosphere-appview/src/indexer.ts 478 - git commit -m "feat: index org.relationaltext.lens records in appview" 479 - ``` 480 - 481 - ### Task 7: Create lens resolver 482 - 483 - The lens resolver is used by pipeline scripts to find the right lens for a source→target pair. It checks the appview index first, then falls back to fetching directly from the PDS. 484 - 485 - **Files:** 486 - - Create: `apps/ionosphere-appview/src/lens-resolver.ts` 487 - 488 - - [ ] **Step 1: Write the lens resolver** 489 - 490 - ```typescript 491 - // apps/ionosphere-appview/src/lens-resolver.ts 492 - import type Database from "better-sqlite3"; 493 - 494 - const PDS_URL = process.env.PDS_URL ?? "http://localhost:2690"; 495 - const BOT_HANDLE = process.env.BOT_HANDLE ?? "ionosphere.test"; 496 - 497 - interface ResolvedLens { 498 - chainJson: string; 499 - source: string; 500 - target: string; 501 - } 502 - 503 - /** 504 - * Resolve a lens by source and target NSID. 505 - * 506 - * Resolution order: 507 - * 1. Appview SQLite index (fast, local) 508 - * 2. PDS direct fetch (always available after publish) 509 - * 3. null (not found) 510 - */ 511 - export async function resolveLensRecord( 512 - source: string, 513 - target: string, 514 - db?: Database.Database 515 - ): Promise<ResolvedLens | null> { 516 - // 1. Try appview index first (if db handle provided) 517 - if (db) { 518 - const row = db 519 - .prepare( 520 - "SELECT source_nsid, target_nsid, chain_json FROM lenses WHERE source_nsid = ? AND target_nsid = ? LIMIT 1" 521 - ) 522 - .get(source, target) as any; 523 - if (row?.chain_json) { 524 - return { 525 - chainJson: row.chain_json, 526 - source: row.source_nsid, 527 - target: row.target_nsid, 528 - }; 529 - } 530 - } 531 - 532 - // 2. Fall back to PDS direct fetch 533 - try { 534 - const handleRes = await fetch( 535 - `${PDS_URL}/xrpc/com.atproto.identity.resolveHandle?handle=${BOT_HANDLE}` 536 - ); 537 - if (!handleRes.ok) return null; 538 - const { did } = (await handleRes.json()) as { did: string }; 539 - 540 - let cursor: string | undefined; 541 - do { 542 - const params = new URLSearchParams({ 543 - repo: did, 544 - collection: "org.relationaltext.lens", 545 - limit: "100", 546 - }); 547 - if (cursor) params.set("cursor", cursor); 548 - 549 - const res = await fetch( 550 - `${PDS_URL}/xrpc/com.atproto.repo.listRecords?${params}` 551 - ); 552 - if (!res.ok) return null; 553 - const data = await res.json(); 554 - 555 - for (const record of data.records || []) { 556 - const v = record.value; 557 - if (v.source === source && v.target === target) { 558 - return { 559 - chainJson: v.chainJson, 560 - source: v.source, 561 - target: v.target, 562 - }; 563 - } 564 - } 565 - 566 - cursor = data.cursor; 567 - } while (cursor); 568 - } catch { 569 - // PDS not available 570 - } 571 - 572 - return null; 573 - } 574 - ``` 575 - 576 - - [ ] **Step 2: Commit** 577 - 578 - ```bash 579 - git add apps/ionosphere-appview/src/lens-resolver.ts 580 - git commit -m "feat: lens resolver with PDS fetch fallback" 581 - ``` 582 - 583 - --- 584 - 585 - ## Chunk 3: Pipeline Integration 586 - 587 - Wire panproto lenses into the actual pipeline scripts: publish, ingest, transcribe. 588 - 589 - ### Task 8: Publish lens records 590 - 591 - **Files:** 592 - - Modify: `apps/ionosphere-appview/src/publish.ts` 593 - 594 - - [ ] **Step 1: Add lens publishing as step 0** 595 - 596 - Add at the top of the `main()` function in `publish.ts`, before publishing events. This reads the source and target lexicons, auto-generates a protolens chain, serializes it, and writes a lens record to the PDS. 597 - 598 - ```typescript 599 - // 0. Publish lens records 600 - console.log("Publishing lens records..."); 601 - const { loadSchema, serializeChain } = await import("@ionosphere/format/panproto"); 602 - const lexiconDir = path.resolve(import.meta.dirname, "../../../lexicons"); 603 - 604 - async function publishLens( 605 - sourceLexiconPath: string, 606 - targetLexiconPath: string, 607 - rkey: string 608 - ) { 609 - const sourceLexicon = JSON.parse( 610 - readFileSync(path.join(lexiconDir, sourceLexiconPath), "utf-8") 611 - ); 612 - const targetLexicon = JSON.parse( 613 - readFileSync(path.join(lexiconDir, targetLexiconPath), "utf-8") 614 - ); 615 - 616 - const sourceSchema = await loadSchema(sourceLexicon); 617 - const targetSchema = await loadSchema(targetLexicon); 618 - const chainJson = await serializeChain(sourceSchema, targetSchema); 619 - 620 - const sourceNsid = sourceLexicon.id; 621 - const targetNsid = targetLexicon.id; 622 - 623 - await pds.putRecord("org.relationaltext.lens", rkey, { 624 - $type: "org.relationaltext.lens", 625 - source: sourceNsid, 626 - target: targetNsid, 627 - version: 1, 628 - chainJson, 629 - }); 630 - 631 - console.log(` Lens: ${sourceNsid} → ${targetNsid}`); 632 - } 633 - 634 - await publishLens( 635 - "community/lexicon/calendar/event.json", 636 - "tv/ionosphere/talk.json", 637 - "calendar-event-to-talk-v1" 638 - ); 639 - await publishLens( 640 - "place/stream/video.json", 641 - "tv/ionosphere/talk.json", 642 - "vod-to-talk-v1" 643 - ); 644 - // Whisper lens only if the lexicon works with parseLexicon 645 - // (may need json-schema protocol instead — test in Task 3) 646 - ``` 647 - 648 - - [ ] **Step 2: Test publish** 649 - 650 - ```bash 651 - cd apps/ionosphere-appview && npx tsx src/publish.ts 652 - ``` 653 - 654 - Expected: "Publishing lens records..." followed by lens creation messages, then the normal event/speaker/talk/transcript publishing. 655 - 656 - - [ ] **Step 3: Verify lens records on PDS** 657 - 658 - ```bash 659 - DID=$(curl -s "http://localhost:2690/xrpc/com.atproto.identity.resolveHandle?handle=ionosphere.test" | python3 -c "import sys,json; print(json.load(sys.stdin)['did'])") 660 - curl -s "http://localhost:2690/xrpc/com.atproto.repo.listRecords?repo=$DID&collection=org.relationaltext.lens&limit=10" | python3 -m json.tool 661 - ``` 662 - 663 - Expected: Lens records visible on PDS. 664 - 665 - - [ ] **Step 4: Commit** 666 - 667 - ```bash 668 - git add apps/ionosphere-appview/src/publish.ts 669 - git commit -m "feat: publish lens records to PDS" 670 - ``` 671 - 672 - ### Task 9: Wire panproto into ingest.ts 673 - 674 - **Files:** 675 - - Modify: `apps/ionosphere-appview/src/ingest.ts` 676 - 677 - - [ ] **Step 1: Replace loadLens/applyLens with panproto convert** 678 - 679 - In `ingest.ts`: 680 - 681 - 1. Remove the imports of `loadLens` and `applyLens` from `@ionosphere/format/lenses` 682 - 2. Add panproto imports 683 - 3. Replace `parseScheduleEvent` to use panproto convert 684 - 4. Replace `parseVodRecord` to use panproto convert (currently ad-hoc) 685 - 686 - The key change in `parseScheduleEvent`: 687 - 688 - ```typescript 689 - // Before: 690 - const mapped = applyLens(scheduleLens, v); 691 - 692 - // After: 693 - const mapped = await convert(v, calendarSchema, talkSchema, { 694 - eventUri: "", 695 - }); 696 - ``` 697 - 698 - Note: `ingest.ts` currently does filtering logic (skip cancelled, skip info/food types) before applying the lens. This filtering stays as TypeScript — it's not a schema transform, it's business logic. 699 - 700 - Also note: `ingest.ts` writes to a local staging SQLite, not the PDS. The lens is used to normalize source field names, not to produce final PDS records. 701 - 702 - - [ ] **Step 2: Initialize panproto and load schemas at top of main()** 703 - 704 - ```typescript 705 - import { init as initPanproto, loadSchema, convert } from "@ionosphere/format/panproto"; 706 - // ... at top of main(): 707 - const pp = await initPanproto(); 708 - 709 - const calendarLexicon = JSON.parse( 710 - readFileSync( 711 - path.resolve(import.meta.dirname, "../../../lexicons/community/lexicon/calendar/event.json"), 712 - "utf-8" 713 - ) 714 - ); 715 - const talkLexicon = JSON.parse( 716 - readFileSync( 717 - path.resolve(import.meta.dirname, "../../../lexicons/tv/ionosphere/talk.json"), 718 - "utf-8" 719 - ) 720 - ); 721 - const calendarSchema = pp.parseLexicon(calendarLexicon); 722 - const talkSchema = pp.parseLexicon(talkLexicon); 723 - ``` 724 - 725 - - [ ] **Step 3: Test ingest still works** 726 - 727 - ```bash 728 - cd apps/ionosphere-appview && npx tsx src/ingest.ts 729 - ``` 730 - 731 - Expected: Same output as before — schedule events fetched, VODs fetched, correlated, ingested. The lens is applied transparently. 732 - 733 - - [ ] **Step 4: Commit** 734 - 735 - ```bash 736 - git add apps/ionosphere-appview/src/ingest.ts 737 - git commit -m "refactor: use panproto lenses in ingest pipeline" 738 - ``` 739 - 740 - ### Task 10: Wire panproto into Whisper provider 741 - 742 - **Files:** 743 - - Modify: `apps/ionosphere-appview/src/providers/openai-whisper.ts` 744 - 745 - - [ ] **Step 1: Evaluate Whisper lens feasibility** 746 - 747 - The Whisper provider currently does a simple mapping: `{ word, start, end }` → `{ word, start, end, confidence: 1.0 }`. This is ad-hoc in `openai-whisper.ts:25-30`. 748 - 749 - Check whether `parseLexicon` works with the Whisper lexicon we created. If the Whisper output isn't an ATProto record, we may need `pp.protocol('json-schema')` to create the schema. 750 - 751 - If panproto can't handle this boundary (Whisper output isn't really a lexicon), **keep the ad-hoc mapping** and add a comment noting it as a future graduation candidate. The calendar→talk and VOD→talk lenses are the high-value boundaries. 752 - 753 - - [ ] **Step 2: If feasible, replace the ad-hoc mapping with lens convert** 754 - 755 - Replace lines 25-30 in `openai-whisper.ts` with a panproto convert call. If not feasible, add a comment: 756 - 757 - ```typescript 758 - // Lens candidate: Whisper output → ionosphere transcript format. 759 - // Currently ad-hoc because Whisper output is not an ATProto record. 760 - // Migrate when panproto's json-schema protocol supports this shape. 761 - ``` 762 - 763 - - [ ] **Step 3: Test transcription still works** 764 - 765 - ```bash 766 - cd apps/ionosphere-appview && npx tsx src/transcribe.ts 767 - ``` 768 - 769 - Expected: Provider still returns correctly formatted transcript data. 770 - 771 - - [ ] **Step 4: Commit** 772 - 773 - ```bash 774 - git add apps/ionosphere-appview/src/providers/openai-whisper.ts 775 - git commit -m "refactor: evaluate and document Whisper lens boundary" 776 - ``` 777 - 778 - --- 779 - 780 - ## Chunk 4: Cleanup and Verification 781 - 782 - Remove old lens code and files, run full test suite, verify end-to-end. 783 - 784 - ### Task 11: Replace old lens implementation 785 - 786 - Now that the pipeline is wired to panproto, replace the old `lenses.ts` with re-exports. 787 - 788 - **Files:** 789 - - Modify: `formats/tv.ionosphere/ts/lenses.ts` 790 - - Modify: `formats/tv.ionosphere/ts/lenses.test.ts` 791 - 792 - - [ ] **Step 1: Update lenses.ts to re-export from panproto** 793 - 794 - Replace the entire contents of `formats/tv.ionosphere/ts/lenses.ts` with: 795 - 796 - ```typescript 797 - // Legacy export path — re-exports from panproto wrapper. 798 - // Pipeline code should import from "./panproto.js" directly. 799 - export { init, loadSchema, createLens, convert, serializeChain } from "./panproto.js"; 800 - export type { LensHandle, BuiltSchema, Panproto } from "./panproto.js"; 801 - ``` 802 - 803 - - [ ] **Step 2: Update lenses.test.ts** 804 - 805 - Replace with a single smoke test that verifies the re-export works: 806 - 807 - ```typescript 808 - import { describe, it, expect } from "vitest"; 809 - import { init } from "./lenses.js"; 810 - 811 - describe("lenses re-export", () => { 812 - it("re-exports init from panproto", async () => { 813 - const pp = await init(); 814 - expect(pp).toBeDefined(); 815 - }); 816 - }); 817 - ``` 818 - 819 - - [ ] **Step 3: Run all format tests** 820 - 821 - ```bash 822 - cd formats/tv.ionosphere && pnpm test 823 - ``` 824 - 825 - Expected: All tests pass. 826 - 827 - - [ ] **Step 4: Commit** 828 - 829 - ```bash 830 - git add formats/tv.ionosphere/ts/lenses.ts formats/tv.ionosphere/ts/lenses.test.ts 831 - git commit -m "refactor: replace custom lens runtime with panproto re-export" 832 - ``` 833 - 834 - ### Task 12: Remove old lens JSON files 835 - 836 - **Files:** 837 - - Delete contents of: `formats/tv.ionosphere/lenses/` directory 838 - 839 - - [ ] **Step 1: Remove old lens spec files** 840 - 841 - The `lenses/` directory contained our custom lens JSON specs. These are superseded by the source lexicons + panproto auto-generation. 842 - 843 - ```bash 844 - rm formats/tv.ionosphere/lenses/*.lens.json 845 - ``` 846 - 847 - - [ ] **Step 2: Add a README to the lenses directory** 848 - 849 - Create `formats/tv.ionosphere/lenses/README.md`: 850 - 851 - ```markdown 852 - # Lenses 853 - 854 - Lens generation is now handled by panproto from source and target lexicon pairs. 855 - See `lexicons/` for source schemas and `docs/superpowers/specs/2026-03-31-lens-layer-design.md` for details. 856 - ``` 857 - 858 - - [ ] **Step 3: Commit** 859 - 860 - ```bash 861 - git add formats/tv.ionosphere/lenses/ 862 - git commit -m "chore: remove old custom lens specs, replaced by panproto" 863 - ``` 864 - 865 - ### Task 13: Full verification 866 - 867 - - [ ] **Step 1: Run format package tests** 868 - 869 - ```bash 870 - cd formats/tv.ionosphere && pnpm test 871 - ``` 872 - 873 - Expected: All tests pass. 874 - 875 - - [ ] **Step 2: Run appview tests** 876 - 877 - ```bash 878 - cd apps/ionosphere-appview && pnpm test 879 - ``` 880 - 881 - Expected: All tests pass. 882 - 883 - - [ ] **Step 3: Run typecheck across workspace** 884 - 885 - ```bash 886 - cd /Users/blainecook/Code/skeetv && pnpm -r typecheck 887 - ``` 888 - 889 - Expected: No type errors. 890 - 891 - - [ ] **Step 4: End-to-end verification** 892 - 893 - Start the PDS (already running), run publish, start appview, verify lens records are indexed: 894 - 895 - ```bash 896 - cd apps/ionosphere-appview 897 - npx tsx src/publish.ts 898 - PORT=9401 npx tsx src/appview.ts & 899 - sleep 5 900 - # Verify lenses are in the appview DB 901 - sqlite3 data/ionosphere.sqlite "SELECT source_nsid, target_nsid, version FROM lenses" 902 - kill %1 903 - ``` 904 - 905 - Expected: Lens records visible in the SQLite database. 906 - 907 - - [ ] **Step 5: Commit any final fixes** 908 - 909 - ```bash 910 - git add -A && git status 911 - # Only commit if there are changes 912 - git commit -m "fix: final verification fixes for lens layer" 913 - ```
-482
docs/superpowers/plans/2026-03-31-tests-ci.md
··· 1 - # Tests & CI/CD Implementation Plan 2 - 3 - > **For agentic workers:** REQUIRED: Use superpowers:subagent-driven-development (if subagents available) or superpowers:executing-plans to implement this plan. Steps use checkbox (`- [ ]`) syntax for tracking. 4 - 5 - **Goal:** Add frontend unit tests for logic-heavy transcript code and a GitHub Actions CI pipeline that runs typecheck + test on every push and PR. 6 - 7 - **Architecture:** Extract pure functions from TranscriptView.tsx into a testable module, test them with Vitest, add CI workflow that runs all workspace tests. 8 - 9 - **Tech Stack:** Vitest, GitHub Actions, pnpm 10 - 11 - **Spec:** `docs/superpowers/specs/2026-03-31-tests-ci-design.md` 12 - 13 - --- 14 - 15 - ## File Map 16 - 17 - ### New files 18 - - `apps/ionosphere/src/lib/transcript.ts` — pure functions extracted from TranscriptView.tsx 19 - - `apps/ionosphere/src/lib/transcript.test.ts` — unit tests 20 - - `apps/ionosphere/vitest.config.ts` — Vitest config for the frontend package 21 - - `.github/workflows/ci.yml` — GitHub Actions workflow 22 - 23 - ### Modified files 24 - - `apps/ionosphere/package.json` — add vitest devDependency and test script 25 - - `apps/ionosphere/src/app/components/TranscriptView.tsx` — import from extracted module instead of inline 26 - 27 - --- 28 - 29 - ## Chunk 1: Frontend Unit Tests 30 - 31 - ### Task 1: Add Vitest to the frontend package 32 - 33 - **Files:** 34 - - Modify: `apps/ionosphere/package.json` 35 - - Create: `apps/ionosphere/vitest.config.ts` 36 - 37 - - [ ] **Step 1: Add vitest devDependency** 38 - 39 - ```bash 40 - cd apps/ionosphere && pnpm add -D vitest 41 - ``` 42 - 43 - - [ ] **Step 2: Create vitest config** 44 - 45 - Create `apps/ionosphere/vitest.config.ts`: 46 - 47 - ```typescript 48 - import { defineConfig } from "vitest/config"; 49 - import path from "node:path"; 50 - 51 - export default defineConfig({ 52 - test: { 53 - include: ["src/**/*.test.ts"], 54 - }, 55 - resolve: { 56 - alias: { 57 - "@": path.resolve(__dirname, "src"), 58 - }, 59 - }, 60 - }); 61 - ``` 62 - 63 - - [ ] **Step 3: Add test script to package.json** 64 - 65 - Add to the `scripts` section: 66 - ```json 67 - "test": "vitest run" 68 - ``` 69 - 70 - - [ ] **Step 4: Verify vitest runs (no tests yet)** 71 - 72 - ```bash 73 - cd apps/ionosphere && pnpm test 74 - ``` 75 - 76 - Expected: "No test files found" or similar — no error. 77 - 78 - - [ ] **Step 5: Commit** 79 - 80 - ```bash 81 - git add apps/ionosphere/package.json apps/ionosphere/vitest.config.ts pnpm-lock.yaml 82 - git commit -m "chore: add vitest to frontend package" 83 - ``` 84 - 85 - ### Task 2: Extract pure functions from TranscriptView 86 - 87 - **Files:** 88 - - Create: `apps/ionosphere/src/lib/transcript.ts` 89 - - Modify: `apps/ionosphere/src/app/components/TranscriptView.tsx` 90 - 91 - - [ ] **Step 1: Create the extracted module** 92 - 93 - Create `apps/ionosphere/src/lib/transcript.ts` with the pure functions from TranscriptView.tsx. Extract these functions and types: 94 - 95 - - `TranscriptFacet` interface 96 - - `TranscriptDocument` interface 97 - - `WordSpan` interface 98 - - `ConceptSpan` interface 99 - - `extractData(doc: TranscriptDocument)` — parses facets into sorted word spans and concept spans with shared boundary times 100 - - `brightnessAtTime(currentTimeNs: number, timeNs: number): number` — brightness falloff calculation 101 - - `toColor(brightness: number, concept: ConceptSpan | null): string` — brightness + concept → CSS color 102 - - Constants: `BASE_BRIGHTNESS`, `PEAK_BRIGHTNESS`, `WINDOW_NS` 103 - 104 - These functions are currently defined inline in TranscriptView.tsx (lines 6-139). Copy them to the new module and export them. 105 - 106 - - [ ] **Step 2: Update TranscriptView.tsx to import from the extracted module** 107 - 108 - Replace the inline definitions with imports: 109 - 110 - ```typescript 111 - import { 112 - extractData, 113 - brightnessAtTime, 114 - toColor, 115 - type TranscriptDocument, 116 - type WordSpan, 117 - type ConceptSpan, 118 - } from "@/lib/transcript"; 119 - ``` 120 - 121 - Remove the inline `extractData`, `brightnessAtTime`, `toColor`, `WordSpan`, `ConceptSpan`, `TranscriptFacet`, `TranscriptDocument` definitions and the constants from TranscriptView.tsx. Keep the React components (`WordSpanComponent`, `TranscriptView`). 122 - 123 - - [ ] **Step 3: Verify the app still builds** 124 - 125 - ```bash 126 - cd apps/ionosphere && npx next build 2>&1 | tail -5 127 - ``` 128 - 129 - This may fail if the appview isn't serving data for SSG, but it should at least get past TypeScript compilation. If it fails on data fetching, that's fine — we're checking that the imports resolve. 130 - 131 - Alternatively, just run typecheck: 132 - ```bash 133 - cd apps/ionosphere && npx tsc --noEmit 134 - ``` 135 - 136 - - [ ] **Step 4: Commit** 137 - 138 - ```bash 139 - git add apps/ionosphere/src/lib/transcript.ts apps/ionosphere/src/app/components/TranscriptView.tsx 140 - git commit -m "refactor: extract pure transcript functions into testable module" 141 - ``` 142 - 143 - ### Task 3: Write transcript unit tests 144 - 145 - **Files:** 146 - - Create: `apps/ionosphere/src/lib/transcript.test.ts` 147 - 148 - - [ ] **Step 1: Write the test file** 149 - 150 - ```typescript 151 - import { describe, it, expect } from "vitest"; 152 - import { 153 - extractData, 154 - brightnessAtTime, 155 - toColor, 156 - BASE_BRIGHTNESS, 157 - PEAK_BRIGHTNESS, 158 - WINDOW_NS, 159 - type TranscriptDocument, 160 - } from "./transcript"; 161 - 162 - // --- Test fixtures --- 163 - 164 - function makeDoc(words: Array<{ text: string; startNs: number; endNs: number }>, concepts?: Array<{ byteStart: number; byteEnd: number; name: string }>): TranscriptDocument { 165 - const encoder = new TextEncoder(); 166 - const text = words.map((w) => w.text).join(" "); 167 - const facets: TranscriptDocument["facets"] = []; 168 - 169 - let searchFrom = 0; 170 - for (const w of words) { 171 - const idx = text.indexOf(w.text, searchFrom); 172 - const byteStart = encoder.encode(text.slice(0, idx)).length; 173 - const byteEnd = encoder.encode(text.slice(0, idx + w.text.length)).length; 174 - 175 - facets.push({ 176 - index: { byteStart, byteEnd }, 177 - features: [{ 178 - $type: "tv.ionosphere.facet#timestamp", 179 - startTime: w.startNs, 180 - endTime: w.endNs, 181 - }], 182 - }); 183 - searchFrom = idx + w.text.length; 184 - } 185 - 186 - if (concepts) { 187 - for (const c of concepts) { 188 - facets.push({ 189 - index: { byteStart: c.byteStart, byteEnd: c.byteEnd }, 190 - features: [{ 191 - $type: "tv.ionosphere.facet#concept-ref", 192 - conceptUri: `at://test/tv.ionosphere.concept/${c.name}`, 193 - conceptRkey: c.name, 194 - conceptName: c.name, 195 - }], 196 - }); 197 - } 198 - } 199 - 200 - return { text, facets }; 201 - } 202 - 203 - // --- extractData --- 204 - 205 - describe("extractData", () => { 206 - it("extracts words sorted by start time", () => { 207 - const doc = makeDoc([ 208 - { text: "Hello", startNs: 0, endNs: 500_000_000 }, 209 - { text: "world", startNs: 500_000_000, endNs: 1_000_000_000 }, 210 - ]); 211 - 212 - const { words } = extractData(doc); 213 - expect(words).toHaveLength(2); 214 - expect(words[0].text).toBe("Hello"); 215 - expect(words[1].text).toBe("world"); 216 - expect(words[0].startTime).toBe(0); 217 - expect(words[1].startTime).toBe(500_000_000); 218 - }); 219 - 220 - it("computes shared boundary times between adjacent words", () => { 221 - const doc = makeDoc([ 222 - { text: "Hello", startNs: 0, endNs: 400_000_000 }, 223 - { text: "world", startNs: 600_000_000, endNs: 1_000_000_000 }, 224 - ]); 225 - 226 - const { words } = extractData(doc); 227 - 228 - // First word: boundaryStart = own startTime, boundaryEnd = midpoint to next 229 - expect(words[0].boundaryStartTime).toBe(0); 230 - expect(words[0].boundaryEndTime).toBe(500_000_000); // (400M + 600M) / 2 231 - 232 - // Second word: boundaryStart = midpoint from prev, boundaryEnd = own endTime 233 - expect(words[1].boundaryStartTime).toBe(500_000_000); 234 - expect(words[1].boundaryEndTime).toBe(1_000_000_000); 235 - 236 - // KEY INVARIANT: word N boundaryEnd === word N+1 boundaryStart 237 - expect(words[0].boundaryEndTime).toBe(words[1].boundaryStartTime); 238 - }); 239 - 240 - it("matches concepts to words by byte range overlap", () => { 241 - const encoder = new TextEncoder(); 242 - // "AT Protocol" — concept covers the whole phrase 243 - const doc = makeDoc([ 244 - { text: "AT", startNs: 0, endNs: 200_000_000 }, 245 - { text: "Protocol", startNs: 200_000_000, endNs: 600_000_000 }, 246 - { text: "rocks", startNs: 700_000_000, endNs: 1_000_000_000 }, 247 - ]); 248 - 249 - const text = "AT Protocol rocks"; 250 - const conceptStart = 0; 251 - const conceptEnd = encoder.encode("AT Protocol").length; 252 - 253 - doc.facets.push({ 254 - index: { byteStart: conceptStart, byteEnd: conceptEnd }, 255 - features: [{ 256 - $type: "tv.ionosphere.facet#concept-ref", 257 - conceptUri: "at://test/tv.ionosphere.concept/at-protocol", 258 - conceptRkey: "at-protocol", 259 - conceptName: "AT Protocol", 260 - }], 261 - }); 262 - 263 - const { wordConcepts } = extractData(doc); 264 - 265 - // "AT" and "Protocol" overlap with the concept 266 - expect(wordConcepts[0]).toHaveLength(1); 267 - expect(wordConcepts[0][0].conceptName).toBe("AT Protocol"); 268 - expect(wordConcepts[1]).toHaveLength(1); 269 - // "rocks" does not 270 - expect(wordConcepts[2]).toHaveLength(0); 271 - }); 272 - 273 - it("handles empty document", () => { 274 - const doc: TranscriptDocument = { text: "", facets: [] }; 275 - const { words, concepts, wordConcepts } = extractData(doc); 276 - expect(words).toHaveLength(0); 277 - expect(concepts).toHaveLength(0); 278 - expect(wordConcepts).toHaveLength(0); 279 - }); 280 - }); 281 - 282 - // --- brightnessAtTime --- 283 - 284 - describe("brightnessAtTime", () => { 285 - it("returns peak brightness at current time", () => { 286 - expect(brightnessAtTime(1_000_000_000, 1_000_000_000)).toBe(PEAK_BRIGHTNESS); 287 - }); 288 - 289 - it("returns base brightness beyond the window", () => { 290 - const current = 5_000_000_000; 291 - const distant = current + WINDOW_NS + 1; 292 - expect(brightnessAtTime(current, distant)).toBe(BASE_BRIGHTNESS); 293 - }); 294 - 295 - it("returns intermediate brightness within the window", () => { 296 - const current = 5_000_000_000; 297 - const halfway = current + WINDOW_NS / 2; 298 - const b = brightnessAtTime(current, halfway); 299 - expect(b).toBeGreaterThan(BASE_BRIGHTNESS); 300 - expect(b).toBeLessThan(PEAK_BRIGHTNESS); 301 - }); 302 - 303 - it("is symmetric around current time", () => { 304 - const current = 5_000_000_000; 305 - const offset = 500_000_000; 306 - expect(brightnessAtTime(current, current + offset)) 307 - .toBe(brightnessAtTime(current, current - offset)); 308 - }); 309 - 310 - it("uses quadratic easing (not linear)", () => { 311 - const current = 5_000_000_000; 312 - const quarter = current + WINDOW_NS / 4; 313 - const half = current + WINDOW_NS / 2; 314 - const bQuarter = brightnessAtTime(current, quarter); 315 - const bHalf = brightnessAtTime(current, half); 316 - 317 - // With quadratic easing, the drop from peak to quarter should be 318 - // less than the drop from quarter to half (steeper falloff further out) 319 - const dropToQuarter = PEAK_BRIGHTNESS - bQuarter; 320 - const dropQuarterToHalf = bQuarter - bHalf; 321 - expect(dropQuarterToHalf).toBeGreaterThan(dropToQuarter); 322 - }); 323 - }); 324 - 325 - // --- toColor --- 326 - 327 - describe("toColor", () => { 328 - it("returns grayscale for non-concept words", () => { 329 - const color = toColor(0.5, null); 330 - expect(color).toMatch(/^rgb\(\d+ \d+ \d+\)$/); 331 - // Grayscale: all three channels equal 332 - const [r, g, b] = color.match(/\d+/g)!.map(Number); 333 - expect(r).toBe(g); 334 - expect(r).toBe(b); 335 - }); 336 - 337 - it("returns amber tint for concept words", () => { 338 - const concept = { 339 - byteStart: 0, 340 - byteEnd: 5, 341 - conceptUri: "at://test/concept/foo", 342 - conceptRkey: "foo", 343 - conceptName: "Foo", 344 - }; 345 - const color = toColor(0.8, concept); 346 - const [r, g, b] = color.match(/\d+/g)!.map(Number); 347 - // Amber: red > green > blue 348 - expect(r).toBeGreaterThan(g); 349 - expect(g).toBeGreaterThan(b); 350 - }); 351 - 352 - it("returns near-gray for dim concepts", () => { 353 - const concept = { 354 - byteStart: 0, 355 - byteEnd: 5, 356 - conceptUri: "at://test/concept/foo", 357 - conceptRkey: "foo", 358 - conceptName: "Foo", 359 - }; 360 - const color = toColor(BASE_BRIGHTNESS, concept); 361 - const [r, g, b] = color.match(/\d+/g)!.map(Number); 362 - // At base brightness, the amber tint should be very subtle 363 - // (channels close together) 364 - expect(Math.abs(r - b)).toBeLessThan(30); 365 - }); 366 - }); 367 - ``` 368 - 369 - - [ ] **Step 2: Run tests — expect failure** 370 - 371 - ```bash 372 - cd apps/ionosphere && pnpm test 373 - ``` 374 - 375 - Expected: FAIL — `./transcript` module doesn't export the expected functions yet (Task 2 may not be done, or the exports don't match). 376 - 377 - - [ ] **Step 3: Fix any import/export mismatches** 378 - 379 - Ensure `transcript.ts` exports match what the tests import. All functions and types listed in the test imports must be exported. 380 - 381 - - [ ] **Step 4: Run tests — expect pass** 382 - 383 - ```bash 384 - cd apps/ionosphere && pnpm test 385 - ``` 386 - 387 - Expected: All tests pass. 388 - 389 - - [ ] **Step 5: Commit** 390 - 391 - ```bash 392 - git add apps/ionosphere/src/lib/transcript.test.ts 393 - git commit -m "test: unit tests for transcript extraction, brightness, and concept overlay" 394 - ``` 395 - 396 - --- 397 - 398 - ## Chunk 2: CI/CD Pipeline 399 - 400 - ### Task 4: Create GitHub Actions workflow 401 - 402 - **Files:** 403 - - Create: `.github/workflows/ci.yml` 404 - 405 - - [ ] **Step 1: Create the workflow file** 406 - 407 - ```yaml 408 - name: CI 409 - 410 - on: 411 - push: 412 - branches: [main] 413 - pull_request: 414 - branches: [main] 415 - 416 - jobs: 417 - check: 418 - runs-on: ubuntu-latest 419 - 420 - steps: 421 - - uses: actions/checkout@v4 422 - 423 - - uses: pnpm/action-setup@v4 424 - with: 425 - version: 10 426 - 427 - - uses: actions/setup-node@v4 428 - with: 429 - node-version: 24 430 - cache: pnpm 431 - 432 - - run: pnpm install --frozen-lockfile 433 - 434 - - name: Typecheck 435 - run: pnpm -r typecheck 436 - 437 - - name: Test 438 - run: pnpm -r test 439 - ``` 440 - 441 - Note: `pnpm -r typecheck` runs `tsc --noEmit` in format and appview packages. The frontend doesn't have a typecheck script yet — add one if it fails, but Next.js projects typically typecheck during `next build`. 442 - 443 - - [ ] **Step 2: Add typecheck script to frontend package.json if missing** 444 - 445 - If `apps/ionosphere/package.json` doesn't have a `typecheck` script, add: 446 - ```json 447 - "typecheck": "tsc --noEmit" 448 - ``` 449 - 450 - The frontend needs a `tsconfig.json` that `tsc --noEmit` can use. Check if one exists — Next.js projects always have one. 451 - 452 - - [ ] **Step 3: Verify the full CI sequence locally** 453 - 454 - ```bash 455 - cd /Users/blainecook/Code/skeetv 456 - pnpm install --frozen-lockfile 457 - pnpm -r typecheck 458 - pnpm -r test 459 - ``` 460 - 461 - Expected: All typecheck and tests pass. The panproto tests will skip (no WASM in CI) but shouldn't fail. 462 - 463 - - [ ] **Step 4: Commit** 464 - 465 - ```bash 466 - git add .github/workflows/ci.yml apps/ionosphere/package.json 467 - git commit -m "ci: add GitHub Actions workflow for typecheck and test" 468 - ``` 469 - 470 - ### Task 5: Verify CI works 471 - 472 - - [ ] **Step 1: Push and check** 473 - 474 - Push the branch (or create a PR) and verify the GitHub Actions workflow runs successfully. 475 - 476 - ```bash 477 - git push origin main 478 - ``` 479 - 480 - Then check: `gh run list --limit 1` 481 - 482 - Expected: Workflow runs, typecheck passes, tests pass (panproto tests skip gracefully).
-566
docs/superpowers/plans/2026-03-31-word-index.md
··· 1 - # Word Index Implementation Plan 2 - 3 - > **For agentic workers:** REQUIRED: Use superpowers:subagent-driven-development (if subagents available) or superpowers:executing-plans to implement this plan. Steps use checkbox (`- [ ]`) syntax for tracking. 4 - 5 - **Goal:** Build a book-style word concordance at `/index` with Pretext multi-column layout and a fixed player column that shows video + transcript for any clicked word. 6 - 7 - **Architecture:** New appview endpoint builds the concordance from transcripts in SQLite. Frontend page uses Pretext for multi-column typesetting, reuses VideoPlayer + TranscriptView in a fixed side panel. 8 - 9 - **Tech Stack:** `@chenglou/pretext`, Next.js, existing VideoPlayer/TranscriptView components, Hono API 10 - 11 - **Spec:** `docs/superpowers/specs/2026-03-31-word-index-design.md` 12 - 13 - --- 14 - 15 - ## File Map 16 - 17 - ### New files 18 - - `apps/ionosphere-appview/src/stopwords.ts` — stopword list 19 - - `apps/ionosphere-appview/src/concordance.ts` — builds word concordance from transcripts 20 - - `apps/ionosphere-appview/src/concordance.test.ts` — tests 21 - - `apps/ionosphere/src/app/index/page.tsx` — index page (server component, data fetching) 22 - - `apps/ionosphere/src/app/index/IndexContent.tsx` — client component with Pretext layout + player 23 - 24 - ### Modified files 25 - - `apps/ionosphere-appview/src/routes.ts` — add `GET /index` endpoint 26 - - `apps/ionosphere/src/lib/api.ts` — add `getIndex()` function 27 - - `apps/ionosphere/src/app/layout.tsx` — add "Index" nav link 28 - 29 - --- 30 - 31 - ## Chunk 1: Appview Concordance Endpoint 32 - 33 - ### Task 1: Stopword list 34 - 35 - **Files:** 36 - - Create: `apps/ionosphere-appview/src/stopwords.ts` 37 - 38 - - [ ] **Step 1: Create stopword module** 39 - 40 - ```typescript 41 - // apps/ionosphere-appview/src/stopwords.ts 42 - 43 - // Standard English stopwords + filler words. 44 - // Intentionally minimal — extend as needed. 45 - const STOPWORDS = new Set([ 46 - // Articles, pronouns, prepositions 47 - "a", "an", "the", "i", "me", "my", "we", "our", "you", "your", 48 - "he", "she", "it", "they", "them", "his", "her", "its", "their", 49 - "this", "that", "these", "those", "who", "whom", "which", "what", 50 - "am", "is", "are", "was", "were", "be", "been", "being", 51 - "have", "has", "had", "do", "does", "did", "will", "would", 52 - "shall", "should", "may", "might", "must", "can", "could", 53 - "not", "no", "nor", "and", "but", "or", "so", "if", "then", 54 - "than", "too", "very", "just", "also", "only", 55 - "in", "on", "at", "to", "for", "of", "with", "by", "from", 56 - "up", "about", "into", "through", "during", "before", "after", 57 - "above", "below", "between", "out", "off", "over", "under", 58 - "again", "further", "once", "here", "there", "when", "where", 59 - "why", "how", "all", "each", "every", "both", "few", "more", 60 - "most", "other", "some", "such", "any", "own", "same", 61 - "as", "until", "while", "because", "although", "since", 62 - // Common verbs 63 - "get", "got", "go", "going", "gone", "come", "came", 64 - "make", "made", "take", "took", "know", "knew", "think", 65 - "thought", "see", "saw", "want", "look", "use", "find", 66 - "give", "tell", "say", "said", "let", "put", "try", 67 - "need", "keep", "start", "show", "hear", "play", "run", 68 - "move", "like", "live", "believe", "hold", "bring", 69 - "happen", "write", "provide", "sit", "stand", "lose", 70 - "pay", "meet", "include", "continue", "set", "learn", 71 - "change", "lead", "understand", "watch", "follow", "stop", 72 - "create", "speak", "read", "allow", "add", "spend", "grow", 73 - // Filler words 74 - "um", "uh", "like", "okay", "ok", "well", "right", "yeah", 75 - "yes", "no", "oh", "ah", "so", "actually", "basically", 76 - "really", "kind", "sort", "thing", "things", "stuff", 77 - "gonna", "gotta", "wanna", 78 - // Pronouns and determiners 79 - "something", "anything", "everything", "nothing", 80 - "someone", "anyone", "everyone", "one", "ones", 81 - // Numbers and common words 82 - "first", "two", "new", "way", "even", "much", "still", 83 - "back", "now", "long", "great", "little", "world", 84 - "good", "big", "old", "different", "lot", "able", 85 - "don", "doesn", "didn", "won", "wouldn", "couldn", 86 - "shouldn", "isn", "aren", "wasn", "weren", "hasn", 87 - "haven", "hadn", "don't", "doesn't", "didn't", "won't", 88 - "it's", "that's", "there's", "what's", "let's", 89 - "i'm", "i've", "i'll", "i'd", "we're", "we've", "we'll", 90 - "you're", "you've", "you'll", "you'd", "they're", "they've", 91 - "he's", "she's", "we'd", "they'll", "they'd", 92 - ]); 93 - 94 - export function isStopword(word: string): boolean { 95 - return STOPWORDS.has(word) || word.length <= 1; 96 - } 97 - ``` 98 - 99 - - [ ] **Step 2: Commit** 100 - 101 - ```bash 102 - git add apps/ionosphere-appview/src/stopwords.ts 103 - git commit -m "feat: stopword list for concordance index" 104 - ``` 105 - 106 - ### Task 2: Concordance builder with tests 107 - 108 - **Files:** 109 - - Create: `apps/ionosphere-appview/src/concordance.ts` 110 - - Create: `apps/ionosphere-appview/src/concordance.test.ts` 111 - 112 - - [ ] **Step 1: Write the test** 113 - 114 - ```typescript 115 - // apps/ionosphere-appview/src/concordance.test.ts 116 - import { describe, it, expect } from "vitest"; 117 - import { buildConcordance, type ConcordanceEntry } from "./concordance.js"; 118 - 119 - describe("buildConcordance", () => { 120 - const transcripts = [ 121 - { 122 - talkRkey: "talk-1", 123 - talkTitle: "Building with AT Protocol", 124 - text: "AT Protocol is a decentralized protocol for social networking", 125 - startMs: 0, 126 - timings: [300, 300, -50, 200, 400, 300, -50, 200, 300, 400], 127 - }, 128 - { 129 - talkRkey: "talk-2", 130 - talkTitle: "Decentralized Identity", 131 - text: "Protocol design for decentralized identity systems", 132 - startMs: 0, 133 - timings: [400, 300, -50, 200, 500, 300, 400], 134 - }, 135 - ]; 136 - 137 - it("builds entries sorted alphabetically", () => { 138 - const entries = buildConcordance(transcripts); 139 - const words = entries.map((e) => e.word); 140 - expect(words).toEqual([...words].sort()); 141 - }); 142 - 143 - it("excludes stopwords", () => { 144 - const entries = buildConcordance(transcripts); 145 - const words = entries.map((e) => e.word); 146 - expect(words).not.toContain("is"); 147 - expect(words).not.toContain("a"); 148 - expect(words).not.toContain("for"); 149 - }); 150 - 151 - it("aggregates across talks with counts", () => { 152 - const entries = buildConcordance(transcripts); 153 - const protocol = entries.find((e) => e.word === "protocol"); 154 - expect(protocol).toBeDefined(); 155 - expect(protocol!.talks).toHaveLength(2); 156 - expect(protocol!.talks[0].count).toBeGreaterThanOrEqual(1); 157 - }); 158 - 159 - it("includes first timestamp for each talk occurrence", () => { 160 - const entries = buildConcordance(transcripts); 161 - const protocol = entries.find((e) => e.word === "protocol"); 162 - expect(protocol).toBeDefined(); 163 - for (const talk of protocol!.talks) { 164 - expect(talk.firstTimestampNs).toBeGreaterThanOrEqual(0); 165 - } 166 - }); 167 - 168 - it("lowercases all words", () => { 169 - const entries = buildConcordance(transcripts); 170 - for (const entry of entries) { 171 - expect(entry.word).toBe(entry.word.toLowerCase()); 172 - } 173 - }); 174 - 175 - it("handles empty input", () => { 176 - expect(buildConcordance([])).toEqual([]); 177 - }); 178 - }); 179 - ``` 180 - 181 - - [ ] **Step 2: Run test to verify it fails** 182 - 183 - ```bash 184 - cd apps/ionosphere-appview && pnpm test -- concordance.test 185 - ``` 186 - 187 - Expected: FAIL — module not found. 188 - 189 - - [ ] **Step 3: Implement the concordance builder** 190 - 191 - ```typescript 192 - // apps/ionosphere-appview/src/concordance.ts 193 - import { decode } from "@ionosphere/format/transcript-encoding"; 194 - import { isStopword } from "./stopwords.js"; 195 - 196 - export interface ConcordanceTalkRef { 197 - rkey: string; 198 - title: string; 199 - count: number; 200 - firstTimestampNs: number; 201 - } 202 - 203 - export interface ConcordanceEntry { 204 - word: string; 205 - talks: ConcordanceTalkRef[]; 206 - totalCount: number; 207 - } 208 - 209 - interface TranscriptInput { 210 - talkRkey: string; 211 - talkTitle: string; 212 - text: string; 213 - startMs: number; 214 - timings: number[]; 215 - } 216 - 217 - /** 218 - * Build a concordance from a set of transcripts. 219 - * Returns alphabetized entries with talk references and timestamps. 220 - */ 221 - export function buildConcordance( 222 - transcripts: TranscriptInput[] 223 - ): ConcordanceEntry[] { 224 - // word → { talk rkey → { count, firstTimestampNs, title } } 225 - const index = new Map< 226 - string, 227 - Map<string, { title: string; count: number; firstTimestampNs: number }> 228 - >(); 229 - 230 - for (const t of transcripts) { 231 - // Decode compact timings to get word-level timestamps 232 - const decoded = decode({ text: t.text, startMs: t.startMs, timings: t.timings }); 233 - const words = t.text.split(/\s+/).filter((w) => w.length > 0); 234 - 235 - for (let i = 0; i < words.length; i++) { 236 - const raw = words[i].toLowerCase().replace(/[^a-z0-9'-]/g, ""); 237 - if (!raw || isStopword(raw)) continue; 238 - 239 - const timestampNs = 240 - i < decoded.words.length 241 - ? Math.round(decoded.words[i].start * 1e9) 242 - : 0; 243 - 244 - if (!index.has(raw)) index.set(raw, new Map()); 245 - const talkMap = index.get(raw)!; 246 - 247 - if (!talkMap.has(t.talkRkey)) { 248 - talkMap.set(t.talkRkey, { 249 - title: t.talkTitle, 250 - count: 1, 251 - firstTimestampNs: timestampNs, 252 - }); 253 - } else { 254 - const ref = talkMap.get(t.talkRkey)!; 255 - ref.count++; 256 - if (timestampNs < ref.firstTimestampNs) { 257 - ref.firstTimestampNs = timestampNs; 258 - } 259 - } 260 - } 261 - } 262 - 263 - // Convert to sorted array 264 - const entries: ConcordanceEntry[] = []; 265 - for (const [word, talkMap] of index) { 266 - const talks: ConcordanceTalkRef[] = []; 267 - let totalCount = 0; 268 - for (const [rkey, ref] of talkMap) { 269 - talks.push({ rkey, title: ref.title, count: ref.count, firstTimestampNs: ref.firstTimestampNs }); 270 - totalCount += ref.count; 271 - } 272 - // Sort talks by count descending 273 - talks.sort((a, b) => b.count - a.count); 274 - entries.push({ word, talks, totalCount }); 275 - } 276 - 277 - entries.sort((a, b) => a.word.localeCompare(b.word)); 278 - return entries; 279 - } 280 - ``` 281 - 282 - - [ ] **Step 4: Run tests** 283 - 284 - ```bash 285 - cd apps/ionosphere-appview && pnpm test -- concordance.test 286 - ``` 287 - 288 - Expected: All tests pass. 289 - 290 - - [ ] **Step 5: Commit** 291 - 292 - ```bash 293 - git add apps/ionosphere-appview/src/concordance.ts apps/ionosphere-appview/src/concordance.test.ts 294 - git commit -m "feat: concordance builder with tests" 295 - ``` 296 - 297 - ### Task 3: Add /index endpoint to appview 298 - 299 - **Files:** 300 - - Modify: `apps/ionosphere-appview/src/routes.ts` 301 - 302 - - [ ] **Step 1: Add the endpoint** 303 - 304 - Read `routes.ts` first, then add a new route before the final `return app`: 305 - 306 - ```typescript 307 - app.get("/index", (c) => { 308 - // Get all transcripts with their talk info 309 - const rows = db 310 - .prepare( 311 - `SELECT tr.text, tr.start_ms, tr.timings, t.rkey as talk_rkey, t.title as talk_title 312 - FROM transcripts tr 313 - JOIN talks t ON tr.talk_uri = t.uri 314 - ORDER BY t.starts_at ASC` 315 - ) 316 - .all() as any[]; 317 - 318 - const transcripts = rows.map((r) => ({ 319 - talkRkey: r.talk_rkey, 320 - talkTitle: r.talk_title, 321 - text: r.text, 322 - startMs: r.start_ms, 323 - timings: JSON.parse(r.timings), 324 - })); 325 - 326 - const { buildConcordance } = require("./concordance.js"); 327 - const entries = buildConcordance(transcripts); 328 - 329 - return c.json({ entries }); 330 - }); 331 - ``` 332 - 333 - Note: Use a static import at the top of routes.ts instead of `require` — add `import { buildConcordance } from "./concordance.js";` at the top. 334 - 335 - - [ ] **Step 2: Test the endpoint** 336 - 337 - ```bash 338 - curl -s http://localhost:9401/index | python3 -c " 339 - import sys, json 340 - d = json.load(sys.stdin) 341 - entries = d['entries'] 342 - print(f'{len(entries)} words in concordance') 343 - if entries: 344 - print(f'First: {entries[0][\"word\"]} ({entries[0][\"totalCount\"]} occurrences)') 345 - print(f'Last: {entries[-1][\"word\"]} ({entries[-1][\"totalCount\"]} occurrences)') 346 - " 347 - ``` 348 - 349 - Note: The appview needs a restart to pick up the new route. Kill and restart it: 350 - ```bash 351 - pkill -f "appview.ts"; sleep 2 352 - cd apps/ionosphere-appview && PORT=9401 nohup npx tsx src/appview.ts > /tmp/appview.log 2>&1 & 353 - sleep 5 354 - ``` 355 - 356 - - [ ] **Step 3: Commit** 357 - 358 - ```bash 359 - git add apps/ionosphere-appview/src/routes.ts 360 - git commit -m "feat: add /index concordance endpoint to appview" 361 - ``` 362 - 363 - --- 364 - 365 - ## Chunk 2: Frontend Index Page 366 - 367 - ### Task 4: Install Pretext and add API function 368 - 369 - **Files:** 370 - - Modify: `apps/ionosphere/package.json` 371 - - Modify: `apps/ionosphere/src/lib/api.ts` 372 - 373 - - [ ] **Step 1: Install pretext** 374 - 375 - ```bash 376 - cd apps/ionosphere && pnpm add @chenglou/pretext 377 - ``` 378 - 379 - - [ ] **Step 2: Add getIndex() to api.ts** 380 - 381 - Add to `apps/ionosphere/src/lib/api.ts`: 382 - 383 - ```typescript 384 - export async function getIndex() { 385 - return fetchApi<{ entries: any[] }>("/index"); 386 - } 387 - ``` 388 - 389 - - [ ] **Step 3: Add Index nav link** 390 - 391 - In `apps/ionosphere/src/app/layout.tsx`, add after the Concepts nav link: 392 - 393 - ```tsx 394 - <a href="/index" className="text-sm text-neutral-400 hover:text-neutral-100">Index</a> 395 - ``` 396 - 397 - - [ ] **Step 4: Commit** 398 - 399 - ```bash 400 - git add apps/ionosphere/package.json apps/ionosphere/src/lib/api.ts apps/ionosphere/src/app/layout.tsx pnpm-lock.yaml 401 - git commit -m "feat: install pretext, add index API function and nav link" 402 - ``` 403 - 404 - ### Task 5: Index page server component 405 - 406 - **Files:** 407 - - Create: `apps/ionosphere/src/app/index/page.tsx` 408 - 409 - - [ ] **Step 1: Create the page** 410 - 411 - ```tsx 412 - // apps/ionosphere/src/app/index/page.tsx 413 - import { getIndex } from "@/lib/api"; 414 - import IndexContent from "./IndexContent"; 415 - 416 - export default async function IndexPage() { 417 - const { entries } = await getIndex(); 418 - return <IndexContent entries={entries} />; 419 - } 420 - ``` 421 - 422 - - [ ] **Step 2: Commit** 423 - 424 - ```bash 425 - git add apps/ionosphere/src/app/index/page.tsx 426 - git commit -m "feat: index page server component" 427 - ``` 428 - 429 - ### Task 6: IndexContent client component 430 - 431 - **Files:** 432 - - Create: `apps/ionosphere/src/app/index/IndexContent.tsx` 433 - 434 - This is the main component. It has two panels: 435 - - Left: multi-column word index (Pretext-rendered) 436 - - Right: fixed player column (VideoPlayer + TranscriptView) 437 - 438 - - [ ] **Step 1: Create IndexContent** 439 - 440 - Read the existing TalkContent.tsx for patterns (TimestampProvider wrapping, VideoPlayer/TranscriptView usage). 441 - 442 - The component should: 443 - 444 - 1. Take `entries` as props (from server component) 445 - 2. Maintain state: `selectedTalk` (rkey, title, videoUri, offsetNs, document) and `selectedTimestampNs` 446 - 3. When a talk ref is clicked: 447 - - Fetch the full talk data via `/talks/:rkey` 448 - - Set the selected talk + timestamp 449 - - Player column loads the video and transcript 450 - 4. Render the word list in multi-column layout using Pretext's `prepareWithSegments` + `layoutWithLines` for balanced columns 451 - 5. Group entries by first letter with letter headings 452 - 453 - The layout: 454 - ```tsx 455 - <div className="h-full flex"> 456 - {/* Left: scrollable index */} 457 - <div className="flex-1 overflow-y-auto p-6"> 458 - {/* Pretext-rendered multi-column concordance */} 459 - </div> 460 - 461 - {/* Right: fixed player column */} 462 - <div className="w-[400px] shrink-0 border-l border-neutral-800 flex flex-col"> 463 - <TimestampProvider> 464 - {selectedTalk && ( 465 - <> 466 - <div className="shrink-0"> 467 - <VideoPlayer videoUri={selectedTalk.videoUri} offsetNs={selectedTalk.offsetNs} /> 468 - </div> 469 - {selectedTalk.document && ( 470 - <div className="flex-1 min-h-0 overflow-hidden"> 471 - <TranscriptView document={selectedTalk.document} /> 472 - </div> 473 - )} 474 - </> 475 - )} 476 - </TimestampProvider> 477 - </div> 478 - </div> 479 - ``` 480 - 481 - For the initial implementation, use CSS `column-count` for the multi-column layout (get it working first), then replace with Pretext in a follow-up step. 482 - 483 - The word list rendering: 484 - ```tsx 485 - {letterGroups.map(([letter, words]) => ( 486 - <div key={letter} className="break-inside-avoid mb-4"> 487 - <h2 className="text-lg font-bold text-neutral-400 mb-1">{letter.toUpperCase()}</h2> 488 - {words.map((entry) => ( 489 - <div key={entry.word} className="text-sm leading-relaxed"> 490 - <span className="font-medium">{entry.word}</span> 491 - {" — "} 492 - {entry.talks.map((talk, i) => ( 493 - <span key={talk.rkey}> 494 - {i > 0 && ", "} 495 - <button 496 - onClick={() => handleSelectTalk(talk.rkey, entry.word, talk.firstTimestampNs)} 497 - className="text-neutral-400 hover:text-neutral-100 underline underline-offset-2" 498 - > 499 - {talk.title} 500 - </button> 501 - {talk.count > 1 && <span className="text-neutral-600"> ({talk.count})</span>} 502 - </span> 503 - ))} 504 - </div> 505 - ))} 506 - </div> 507 - ))} 508 - ``` 509 - 510 - - [ ] **Step 2: Implement `handleSelectTalk`** 511 - 512 - When clicked, fetch the talk data and update state: 513 - ```typescript 514 - async function handleSelectTalk(rkey: string, word: string, timestampNs: number) { 515 - const API_BASE = process.env.NEXT_PUBLIC_API_URL || "http://localhost:3001"; 516 - const res = await fetch(`${API_BASE}/talks/${rkey}`); 517 - const { talk } = await res.json(); 518 - 519 - const document = talk.document ? JSON.parse(talk.document) : null; 520 - 521 - setSelectedTalk({ 522 - rkey, 523 - title: talk.title, 524 - videoUri: talk.video_uri, 525 - offsetNs: talk.video_offset_ns || 0, 526 - document: document?.facets?.length > 0 ? document : null, 527 - }); 528 - setSelectedTimestampNs(timestampNs); 529 - } 530 - ``` 531 - 532 - - [ ] **Step 3: Verify it renders** 533 - 534 - Navigate to http://localhost:9402/index in the browser. The word list should appear in multi-column layout. Clicking a talk reference should load the video and transcript in the right panel. 535 - 536 - - [ ] **Step 4: Commit** 537 - 538 - ```bash 539 - git add apps/ionosphere/src/app/index/IndexContent.tsx 540 - git commit -m "feat: word index page with multi-column layout and player panel" 541 - ``` 542 - 543 - ### Task 7: Pretext column layout (upgrade from CSS columns) 544 - 545 - **Files:** 546 - - Modify: `apps/ionosphere/src/app/index/IndexContent.tsx` 547 - 548 - - [ ] **Step 1: Replace CSS column-count with Pretext** 549 - 550 - Use Pretext's `prepareWithSegments` and `layoutWithLines` to measure and flow the index entries into balanced columns. This gives proper column balancing and enables virtualization for the full concordance. 551 - 552 - Pretext works with a canvas for text measurement, so this needs to be a client-side effect. The approach: 553 - 554 - 1. Prepare all entry texts with Pretext 555 - 2. Layout into lines at the available width 556 - 3. Split lines into N balanced columns by total height 557 - 4. Render each column as positioned elements 558 - 559 - This step may require experimentation with Pretext's API. If Pretext's API proves too complex for the initial ship, keep CSS `column-count` and note the Pretext upgrade as a follow-up. 560 - 561 - - [ ] **Step 2: Commit** 562 - 563 - ```bash 564 - git add apps/ionosphere/src/app/index/IndexContent.tsx 565 - git commit -m "feat: upgrade index layout to Pretext balanced columns" 566 - ```
-965
docs/superpowers/plans/2026-04-01-comments-oauth.md
··· 1 - # Comments & OAuth Implementation Plan 2 - 3 - > **For agentic workers:** REQUIRED: Use superpowers:subagent-driven-development (if subagents available) or superpowers:executing-plans to implement this plan. Steps use checkbox (`- [ ]`) syntax for tracking. 4 - 5 - **Goal:** Enable users to post inline comments and emoji reactions on talk transcripts via AT Protocol OAuth, with comments discoverable via public Jetstream. 6 - 7 - **Architecture:** Browser-side OAuth via `@atproto/oauth-client-browser` writes `tv.ionosphere.comment` records to user's PDS. A public Jetstream subscription delivers those records to the appview, which indexes them in SQLite and serves them via API. The frontend renders inline highlights on transcripts and a comment panel. 8 - 9 - **Tech Stack:** `@atproto/oauth-client-browser`, `@atproto/api`, Jetstream, Hono, Next.js, SQLite 10 - 11 - **Spec:** `docs/superpowers/specs/2026-04-01-comments-oauth-design.md` 12 - 13 - --- 14 - 15 - ## File Map 16 - 17 - ### New files 18 - - `lexicons/tv/ionosphere/comment.json` — comment lexicon definition 19 - - `apps/ionosphere/public/client-metadata.json` — OAuth client metadata 20 - - `apps/ionosphere/src/lib/auth.tsx` — OAuth client setup, React context, hooks 21 - - `apps/ionosphere/src/app/components/AuthButton.tsx` — sign in/out in nav 22 - - `apps/ionosphere/src/app/components/CommentHighlights.tsx` — inline highlights on transcript spans 23 - - `apps/ionosphere/src/app/components/CommentPanel.tsx` — comment thread sidebar/overlay 24 - - `apps/ionosphere/src/app/components/TextSelector.tsx` — text selection → comment/react 25 - - `apps/ionosphere/src/app/auth/callback/page.tsx` — OAuth callback page 26 - - `apps/ionosphere-appview/src/public-jetstream.ts` — public Jetstream subscription 27 - 28 - ### Modified files 29 - - `apps/ionosphere-appview/src/db.ts` — add comments table + profiles cache table 30 - - `apps/ionosphere-appview/src/indexer.ts` — handle tv.ionosphere.comment events 31 - - `apps/ionosphere-appview/src/routes.ts` — add comment API endpoints 32 - - `apps/ionosphere-appview/src/appview.ts` — start public Jetstream connection 33 - - `apps/ionosphere/src/app/components/NavHeader.tsx` — add auth button 34 - - `apps/ionosphere/src/app/components/TranscriptView.tsx` — integrate comment highlights 35 - - `apps/ionosphere/src/app/layout.tsx` — wrap with auth provider 36 - - `apps/ionosphere/package.json` — add @atproto/oauth-client-browser, @atproto/api 37 - 38 - --- 39 - 40 - ## Chunk 1: Comment Lexicon + Appview Indexing 41 - 42 - Backend infrastructure — define the lexicon, add DB table, index comments, serve API. 43 - 44 - ### Task 1: Comment lexicon definition 45 - 46 - **Files:** 47 - - Create: `lexicons/tv/ionosphere/comment.json` 48 - 49 - - [ ] **Step 1: Create the lexicon file** 50 - 51 - ```json 52 - { 53 - "lexicon": 1, 54 - "$type": "com.atproto.lexicon.schema", 55 - "id": "tv.ionosphere.comment", 56 - "revision": 1, 57 - "description": "A comment or reaction on a transcript, talk, or another comment.", 58 - "defs": { 59 - "main": { 60 - "type": "record", 61 - "key": "tid", 62 - "record": { 63 - "type": "object", 64 - "required": ["subject", "text", "createdAt"], 65 - "properties": { 66 - "subject": { 67 - "type": "string", 68 - "format": "at-uri", 69 - "description": "AT URI of the transcript, talk, or parent comment." 70 - }, 71 - "text": { 72 - "type": "string", 73 - "maxLength": 10000, 74 - "description": "Comment body or single emoji reaction." 75 - }, 76 - "facets": { 77 - "type": "array", 78 - "items": { "type": "ref", "ref": "app.bsky.richtext.facet" }, 79 - "description": "Rich text facets (mentions, links) in the comment." 80 - }, 81 - "anchor": { 82 - "type": "ref", 83 - "ref": "#byteRange", 84 - "description": "Optional byte range on the subject's text." 85 - }, 86 - "createdAt": { 87 - "type": "string", 88 - "format": "datetime" 89 - } 90 - } 91 - } 92 - }, 93 - "byteRange": { 94 - "type": "object", 95 - "required": ["byteStart", "byteEnd"], 96 - "properties": { 97 - "byteStart": { "type": "integer" }, 98 - "byteEnd": { "type": "integer" } 99 - } 100 - } 101 - } 102 - } 103 - ``` 104 - 105 - - [ ] **Step 2: Commit** 106 - 107 - ```bash 108 - git add lexicons/tv/ionosphere/comment.json 109 - git commit -m "feat: tv.ionosphere.comment lexicon definition" 110 - ``` 111 - 112 - ### Task 2: Comments table + indexer 113 - 114 - **Files:** 115 - - Modify: `apps/ionosphere-appview/src/db.ts` 116 - - Modify: `apps/ionosphere-appview/src/indexer.ts` 117 - 118 - - [ ] **Step 1: Add comments table to db.ts migrate function** 119 - 120 - Add after the `lenses` table: 121 - 122 - ```sql 123 - CREATE TABLE IF NOT EXISTS comments ( 124 - uri TEXT PRIMARY KEY, 125 - author_did TEXT NOT NULL, 126 - rkey TEXT NOT NULL, 127 - subject_uri TEXT NOT NULL, 128 - text TEXT NOT NULL, 129 - facets TEXT, 130 - byte_start INTEGER, 131 - byte_end INTEGER, 132 - created_at TEXT NOT NULL, 133 - indexed_at TEXT DEFAULT CURRENT_TIMESTAMP 134 - ); 135 - 136 - CREATE INDEX IF NOT EXISTS idx_comments_subject ON comments(subject_uri); 137 - CREATE INDEX IF NOT EXISTS idx_comments_author ON comments(author_did); 138 - ``` 139 - 140 - - [ ] **Step 2: Add tv.ionosphere.comment to IONOSPHERE_COLLECTIONS in indexer.ts** 141 - 142 - Add `"tv.ionosphere.comment"` to the `IONOSPHERE_COLLECTIONS` array. This also makes the backfill pick it up. 143 - 144 - - [ ] **Step 3: Add delete handler** 145 - 146 - In the `processEvent` delete switch: 147 - ```typescript 148 - case "tv.ionosphere.comment": 149 - db.prepare("DELETE FROM comments WHERE uri = ?").run(uri); 150 - break; 151 - ``` 152 - 153 - - [ ] **Step 4: Add create/update handler** 154 - 155 - New case in the create/update switch: 156 - ```typescript 157 - case "tv.ionosphere.comment": 158 - indexComment(db, event.did, rkey, uri, record); 159 - break; 160 - ``` 161 - 162 - New function: 163 - ```typescript 164 - function indexComment( 165 - db: Database.Database, 166 - did: string, 167 - rkey: string, 168 - uri: string, 169 - record: Record<string, unknown> 170 - ): void { 171 - const anchor = record.anchor as { byteStart: number; byteEnd: number } | undefined; 172 - db.prepare( 173 - `INSERT OR REPLACE INTO comments 174 - (uri, author_did, rkey, subject_uri, text, facets, byte_start, byte_end, created_at) 175 - VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)` 176 - ).run( 177 - uri, 178 - did, 179 - rkey, 180 - record.subject as string, 181 - record.text as string, 182 - record.facets ? JSON.stringify(record.facets) : null, 183 - anchor?.byteStart ?? null, 184 - anchor?.byteEnd ?? null, 185 - record.createdAt as string 186 - ); 187 - } 188 - ``` 189 - 190 - - [ ] **Step 5: Run tests** 191 - 192 - ```bash 193 - cd apps/ionosphere-appview && pnpm test 194 - ``` 195 - 196 - - [ ] **Step 6: Commit** 197 - 198 - ```bash 199 - git add apps/ionosphere-appview/src/db.ts apps/ionosphere-appview/src/indexer.ts 200 - git commit -m "feat: comments table and indexer for tv.ionosphere.comment" 201 - ``` 202 - 203 - ### Task 3: Comment API endpoints 204 - 205 - **Files:** 206 - - Modify: `apps/ionosphere-appview/src/routes.ts` 207 - 208 - - [ ] **Step 1: Add comment endpoints** 209 - 210 - Add before the concordance cache section: 211 - 212 - ```typescript 213 - // --- Comments --- 214 - 215 - app.get("/talks/:rkey/comments", (c) => { 216 - const { rkey } = c.req.param(); 217 - const talk = db.prepare("SELECT uri FROM talks WHERE rkey = ?").get(rkey) as any; 218 - if (!talk) return c.json({ comments: [] }); 219 - 220 - // Get transcript URI for this talk 221 - const transcript = db.prepare("SELECT uri FROM transcripts WHERE talk_uri = ?").get(talk.uri) as any; 222 - 223 - // Get comments on the talk URI or its transcript URI 224 - const subjectUris = [talk.uri]; 225 - if (transcript) subjectUris.push(transcript.uri); 226 - 227 - const placeholders = subjectUris.map(() => "?").join(","); 228 - const comments = db.prepare( 229 - `SELECT * FROM comments WHERE subject_uri IN (${placeholders}) ORDER BY created_at ASC` 230 - ).all(...subjectUris); 231 - 232 - return c.json({ comments }); 233 - }); 234 - 235 - app.get("/comments", (c) => { 236 - const subject = c.req.query("subject"); 237 - if (!subject) return c.json({ comments: [] }); 238 - 239 - const comments = db.prepare( 240 - "SELECT * FROM comments WHERE subject_uri = ? ORDER BY created_at ASC" 241 - ).all(subject); 242 - 243 - return c.json({ comments }); 244 - }); 245 - ``` 246 - 247 - - [ ] **Step 2: Commit** 248 - 249 - ```bash 250 - git add apps/ionosphere-appview/src/routes.ts 251 - git commit -m "feat: comment API endpoints" 252 - ``` 253 - 254 - ### Task 4: Public Jetstream subscription 255 - 256 - **Files:** 257 - - Create: `apps/ionosphere-appview/src/public-jetstream.ts` 258 - - Modify: `apps/ionosphere-appview/src/appview.ts` 259 - 260 - - [ ] **Step 1: Create public-jetstream.ts** 261 - 262 - A thin wrapper that creates a second JetstreamClient for the public network: 263 - 264 - ```typescript 265 - import type Database from "better-sqlite3"; 266 - import { JetstreamClient } from "./jetstream.js"; 267 - import { processEvent } from "./indexer.js"; 268 - 269 - const PUBLIC_JETSTREAM_URL = process.env.PUBLIC_JETSTREAM_URL ?? "wss://jetstream1.us-east.bsky.network"; 270 - 271 - export function startPublicJetstream(db: Database.Database): JetstreamClient { 272 - // Separate cursor for the public firehose 273 - db.exec(` 274 - CREATE TABLE IF NOT EXISTS _public_cursor ( 275 - id INTEGER PRIMARY KEY CHECK (id = 1), 276 - cursor_us INTEGER 277 - ); 278 - INSERT OR IGNORE INTO _public_cursor (id, cursor_us) VALUES (1, NULL); 279 - `); 280 - 281 - const getCursor = (): number | null => { 282 - const row = db.prepare("SELECT cursor_us FROM _public_cursor WHERE id = 1").get() as any; 283 - return row?.cursor_us ?? null; 284 - }; 285 - 286 - const setCursor = (cursor: number): void => { 287 - db.prepare("UPDATE _public_cursor SET cursor_us = ? WHERE id = 1").run(cursor); 288 - }; 289 - 290 - const client = new JetstreamClient({ 291 - url: PUBLIC_JETSTREAM_URL, 292 - wantedCollections: ["tv.ionosphere.comment"], 293 - getCursor, 294 - setCursor, 295 - onEvent: (event) => { 296 - try { 297 - processEvent(db, event); 298 - } catch (err) { 299 - console.error("Public Jetstream indexer error:", err); 300 - } 301 - }, 302 - onError: (err) => console.error("Public Jetstream error:", err), 303 - }); 304 - 305 - return client; 306 - } 307 - ``` 308 - 309 - - [ ] **Step 2: Add to appview.ts** 310 - 311 - In the `init()` function, after the local Jetstream setup: 312 - 313 - ```typescript 314 - import { startPublicJetstream } from "./public-jetstream.js"; 315 - 316 - // In init(): 317 - const publicJetstream = startPublicJetstream(db); 318 - publicJetstream.start(); 319 - console.log("Public Jetstream: listening for tv.ionosphere.comment"); 320 - ``` 321 - 322 - - [ ] **Step 3: Commit** 323 - 324 - ```bash 325 - git add apps/ionosphere-appview/src/public-jetstream.ts apps/ionosphere-appview/src/appview.ts 326 - git commit -m "feat: public Jetstream subscription for user comments" 327 - ``` 328 - 329 - --- 330 - 331 - ## Chunk 2: AT Protocol OAuth 332 - 333 - Frontend authentication — OAuth flow, session management, auth UI. 334 - 335 - ### Task 5: Install OAuth dependencies 336 - 337 - **Files:** 338 - - Modify: `apps/ionosphere/package.json` 339 - 340 - - [ ] **Step 1: Install packages** 341 - 342 - ```bash 343 - cd apps/ionosphere && pnpm add @atproto/oauth-client-browser @atproto/api 344 - ``` 345 - 346 - - [ ] **Step 2: Commit** 347 - 348 - ```bash 349 - git add apps/ionosphere/package.json pnpm-lock.yaml 350 - git commit -m "chore: add @atproto/oauth-client-browser and @atproto/api" 351 - ``` 352 - 353 - ### Task 6: OAuth client metadata 354 - 355 - **Files:** 356 - - Create: `apps/ionosphere/public/client-metadata.json` 357 - 358 - - [ ] **Step 1: Create client metadata** 359 - 360 - ```json 361 - { 362 - "client_id": "http://localhost:9402/client-metadata.json", 363 - "client_name": "Ionosphere", 364 - "client_uri": "http://localhost:9402", 365 - "redirect_uris": ["http://localhost:9402/auth/callback"], 366 - "scope": "atproto", 367 - "grant_types": ["authorization_code", "refresh_token"], 368 - "response_types": ["code"], 369 - "token_endpoint_auth_method": "none", 370 - "application_type": "web", 371 - "dpop_bound_access_tokens": true 372 - } 373 - ``` 374 - 375 - Note: For production, `client_id` changes to `https://ionosphere.tv/client-metadata.json` and redirect URI updates accordingly. 376 - 377 - - [ ] **Step 2: Commit** 378 - 379 - ```bash 380 - git add apps/ionosphere/public/client-metadata.json 381 - git commit -m "feat: OAuth client metadata for AT Protocol auth" 382 - ``` 383 - 384 - ### Task 7: Auth library and React context 385 - 386 - **Files:** 387 - - Create: `apps/ionosphere/src/lib/auth.tsx` 388 - - Create: `apps/ionosphere/src/app/auth/callback/page.tsx` 389 - 390 - - [ ] **Step 1: Create auth.tsx** 391 - 392 - This provides the OAuth client, React context, and hooks: 393 - 394 - ```typescript 395 - "use client"; 396 - 397 - import { createContext, useContext, useState, useEffect, useCallback, type ReactNode } from "react"; 398 - import { BrowserOAuthClient } from "@atproto/oauth-client-browser"; 399 - import { Agent } from "@atproto/api"; 400 - 401 - interface AuthState { 402 - agent: Agent | null; 403 - did: string | null; 404 - handle: string | null; 405 - loading: boolean; 406 - signIn: (handle: string) => Promise<void>; 407 - signOut: () => Promise<void>; 408 - } 409 - 410 - const AuthContext = createContext<AuthState | null>(null); 411 - 412 - export function useAuth() { 413 - const ctx = useContext(AuthContext); 414 - if (!ctx) throw new Error("useAuth must be used within AuthProvider"); 415 - return ctx; 416 - } 417 - 418 - let _oauthClient: BrowserOAuthClient | null = null; 419 - 420 - function getOAuthClient(): BrowserOAuthClient { 421 - if (!_oauthClient) { 422 - _oauthClient = new BrowserOAuthClient({ 423 - clientMetadata: `${window.location.origin}/client-metadata.json`, 424 - handleResolver: "https://bsky.social", 425 - }); 426 - } 427 - return _oauthClient; 428 - } 429 - 430 - export function AuthProvider({ children }: { children: ReactNode }) { 431 - const [agent, setAgent] = useState<Agent | null>(null); 432 - const [did, setDid] = useState<string | null>(null); 433 - const [handle, setHandle] = useState<string | null>(null); 434 - const [loading, setLoading] = useState(true); 435 - 436 - // Restore session on mount 437 - useEffect(() => { 438 - async function restore() { 439 - try { 440 - const client = getOAuthClient(); 441 - const result = await client.init(); 442 - if (result?.session) { 443 - const newAgent = new Agent(result.session); 444 - setAgent(newAgent); 445 - setDid(result.session.did); 446 - // Resolve handle 447 - try { 448 - const profile = await newAgent.getProfile({ actor: result.session.did }); 449 - setHandle(profile.data.handle); 450 - } catch {} 451 - } 452 - } catch (err) { 453 - console.error("Auth restore error:", err); 454 - } finally { 455 - setLoading(false); 456 - } 457 - } 458 - restore(); 459 - }, []); 460 - 461 - const signIn = useCallback(async (userHandle: string) => { 462 - const client = getOAuthClient(); 463 - await client.signIn(userHandle, { 464 - scope: "atproto", 465 - }); 466 - // This redirects — the callback page handles the rest 467 - }, []); 468 - 469 - const signOut = useCallback(async () => { 470 - try { 471 - const client = getOAuthClient(); 472 - if (did) { 473 - const session = await client.restore(did); 474 - if (session) { 475 - // Revoke session 476 - } 477 - } 478 - } catch {} 479 - setAgent(null); 480 - setDid(null); 481 - setHandle(null); 482 - }, [did]); 483 - 484 - return ( 485 - <AuthContext.Provider value={{ agent, did, handle, loading, signIn, signOut }}> 486 - {children} 487 - </AuthContext.Provider> 488 - ); 489 - } 490 - ``` 491 - 492 - - [ ] **Step 2: Create OAuth callback page** 493 - 494 - `apps/ionosphere/src/app/auth/callback/page.tsx`: 495 - 496 - ```tsx 497 - "use client"; 498 - 499 - import { useEffect } from "react"; 500 - import { useRouter } from "next/navigation"; 501 - 502 - export default function AuthCallback() { 503 - const router = useRouter(); 504 - 505 - useEffect(() => { 506 - // The BrowserOAuthClient.init() in AuthProvider handles the callback 507 - // automatically when it detects the authorization code in the URL. 508 - // Just redirect back to where the user came from. 509 - const returnTo = sessionStorage.getItem("auth_return_to") || "/talks"; 510 - router.replace(returnTo); 511 - }, [router]); 512 - 513 - return ( 514 - <div className="h-full flex items-center justify-center text-neutral-400"> 515 - Signing in... 516 - </div> 517 - ); 518 - } 519 - ``` 520 - 521 - - [ ] **Step 3: Commit** 522 - 523 - ```bash 524 - git add apps/ionosphere/src/lib/auth.tsx apps/ionosphere/src/app/auth/callback/page.tsx 525 - git commit -m "feat: AT Protocol OAuth client with React context" 526 - ``` 527 - 528 - ### Task 8: Auth button + layout integration 529 - 530 - **Files:** 531 - - Create: `apps/ionosphere/src/app/components/AuthButton.tsx` 532 - - Modify: `apps/ionosphere/src/app/components/NavHeader.tsx` 533 - - Modify: `apps/ionosphere/src/app/layout.tsx` 534 - 535 - - [ ] **Step 1: Create AuthButton** 536 - 537 - ```tsx 538 - "use client"; 539 - 540 - import { useState } from "react"; 541 - import { useAuth } from "@/lib/auth"; 542 - 543 - export default function AuthButton() { 544 - const { did, handle, loading, signIn, signOut } = useAuth(); 545 - const [inputHandle, setInputHandle] = useState(""); 546 - const [showInput, setShowInput] = useState(false); 547 - 548 - if (loading) return null; 549 - 550 - if (did) { 551 - return ( 552 - <div className="flex items-center gap-2"> 553 - <span className="text-xs text-neutral-400 hidden sm:inline"> 554 - {handle || did.slice(0, 20) + "..."} 555 - </span> 556 - <button 557 - onClick={signOut} 558 - className="text-xs text-neutral-500 hover:text-neutral-300" 559 - > 560 - Sign out 561 - </button> 562 - </div> 563 - ); 564 - } 565 - 566 - if (showInput) { 567 - return ( 568 - <form 569 - onSubmit={(e) => { 570 - e.preventDefault(); 571 - if (inputHandle) { 572 - sessionStorage.setItem("auth_return_to", window.location.pathname); 573 - signIn(inputHandle); 574 - } 575 - }} 576 - className="flex items-center gap-1" 577 - > 578 - <input 579 - type="text" 580 - value={inputHandle} 581 - onChange={(e) => setInputHandle(e.target.value)} 582 - placeholder="handle.bsky.social" 583 - className="bg-neutral-900 border border-neutral-700 rounded px-2 py-1 text-xs text-neutral-200 w-40 focus:outline-none focus:border-neutral-500" 584 - autoFocus 585 - /> 586 - <button type="submit" className="text-xs text-neutral-400 hover:text-neutral-200"> 587 - Go 588 - </button> 589 - <button type="button" onClick={() => setShowInput(false)} className="text-xs text-neutral-600"> 590 - 591 - </button> 592 - </form> 593 - ); 594 - } 595 - 596 - return ( 597 - <button 598 - onClick={() => setShowInput(true)} 599 - className="text-xs text-neutral-500 hover:text-neutral-300" 600 - > 601 - Sign in 602 - </button> 603 - ); 604 - } 605 - ``` 606 - 607 - - [ ] **Step 2: Add AuthButton to NavHeader** 608 - 609 - In `NavHeader.tsx`, add AuthButton to the right side of the nav: 610 - 611 - ```tsx 612 - import AuthButton from "./AuthButton"; 613 - 614 - // Inside the nav, after the desktop nav links: 615 - <div className="ml-auto hidden md:block"> 616 - <AuthButton /> 617 - </div> 618 - ``` 619 - 620 - - [ ] **Step 3: Wrap layout with AuthProvider** 621 - 622 - In `layout.tsx`, wrap the body content with AuthProvider. Since AuthProvider is a client component, it needs to wrap the children: 623 - 624 - ```tsx 625 - import { AuthProvider } from "@/lib/auth"; 626 - 627 - // In the body: 628 - <AuthProvider> 629 - <NavHeader /> 630 - <main className="flex-1 min-h-0">{children}</main> 631 - </AuthProvider> 632 - ``` 633 - 634 - - [ ] **Step 4: Commit** 635 - 636 - ```bash 637 - git add apps/ionosphere/src/app/components/AuthButton.tsx apps/ionosphere/src/app/components/NavHeader.tsx apps/ionosphere/src/app/layout.tsx 638 - git commit -m "feat: sign in/out button with AT Protocol OAuth" 639 - ``` 640 - 641 - --- 642 - 643 - ## Chunk 3: Comment UI 644 - 645 - Frontend comment rendering and composition. 646 - 647 - ### Task 9: Comment composition (text selection → comment/react) 648 - 649 - **Files:** 650 - - Create: `apps/ionosphere/src/app/components/TextSelector.tsx` 651 - 652 - - [ ] **Step 1: Create TextSelector** 653 - 654 - A component that detects text selection in the transcript and shows a floating toolbar with emoji reactions and a comment button: 655 - 656 - ```tsx 657 - "use client"; 658 - 659 - import { useState, useEffect, useCallback, useRef } from "react"; 660 - import { useAuth } from "@/lib/auth"; 661 - 662 - interface TextSelectorProps { 663 - containerRef: React.RefObject<HTMLDivElement>; 664 - onComment: (byteStart: number, byteEnd: number, text: string) => void; 665 - getByteRange: (selection: Selection) => { byteStart: number; byteEnd: number } | null; 666 - } 667 - 668 - const QUICK_EMOJI = ["🔥", "👏", "💡", "❓", "💯", "❤️"]; 669 - 670 - export default function TextSelector({ containerRef, onComment, getByteRange }: TextSelectorProps) { 671 - const { agent, did } = useAuth(); 672 - const [selection, setSelection] = useState<{ byteStart: number; byteEnd: number; rect: DOMRect } | null>(null); 673 - const [showCommentInput, setShowCommentInput] = useState(false); 674 - const [commentText, setCommentText] = useState(""); 675 - const toolbarRef = useRef<HTMLDivElement>(null); 676 - 677 - useEffect(() => { 678 - const container = containerRef.current; 679 - if (!container) return; 680 - 681 - const handleMouseUp = () => { 682 - const sel = window.getSelection(); 683 - if (!sel || sel.isCollapsed || !sel.rangeCount) { 684 - setSelection(null); 685 - return; 686 - } 687 - 688 - // Check if selection is within our container 689 - const range = sel.getRangeAt(0); 690 - if (!container.contains(range.commonAncestorContainer)) { 691 - setSelection(null); 692 - return; 693 - } 694 - 695 - const byteRange = getByteRange(sel); 696 - if (!byteRange) { setSelection(null); return; } 697 - 698 - const rect = range.getBoundingClientRect(); 699 - setSelection({ ...byteRange, rect }); 700 - setShowCommentInput(false); 701 - setCommentText(""); 702 - }; 703 - 704 - document.addEventListener("mouseup", handleMouseUp); 705 - return () => document.removeEventListener("mouseup", handleMouseUp); 706 - }, [containerRef, getByteRange]); 707 - 708 - const handleEmoji = useCallback((emoji: string) => { 709 - if (!selection) return; 710 - onComment(selection.byteStart, selection.byteEnd, emoji); 711 - setSelection(null); 712 - }, [selection, onComment]); 713 - 714 - const handleSubmitComment = useCallback(() => { 715 - if (!selection || !commentText.trim()) return; 716 - onComment(selection.byteStart, selection.byteEnd, commentText.trim()); 717 - setSelection(null); 718 - setShowCommentInput(false); 719 - setCommentText(""); 720 - }, [selection, commentText, onComment]); 721 - 722 - if (!selection || !did) return null; 723 - 724 - const containerRect = containerRef.current?.getBoundingClientRect(); 725 - if (!containerRect) return null; 726 - 727 - const top = selection.rect.top - containerRect.top - 40; 728 - const left = selection.rect.left - containerRect.left + selection.rect.width / 2; 729 - 730 - return ( 731 - <div 732 - ref={toolbarRef} 733 - className="absolute z-50 bg-neutral-800 border border-neutral-700 rounded-lg shadow-xl p-1 flex items-center gap-1" 734 - style={{ top, left, transform: "translateX(-50%)" }} 735 - > 736 - {!showCommentInput ? ( 737 - <> 738 - {QUICK_EMOJI.map((emoji) => ( 739 - <button 740 - key={emoji} 741 - onClick={() => handleEmoji(emoji)} 742 - className="w-8 h-8 flex items-center justify-center hover:bg-neutral-700 rounded text-base" 743 - > 744 - {emoji} 745 - </button> 746 - ))} 747 - <div className="w-px h-6 bg-neutral-700" /> 748 - <button 749 - onClick={() => setShowCommentInput(true)} 750 - className="px-2 h-8 text-xs text-neutral-400 hover:text-neutral-200 hover:bg-neutral-700 rounded" 751 - > 752 - Comment 753 - </button> 754 - </> 755 - ) : ( 756 - <form onSubmit={(e) => { e.preventDefault(); handleSubmitComment(); }} className="flex items-center gap-1"> 757 - <input 758 - type="text" 759 - value={commentText} 760 - onChange={(e) => setCommentText(e.target.value)} 761 - placeholder="Add a comment..." 762 - className="bg-neutral-900 border border-neutral-600 rounded px-2 py-1 text-xs text-neutral-200 w-48 focus:outline-none" 763 - autoFocus 764 - /> 765 - <button type="submit" className="text-xs text-neutral-400 hover:text-neutral-200 px-1"> 766 - Post 767 - </button> 768 - </form> 769 - )} 770 - </div> 771 - ); 772 - } 773 - ``` 774 - 775 - - [ ] **Step 2: Commit** 776 - 777 - ```bash 778 - git add apps/ionosphere/src/app/components/TextSelector.tsx 779 - git commit -m "feat: text selection toolbar for comments and emoji reactions" 780 - ``` 781 - 782 - ### Task 10: Comment publishing logic 783 - 784 - **Files:** 785 - - Create: `apps/ionosphere/src/lib/comments.ts` 786 - 787 - - [ ] **Step 1: Create comments.ts** 788 - 789 - Functions for publishing and fetching comments: 790 - 791 - ```typescript 792 - import type { Agent } from "@atproto/api"; 793 - 794 - const API_BASE = typeof window !== "undefined" 795 - ? (process.env.NEXT_PUBLIC_API_URL || "http://localhost:9401") 796 - : ""; 797 - 798 - export async function publishComment( 799 - agent: Agent, 800 - subject: string, 801 - text: string, 802 - anchor?: { byteStart: number; byteEnd: number } 803 - ): Promise<string> { 804 - const record: Record<string, unknown> = { 805 - $type: "tv.ionosphere.comment", 806 - subject, 807 - text, 808 - createdAt: new Date().toISOString(), 809 - }; 810 - if (anchor) { 811 - record.anchor = anchor; 812 - } 813 - 814 - const result = await agent.com.atproto.repo.createRecord({ 815 - repo: agent.assertDid, 816 - collection: "tv.ionosphere.comment", 817 - record, 818 - }); 819 - 820 - return result.data.uri; 821 - } 822 - 823 - export async function fetchComments(talkRkey: string): Promise<any[]> { 824 - const res = await fetch(`${API_BASE}/talks/${talkRkey}/comments`); 825 - if (!res.ok) return []; 826 - const { comments } = await res.json(); 827 - return comments; 828 - } 829 - 830 - export async function fetchReplies(commentUri: string): Promise<any[]> { 831 - const res = await fetch(`${API_BASE}/comments?subject=${encodeURIComponent(commentUri)}`); 832 - if (!res.ok) return []; 833 - const { comments } = await res.json(); 834 - return comments; 835 - } 836 - ``` 837 - 838 - - [ ] **Step 2: Commit** 839 - 840 - ```bash 841 - git add apps/ionosphere/src/lib/comments.ts 842 - git commit -m "feat: comment publishing and fetching library" 843 - ``` 844 - 845 - ### Task 11: Integrate comments into TranscriptView 846 - 847 - **Files:** 848 - - Modify: `apps/ionosphere/src/app/components/TranscriptView.tsx` 849 - 850 - - [ ] **Step 1: Add comment highlights to transcript rendering** 851 - 852 - This is the integration point. The TranscriptView already renders word spans with byte ranges. Add: 853 - 854 - 1. A prop for comments data 855 - 2. Highlight styling on spans that have comments/reactions 856 - 3. Click handler to open comment thread 857 - 4. The TextSelector component for creating new comments 858 - 859 - The exact integration depends on the TranscriptView structure (which renders word spans in a loop). The key changes: 860 - 861 - - Accept `comments` prop and `onPublishComment` callback 862 - - For each word span, check if any comment's byte range overlaps 863 - - If so, add a subtle highlight style (e.g., dotted underline, background tint) 864 - - Show small emoji clusters near highlighted spans 865 - - Include `<TextSelector>` component inside the transcript container 866 - 867 - The `getByteRange` callback for TextSelector needs to map a DOM Selection back to byte offsets — use the word spans' `byteStart`/`byteEnd` data to determine the range. 868 - 869 - - [ ] **Step 2: Commit** 870 - 871 - ```bash 872 - git add apps/ionosphere/src/app/components/TranscriptView.tsx 873 - git commit -m "feat: inline comment highlights on transcript" 874 - ``` 875 - 876 - ### Task 12: Comment panel 877 - 878 - **Files:** 879 - - Create: `apps/ionosphere/src/app/components/CommentPanel.tsx` 880 - 881 - - [ ] **Step 1: Create CommentPanel** 882 - 883 - A panel that shows comments for a selected transcript span. Can be integrated into the player sidebar or as an overlay: 884 - 885 - ```tsx 886 - "use client"; 887 - 888 - import { useState, useEffect } from "react"; 889 - import { useAuth } from "@/lib/auth"; 890 - import { fetchReplies, publishComment } from "@/lib/comments"; 891 - 892 - interface Comment { 893 - uri: string; 894 - author_did: string; 895 - text: string; 896 - byte_start: number | null; 897 - byte_end: number | null; 898 - created_at: string; 899 - } 900 - 901 - interface CommentPanelProps { 902 - comments: Comment[]; 903 - subjectUri: string; 904 - selectedRange?: { byteStart: number; byteEnd: number }; 905 - } 906 - 907 - export default function CommentPanel({ comments, subjectUri, selectedRange }: CommentPanelProps) { 908 - const { agent, did } = useAuth(); 909 - 910 - // Filter to comments on the selected range (or all if no range selected) 911 - const visible = selectedRange 912 - ? comments.filter((c) => 913 - c.byte_start !== null && 914 - c.byte_end !== null && 915 - c.byte_start < selectedRange.byteEnd && 916 - c.byte_end > selectedRange.byteStart 917 - ) 918 - : comments; 919 - 920 - // Group emoji reactions 921 - const reactions = visible.filter((c) => c.text.length <= 2); 922 - const textComments = visible.filter((c) => c.text.length > 2); 923 - 924 - // Emoji counts 925 - const emojiCounts = new Map<string, number>(); 926 - for (const r of reactions) { 927 - emojiCounts.set(r.text, (emojiCounts.get(r.text) || 0) + 1); 928 - } 929 - 930 - return ( 931 - <div className="p-3 text-sm"> 932 - {emojiCounts.size > 0 && ( 933 - <div className="flex gap-2 mb-3 flex-wrap"> 934 - {[...emojiCounts.entries()].map(([emoji, count]) => ( 935 - <span key={emoji} className="bg-neutral-800 rounded-full px-2 py-0.5 text-xs"> 936 - {emoji} {count > 1 && count} 937 - </span> 938 - ))} 939 - </div> 940 - )} 941 - {textComments.map((comment) => ( 942 - <div key={comment.uri} className="mb-3 border-l-2 border-neutral-800 pl-3"> 943 - <div className="text-xs text-neutral-500 mb-0.5"> 944 - {comment.author_did.slice(0, 24)}... 945 - </div> 946 - <div className="text-neutral-300">{comment.text}</div> 947 - <div className="text-xs text-neutral-600 mt-0.5"> 948 - {new Date(comment.created_at).toLocaleDateString()} 949 - </div> 950 - </div> 951 - ))} 952 - {visible.length === 0 && ( 953 - <div className="text-neutral-600 text-xs">No comments on this selection</div> 954 - )} 955 - </div> 956 - ); 957 - } 958 - ``` 959 - 960 - - [ ] **Step 2: Commit** 961 - 962 - ```bash 963 - git add apps/ionosphere/src/app/components/CommentPanel.tsx 964 - git commit -m "feat: comment panel with emoji counts and threaded display" 965 - ```
-672
docs/superpowers/plans/2026-04-02-comment-ui-polish.md
··· 1 - # Comment UI Polish Implementation Plan 2 - 3 - > **For agentic workers:** REQUIRED: Use superpowers:subagent-driven-development (if subagents available) or superpowers:executing-plans to implement this plan. Steps use checkbox (`- [ ]`) syntax for tracking. 4 - 5 - **Goal:** Polish the comment system with author identity resolution, discoverability hints, a whole-talk reaction bar, and comment count badges on talk listings. 6 - 7 - **Architecture:** Four independent features layered on the existing comment pipeline (OAuth → PDS → Jetstream → appview → frontend). The backend adds a `profiles` table and enriches API responses; the frontend adds new UI components and updates existing ones to use profile data. 8 - 9 - **Tech Stack:** SQLite (better-sqlite3), Hono, Next.js/React, AT Protocol public API, localStorage 10 - 11 - **Spec:** `docs/superpowers/specs/2026-04-02-comment-ui-polish-design.md` 12 - 13 - --- 14 - 15 - ## File Map 16 - 17 - ### Backend (apps/ionosphere-appview/src/) 18 - | File | Action | Responsibility | 19 - |------|--------|---------------| 20 - | `profiles.ts` | Create | Profile resolution + caching (fetch from public API, read/write profiles table) | 21 - | `db.ts` | Modify | Add `profiles` table to migration | 22 - | `routes.ts` | Modify | Join profile data on comment endpoints; add reaction summary to `/talks` | 23 - | `indexer.ts` | Modify | Trigger profile resolution when indexing a new comment | 24 - 25 - ### Frontend (apps/ionosphere/src/) 26 - | File | Action | Responsibility | 27 - |------|--------|---------------| 28 - | `lib/comments.ts` | Modify | Update `CommentData` type with profile fields | 29 - | `app/components/ReactionBar.tsx` | Create | Whole-talk reaction bar (emoji buttons + comment input + counts) | 30 - | `app/components/TranscriptView.tsx` | Modify | Render author handle/avatar in popover; add discoverability hint | 31 - | `app/talks/[rkey]/TalkContent.tsx` | Modify | Wire ReactionBar between video and transcript | 32 - | `app/talks/TalksListContent.tsx` | Modify | Render comment badges; remove header reaction display | 33 - 34 - --- 35 - 36 - ## Chunk 1: Backend — Profile Resolution + API Enrichment 37 - 38 - ### Task 1: Create profiles table and resolution module 39 - 40 - **Files:** 41 - - Modify: `apps/ionosphere-appview/src/db.ts:152` (after comments table) 42 - - Create: `apps/ionosphere-appview/src/profiles.ts` 43 - 44 - - [ ] **Step 1: Add profiles table to DB migration** 45 - 46 - In `db.ts`, add after the `comments` table and its indexes (around line 165): 47 - 48 - ```sql 49 - CREATE TABLE IF NOT EXISTS profiles ( 50 - did TEXT PRIMARY KEY, 51 - handle TEXT, 52 - display_name TEXT, 53 - avatar_url TEXT, 54 - fetched_at TEXT 55 - ); 56 - ``` 57 - 58 - - [ ] **Step 2: Create profiles.ts with resolveProfile function** 59 - 60 - ```typescript 61 - import type Database from "better-sqlite3"; 62 - 63 - const PROFILE_MAX_AGE_MS = 24 * 60 * 60 * 1000; // 24 hours 64 - const PUBLIC_API = "https://public.api.bsky.app"; 65 - 66 - export interface Profile { 67 - did: string; 68 - handle: string | null; 69 - display_name: string | null; 70 - avatar_url: string | null; 71 - } 72 - 73 - /** 74 - * Get a cached profile or fetch from public API. 75 - * Returns null only on fetch failure for unknown DIDs. 76 - */ 77 - export async function resolveProfile( 78 - db: Database.Database, 79 - did: string 80 - ): Promise<Profile | null> { 81 - const cached = db 82 - .prepare("SELECT * FROM profiles WHERE did = ?") 83 - .get(did) as any; 84 - 85 - if (cached) { 86 - const age = Date.now() - new Date(cached.fetched_at).getTime(); 87 - if (age < PROFILE_MAX_AGE_MS) { 88 - return { 89 - did: cached.did, 90 - handle: cached.handle, 91 - display_name: cached.display_name, 92 - avatar_url: cached.avatar_url, 93 - }; 94 - } 95 - } 96 - 97 - // Fetch from public API 98 - try { 99 - const res = await fetch( 100 - `${PUBLIC_API}/xrpc/app.bsky.actor.getProfile?actor=${encodeURIComponent(did)}` 101 - ); 102 - if (!res.ok) return cached || null; 103 - const data = await res.json(); 104 - 105 - const profile: Profile = { 106 - did, 107 - handle: data.handle || null, 108 - display_name: data.displayName || null, 109 - avatar_url: data.avatar || null, 110 - }; 111 - 112 - db.prepare( 113 - `INSERT OR REPLACE INTO profiles (did, handle, display_name, avatar_url, fetched_at) 114 - VALUES (?, ?, ?, ?, ?)` 115 - ).run(did, profile.handle, profile.display_name, profile.avatar_url, new Date().toISOString()); 116 - 117 - return profile; 118 - } catch { 119 - return cached || null; 120 - } 121 - } 122 - 123 - /** 124 - * Fire-and-forget profile resolution. Call when indexing a comment 125 - * from a DID we haven't seen. Does not block the caller. 126 - */ 127 - export function ensureProfile(db: Database.Database, did: string): void { 128 - const exists = db 129 - .prepare("SELECT 1 FROM profiles WHERE did = ?") 130 - .get(did); 131 - if (!exists) { 132 - resolveProfile(db, did).catch(() => {}); 133 - } 134 - } 135 - ``` 136 - 137 - - [ ] **Step 3: Commit** 138 - 139 - ```bash 140 - git add apps/ionosphere-appview/src/db.ts apps/ionosphere-appview/src/profiles.ts 141 - git commit -m "feat: add profiles table and DID-to-profile resolution" 142 - ``` 143 - 144 - ### Task 2: Trigger profile resolution on comment indexing 145 - 146 - **Files:** 147 - - Modify: `apps/ionosphere-appview/src/indexer.ts:320-343` 148 - 149 - - [ ] **Step 1: Import ensureProfile in indexer.ts** 150 - 151 - Add at top of file: 152 - 153 - ```typescript 154 - import { ensureProfile } from "./profiles.js"; 155 - ``` 156 - 157 - - [ ] **Step 2: Call ensureProfile in indexUserComment** 158 - 159 - Add at the end of the `indexUserComment` function (after the INSERT): 160 - 161 - ```typescript 162 - ensureProfile(db, did); 163 - ``` 164 - 165 - - [ ] **Step 3: Commit** 166 - 167 - ```bash 168 - git add apps/ionosphere-appview/src/indexer.ts 169 - git commit -m "feat: resolve author profiles when indexing comments" 170 - ``` 171 - 172 - ### Task 3: Enrich comment API responses with profile data 173 - 174 - **Files:** 175 - - Modify: `apps/ionosphere-appview/src/routes.ts:187-214` 176 - 177 - - [ ] **Step 1: Update /talks/:rkey/comments to join profiles** 178 - 179 - Replace the comments query (around line 198-200) with a LEFT JOIN: 180 - 181 - ```typescript 182 - const comments = db.prepare( 183 - `SELECT c.*, p.handle as author_handle, p.display_name as author_display_name, p.avatar_url as author_avatar_url 184 - FROM comments c 185 - LEFT JOIN profiles p ON c.author_did = p.did 186 - WHERE c.subject_uri IN (${placeholders}) 187 - ORDER BY c.created_at ASC` 188 - ).all(...subjectUris); 189 - ``` 190 - 191 - - [ ] **Step 2: Update /comments endpoint similarly** 192 - 193 - Replace the comments query (around line 209-211): 194 - 195 - ```typescript 196 - const comments = db.prepare( 197 - `SELECT c.*, p.handle as author_handle, p.display_name as author_display_name, p.avatar_url as author_avatar_url 198 - FROM comments c 199 - LEFT JOIN profiles p ON c.author_did = p.did 200 - WHERE c.subject_uri = ? 201 - ORDER BY c.created_at ASC` 202 - ).all(subject); 203 - ``` 204 - 205 - - [ ] **Step 3: Commit** 206 - 207 - ```bash 208 - git add apps/ionosphere-appview/src/routes.ts 209 - git commit -m "feat: include author profile data in comment API responses" 210 - ``` 211 - 212 - ### Task 4: Add reaction summary to /talks endpoint 213 - 214 - **Files:** 215 - - Modify: `apps/ionosphere-appview/src/routes.ts:74-86` 216 - 217 - - [ ] **Step 1: Add comment stats subquery to /talks** 218 - 219 - After the existing talks query (line 74-85), add a second query to get comment stats per talk, and merge them: 220 - 221 - ```typescript 222 - app.get("/talks", (c) => { 223 - const talks = db 224 - .prepare( 225 - `SELECT t.*, GROUP_CONCAT(s.name) as speaker_names 226 - FROM talks t 227 - LEFT JOIN talk_speakers ts ON t.uri = ts.talk_uri 228 - LEFT JOIN speakers s ON ts.speaker_uri = s.uri 229 - GROUP BY t.uri 230 - ORDER BY t.starts_at ASC` 231 - ) 232 - .all() as any[]; 233 - 234 - // Build comment stats per talk (talk URI or transcript URI as subject) 235 - const commentStats = db 236 - .prepare( 237 - `SELECT 238 - COALESCE(t.uri, c.subject_uri) as talk_uri, 239 - c.text 240 - FROM comments c 241 - LEFT JOIN transcripts tr ON c.subject_uri = tr.uri 242 - LEFT JOIN talks t ON t.uri = c.subject_uri OR t.uri = tr.talk_uri` 243 - ) 244 - .all() as any[]; 245 - 246 - // Aggregate per talk 247 - const statsMap = new Map<string, { emojis: Map<string, number>; textCount: number }>(); 248 - for (const row of commentStats) { 249 - if (!row.talk_uri) continue; 250 - if (!statsMap.has(row.talk_uri)) { 251 - statsMap.set(row.talk_uri, { emojis: new Map(), textCount: 0 }); 252 - } 253 - const stats = statsMap.get(row.talk_uri)!; 254 - const isEmoji = row.text.length <= 2 && !/[a-zA-Z]/.test(row.text); 255 - if (isEmoji) { 256 - stats.emojis.set(row.text, (stats.emojis.get(row.text) || 0) + 1); 257 - } else { 258 - stats.textCount++; 259 - } 260 - } 261 - 262 - const enriched = talks.map((talk: any) => { 263 - const stats = statsMap.get(talk.uri); 264 - if (!stats) return talk; 265 - // Top 3 emojis by count 266 - const topEmojis = [...stats.emojis.entries()] 267 - .sort((a, b) => b[1] - a[1]) 268 - .slice(0, 3); 269 - return { 270 - ...talk, 271 - reaction_summary: topEmojis.length > 0 ? JSON.stringify(topEmojis) : null, 272 - comment_count: stats.textCount, 273 - }; 274 - }); 275 - 276 - return c.json({ talks: enriched }); 277 - }); 278 - ``` 279 - 280 - - [ ] **Step 2: Commit** 281 - 282 - ```bash 283 - git add apps/ionosphere-appview/src/routes.ts 284 - git commit -m "feat: add reaction summary and comment count to /talks endpoint" 285 - ``` 286 - 287 - --- 288 - 289 - ## Chunk 2: Frontend — Profile Display, Reaction Bar, Badges 290 - 291 - ### Task 5: Update CommentData type with profile fields 292 - 293 - **Files:** 294 - - Modify: `apps/ionosphere/src/lib/comments.ts:28-38` 295 - 296 - - [ ] **Step 1: Add profile fields to CommentData interface** 297 - 298 - ```typescript 299 - export interface CommentData { 300 - uri: string; 301 - author_did: string; 302 - author_handle?: string | null; 303 - author_display_name?: string | null; 304 - author_avatar_url?: string | null; 305 - rkey: string; 306 - subject_uri: string; 307 - text: string; 308 - facets: string | null; 309 - byte_start: number | null; 310 - byte_end: number | null; 311 - created_at: string; 312 - } 313 - ``` 314 - 315 - - [ ] **Step 2: Commit** 316 - 317 - ```bash 318 - git add apps/ionosphere/src/lib/comments.ts 319 - git commit -m "feat: add profile fields to CommentData type" 320 - ``` 321 - 322 - ### Task 6: Render author identity in TranscriptView popover 323 - 324 - **Files:** 325 - - Modify: `apps/ionosphere/src/app/components/TranscriptView.tsx:550-556` 326 - 327 - - [ ] **Step 1: Update the comment text rendering in the expanded popover** 328 - 329 - Replace the comment rendering block (around line 550-556): 330 - 331 - ```tsx 332 - {group.texts.map((c) => ( 333 - <div key={c.uri} className="text-[12px] text-neutral-300 border-t border-neutral-700 pt-1 mt-1"> 334 - <span className="text-neutral-500 flex items-center gap-1"> 335 - {c.author_avatar_url && ( 336 - <img src={c.author_avatar_url} alt="" className="w-3.5 h-3.5 rounded-full" /> 337 - )} 338 - {c.author_display_name || c.author_handle || c.author_did.slice(8, 24) + "..."} 339 - </span> 340 - <p>{c.text}</p> 341 - </div> 342 - ))} 343 - ``` 344 - 345 - - [ ] **Step 2: Commit** 346 - 347 - ```bash 348 - git add apps/ionosphere/src/app/components/TranscriptView.tsx 349 - git commit -m "feat: show author handle and avatar in comment popovers" 350 - ``` 351 - 352 - ### Task 7: Add discoverability hint to TranscriptView 353 - 354 - **Files:** 355 - - Modify: `apps/ionosphere/src/app/components/TranscriptView.tsx` 356 - 357 - - [ ] **Step 1: Add state for hint visibility** 358 - 359 - At the top of the `TranscriptView` component (after existing useState calls), add: 360 - 361 - ```typescript 362 - const [showHint, setShowHint] = useState(() => { 363 - if (typeof window === "undefined") return false; 364 - return !localStorage.getItem("has_commented"); 365 - }); 366 - ``` 367 - 368 - - [ ] **Step 2: Dismiss hint on comment publish** 369 - 370 - In the `handlePublish` callback, after the optimistic comment is added, add: 371 - 372 - ```typescript 373 - if (showHint) { 374 - localStorage.setItem("has_commented", "1"); 375 - setShowHint(false); 376 - } 377 - ``` 378 - 379 - - [ ] **Step 3: Render hint at bottom of transcript container** 380 - 381 - Add just before the bottom spacer `<div style={{ height: "calc(67% + 1rem)" }} />`: 382 - 383 - ```tsx 384 - {showHint && ( 385 - <p className="text-center text-xs text-neutral-600 mt-4 select-none"> 386 - Select text to add a reaction 387 - </p> 388 - )} 389 - ``` 390 - 391 - - [ ] **Step 4: Commit** 392 - 393 - ```bash 394 - git add apps/ionosphere/src/app/components/TranscriptView.tsx 395 - git commit -m "feat: add discoverability hint for text-selection reactions" 396 - ``` 397 - 398 - ### Task 8: Create ReactionBar component 399 - 400 - **Files:** 401 - - Create: `apps/ionosphere/src/app/components/ReactionBar.tsx` 402 - 403 - - [ ] **Step 1: Create the component** 404 - 405 - ```tsx 406 - "use client"; 407 - 408 - import { useState, useCallback } from "react"; 409 - import { useAuth } from "@/lib/auth"; 410 - import { publishComment, type CommentData } from "@/lib/comments"; 411 - 412 - const QUICK_EMOJI = ["\u{1F525}", "\u{1F44F}", "\u{1F4A1}", "\u2753", "\u{1F4AF}", "\u2764\uFE0F"]; 413 - 414 - interface ReactionBarProps { 415 - subjectUri: string; 416 - comments: CommentData[]; 417 - onCommentPublished?: () => void; 418 - } 419 - 420 - export default function ReactionBar({ subjectUri, comments, onCommentPublished }: ReactionBarProps) { 421 - const { agent, did } = useAuth(); 422 - const [showInput, setShowInput] = useState(false); 423 - const [commentText, setCommentText] = useState(""); 424 - const [posting, setPosting] = useState(false); 425 - 426 - // Whole-talk reactions (unanchored emojis) 427 - const reactionCounts = new Map<string, number>(); 428 - for (const c of comments) { 429 - if (c.byte_start === null && c.text.length <= 2 && !/[a-zA-Z]/.test(c.text)) { 430 - reactionCounts.set(c.text, (reactionCounts.get(c.text) || 0) + 1); 431 - } 432 - } 433 - 434 - const handleEmoji = useCallback(async (emoji: string) => { 435 - if (!agent) return; 436 - try { 437 - await publishComment(agent, subjectUri, emoji); 438 - onCommentPublished?.(); 439 - } catch (err) { 440 - console.error("Failed to post reaction:", err); 441 - } 442 - // Dismiss hint 443 - localStorage.setItem("has_commented", "1"); 444 - }, [agent, subjectUri, onCommentPublished]); 445 - 446 - const handleSubmit = useCallback(async () => { 447 - if (!agent || !commentText.trim()) return; 448 - setPosting(true); 449 - try { 450 - await publishComment(agent, subjectUri, commentText.trim()); 451 - setCommentText(""); 452 - setShowInput(false); 453 - onCommentPublished?.(); 454 - } catch (err) { 455 - console.error("Failed to post comment:", err); 456 - } finally { 457 - setPosting(false); 458 - } 459 - localStorage.setItem("has_commented", "1"); 460 - }, [agent, subjectUri, commentText, onCommentPublished]); 461 - 462 - return ( 463 - <div className="flex items-center gap-1 px-4 py-1.5 border-b border-neutral-800 bg-neutral-950/50"> 464 - {/* Existing reaction counts */} 465 - {[...reactionCounts.entries()].map(([emoji, count]) => ( 466 - <span key={emoji} className="text-xs bg-neutral-800 rounded-full px-1.5 py-0.5 border border-neutral-700"> 467 - {emoji}{count > 1 && <span className="text-neutral-500 ml-0.5">{count}</span>} 468 - </span> 469 - ))} 470 - 471 - {/* Divider if there are existing reactions */} 472 - {reactionCounts.size > 0 && <div className="w-px h-4 bg-neutral-800 mx-1" />} 473 - 474 - {/* Quick emoji buttons (only shown when logged in) */} 475 - {did && ( 476 - <> 477 - {QUICK_EMOJI.map((emoji) => ( 478 - <button 479 - key={emoji} 480 - onClick={() => handleEmoji(emoji)} 481 - className="w-6 h-6 flex items-center justify-center hover:bg-neutral-800 rounded text-sm transition-colors" 482 - >{emoji}</button> 483 - ))} 484 - <div className="w-px h-4 bg-neutral-800 mx-1" /> 485 - {!showInput ? ( 486 - <button 487 - onClick={() => setShowInput(true)} 488 - className="text-xs text-neutral-500 hover:text-neutral-300 px-1.5 py-0.5 hover:bg-neutral-800 rounded transition-colors" 489 - >Comment</button> 490 - ) : ( 491 - <form 492 - onSubmit={(e) => { e.preventDefault(); handleSubmit(); }} 493 - className="flex items-center gap-1 flex-1 min-w-0" 494 - > 495 - <input 496 - type="text" 497 - value={commentText} 498 - onChange={(e) => setCommentText(e.target.value)} 499 - onKeyDown={(e) => { if (e.key === "Escape") { setShowInput(false); setCommentText(""); } }} 500 - placeholder="Add a comment..." 501 - className="flex-1 min-w-0 bg-neutral-900 border border-neutral-700 rounded px-2 py-0.5 text-xs text-neutral-200 placeholder:text-neutral-600 focus:outline-none" 502 - autoFocus 503 - disabled={posting} 504 - /> 505 - <button 506 - type="submit" 507 - disabled={posting || !commentText.trim()} 508 - className="text-xs text-neutral-400 hover:text-neutral-200 px-1 disabled:opacity-50" 509 - >{posting ? "..." : "Post"}</button> 510 - </form> 511 - )} 512 - </> 513 - )} 514 - </div> 515 - ); 516 - } 517 - ``` 518 - 519 - - [ ] **Step 2: Commit** 520 - 521 - ```bash 522 - git add apps/ionosphere/src/app/components/ReactionBar.tsx 523 - git commit -m "feat: create whole-talk ReactionBar component" 524 - ``` 525 - 526 - ### Task 9: Wire ReactionBar into TalkContent 527 - 528 - **Files:** 529 - - Modify: `apps/ionosphere/src/app/talks/[rkey]/TalkContent.tsx:120-139` 530 - 531 - - [ ] **Step 1: Import ReactionBar** 532 - 533 - Add import at top: 534 - 535 - ```typescript 536 - import ReactionBar from "@/app/components/ReactionBar"; 537 - ``` 538 - 539 - - [ ] **Step 2: Add ReactionBar between video and transcript** 540 - 541 - After the video section (around line 125) and before the transcript section (line 128), add: 542 - 543 - ```tsx 544 - {/* Whole-talk reaction bar */} 545 - <ReactionBar 546 - subjectUri={talk.uri} 547 - comments={comments} 548 - onCommentPublished={handleCommentPublished} 549 - /> 550 - ``` 551 - 552 - - [ ] **Step 3: Commit** 553 - 554 - ```bash 555 - git add apps/ionosphere/src/app/talks/[rkey]/TalkContent.tsx 556 - git commit -m "feat: add whole-talk reaction bar to talk detail page" 557 - ``` 558 - 559 - ### Task 10: Wire ReactionBar into TalksListContent and add comment badges 560 - 561 - **Files:** 562 - - Modify: `apps/ionosphere/src/app/talks/TalksListContent.tsx` 563 - 564 - - [ ] **Step 1: Import ReactionBar** 565 - 566 - Add import at top: 567 - 568 - ```typescript 569 - import ReactionBar from "@/app/components/ReactionBar"; 570 - ``` 571 - 572 - - [ ] **Step 2: Add ReactionBar to sidebar player** 573 - 574 - Replace the existing whole-talk reactions in the player header (lines 275-289) and add ReactionBar after the video player (around line 296): 575 - 576 - Remove the whole-talk reaction IIFE (the `{(() => { ... })()}` block in the header bar, lines 274-289). Then after the `<VideoPlayer>` closing div (around line 296), add: 577 - 578 - ```tsx 579 - <ReactionBar 580 - subjectUri={selectedTalk.talkUri} 581 - comments={comments} 582 - onCommentPublished={() => fetchComments(selectedTalk.rkey).then(setComments)} 583 - /> 584 - ``` 585 - 586 - - [ ] **Step 3: Add Talk type fields for reaction data** 587 - 588 - Update the `Talk` interface to include: 589 - 590 - ```typescript 591 - reaction_summary?: string | null; // JSON: [["emoji", count], ...] 592 - comment_count?: number; 593 - ``` 594 - 595 - - [ ] **Step 4: Render comment badges in talk listing entries** 596 - 597 - In the talk metadata line (around line 238), after the existing time display, add: 598 - 599 - ```tsx 600 - {(() => { 601 - const emojis: [string, number][] = talk.reaction_summary ? JSON.parse(talk.reaction_summary) : []; 602 - const count = talk.comment_count || 0; 603 - if (emojis.length === 0 && count === 0) return null; 604 - return ( 605 - <> 606 - {" \u00b7 "} 607 - {emojis.map(([emoji, n]) => ( 608 - <span key={emoji}>{emoji}{n > 1 ? n : ""}</span> 609 - ))} 610 - {count > 0 && <span>{emojis.length > 0 ? " " : ""}{"\uD83D\uDCAC"}{count}</span>} 611 - </> 612 - ); 613 - })()} 614 - ``` 615 - 616 - - [ ] **Step 5: Commit** 617 - 618 - ```bash 619 - git add apps/ionosphere/src/app/talks/TalksListContent.tsx 620 - git commit -m "feat: add reaction bar to sidebar player and comment badges to talk listings" 621 - ``` 622 - 623 - --- 624 - 625 - ## Chunk 3: Integration and Cleanup 626 - 627 - ### Task 11: Remove unused CommentPanel (or keep for future use) 628 - 629 - **Files:** 630 - - Evaluate: `apps/ionosphere/src/app/components/CommentPanel.tsx` 631 - 632 - - [ ] **Step 1: Check if CommentPanel is imported anywhere** 633 - 634 - Run: `grep -r "CommentPanel" apps/ionosphere/src/` 635 - 636 - If it's not imported anywhere (it wasn't wired in), leave it for now — it may be useful for a future sidebar comment view. No action needed. 637 - 638 - - [ ] **Step 2: Commit (if changes made)** 639 - 640 - ### Task 12: Manual integration test 641 - 642 - - [ ] **Step 1: Start dev environment** 643 - 644 - ```bash 645 - cd apps/ionosphere-appview 646 - docker compose up -d 647 - PORT=9401 npx tsx src/appview.ts & 648 - cd ../ionosphere 649 - NEXT_PUBLIC_API_URL=http://localhost:9401 npx next dev --port 9402 650 - ``` 651 - 652 - Open `http://127.0.0.1:9402/talks` 653 - 654 - - [ ] **Step 2: Verify profile resolution** 655 - 656 - Pick a talk with existing comments. Verify that comment popovers show handles/avatars instead of truncated DIDs. 657 - 658 - - [ ] **Step 3: Verify discoverability hint** 659 - 660 - Clear localStorage (`localStorage.removeItem("has_commented")`). Reload. Verify "Select text to add a reaction" appears below transcript. Select text and post a reaction. Verify hint disappears and doesn't return on reload. 661 - 662 - - [ ] **Step 4: Verify reaction bar** 663 - 664 - On a talk detail page, verify the reaction bar appears between video and transcript. Click an emoji — verify it posts. Click Comment — verify input expands, post works, input collapses. 665 - 666 - - [ ] **Step 5: Verify comment badges** 667 - 668 - On the talks list page, verify talks with reactions show emoji + count in the metadata line. 669 - 670 - - [ ] **Step 6: Verify sidebar player** 671 - 672 - Click a talk in the list. Verify the ReactionBar appears in the sidebar player and the old header reaction display is gone.
-1354
docs/superpowers/plans/2026-04-03-enhanced-boundary-detection.md
··· 1 - # Enhanced Boundary Detection Implementation Plan 2 - 3 - > **For agentic workers:** REQUIRED: Use superpowers:subagent-driven-development (if subagents available) or superpowers:executing-plans to implement this plan. Steps use checkbox (`- [ ]`) syntax for tracking. 4 - 5 - **Goal:** Add Whisper segment confidence and pyannote speaker diarization to the boundary detection pipeline, with formalized ground truth evaluation, to improve talk boundary accuracy. 6 - 7 - **Architecture:** Python tools (`apps/ionosphere-appview/tools/`) handle audio extraction, Whisper re-transcription with segment confidence, and speaker diarization. Each produces JSON. A merge step combines them into an enriched transcript. TypeScript `detect-boundaries-v6.ts` consumes the enriched transcript and adds speaker-change and confidence-based scoring signals. An evaluation script scores results against ground truth. 8 - 9 - **Tech Stack:** Python 3.13 (via uv), pyannote.audio 3.x, torch (MPS), OpenAI Whisper API, TypeScript (existing), vitest (existing) 10 - 11 - --- 12 - 13 - ## File Structure 14 - 15 - ### New files 16 - 17 - | File | Responsibility | 18 - |------|---------------| 19 - | `apps/ionosphere-appview/tools/pyproject.toml` | Python project config with uv | 20 - | `apps/ionosphere-appview/tools/extract_audio.py` | Extract audio from HLS streams (MP3 chunks + WAV) | 21 - | `apps/ionosphere-appview/tools/transcribe_enhanced.py` | Whisper re-transcription with segment confidence + prompt hints | 22 - | `apps/ionosphere-appview/tools/diarize.py` | pyannote speaker diarization → JSON | 23 - | `apps/ionosphere-appview/tools/merge_enrichment.py` | Combine transcript + diarization into unified JSON | 24 - | `apps/ionosphere-appview/tools/evaluate.py` | Score boundaries against ground truth | 25 - | `apps/ionosphere-appview/tools/test_evaluate.py` | Tests for evaluation scoring | 26 - | `apps/ionosphere-appview/tools/test_merge.py` | Tests for merge/alignment logic | 27 - | `apps/ionosphere-appview/data/ground-truth/great-hall-day-1.json` | Ground truth timestamps | 28 - | `apps/ionosphere-appview/src/detect-boundaries-v6.ts` | Enhanced boundary detection | 29 - | `apps/ionosphere-appview/src/detect-boundaries-v6.test.ts` | Tests for v6 scoring functions | 30 - 31 - ### Modified files 32 - 33 - | File | Change | 34 - |------|--------| 35 - | `apps/ionosphere-appview/.gitignore` | Add `data/fullday/` (large audio/transcript files) | 36 - 37 - ### Stream config (shared) 38 - 39 - The FULLDAY_STREAMS array in `transcribe-fullday.ts` is the source of truth for stream URIs. The Python tools accept stream name + URI as CLI args rather than duplicating this config. 40 - 41 - --- 42 - 43 - ## Chunk 1: Python Environment + Audio Extraction + Ground Truth 44 - 45 - ### Task 1: Python project setup 46 - 47 - **Files:** 48 - - Create: `apps/ionosphere-appview/tools/pyproject.toml` 49 - - Modify: `apps/ionosphere-appview/.gitignore` 50 - 51 - - [ ] **Step 1: Create pyproject.toml** 52 - 53 - ```toml 54 - [project] 55 - name = "ionosphere-tools" 56 - version = "0.1.0" 57 - description = "Audio enrichment tools for ionosphere boundary detection" 58 - requires-python = ">=3.12" 59 - dependencies = [ 60 - "openai>=1.0", 61 - "pyannote-audio>=3.1", 62 - "torch>=2.0", 63 - ] 64 - 65 - [project.optional-dependencies] 66 - dev = ["pytest>=8.0"] 67 - 68 - [tool.pytest.ini_options] 69 - testpaths = ["."] 70 - ``` 71 - 72 - - [ ] **Step 2: Create the virtual environment** 73 - 74 - Run: `cd apps/ionosphere-appview/tools && uv sync` 75 - Expected: Creates `.venv/` and installs all dependencies including torch with MPS support. 76 - 77 - - [ ] **Step 3: Verify torch MPS support** 78 - 79 - Run: `cd apps/ionosphere-appview/tools && uv run python -c "import torch; print('MPS:', torch.backends.mps.is_available())"` 80 - Expected: `MPS: True` 81 - 82 - - [ ] **Step 4: Add data/fullday to gitignore** 83 - 84 - Append to `apps/ionosphere-appview/.gitignore`: 85 - ``` 86 - data/fullday/ 87 - ``` 88 - 89 - - [ ] **Step 5: Commit** 90 - 91 - ```bash 92 - git add apps/ionosphere-appview/tools/pyproject.toml apps/ionosphere-appview/.gitignore 93 - git commit -m "feat: add Python tooling environment for audio enrichment" 94 - ``` 95 - 96 - --- 97 - 98 - ### Task 2: Ground truth data 99 - 100 - **Files:** 101 - - Create: `apps/ionosphere-appview/data/ground-truth/great-hall-day-1.json` 102 - 103 - Ground truth timestamps from manual verification (memory notes). Times in seconds from stream start. Note: some talks were not manually verified — these have `verified: false`. 104 - 105 - - [ ] **Step 1: Create ground truth JSON** 106 - 107 - ```json 108 - { 109 - "stream": "Great Hall - Day 1", 110 - "notes": "Manually verified timestamps from stream playback. Tolerance is per-talk based on transition clarity.", 111 - "talks": [ 112 - { 113 - "rkey": "gDELD0M", 114 - "title": "Landslide", 115 - "speaker": "Erin Kissane", 116 - "ground_truth_start": 990, 117 - "tolerance_seconds": 120, 118 - "verified": true, 119 - "notes": "Stream starts garbled, first talk begins ~16:30" 120 - }, 121 - { 122 - "rkey": "QK9Ae6Y", 123 - "title": "Groundings with my Siblings: Lessons Learned Building for Community", 124 - "speaker": "Rudy Fraser", 125 - "ground_truth_start": 4254, 126 - "tolerance_seconds": 120, 127 - "verified": true, 128 - "notes": "1:10:54 from stream start" 129 - }, 130 - { 131 - "rkey": "obaP26x", 132 - "title": "Who owns the group chat? Building collaborative spaces on ATProto", 133 - "speaker": "Brittany Ellich", 134 - "ground_truth_start": 6260, 135 - "tolerance_seconds": 120, 136 - "verified": true, 137 - "notes": "1:44:20 from stream start" 138 - }, 139 - { 140 - "rkey": "000Syverson", 141 - "title": "Sattestations", 142 - "speaker": "Paul Syverson", 143 - "ground_truth_start": 11760, 144 - "tolerance_seconds": 120, 145 - "verified": true, 146 - "notes": "3:16:00 — garbled break zone before this talk. v5 detected 3:02:28 (outlier)" 147 - }, 148 - { 149 - "rkey": "81Xovjr", 150 - "title": "Feeds Are The New Websites", 151 - "speaker": "Mike McCue", 152 - "ground_truth_start": 12594, 153 - "tolerance_seconds": 120, 154 - "verified": true, 155 - "notes": "3:29:54 — v5 detected 3:29:13" 156 - }, 157 - { 158 - "rkey": "LZxV6dv", 159 - "title": "Consent Before Cryptography", 160 - "speaker": "Tessa Brown", 161 - "ground_truth_start": 13531, 162 - "tolerance_seconds": 120, 163 - "verified": true, 164 - "notes": "3:45:31 — v5 detected 3:46:39" 165 - }, 166 - { 167 - "rkey": "Y561Qv6", 168 - "title": "From protocol to product: How Expo powers the next wave of social apps", 169 - "speaker": "Eliot", 170 - "ground_truth_start": 15368, 171 - "tolerance_seconds": 120, 172 - "verified": true, 173 - "notes": "4:16:08 — v5 detected 4:16:53" 174 - }, 175 - { 176 - "rkey": "aQ1J9GE", 177 - "title": "2026 Atmosphere Report", 178 - "speaker": "Paul Frazee", 179 - "ground_truth_start": 17475, 180 - "tolerance_seconds": 120, 181 - "verified": true, 182 - "notes": "4:51:15 — v5 detected 4:50:27" 183 - }, 184 - { 185 - "rkey": "2EG4YMj", 186 - "title": "What 350,000 users taught me about growing on Open Social", 187 - "speaker": "Tori", 188 - "ground_truth_start": 21451, 189 - "tolerance_seconds": 120, 190 - "verified": true, 191 - "notes": "5:57:31 — v5 detected 5:56:13" 192 - }, 193 - { 194 - "rkey": "000Jer", 195 - "title": "The Future of Open Source is Social", 196 - "speaker": "Jer Miller", 197 - "ground_truth_start": 22062, 198 - "tolerance_seconds": 120, 199 - "verified": true, 200 - "notes": "6:07:42 — v5 detected 6:07:21" 201 - }, 202 - { 203 - "rkey": "2EGLPML", 204 - "title": "Burning down data walls in the US Fire Service and beyond", 205 - "speaker": "Stephan Noel", 206 - "ground_truth_start": 22514, 207 - "tolerance_seconds": 120, 208 - "verified": true, 209 - "notes": "6:15:14 — v5 detected 6:15:02" 210 - }, 211 - { 212 - "rkey": "OD2G9j8", 213 - "title": "The Phoenix Architecture", 214 - "speaker": "Chad Fowler", 215 - "ground_truth_start": 23426, 216 - "tolerance_seconds": 120, 217 - "verified": true, 218 - "notes": "6:30:26 — v5 detected 6:28:27 (borderline)" 219 - }, 220 - { 221 - "rkey": "rj8Xv62", 222 - "title": "This Title Left Intentionally Blank", 223 - "speaker": "Blaine Cook", 224 - "ground_truth_start": 25228, 225 - "tolerance_seconds": 120, 226 - "verified": true, 227 - "notes": "7:00:28" 228 - } 229 - ] 230 - } 231 - ``` 232 - 233 - Note: Some talks from the v5 results (ODxNLMM "kpop", 7Rrv0E0 "Beyond Bluesky", OD6Gd0A "Semble") don't have ground truth timestamps in the notes. They're included in v5 output but not verified. Tony Schneider's bonus talk also not included (not on schedule). 234 - 235 - - [ ] **Step 2: Commit** 236 - 237 - ```bash 238 - git add apps/ionosphere-appview/data/ground-truth/great-hall-day-1.json 239 - git commit -m "data: add Great Hall Day 1 ground truth timestamps" 240 - ``` 241 - 242 - --- 243 - 244 - ### Task 3: Audio extraction script 245 - 246 - **Files:** 247 - - Create: `apps/ionosphere-appview/tools/extract_audio.py` 248 - 249 - This script extracts audio from an HLS stream, producing: 250 - - 20-minute MP3 chunks (for Whisper, 16kHz mono 32kbps) 251 - - Full WAV (for pyannote, 16kHz mono) 252 - 253 - Skips files that already exist. 254 - 255 - - [ ] **Step 1: Write extract_audio.py** 256 - 257 - ```python 258 - """ 259 - Extract audio from HLS streams for transcription and diarization. 260 - 261 - Produces: 262 - - 20-minute MP3 chunks (for Whisper, ≤25MB each) 263 - - Full WAV (for pyannote diarization, 16kHz mono) 264 - 265 - Usage: 266 - uv run python extract_audio.py <stream_name> <stream_uri> [--output-dir DIR] 267 - 268 - Example: 269 - uv run python extract_audio.py "Great Hall - Day 1" \ 270 - "at://did:plc:rbvrr34edl5ddpuwcubjiost/place.stream.video/3miieadw52j22" 271 - """ 272 - import argparse 273 - import json 274 - import subprocess 275 - import sys 276 - from pathlib import Path 277 - from urllib.parse import quote 278 - 279 - VOD_ENDPOINT = "https://vod-beta.stream.place/xrpc/place.stream.playback.getVideoPlaylist" 280 - CHUNK_SECONDS = 20 * 60 # 20-minute chunks 281 - 282 - 283 - def playlist_url(uri: str) -> str: 284 - return f"{VOD_ENDPOINT}?uri={quote(uri, safe='')}" 285 - 286 - 287 - def stream_duration(url: str) -> float: 288 - result = subprocess.run( 289 - ["ffprobe", "-v", "quiet", "-show_entries", "format=duration", "-of", "csv=p=0", url], 290 - capture_output=True, text=True, timeout=60, 291 - ) 292 - return float(result.stdout.strip() or 0) 293 - 294 - 295 - def extract_chunks(url: str, duration: float, out_dir: Path) -> list[Path]: 296 - """Extract 20-minute MP3 chunks for Whisper.""" 297 - chunks = [] 298 - num_chunks = int(duration // CHUNK_SECONDS) + (1 if duration % CHUNK_SECONDS > 0 else 0) 299 - 300 - for i in range(num_chunks): 301 - start = i * CHUNK_SECONDS 302 - chunk_dur = min(CHUNK_SECONDS, duration - start) 303 - chunk_path = out_dir / f"chunk-{i:03d}.mp3" 304 - chunks.append(chunk_path) 305 - 306 - if chunk_path.exists(): 307 - print(f" Chunk {i+1}/{num_chunks}: exists") 308 - continue 309 - 310 - print(f" Chunk {i+1}/{num_chunks}: extracting {start}s-{start+chunk_dur:.0f}s...") 311 - subprocess.run( 312 - ["ffmpeg", "-i", url, "-ss", str(start), "-t", str(chunk_dur), 313 - "-vn", "-acodec", "libmp3lame", "-ar", "16000", "-ac", "1", 314 - "-b:a", "32k", str(chunk_path), "-y"], 315 - capture_output=True, timeout=600, 316 - check=True, 317 - ) 318 - 319 - return chunks 320 - 321 - 322 - def extract_wav(url: str, out_dir: Path) -> Path: 323 - """Extract full WAV for pyannote diarization.""" 324 - wav_path = out_dir / "full.wav" 325 - if wav_path.exists(): 326 - print(" WAV: exists") 327 - return wav_path 328 - 329 - print(" Extracting full WAV for diarization...") 330 - subprocess.run( 331 - ["ffmpeg", "-i", url, "-vn", "-acodec", "pcm_s16le", 332 - "-ar", "16000", "-ac", "1", str(wav_path), "-y"], 333 - capture_output=True, timeout=1800, 334 - check=True, 335 - ) 336 - return wav_path 337 - 338 - 339 - def main(): 340 - parser = argparse.ArgumentParser(description="Extract audio from HLS streams") 341 - parser.add_argument("stream_name", help="Stream name (e.g. 'Great Hall - Day 1')") 342 - parser.add_argument("stream_uri", help="AT Protocol URI for the stream") 343 - parser.add_argument("--output-dir", type=Path, 344 - default=Path(__file__).parent.parent / "data" / "fullday") 345 - args = parser.parse_args() 346 - 347 - safe_name = args.stream_name.replace(" ", "_").replace("-", "_") 348 - out_dir = args.output_dir / safe_name 349 - out_dir.mkdir(parents=True, exist_ok=True) 350 - 351 - url = playlist_url(args.stream_uri) 352 - print(f"Stream: {args.stream_name}") 353 - print(f"URL: {url}") 354 - 355 - duration = stream_duration(url) 356 - if duration <= 0: 357 - print("ERROR: Could not determine stream duration", file=sys.stderr) 358 - sys.exit(1) 359 - print(f"Duration: {duration/3600:.1f}h ({duration:.0f}s)") 360 - 361 - # Extract chunks and WAV 362 - chunks = extract_chunks(url, duration, out_dir) 363 - wav = extract_wav(url, out_dir) 364 - 365 - # Save manifest 366 - manifest = { 367 - "stream_name": args.stream_name, 368 - "stream_uri": args.stream_uri, 369 - "duration_seconds": duration, 370 - "chunks": [str(c.name) for c in chunks], 371 - "wav": str(wav.name), 372 - } 373 - manifest_path = out_dir / "manifest.json" 374 - manifest_path.write_text(json.dumps(manifest, indent=2)) 375 - print(f"\nDone: {len(chunks)} chunks + WAV → {out_dir}") 376 - 377 - 378 - if __name__ == "__main__": 379 - main() 380 - ``` 381 - 382 - - [ ] **Step 2: Test extraction on a short segment (manual)** 383 - 384 - Run: `cd apps/ionosphere-appview/tools && uv run python extract_audio.py "Great Hall - Day 1" "at://did:plc:rbvrr34edl5ddpuwcubjiost/place.stream.video/3miieadw52j22"` 385 - 386 - Expected: Creates `data/fullday/Great_Hall___Day_1/` with chunk MP3s, full.wav, and manifest.json. This will take a while (8h stream). 387 - 388 - - [ ] **Step 3: Commit** 389 - 390 - ```bash 391 - git add apps/ionosphere-appview/tools/extract_audio.py 392 - git commit -m "feat: audio extraction script for HLS streams" 393 - ``` 394 - 395 - --- 396 - 397 - ## Chunk 2: Whisper Re-transcription with Segment Confidence 398 - 399 - ### Task 4: Enhanced Whisper transcription 400 - 401 - **Files:** 402 - - Create: `apps/ionosphere-appview/tools/transcribe_enhanced.py` 403 - 404 - Re-transcribes audio chunks using the Whisper API with: 405 - - Both `word` and `segment` timestamp granularities 406 - - Prompt hints (speaker names, talk titles, venue) 407 - - Segment-level `avg_logprob` and `no_speech_prob` 408 - 409 - - [ ] **Step 1: Write transcribe_enhanced.py** 410 - 411 - ```python 412 - """ 413 - Re-transcribe audio chunks with segment confidence and prompt hints. 414 - 415 - Reads chunk MP3s from extract_audio output, sends to Whisper API with 416 - both word and segment granularities, saves enhanced transcript JSON. 417 - 418 - Usage: 419 - OPENAI_API_KEY=... uv run python transcribe_enhanced.py <stream_dir> [--prompt HINT] 420 - 421 - Example: 422 - uv run python transcribe_enhanced.py ../data/fullday/Great_Hall___Day_1 \ 423 - --prompt "ATmosphereConf 2026, Great Hall South. Speakers: Erin Kissane, Rudy Fraser, ..." 424 - """ 425 - import argparse 426 - import json 427 - import os 428 - import sys 429 - from pathlib import Path 430 - 431 - from openai import OpenAI 432 - 433 - 434 - def transcribe_chunk(client: OpenAI, chunk_path: Path, prompt: str) -> dict: 435 - """Transcribe a single chunk with both word and segment granularities.""" 436 - with open(chunk_path, "rb") as f: 437 - response = client.audio.transcriptions.create( 438 - model="whisper-1", 439 - file=f, 440 - response_format="verbose_json", 441 - timestamp_granularities=["word", "segment"], 442 - language="en", 443 - prompt=prompt, 444 - ) 445 - 446 - words = [ 447 - {"word": w.word, "start": w.start, "end": w.end} 448 - for w in (response.words or []) 449 - ] 450 - 451 - segments = [ 452 - { 453 - "id": s.id, 454 - "start": s.start, 455 - "end": s.end, 456 - "text": s.text, 457 - "avg_logprob": s.avg_logprob, 458 - "no_speech_prob": s.no_speech_prob, 459 - "compression_ratio": s.compression_ratio, 460 - } 461 - for s in (response.segments or []) 462 - ] 463 - 464 - return {"text": response.text, "words": words, "segments": segments} 465 - 466 - 467 - def main(): 468 - parser = argparse.ArgumentParser(description="Re-transcribe with segment confidence") 469 - parser.add_argument("stream_dir", type=Path, help="Directory with extracted audio chunks") 470 - parser.add_argument("--prompt", default="ATmosphereConf 2026 conference talk.", 471 - help="Whisper prompt hint") 472 - parser.add_argument("--force", action="store_true", help="Re-transcribe even if cached") 473 - args = parser.parse_args() 474 - 475 - if not os.environ.get("OPENAI_API_KEY"): 476 - print("ERROR: OPENAI_API_KEY not set", file=sys.stderr) 477 - sys.exit(1) 478 - 479 - client = OpenAI() 480 - manifest_path = args.stream_dir / "manifest.json" 481 - if not manifest_path.exists(): 482 - print(f"ERROR: No manifest.json in {args.stream_dir}", file=sys.stderr) 483 - sys.exit(1) 484 - 485 - manifest = json.loads(manifest_path.read_text()) 486 - chunk_names = manifest["chunks"] 487 - duration = manifest["duration_seconds"] 488 - 489 - print(f"Stream: {manifest['stream_name']}") 490 - print(f"Chunks: {len(chunk_names)}") 491 - print(f"Prompt: {args.prompt[:80]}...") 492 - 493 - all_words = [] 494 - all_segments = [] 495 - full_text = "" 496 - chunk_seconds = 20 * 60 497 - 498 - for i, chunk_name in enumerate(chunk_names): 499 - chunk_path = args.stream_dir / chunk_name 500 - cache_path = args.stream_dir / f"transcript-{i:03d}.json" 501 - start_offset = i * chunk_seconds 502 - 503 - if cache_path.exists() and not args.force: 504 - print(f" Chunk {i+1}/{len(chunk_names)}: cached") 505 - cached = json.loads(cache_path.read_text()) 506 - all_words.extend(cached["words"]) 507 - all_segments.extend(cached["segments"]) 508 - full_text += (" " if full_text else "") + cached["text"] 509 - continue 510 - 511 - if not chunk_path.exists(): 512 - print(f" Chunk {i+1}/{len(chunk_names)}: SKIP (no audio)") 513 - continue 514 - 515 - print(f" Chunk {i+1}/{len(chunk_names)}: transcribing...") 516 - try: 517 - result = transcribe_chunk(client, chunk_path, args.prompt) 518 - except Exception as e: 519 - print(f" Chunk {i+1}/{len(chunk_names)}: FAILED ({e})") 520 - cache_path.write_text(json.dumps({"text": "", "words": [], "segments": []})) 521 - continue 522 - 523 - # Offset timestamps to absolute stream position 524 - for w in result["words"]: 525 - w["start"] += start_offset 526 - w["end"] += start_offset 527 - for s in result["segments"]: 528 - s["start"] += start_offset 529 - s["end"] += start_offset 530 - 531 - cache_path.write_text(json.dumps(result)) 532 - all_words.extend(result["words"]) 533 - all_segments.extend(result["segments"]) 534 - full_text += (" " if full_text else "") + result["text"] 535 - print(f" {len(result['words'])} words, {len(result['segments'])} segments") 536 - 537 - # Save stitched transcript 538 - output = { 539 - "stream": manifest["stream_name"], 540 - "duration_seconds": duration, 541 - "text": full_text, 542 - "words": all_words, 543 - "segments": all_segments, 544 - "total_words": len(all_words), 545 - "total_segments": len(all_segments), 546 - } 547 - output_path = args.stream_dir / "transcript-enhanced.json" 548 - output_path.write_text(json.dumps(output, indent=2)) 549 - print(f"\nDone: {len(all_words)} words, {len(all_segments)} segments → {output_path}") 550 - 551 - 552 - if __name__ == "__main__": 553 - main() 554 - ``` 555 - 556 - - [ ] **Step 2: Run on Great Hall Day 1 (after audio extraction)** 557 - 558 - Run: 559 - ```bash 560 - cd apps/ionosphere-appview/tools 561 - OPENAI_API_KEY=... uv run python transcribe_enhanced.py ../data/fullday/Great_Hall___Day_1 \ 562 - --prompt "ATmosphereConf 2026, Great Hall South. Speakers: Erin Kissane, Rudy Fraser, Brittany Ellich, Paul Syverson, Mike McCue, Tessa Brown, Eliot, Paul Frazee, Tori, Jer Miller, Stephan Noel, Chad Fowler, Blaine Cook, Tony Schneider." 563 - ``` 564 - 565 - Expected: Creates `transcript-enhanced.json` with words + segments including `avg_logprob` and `no_speech_prob`. 566 - 567 - - [ ] **Step 3: Commit** 568 - 569 - ```bash 570 - git add apps/ionosphere-appview/tools/transcribe_enhanced.py 571 - git commit -m "feat: enhanced Whisper transcription with segment confidence" 572 - ``` 573 - 574 - --- 575 - 576 - ## Chunk 3: Speaker Diarization 577 - 578 - ### Task 5: pyannote diarization script 579 - 580 - **Files:** 581 - - Create: `apps/ionosphere-appview/tools/diarize.py` 582 - 583 - Runs pyannote.audio speaker diarization on the full WAV, produces speaker segments JSON. 584 - 585 - - [ ] **Step 1: Write diarize.py** 586 - 587 - ```python 588 - """ 589 - Speaker diarization using pyannote.audio. 590 - 591 - Reads full.wav from extract_audio output, runs diarization, saves 592 - speaker segments as JSON. 593 - 594 - Usage: 595 - uv run python diarize.py <stream_dir> [--min-speakers N] [--max-speakers N] 596 - 597 - Requires a HuggingFace token with access to pyannote models: 598 - export HF_TOKEN=... 599 - 600 - First run will download the model (~1GB). 601 - """ 602 - import argparse 603 - import json 604 - import os 605 - import sys 606 - from pathlib import Path 607 - 608 - import torch 609 - from pyannote.audio import Pipeline 610 - 611 - 612 - def main(): 613 - parser = argparse.ArgumentParser(description="Speaker diarization") 614 - parser.add_argument("stream_dir", type=Path, help="Directory with full.wav") 615 - parser.add_argument("--min-speakers", type=int, default=None) 616 - parser.add_argument("--max-speakers", type=int, default=None) 617 - parser.add_argument("--force", action="store_true") 618 - args = parser.parse_args() 619 - 620 - wav_path = args.stream_dir / "full.wav" 621 - output_path = args.stream_dir / "diarization.json" 622 - 623 - if output_path.exists() and not args.force: 624 - print(f"Diarization already exists: {output_path}") 625 - return 626 - 627 - if not wav_path.exists(): 628 - print(f"ERROR: {wav_path} not found. Run extract_audio.py first.", file=sys.stderr) 629 - sys.exit(1) 630 - 631 - hf_token = os.environ.get("HF_TOKEN") 632 - if not hf_token: 633 - print("ERROR: HF_TOKEN not set. Get one at https://huggingface.co/settings/tokens", file=sys.stderr) 634 - print(" You also need to accept the model terms at:") 635 - print(" https://huggingface.co/pyannote/speaker-diarization-3.1") 636 - print(" https://huggingface.co/pyannote/segmentation-3.0") 637 - sys.exit(1) 638 - 639 - device = torch.device("mps" if torch.backends.mps.is_available() else "cpu") 640 - print(f"Device: {device}") 641 - print(f"Audio: {wav_path}") 642 - 643 - # Load pipeline 644 - print("Loading pyannote pipeline...") 645 - pipeline = Pipeline.from_pretrained( 646 - "pyannote/speaker-diarization-3.1", 647 - use_auth_token=hf_token, 648 - ) 649 - pipeline.to(device) 650 - 651 - # Run diarization 652 - print("Running diarization (this may take a while for long streams)...") 653 - diarize_params = {} 654 - if args.min_speakers is not None: 655 - diarize_params["min_speakers"] = args.min_speakers 656 - if args.max_speakers is not None: 657 - diarize_params["max_speakers"] = args.max_speakers 658 - 659 - diarization = pipeline(str(wav_path), **diarize_params) 660 - 661 - # Convert to JSON-serializable format 662 - segments = [] 663 - for turn, _, speaker in diarization.itertracks(yield_label=True): 664 - segments.append({ 665 - "start": round(turn.start, 3), 666 - "end": round(turn.end, 3), 667 - "speaker": speaker, 668 - }) 669 - 670 - # Summary 671 - speakers = sorted(set(s["speaker"] for s in segments)) 672 - print(f"\nFound {len(speakers)} speakers, {len(segments)} segments") 673 - for spk in speakers: 674 - spk_segs = [s for s in segments if s["speaker"] == spk] 675 - total_dur = sum(s["end"] - s["start"] for s in spk_segs) 676 - print(f" {spk}: {len(spk_segs)} segments, {total_dur/60:.1f} min") 677 - 678 - output = { 679 - "speakers": speakers, 680 - "segments": segments, 681 - "total_segments": len(segments), 682 - } 683 - output_path.write_text(json.dumps(output, indent=2)) 684 - print(f"\nSaved: {output_path}") 685 - 686 - 687 - if __name__ == "__main__": 688 - main() 689 - ``` 690 - 691 - - [ ] **Step 2: Run on Great Hall Day 1 (after audio extraction)** 692 - 693 - Run: 694 - ```bash 695 - cd apps/ionosphere-appview/tools 696 - HF_TOKEN=... uv run python diarize.py ../data/fullday/Great_Hall___Day_1 697 - ``` 698 - 699 - Expected: Creates `diarization.json` with speaker segments. May take 10-30 minutes for an 8h stream. 700 - 701 - - [ ] **Step 3: Commit** 702 - 703 - ```bash 704 - git add apps/ionosphere-appview/tools/diarize.py 705 - git commit -m "feat: pyannote speaker diarization script" 706 - ``` 707 - 708 - --- 709 - 710 - ## Chunk 4: Merge Enrichment + Evaluation 711 - 712 - ### Task 6: Merge enrichment data 713 - 714 - **Files:** 715 - - Create: `apps/ionosphere-appview/tools/merge_enrichment.py` 716 - - Create: `apps/ionosphere-appview/tools/test_merge.py` 717 - 718 - Combines transcript (with segment confidence) and diarization into a single enriched transcript JSON. Each word gets a `speaker` field by aligning word timestamps with diarization segments. 719 - 720 - - [ ] **Step 1: Write the test** 721 - 722 - ```python 723 - # test_merge.py 724 - """Tests for merge_enrichment.py alignment logic.""" 725 - from merge_enrichment import assign_speakers_to_words, find_dominant_speaker 726 - 727 - 728 - def test_assign_speakers_basic(): 729 - words = [ 730 - {"word": "hello", "start": 0.0, "end": 0.5}, 731 - {"word": "world", "start": 0.6, "end": 1.0}, 732 - {"word": "goodbye", "start": 5.0, "end": 5.5}, 733 - ] 734 - diarization = [ 735 - {"start": 0.0, "end": 2.0, "speaker": "SPEAKER_00"}, 736 - {"start": 4.5, "end": 6.0, "speaker": "SPEAKER_01"}, 737 - ] 738 - result = assign_speakers_to_words(words, diarization) 739 - assert result[0]["speaker"] == "SPEAKER_00" 740 - assert result[1]["speaker"] == "SPEAKER_00" 741 - assert result[2]["speaker"] == "SPEAKER_01" 742 - 743 - 744 - def test_assign_speakers_gap(): 745 - """Words in a gap between diarization segments get nearest speaker.""" 746 - words = [ 747 - {"word": "um", "start": 3.0, "end": 3.2}, 748 - ] 749 - diarization = [ 750 - {"start": 0.0, "end": 2.0, "speaker": "SPEAKER_00"}, 751 - {"start": 4.0, "end": 6.0, "speaker": "SPEAKER_01"}, 752 - ] 753 - result = assign_speakers_to_words(words, diarization) 754 - # Closer to SPEAKER_01 (1.0s gap vs 1.0s gap — tie goes to next) 755 - assert result[0]["speaker"] in ("SPEAKER_00", "SPEAKER_01") 756 - 757 - 758 - def test_dominant_speaker(): 759 - words = [ 760 - {"word": "a", "start": 0, "end": 1, "speaker": "SPEAKER_00"}, 761 - {"word": "b", "start": 1, "end": 2, "speaker": "SPEAKER_00"}, 762 - {"word": "c", "start": 2, "end": 3, "speaker": "SPEAKER_01"}, 763 - ] 764 - assert find_dominant_speaker(words) == "SPEAKER_00" 765 - 766 - 767 - def test_dominant_speaker_empty(): 768 - assert find_dominant_speaker([]) is None 769 - ``` 770 - 771 - - [ ] **Step 2: Run test to verify it fails** 772 - 773 - Run: `cd apps/ionosphere-appview/tools && uv run pytest test_merge.py -v` 774 - Expected: FAIL — `merge_enrichment` module not found. 775 - 776 - - [ ] **Step 3: Write merge_enrichment.py** 777 - 778 - ```python 779 - """ 780 - Merge transcript and diarization into unified enriched transcript. 781 - 782 - Aligns word timestamps with diarization speaker segments so each word 783 - gets a speaker label. Also carries forward segment-level confidence. 784 - 785 - Usage: 786 - uv run python merge_enrichment.py <stream_dir> 787 - """ 788 - import argparse 789 - import json 790 - import sys 791 - from collections import Counter 792 - from pathlib import Path 793 - 794 - 795 - def assign_speakers_to_words( 796 - words: list[dict], 797 - diarization: list[dict], 798 - ) -> list[dict]: 799 - """Assign a speaker label to each word based on diarization segments. 800 - 801 - For each word, find the diarization segment that overlaps its midpoint. 802 - If no segment overlaps, assign the nearest segment's speaker. 803 - """ 804 - if not diarization: 805 - return words 806 - 807 - result = [] 808 - dia_idx = 0 809 - 810 - for word in words: 811 - mid = (word["start"] + word["end"]) / 2 812 - speaker = None 813 - 814 - # Advance diarization index to find overlapping segment 815 - while dia_idx < len(diarization) and diarization[dia_idx]["end"] < mid: 816 - dia_idx += 1 817 - 818 - # Check current and nearby segments for overlap 819 - for offset in (0, -1, 1): 820 - idx = dia_idx + offset 821 - if 0 <= idx < len(diarization): 822 - seg = diarization[idx] 823 - if seg["start"] <= mid <= seg["end"]: 824 - speaker = seg["speaker"] 825 - break 826 - 827 - # No overlap — find nearest 828 - if speaker is None: 829 - best_dist = float("inf") 830 - for seg in diarization: 831 - dist = min(abs(seg["start"] - mid), abs(seg["end"] - mid)) 832 - if dist < best_dist: 833 - best_dist = dist 834 - speaker = seg["speaker"] 835 - 836 - result.append({**word, "speaker": speaker}) 837 - 838 - return result 839 - 840 - 841 - def find_dominant_speaker(words: list[dict]) -> str | None: 842 - """Find the speaker with the most words in a list.""" 843 - speakers = [w.get("speaker") for w in words if w.get("speaker")] 844 - if not speakers: 845 - return None 846 - return Counter(speakers).most_common(1)[0][0] 847 - 848 - 849 - def main(): 850 - parser = argparse.ArgumentParser(description="Merge transcript + diarization") 851 - parser.add_argument("stream_dir", type=Path) 852 - args = parser.parse_args() 853 - 854 - transcript_path = args.stream_dir / "transcript-enhanced.json" 855 - diarization_path = args.stream_dir / "diarization.json" 856 - output_path = args.stream_dir / "transcript-enriched.json" 857 - 858 - if not transcript_path.exists(): 859 - print(f"ERROR: {transcript_path} not found", file=sys.stderr) 860 - sys.exit(1) 861 - 862 - transcript = json.loads(transcript_path.read_text()) 863 - words = transcript["words"] 864 - segments = transcript.get("segments", []) 865 - 866 - # Load diarization if available 867 - if diarization_path.exists(): 868 - diarization = json.loads(diarization_path.read_text()) 869 - dia_segments = diarization["segments"] 870 - print(f"Diarization: {len(dia_segments)} segments, {len(diarization['speakers'])} speakers") 871 - words = assign_speakers_to_words(words, dia_segments) 872 - else: 873 - print("No diarization data — skipping speaker assignment") 874 - 875 - output = { 876 - "stream": transcript["stream"], 877 - "duration_seconds": transcript["duration_seconds"], 878 - "words": words, 879 - "segments": segments, 880 - "total_words": len(words), 881 - "total_segments": len(segments), 882 - } 883 - output_path.write_text(json.dumps(output, indent=2)) 884 - 885 - # Stats 886 - if any(w.get("speaker") for w in words): 887 - speakers = Counter(w.get("speaker") for w in words if w.get("speaker")) 888 - print(f"\nSpeaker word counts:") 889 - for spk, count in speakers.most_common(): 890 - print(f" {spk}: {count} words") 891 - 892 - print(f"\nSaved: {output_path}") 893 - 894 - 895 - if __name__ == "__main__": 896 - main() 897 - ``` 898 - 899 - - [ ] **Step 4: Run tests** 900 - 901 - Run: `cd apps/ionosphere-appview/tools && uv run pytest test_merge.py -v` 902 - Expected: All 4 tests pass. 903 - 904 - - [ ] **Step 5: Commit** 905 - 906 - ```bash 907 - git add apps/ionosphere-appview/tools/merge_enrichment.py apps/ionosphere-appview/tools/test_merge.py 908 - git commit -m "feat: merge transcript and diarization into enriched JSON" 909 - ``` 910 - 911 - --- 912 - 913 - ### Task 7: Evaluation script 914 - 915 - **Files:** 916 - - Create: `apps/ionosphere-appview/tools/evaluate.py` 917 - - Create: `apps/ionosphere-appview/tools/test_evaluate.py` 918 - 919 - Scores boundary detection results against ground truth. 920 - 921 - - [ ] **Step 1: Write the test** 922 - 923 - ```python 924 - # test_evaluate.py 925 - """Tests for evaluate.py scoring logic.""" 926 - from evaluate import score_boundaries 927 - 928 - 929 - def test_perfect_score(): 930 - ground_truth = [ 931 - {"rkey": "a", "title": "Talk A", "ground_truth_start": 100, "tolerance_seconds": 30, "verified": True}, 932 - {"rkey": "b", "title": "Talk B", "ground_truth_start": 500, "tolerance_seconds": 30, "verified": True}, 933 - ] 934 - boundaries = [ 935 - {"rkey": "a", "startTimestamp": 100}, 936 - {"rkey": "b", "startTimestamp": 500}, 937 - ] 938 - result = score_boundaries(ground_truth, boundaries) 939 - assert result["accuracy"] == 1.0 940 - assert result["mean_absolute_error"] == 0.0 941 - assert all(t["pass"] for t in result["talks"]) 942 - 943 - 944 - def test_one_miss(): 945 - ground_truth = [ 946 - {"rkey": "a", "title": "Talk A", "ground_truth_start": 100, "tolerance_seconds": 30, "verified": True}, 947 - {"rkey": "b", "title": "Talk B", "ground_truth_start": 500, "tolerance_seconds": 30, "verified": True}, 948 - ] 949 - boundaries = [ 950 - {"rkey": "a", "startTimestamp": 110}, 951 - {"rkey": "b", "startTimestamp": 600}, # 100s off, outside tolerance 952 - ] 953 - result = score_boundaries(ground_truth, boundaries) 954 - assert result["accuracy"] == 0.5 955 - assert result["talks"][0]["pass"] is True 956 - assert result["talks"][1]["pass"] is False 957 - 958 - 959 - def test_unverified_skipped(): 960 - ground_truth = [ 961 - {"rkey": "a", "title": "Talk A", "ground_truth_start": 100, "tolerance_seconds": 30, "verified": True}, 962 - {"rkey": "b", "title": "Talk B", "ground_truth_start": 500, "tolerance_seconds": 30, "verified": False}, 963 - ] 964 - boundaries = [ 965 - {"rkey": "a", "startTimestamp": 100}, 966 - {"rkey": "b", "startTimestamp": 999}, 967 - ] 968 - result = score_boundaries(ground_truth, boundaries) 969 - assert result["accuracy"] == 1.0 # only verified talks count 970 - assert len([t for t in result["talks"] if t.get("skipped")]) == 1 971 - 972 - 973 - def test_missing_boundary(): 974 - ground_truth = [ 975 - {"rkey": "a", "title": "Talk A", "ground_truth_start": 100, "tolerance_seconds": 30, "verified": True}, 976 - ] 977 - boundaries = [] 978 - result = score_boundaries(ground_truth, boundaries) 979 - assert result["accuracy"] == 0.0 980 - ``` 981 - 982 - - [ ] **Step 2: Run test to verify it fails** 983 - 984 - Run: `cd apps/ionosphere-appview/tools && uv run pytest test_evaluate.py -v` 985 - Expected: FAIL — `evaluate` module not found. 986 - 987 - - [ ] **Step 3: Write evaluate.py** 988 - 989 - ```python 990 - """ 991 - Evaluate boundary detection results against ground truth. 992 - 993 - Usage: 994 - uv run python evaluate.py <boundaries.json> <ground-truth.json> 995 - 996 - Example: 997 - uv run python evaluate.py ../data/fullday/Great_Hall___Day_1/boundaries-v6.json \ 998 - ../data/ground-truth/great-hall-day-1.json 999 - """ 1000 - import argparse 1001 - import json 1002 - import sys 1003 - from pathlib import Path 1004 - 1005 - 1006 - def fmt(seconds: float) -> str: 1007 - h = int(seconds // 3600) 1008 - m = int((seconds % 3600) // 60) 1009 - s = int(seconds % 60) 1010 - return f"{h}:{m:02d}:{s:02d}" 1011 - 1012 - 1013 - def score_boundaries( 1014 - ground_truth: list[dict], 1015 - boundaries: list[dict], 1016 - ) -> dict: 1017 - """Score detected boundaries against ground truth. 1018 - 1019 - Returns accuracy, mean absolute error, and per-talk breakdown. 1020 - Only verified ground truth entries are scored. 1021 - """ 1022 - boundary_map = {b["rkey"]: b for b in boundaries} 1023 - 1024 - talks = [] 1025 - verified_count = 0 1026 - pass_count = 0 1027 - total_error = 0.0 1028 - 1029 - for gt in ground_truth: 1030 - if not gt.get("verified", True): 1031 - talks.append({ 1032 - "rkey": gt["rkey"], 1033 - "title": gt.get("title", ""), 1034 - "skipped": True, 1035 - "reason": "not verified", 1036 - }) 1037 - continue 1038 - 1039 - verified_count += 1 1040 - detected = boundary_map.get(gt["rkey"]) 1041 - 1042 - if detected is None: 1043 - talks.append({ 1044 - "rkey": gt["rkey"], 1045 - "title": gt.get("title", ""), 1046 - "pass": False, 1047 - "reason": "not detected", 1048 - "ground_truth": gt["ground_truth_start"], 1049 - }) 1050 - continue 1051 - 1052 - error = abs(detected["startTimestamp"] - gt["ground_truth_start"]) 1053 - passed = error <= gt["tolerance_seconds"] 1054 - if passed: 1055 - pass_count += 1 1056 - total_error += error 1057 - 1058 - talks.append({ 1059 - "rkey": gt["rkey"], 1060 - "title": gt.get("title", ""), 1061 - "pass": passed, 1062 - "error_seconds": round(error, 1), 1063 - "ground_truth": gt["ground_truth_start"], 1064 - "detected": detected["startTimestamp"], 1065 - "tolerance": gt["tolerance_seconds"], 1066 - "ground_truth_fmt": fmt(gt["ground_truth_start"]), 1067 - "detected_fmt": fmt(detected["startTimestamp"]), 1068 - }) 1069 - 1070 - accuracy = pass_count / verified_count if verified_count > 0 else 0.0 1071 - mae = total_error / verified_count if verified_count > 0 else 0.0 1072 - 1073 - return { 1074 - "accuracy": accuracy, 1075 - "mean_absolute_error": round(mae, 1), 1076 - "verified_count": verified_count, 1077 - "pass_count": pass_count, 1078 - "talks": talks, 1079 - } 1080 - 1081 - 1082 - def main(): 1083 - parser = argparse.ArgumentParser(description="Evaluate boundaries against ground truth") 1084 - parser.add_argument("boundaries", type=Path) 1085 - parser.add_argument("ground_truth", type=Path) 1086 - args = parser.parse_args() 1087 - 1088 - boundaries_data = json.loads(args.boundaries.read_text()) 1089 - gt_data = json.loads(args.ground_truth.read_text()) 1090 - 1091 - results_list = boundaries_data.get("results", boundaries_data) 1092 - if isinstance(results_list, dict): 1093 - results_list = [results_list] 1094 - 1095 - result = score_boundaries(gt_data["talks"], results_list) 1096 - 1097 - print(f"Accuracy: {result['accuracy']:.0%} ({result['pass_count']}/{result['verified_count']})") 1098 - print(f"Mean Absolute Error: {result['mean_absolute_error']}s") 1099 - print() 1100 - print(f"{'Talk':<50} {'GT':>8} {'Det':>8} {'Err':>6} {'Pass':>5}") 1101 - print("-" * 85) 1102 - for t in result["talks"]: 1103 - if t.get("skipped"): 1104 - print(f"{t['title'][:49]:<50} {'SKIPPED':>8}") 1105 - continue 1106 - gt_str = t.get("ground_truth_fmt", "?") 1107 - det_str = t.get("detected_fmt", "?") 1108 - err = t.get("error_seconds", "?") 1109 - passed = "✓" if t["pass"] else "✗" 1110 - print(f"{t['title'][:49]:<50} {gt_str:>8} {det_str:>8} {str(err):>5}s {passed:>5}") 1111 - 1112 - 1113 - if __name__ == "__main__": 1114 - main() 1115 - ``` 1116 - 1117 - - [ ] **Step 4: Run tests** 1118 - 1119 - Run: `cd apps/ionosphere-appview/tools && uv run pytest test_evaluate.py -v` 1120 - Expected: All 4 tests pass. 1121 - 1122 - - [ ] **Step 5: Evaluate v5 against ground truth (baseline)** 1123 - 1124 - Run: 1125 - ```bash 1126 - cd apps/ionosphere-appview/tools 1127 - uv run python evaluate.py /tmp/fullday-transcripts/Great_Hall_-_Day_1-boundaries-v5.json \ 1128 - ../data/ground-truth/great-hall-day-1.json 1129 - ``` 1130 - 1131 - Expected: Shows per-talk accuracy. This establishes the v5 baseline we're improving on. 1132 - 1133 - - [ ] **Step 6: Commit** 1134 - 1135 - ```bash 1136 - git add apps/ionosphere-appview/tools/evaluate.py apps/ionosphere-appview/tools/test_evaluate.py 1137 - git commit -m "feat: evaluation script for boundary detection accuracy" 1138 - ``` 1139 - 1140 - --- 1141 - 1142 - ## Chunk 5: Enhanced Boundary Detection (v6) 1143 - 1144 - ### Task 8: detect-boundaries-v6.ts 1145 - 1146 - **Files:** 1147 - - Create: `apps/ionosphere-appview/src/detect-boundaries-v6.ts` 1148 - - Create: `apps/ionosphere-appview/src/detect-boundaries-v6.test.ts` 1149 - 1150 - Copy v5 as baseline, add new scoring signals: 1151 - - Speaker change detection from diarization data 1152 - - Confidence-based garbled zone detection from segment data 1153 - - Replace word-repetition garbled zone detection 1154 - 1155 - - [ ] **Step 1: Write tests for new scoring functions** 1156 - 1157 - ```typescript 1158 - // detect-boundaries-v6.test.ts 1159 - import { describe, it, expect } from "vitest"; 1160 - import { 1161 - scoreSpeakerChange, 1162 - scoreConfidenceDrop, 1163 - findLowConfidenceZones, 1164 - } from "./detect-boundaries-v6.js"; 1165 - 1166 - describe("scoreSpeakerChange", () => { 1167 - it("returns high score when dominant speaker changes", () => { 1168 - const wordsBefore = [ 1169 - { word: "thanks", start: 0, end: 1, speaker: "SPEAKER_00" }, 1170 - { word: "everyone", start: 1, end: 2, speaker: "SPEAKER_00" }, 1171 - ]; 1172 - const wordsAfter = [ 1173 - { word: "hello", start: 5, end: 6, speaker: "SPEAKER_01" }, 1174 - { word: "there", start: 6, end: 7, speaker: "SPEAKER_01" }, 1175 - ]; 1176 - const result = scoreSpeakerChange(wordsBefore, wordsAfter); 1177 - expect(result.score).toBeGreaterThanOrEqual(12); 1178 - expect(result.signal).toContain("speaker_change"); 1179 - }); 1180 - 1181 - it("returns zero when same speaker continues", () => { 1182 - const wordsBefore = [ 1183 - { word: "and", start: 0, end: 1, speaker: "SPEAKER_00" }, 1184 - { word: "also", start: 1, end: 2, speaker: "SPEAKER_00" }, 1185 - ]; 1186 - const wordsAfter = [ 1187 - { word: "next", start: 5, end: 6, speaker: "SPEAKER_00" }, 1188 - { word: "slide", start: 6, end: 7, speaker: "SPEAKER_00" }, 1189 - ]; 1190 - const result = scoreSpeakerChange(wordsBefore, wordsAfter); 1191 - expect(result.score).toBe(0); 1192 - }); 1193 - 1194 - it("handles missing speaker data gracefully", () => { 1195 - const wordsBefore = [{ word: "hi", start: 0, end: 1 }]; 1196 - const wordsAfter = [{ word: "bye", start: 5, end: 6 }]; 1197 - const result = scoreSpeakerChange(wordsBefore, wordsAfter); 1198 - expect(result.score).toBe(0); 1199 - }); 1200 - }); 1201 - 1202 - describe("scoreConfidenceDrop", () => { 1203 - it("scores high for low avg_logprob near gap", () => { 1204 - const segments = [ 1205 - { start: 0, end: 10, avg_logprob: -0.3, no_speech_prob: 0.1 }, 1206 - { start: 10, end: 20, avg_logprob: -1.5, no_speech_prob: 0.8 }, // bad 1207 - { start: 20, end: 30, avg_logprob: -0.2, no_speech_prob: 0.05 }, 1208 - ]; 1209 - const result = scoreConfidenceDrop(segments, 15, 10); 1210 - expect(result.score).toBeGreaterThan(0); 1211 - expect(result.signal).toContain("confidence_drop"); 1212 - }); 1213 - 1214 - it("returns zero for high confidence segments", () => { 1215 - const segments = [ 1216 - { start: 0, end: 10, avg_logprob: -0.2, no_speech_prob: 0.05 }, 1217 - { start: 10, end: 20, avg_logprob: -0.3, no_speech_prob: 0.1 }, 1218 - ]; 1219 - const result = scoreConfidenceDrop(segments, 10, 10); 1220 - expect(result.score).toBe(0); 1221 - }); 1222 - }); 1223 - 1224 - describe("findLowConfidenceZones", () => { 1225 - it("finds contiguous low-confidence segments", () => { 1226 - const segments = [ 1227 - { start: 0, end: 10, avg_logprob: -0.3, no_speech_prob: 0.1 }, 1228 - { start: 10, end: 20, avg_logprob: -1.5, no_speech_prob: 0.9 }, 1229 - { start: 20, end: 30, avg_logprob: -1.8, no_speech_prob: 0.85 }, 1230 - { start: 30, end: 40, avg_logprob: -0.2, no_speech_prob: 0.05 }, 1231 - ]; 1232 - const zones = findLowConfidenceZones(segments); 1233 - expect(zones.length).toBe(1); 1234 - expect(zones[0].start).toBe(10); 1235 - expect(zones[0].end).toBe(30); 1236 - }); 1237 - }); 1238 - ``` 1239 - 1240 - - [ ] **Step 2: Run tests to verify they fail** 1241 - 1242 - Run: `cd apps/ionosphere-appview && npx vitest run src/detect-boundaries-v6.test.ts` 1243 - Expected: FAIL — module not found. 1244 - 1245 - - [ ] **Step 3: Copy v5 as baseline and add new scoring functions** 1246 - 1247 - Start from `detect-boundaries-v5.ts`. Key changes: 1248 - 1249 - 1. **New interfaces** for enriched data: 1250 - - `EnrichedWord` extends `Word` with optional `speaker: string` 1251 - - `Segment` with `start, end, avg_logprob, no_speech_prob, compression_ratio` 1252 - 1253 - 2. **New exported scoring functions:** 1254 - - `scoreSpeakerChange(wordsBefore, wordsAfter)` → `{ score, signal }` 1255 - - `scoreConfidenceDrop(segments, gapTimestamp, windowSec)` → `{ score, signal }` 1256 - - `findLowConfidenceZones(segments)` → `Array<{ start, end }>` 1257 - 1258 - 3. **Modified `scoreGapGeneric`:** adds confidence scoring if segments available. 1259 - 1260 - 4. **Modified `selectTransitionsDP`:** adds speaker-change scoring during gap evaluation. 1261 - 1262 - 5. **Modified `findUsableTranscriptStart`:** uses `findLowConfidenceZones` instead of word-repetition. 1263 - 1264 - 6. **Input format:** reads enriched transcript JSON (with `words[].speaker` and `segments[]`), falls back gracefully to plain transcript format. 1265 - 1266 - The full file is a copy of v5 with these additions — keep all existing v5 logic intact. 1267 - 1268 - - [ ] **Step 4: Run tests** 1269 - 1270 - Run: `cd apps/ionosphere-appview && npx vitest run src/detect-boundaries-v6.test.ts` 1271 - Expected: All tests pass. 1272 - 1273 - - [ ] **Step 5: Run v6 on existing v5 transcript (no enrichment yet)** 1274 - 1275 - Run: `cd apps/ionosphere-appview && npx tsx src/detect-boundaries-v6.ts /tmp/fullday-transcripts/Great_Hall_-_Day_1.json` 1276 - 1277 - Expected: Should produce equivalent results to v5 (no enrichment data = no new signals). Validates backward compatibility. 1278 - 1279 - - [ ] **Step 6: Commit** 1280 - 1281 - ```bash 1282 - git add apps/ionosphere-appview/src/detect-boundaries-v6.ts apps/ionosphere-appview/src/detect-boundaries-v6.test.ts 1283 - git commit -m "feat: boundary detection v6 with speaker change and confidence scoring" 1284 - ``` 1285 - 1286 - --- 1287 - 1288 - ## Chunk 6: Integration & First Run 1289 - 1290 - ### Task 9: End-to-end pipeline run 1291 - 1292 - This is an execution task, not a code task. Run the full pipeline on Great Hall Day 1. 1293 - 1294 - - [ ] **Step 1: Extract audio** 1295 - 1296 - ```bash 1297 - cd apps/ionosphere-appview/tools 1298 - uv run python extract_audio.py "Great Hall - Day 1" \ 1299 - "at://did:plc:rbvrr34edl5ddpuwcubjiost/place.stream.video/3miieadw52j22" 1300 - ``` 1301 - 1302 - This will take ~20-30 minutes (8h stream download + extraction). 1303 - 1304 - - [ ] **Step 2: Run Whisper re-transcription** 1305 - 1306 - ```bash 1307 - cd apps/ionosphere-appview/tools 1308 - OPENAI_API_KEY=... uv run python transcribe_enhanced.py \ 1309 - ../data/fullday/Great_Hall___Day_1 \ 1310 - --prompt "ATmosphereConf 2026, Great Hall South. Speakers: Erin Kissane, Rudy Fraser, Brittany Ellich, Paul Syverson, Mike McCue, Tessa Brown, Eliot, Paul Frazee, Tori, Jer Miller, Stephan Noel, Chad Fowler, Blaine Cook, Tony Schneider." 1311 - ``` 1312 - 1313 - - [ ] **Step 3: Run speaker diarization** 1314 - 1315 - ```bash 1316 - cd apps/ionosphere-appview/tools 1317 - HF_TOKEN=... uv run python diarize.py ../data/fullday/Great_Hall___Day_1 1318 - ``` 1319 - 1320 - - [ ] **Step 4: Merge enrichment** 1321 - 1322 - ```bash 1323 - cd apps/ionosphere-appview/tools 1324 - uv run python merge_enrichment.py ../data/fullday/Great_Hall___Day_1 1325 - ``` 1326 - 1327 - - [ ] **Step 5: Run v6 boundary detection** 1328 - 1329 - ```bash 1330 - cd apps/ionosphere-appview 1331 - npx tsx src/detect-boundaries-v6.ts data/fullday/Great_Hall___Day_1/transcript-enriched.json 1332 - ``` 1333 - 1334 - - [ ] **Step 6: Evaluate against ground truth** 1335 - 1336 - ```bash 1337 - cd apps/ionosphere-appview/tools 1338 - uv run python evaluate.py \ 1339 - ../data/fullday/Great_Hall___Day_1/Great_Hall___Day_1-boundaries-v6.json \ 1340 - ../data/ground-truth/great-hall-day-1.json 1341 - ``` 1342 - 1343 - - [ ] **Step 7: Compare v6 vs v5 baseline** 1344 - 1345 - Run evaluate.py on the v5 results too and compare accuracy + MAE. Document findings. 1346 - 1347 - --- 1348 - 1349 - ## Notes 1350 - 1351 - - **HuggingFace token:** pyannote models require accepting terms at huggingface.co. User needs `HF_TOKEN`. 1352 - - **OpenAI API cost:** Re-transcribing an 8h stream ≈ 24 chunks × ~$0.006/min × 20min = ~$2.88. 1353 - - **Audio storage:** Full WAV for 8h stream ≈ ~900MB. MP3 chunks ≈ ~350MB total. Gitignored. 1354 - - **Iterating on weights:** After the first run, scoring weights in v6 can be tuned by modifying constants and re-running evaluate. No re-transcription or re-diarization needed.
-2772
docs/superpowers/plans/2026-04-05-alignment-editor.md
··· 1 - # Alignment Editor Implementation Plan 2 - 3 - > **For agentic workers:** REQUIRED: Use superpowers:subagent-driven-development (if subagents available) or superpowers:executing-plans to implement this plan. Steps use checkbox (`- [ ]`) syntax for tracking. 4 - 5 - **Goal:** Build an NLE-style alignment editor for correcting talk boundaries, verifying talks, and naming speakers in the track timeline view. 6 - 7 - **Architecture:** A TimelineEngine (React context + store) owns viewport, editing state, and an append-only corrections sidecar. Rendering layers (talk segments, waveform/diarization, snap guides, interaction overlay) subscribe to the engine. The appview gets two new endpoints for loading/saving the corrections JSON. 8 - 9 - **Tech Stack:** React 18, Next.js 15, TypeScript, Tailwind CSS, Hono (appview API), nanoid (IDs) 10 - 11 - **Spec:** `docs/superpowers/specs/2026-04-05-alignment-editor-design.md` 12 - 13 - --- 14 - 15 - ## Chunk 1: Data Layer (Corrections + Engine Core) 16 - 17 - ### Task 1: Correction Types and Replay Logic 18 - 19 - **Files:** 20 - - Create: `apps/ionosphere/src/lib/corrections.ts` 21 - - Create: `apps/ionosphere/src/lib/corrections.test.ts` 22 - 23 - - [ ] **Step 1: Write failing tests for correction replay** 24 - 25 - ```ts 26 - // apps/ionosphere/src/lib/corrections.test.ts 27 - import { describe, it, expect } from "vitest"; 28 - import { replayCorrections, type CorrectionEntry, type BaseTalk } from "./corrections"; 29 - 30 - const baseTalks: BaseTalk[] = [ 31 - { rkey: "talk1", title: "First Talk", speakers: ["Alice"], startSeconds: 100, endSeconds: 500, confidence: "high" }, 32 - { rkey: "talk2", title: "Second Talk", speakers: ["Bob"], startSeconds: 500, endSeconds: 900, confidence: "high" }, 33 - ]; 34 - 35 - function entry(action: CorrectionEntry["action"]): CorrectionEntry { 36 - return { id: "test", timestamp: new Date().toISOString(), streamSlug: "test", action }; 37 - } 38 - 39 - describe("replayCorrections", () => { 40 - it("returns base talks when no corrections", () => { 41 - const result = replayCorrections(baseTalks, []); 42 - expect(result.talks).toEqual(baseTalks.map(t => ({ ...t, verified: false }))); 43 - expect(result.speakerNames).toEqual(new Map()); 44 - }); 45 - 46 - it("applies move_boundary to start edge", () => { 47 - const corrections = [entry({ type: "move_boundary", talkRkey: "talk1", edge: "start", fromSeconds: 100, toSeconds: 110 })]; 48 - const result = replayCorrections(baseTalks, corrections); 49 - expect(result.talks[0].startSeconds).toBe(110); 50 - }); 51 - 52 - it("applies move_boundary to end edge", () => { 53 - const corrections = [entry({ type: "move_boundary", talkRkey: "talk1", edge: "end", fromSeconds: 500, toSeconds: 480 })]; 54 - const result = replayCorrections(baseTalks, corrections); 55 - expect(result.talks[0].endSeconds).toBe(480); 56 - }); 57 - 58 - it("applies split_talk", () => { 59 - const corrections = [entry({ type: "split_talk", talkRkey: "talk1", atSeconds: 300, newRkey: "talk1b" })]; 60 - const result = replayCorrections(baseTalks, corrections); 61 - expect(result.talks).toHaveLength(3); 62 - expect(result.talks[0]).toMatchObject({ rkey: "talk1", startSeconds: 100, endSeconds: 300 }); 63 - expect(result.talks[1]).toMatchObject({ rkey: "talk1b", startSeconds: 300, endSeconds: 500, title: "Untitled" }); 64 - }); 65 - 66 - it("applies add_talk", () => { 67 - const corrections = [entry({ type: "add_talk", rkey: "talk3", title: "New Talk", startSeconds: 950, endSeconds: 1100 })]; 68 - const result = replayCorrections(baseTalks, corrections); 69 - expect(result.talks).toHaveLength(3); 70 - expect(result.talks[2]).toMatchObject({ rkey: "talk3", title: "New Talk" }); 71 - }); 72 - 73 - it("applies remove_talk", () => { 74 - const corrections = [entry({ type: "remove_talk", talkRkey: "talk1" })]; 75 - const result = replayCorrections(baseTalks, corrections); 76 - expect(result.talks).toHaveLength(1); 77 - expect(result.talks[0].rkey).toBe("talk2"); 78 - }); 79 - 80 - it("applies set_talk_title", () => { 81 - const corrections = [entry({ type: "set_talk_title", talkRkey: "talk1", title: "Renamed" })]; 82 - const result = replayCorrections(baseTalks, corrections); 83 - expect(result.talks[0].title).toBe("Renamed"); 84 - }); 85 - 86 - it("applies verify_talk and unverify_talk", () => { 87 - const corrections = [ 88 - entry({ type: "verify_talk", talkRkey: "talk1" }), 89 - entry({ type: "unverify_talk", talkRkey: "talk1" }), 90 - ]; 91 - const result = replayCorrections(baseTalks, corrections); 92 - expect(result.talks[0].verified).toBe(false); 93 - }); 94 - 95 - it("applies name_speaker", () => { 96 - const corrections = [entry({ type: "name_speaker", speakerId: "SPEAKER_01", name: "Alice Smith" })]; 97 - const result = replayCorrections(baseTalks, corrections); 98 - expect(result.speakerNames.get("SPEAKER_01")).toBe("Alice Smith"); 99 - }); 100 - 101 - it("respects undo cursor", () => { 102 - const corrections = [ 103 - entry({ type: "move_boundary", talkRkey: "talk1", edge: "start", fromSeconds: 100, toSeconds: 110 }), 104 - entry({ type: "move_boundary", talkRkey: "talk1", edge: "start", fromSeconds: 110, toSeconds: 120 }), 105 - ]; 106 - const result = replayCorrections(baseTalks, corrections, 1); // only first correction 107 - expect(result.talks[0].startSeconds).toBe(110); 108 - }); 109 - 110 - it("respects undo cursor = 0 (no corrections applied)", () => { 111 - const corrections = [entry({ type: "move_boundary", talkRkey: "talk1", edge: "start", fromSeconds: 100, toSeconds: 999 })]; 112 - const result = replayCorrections(baseTalks, corrections, 0); 113 - expect(result.talks[0].startSeconds).toBe(100); 114 - }); 115 - 116 - it("handles null endSeconds in base talk", () => { 117 - const talks: BaseTalk[] = [ 118 - { rkey: "t1", title: "Last Talk", speakers: [], startSeconds: 800, endSeconds: null, confidence: "high" }, 119 - ]; 120 - const corrections = [entry({ type: "move_boundary", talkRkey: "t1", edge: "end", fromSeconds: 0, toSeconds: 1000 })]; 121 - const result = replayCorrections(talks, corrections); 122 - expect(result.talks[0].endSeconds).toBe(1000); 123 - }); 124 - 125 - it("splits talk with null endSeconds", () => { 126 - const talks: BaseTalk[] = [ 127 - { rkey: "t1", title: "Last Talk", speakers: [], startSeconds: 800, endSeconds: null, confidence: "high" }, 128 - ]; 129 - const corrections = [entry({ type: "split_talk", talkRkey: "t1", atSeconds: 900, newRkey: "t1b" })]; 130 - const result = replayCorrections(talks, corrections); 131 - expect(result.talks[0]).toMatchObject({ rkey: "t1", endSeconds: 900 }); 132 - expect(result.talks[1]).toMatchObject({ rkey: "t1b", startSeconds: 900, endSeconds: null }); 133 - }); 134 - 135 - it("composes multiple operations on the same talk", () => { 136 - const corrections = [ 137 - entry({ type: "move_boundary", talkRkey: "talk1", edge: "start", fromSeconds: 100, toSeconds: 90 }), 138 - entry({ type: "split_talk", talkRkey: "talk1", atSeconds: 300, newRkey: "talk1b" }), 139 - entry({ type: "set_talk_title", talkRkey: "talk1b", title: "Second Half" }), 140 - entry({ type: "verify_talk", talkRkey: "talk1b" }), 141 - ]; 142 - const result = replayCorrections(baseTalks, corrections); 143 - expect(result.talks[0]).toMatchObject({ rkey: "talk1", startSeconds: 90, endSeconds: 300 }); 144 - expect(result.talks[1]).toMatchObject({ rkey: "talk1b", title: "Second Half", startSeconds: 300, verified: true }); 145 - }); 146 - }); 147 - ``` 148 - 149 - - [ ] **Step 2: Run tests to verify they fail** 150 - 151 - Run: `cd apps/ionosphere && npx vitest run src/lib/corrections.test.ts` 152 - Expected: FAIL — module not found 153 - 154 - - [ ] **Step 3: Implement corrections module** 155 - 156 - ```ts 157 - // apps/ionosphere/src/lib/corrections.ts 158 - 159 - export interface BaseTalk { 160 - rkey: string; 161 - title: string; 162 - speakers: string[]; 163 - startSeconds: number; 164 - endSeconds: number | null; 165 - confidence: string; 166 - } 167 - 168 - export interface EffectiveTalk extends BaseTalk { 169 - verified: boolean; 170 - } 171 - 172 - export type CorrectionAction = 173 - | { type: "move_boundary"; talkRkey: string; edge: "start" | "end"; fromSeconds: number; toSeconds: number } 174 - | { type: "split_talk"; talkRkey: string; atSeconds: number; newRkey: string } 175 - | { type: "add_talk"; rkey: string; title: string; startSeconds: number; endSeconds: number } 176 - | { type: "remove_talk"; talkRkey: string } 177 - | { type: "set_talk_title"; talkRkey: string; title: string } 178 - | { type: "verify_talk"; talkRkey: string } 179 - | { type: "unverify_talk"; talkRkey: string } 180 - | { type: "name_speaker"; speakerId: string; name: string }; 181 - 182 - export interface CorrectionEntry { 183 - id: string; 184 - timestamp: string; 185 - authorDid?: string; 186 - streamSlug: string; 187 - action: CorrectionAction; 188 - } 189 - 190 - export interface ReplayResult { 191 - talks: EffectiveTalk[]; 192 - speakerNames: Map<string, string>; 193 - } 194 - 195 - export function replayCorrections( 196 - baseTalks: BaseTalk[], 197 - corrections: CorrectionEntry[], 198 - cursor?: number, 199 - ): ReplayResult { 200 - const limit = cursor ?? corrections.length; 201 - const active = corrections.slice(0, limit); 202 - 203 - let talks: EffectiveTalk[] = baseTalks.map((t) => ({ ...t, verified: false })); 204 - const speakerNames = new Map<string, string>(); 205 - 206 - for (const entry of active) { 207 - const { action } = entry; 208 - 209 - switch (action.type) { 210 - case "move_boundary": { 211 - talks = talks.map((t) => { 212 - if (t.rkey !== action.talkRkey) return t; 213 - if (action.edge === "start") return { ...t, startSeconds: action.toSeconds }; 214 - return { ...t, endSeconds: action.toSeconds }; 215 - }); 216 - break; 217 - } 218 - case "split_talk": { 219 - const idx = talks.findIndex((t) => t.rkey === action.talkRkey); 220 - if (idx === -1) break; 221 - const original = talks[idx]; 222 - const first: EffectiveTalk = { ...original, endSeconds: action.atSeconds }; 223 - const second: EffectiveTalk = { 224 - ...original, 225 - rkey: action.newRkey, 226 - title: "Untitled", 227 - startSeconds: action.atSeconds, 228 - verified: false, 229 - }; 230 - talks = [...talks.slice(0, idx), first, second, ...talks.slice(idx + 1)]; 231 - break; 232 - } 233 - case "add_talk": { 234 - const newTalk: EffectiveTalk = { 235 - rkey: action.rkey, 236 - title: action.title, 237 - speakers: [], 238 - startSeconds: action.startSeconds, 239 - endSeconds: action.endSeconds, 240 - confidence: "manual", 241 - verified: false, 242 - }; 243 - talks = [...talks, newTalk].sort((a, b) => a.startSeconds - b.startSeconds); 244 - break; 245 - } 246 - case "remove_talk": { 247 - talks = talks.filter((t) => t.rkey !== action.talkRkey); 248 - break; 249 - } 250 - case "set_talk_title": { 251 - talks = talks.map((t) => 252 - t.rkey === action.talkRkey ? { ...t, title: action.title } : t, 253 - ); 254 - break; 255 - } 256 - case "verify_talk": { 257 - talks = talks.map((t) => 258 - t.rkey === action.talkRkey ? { ...t, verified: true } : t, 259 - ); 260 - break; 261 - } 262 - case "unverify_talk": { 263 - talks = talks.map((t) => 264 - t.rkey === action.talkRkey ? { ...t, verified: false } : t, 265 - ); 266 - break; 267 - } 268 - case "name_speaker": { 269 - speakerNames.set(action.speakerId, action.name); 270 - break; 271 - } 272 - } 273 - } 274 - 275 - return { talks, speakerNames }; 276 - } 277 - ``` 278 - 279 - - [ ] **Step 4: Run tests to verify they pass** 280 - 281 - Run: `cd apps/ionosphere && npx vitest run src/lib/corrections.test.ts` 282 - Expected: All 14 tests PASS 283 - 284 - - [ ] **Step 5: Commit** 285 - 286 - ```bash 287 - git add apps/ionosphere/src/lib/corrections.ts apps/ionosphere/src/lib/corrections.test.ts 288 - git commit -m "feat(editor): correction types and replay logic with tests" 289 - ``` 290 - 291 - --- 292 - 293 - ### Task 2: Snap Target Computation 294 - 295 - **Files:** 296 - - Create: `apps/ionosphere/src/lib/snap-targets.ts` 297 - - Create: `apps/ionosphere/src/lib/snap-targets.test.ts` 298 - 299 - - [ ] **Step 1: Write failing tests for snap target computation** 300 - 301 - ```ts 302 - // apps/ionosphere/src/lib/snap-targets.test.ts 303 - import { describe, it, expect } from "vitest"; 304 - import { computeSnapTargets, findNearestSnap, type SnapTarget } from "./snap-targets"; 305 - 306 - describe("computeSnapTargets", () => { 307 - it("finds silence gaps > 2s from word timestamps", () => { 308 - const words = [ 309 - { start: 10, end: 11, speaker: "A" }, 310 - { start: 11.1, end: 12, speaker: "A" }, 311 - // 3s gap here 312 - { start: 15, end: 16, speaker: "A" }, 313 - ]; 314 - const targets = computeSnapTargets(words, []); 315 - const silenceTargets = targets.filter((t) => t.type === "silence_gap"); 316 - expect(silenceTargets).toHaveLength(1); 317 - expect(silenceTargets[0].gapStart).toBeCloseTo(12); 318 - expect(silenceTargets[0].gapEnd).toBeCloseTo(15); 319 - }); 320 - 321 - it("finds speaker change points from diarization", () => { 322 - const diarization = [ 323 - { start: 10, end: 20, speaker: "SPEAKER_01" }, 324 - { start: 20, end: 30, speaker: "SPEAKER_02" }, 325 - ]; 326 - const targets = computeSnapTargets([], diarization); 327 - const changes = targets.filter((t) => t.type === "speaker_change"); 328 - expect(changes).toHaveLength(1); 329 - expect(changes[0].time).toBeCloseTo(20); 330 - }); 331 - 332 - it("returns targets sorted by time", () => { 333 - const words = [ 334 - { start: 50, end: 51, speaker: "A" }, 335 - { start: 55, end: 56, speaker: "A" }, 336 - ]; 337 - const diarization = [ 338 - { start: 10, end: 52, speaker: "S1" }, 339 - { start: 52, end: 60, speaker: "S2" }, 340 - ]; 341 - const targets = computeSnapTargets(words, diarization); 342 - for (let i = 1; i < targets.length; i++) { 343 - expect(targets[i].time).toBeGreaterThanOrEqual(targets[i - 1].time); 344 - } 345 - }); 346 - }); 347 - 348 - describe("findNearestSnap", () => { 349 - it("returns nearest snap target within radius, resolving edge-aware offset", () => { 350 - const targets: SnapTarget[] = [ 351 - { type: "silence_gap", time: 100, gapStart: 98, gapEnd: 102, priority: 1 }, 352 - ]; 353 - // Dragging a start boundary — should snap to gapEnd + 0.5s offset 354 - const result = findNearestSnap(targets, 101.5, "start", 3); 355 - expect(result).not.toBeNull(); 356 - expect(result!.snappedTime).toBeCloseTo(102.5); // gapEnd + 500ms 357 - }); 358 - 359 - it("clamps offset to word boundary if overshoot", () => { 360 - const targets: SnapTarget[] = [ 361 - { type: "silence_gap", time: 100, gapStart: 98, gapEnd: 102, priority: 1, nearestWordAfterGap: 102.2 }, 362 - ]; 363 - // gapEnd + 500ms = 102.5, but nearest word starts at 102.2 — clamp 364 - const result = findNearestSnap(targets, 101.5, "start", 3); 365 - expect(result).not.toBeNull(); 366 - expect(result!.snappedTime).toBeCloseTo(102.2); 367 - }); 368 - 369 - it("returns null when no targets within radius", () => { 370 - const targets: SnapTarget[] = [ 371 - { type: "silence_gap", time: 100, gapStart: 98, gapEnd: 102, priority: 1 }, 372 - ]; 373 - const result = findNearestSnap(targets, 200, "start", 3); 374 - expect(result).toBeNull(); 375 - }); 376 - 377 - it("picks highest priority when multiple targets within radius", () => { 378 - const targets: SnapTarget[] = [ 379 - { type: "speaker_change", time: 100, priority: 2 }, 380 - { type: "silence_gap", time: 100.5, gapStart: 99, gapEnd: 101, priority: 1 }, 381 - ]; 382 - const result = findNearestSnap(targets, 100.2, "start", 3); 383 - expect(result!.target.type).toBe("silence_gap"); // priority 1 wins 384 - }); 385 - 386 - it("resolves end boundary snap to gapStart - offset", () => { 387 - const targets: SnapTarget[] = [ 388 - { type: "silence_gap", time: 100, gapStart: 98, gapEnd: 102, priority: 1 }, 389 - ]; 390 - const result = findNearestSnap(targets, 99, "end", 3); 391 - expect(result).not.toBeNull(); 392 - expect(result!.snappedTime).toBeCloseTo(97.5); // gapStart - 500ms 393 - }); 394 - 395 - it("clamps end boundary offset to word boundary if overshoot", () => { 396 - const targets: SnapTarget[] = [ 397 - { type: "silence_gap", time: 100, gapStart: 98, gapEnd: 102, priority: 1, nearestWordBeforeGap: 97.8 }, 398 - ]; 399 - // gapStart - 500ms = 97.5, but nearest word ends at 97.8 — clamp 400 - const result = findNearestSnap(targets, 99, "end", 3); 401 - expect(result).not.toBeNull(); 402 - expect(result!.snappedTime).toBeCloseTo(97.8); 403 - }); 404 - }); 405 - ``` 406 - 407 - - [ ] **Step 2: Run tests to verify they fail** 408 - 409 - Run: `cd apps/ionosphere && npx vitest run src/lib/snap-targets.test.ts` 410 - Expected: FAIL — module not found 411 - 412 - - [ ] **Step 3: Implement snap targets module** 413 - 414 - ```ts 415 - // apps/ionosphere/src/lib/snap-targets.ts 416 - 417 - export interface SnapTarget { 418 - type: "silence_gap" | "speaker_change" | "low_confidence" | "word_boundary"; 419 - time: number; // representative position in seconds 420 - priority: number; // 1 = highest (silence gap), 4 = lowest (word boundary) 421 - gapStart?: number; 422 - gapEnd?: number; 423 - nearestWordBeforeGap?: number; 424 - nearestWordAfterGap?: number; 425 - } 426 - 427 - export interface SnapResult { 428 - target: SnapTarget; 429 - snappedTime: number; 430 - } 431 - 432 - interface Word { 433 - start: number; 434 - end: number; 435 - speaker: string; 436 - } 437 - 438 - interface DiarizationSegment { 439 - start: number; 440 - end: number; 441 - speaker: string; 442 - } 443 - 444 - const SILENCE_GAP_THRESHOLD = 2; // seconds 445 - const SNAP_OFFSET = 0.5; // 500ms breathing room 446 - 447 - export function computeSnapTargets( 448 - words: Word[], 449 - diarization: DiarizationSegment[], 450 - ): SnapTarget[] { 451 - const targets: SnapTarget[] = []; 452 - 453 - // Silence gaps from word timestamps 454 - for (let i = 1; i < words.length; i++) { 455 - const gap = words[i].start - words[i - 1].end; 456 - if (gap > SILENCE_GAP_THRESHOLD) { 457 - targets.push({ 458 - type: "silence_gap", 459 - time: (words[i - 1].end + words[i].start) / 2, 460 - priority: 1, 461 - gapStart: words[i - 1].end, 462 - gapEnd: words[i].start, 463 - nearestWordBeforeGap: words[i - 1].end, 464 - nearestWordAfterGap: words[i].start, 465 - }); 466 - } 467 - } 468 - 469 - // Speaker change points from diarization 470 - for (let i = 1; i < diarization.length; i++) { 471 - if (diarization[i].speaker !== diarization[i - 1].speaker) { 472 - targets.push({ 473 - type: "speaker_change", 474 - time: diarization[i].start, 475 - priority: 2, 476 - }); 477 - } 478 - } 479 - 480 - // Sort by time for binary search 481 - targets.sort((a, b) => a.time - b.time); 482 - return targets; 483 - } 484 - 485 - /** 486 - * Find the nearest snap target within `radiusSeconds` of `timeSeconds`. 487 - * Edge-aware: resolves to the near edge of the feature + 500ms offset. 488 - */ 489 - export function findNearestSnap( 490 - targets: SnapTarget[], 491 - timeSeconds: number, 492 - edge: "start" | "end", 493 - radiusSeconds: number, 494 - ): SnapResult | null { 495 - // Binary search for the closest target 496 - let lo = 0; 497 - let hi = targets.length - 1; 498 - const candidates: SnapTarget[] = []; 499 - 500 - // Find targets within radius using binary search 501 - while (lo <= hi) { 502 - const mid = (lo + hi) >> 1; 503 - if (targets[mid].time < timeSeconds - radiusSeconds) { 504 - lo = mid + 1; 505 - } else if (targets[mid].time > timeSeconds + radiusSeconds) { 506 - hi = mid - 1; 507 - } else { 508 - // Found one in range — expand outward to find all 509 - let left = mid; 510 - while (left > 0 && targets[left - 1].time >= timeSeconds - radiusSeconds) left--; 511 - let right = mid; 512 - while (right < targets.length - 1 && targets[right + 1].time <= timeSeconds + radiusSeconds) right++; 513 - for (let i = left; i <= right; i++) candidates.push(targets[i]); 514 - break; 515 - } 516 - } 517 - 518 - if (candidates.length === 0) return null; 519 - 520 - // Pick highest priority (lowest number), then closest 521 - candidates.sort((a, b) => a.priority - b.priority || Math.abs(a.time - timeSeconds) - Math.abs(b.time - timeSeconds)); 522 - const best = candidates[0]; 523 - 524 - return { target: best, snappedTime: resolveSnapPosition(best, edge) }; 525 - } 526 - 527 - function resolveSnapPosition(target: SnapTarget, edge: "start" | "end"): number { 528 - if (target.type === "silence_gap" && target.gapStart != null && target.gapEnd != null) { 529 - if (edge === "start") { 530 - // Dragging start boundary — snap to end of gap + offset (before first word) 531 - const ideal = target.gapEnd + SNAP_OFFSET; 532 - // Clamp to nearest word if offset overshoots 533 - if (target.nearestWordAfterGap != null && ideal > target.nearestWordAfterGap) { 534 - return target.nearestWordAfterGap; 535 - } 536 - return ideal; 537 - } else { 538 - // Dragging end boundary — snap to start of gap - offset (after last word) 539 - const ideal = target.gapStart - SNAP_OFFSET; 540 - if (target.nearestWordBeforeGap != null && ideal < target.nearestWordBeforeGap) { 541 - return target.nearestWordBeforeGap; 542 - } 543 - return ideal; 544 - } 545 - } 546 - 547 - // Speaker changes and other targets: use the target time directly 548 - return target.time; 549 - } 550 - ``` 551 - 552 - - [ ] **Step 4: Run tests to verify they pass** 553 - 554 - Run: `cd apps/ionosphere && npx vitest run src/lib/snap-targets.test.ts` 555 - Expected: All 7 tests PASS 556 - 557 - - [ ] **Step 5: Commit** 558 - 559 - ```bash 560 - git add apps/ionosphere/src/lib/snap-targets.ts apps/ionosphere/src/lib/snap-targets.test.ts 561 - git commit -m "feat(editor): snap target computation with edge-aware offset" 562 - ``` 563 - 564 - --- 565 - 566 - ### Task 3: Timeline Engine Store 567 - 568 - **Files:** 569 - - Create: `apps/ionosphere/src/lib/timeline-engine.ts` 570 - 571 - This is a React context + store that composes the corrections and snap logic. No separate test file — the corrections and snap logic are tested independently; the engine is a thin integration layer that will be tested via component interaction in later tasks. 572 - 573 - - [ ] **Step 1: Create the timeline engine** 574 - 575 - ```ts 576 - // apps/ionosphere/src/lib/timeline-engine.ts 577 - "use client"; 578 - 579 - import { 580 - createContext, 581 - useContext, 582 - useReducer, 583 - useMemo, 584 - useCallback, 585 - type ReactNode, 586 - } from "react"; 587 - import { 588 - replayCorrections, 589 - type BaseTalk, 590 - type EffectiveTalk, 591 - type CorrectionEntry, 592 - type CorrectionAction, 593 - } from "./corrections"; 594 - import { 595 - computeSnapTargets, 596 - findNearestSnap, 597 - type SnapTarget, 598 - type SnapResult, 599 - } from "./snap-targets"; 600 - 601 - // --- Types --- 602 - 603 - export type EditMode = "select" | "trim" | "split" | "add"; 604 - 605 - interface DragState { 606 - talkRkey: string; 607 - edge: "start" | "end"; 608 - originalSeconds: number; 609 - currentSeconds: number; 610 - } 611 - 612 - interface EngineState { 613 - // Editing 614 - editingEnabled: boolean; 615 - mode: EditMode; 616 - selectedTalkRkey: string | null; 617 - activeDrag: DragState | null; 618 - 619 - // Corrections 620 - corrections: CorrectionEntry[]; 621 - undoCursor: number; // index into corrections: entries [0, undoCursor) are applied 622 - savedCursor: number; // cursor at last save — for dirty detection 623 - 624 - // Data (set on init, not part of reducer) 625 - streamSlug: string; 626 - baseTalks: BaseTalk[]; 627 - authorDid?: string; 628 - } 629 - 630 - type EngineAction = 631 - | { type: "TOGGLE_EDITING" } 632 - | { type: "SET_MODE"; mode: EditMode } 633 - | { type: "SELECT_TALK"; rkey: string | null } 634 - | { type: "START_DRAG"; talkRkey: string; edge: "start" | "end"; seconds: number } 635 - | { type: "UPDATE_DRAG"; seconds: number } 636 - | { type: "COMMIT_DRAG" } 637 - | { type: "CANCEL_DRAG" } 638 - | { type: "APPLY_CORRECTION"; action: CorrectionAction } 639 - | { type: "UNDO" } 640 - | { type: "REDO" } 641 - | { type: "MARK_SAVED" } 642 - | { type: "LOAD_CORRECTIONS"; corrections: CorrectionEntry[] }; 643 - 644 - function generateId(): string { 645 - return crypto.randomUUID(); 646 - } 647 - 648 - function engineReducer(state: EngineState, action: EngineAction): EngineState { 649 - switch (action.type) { 650 - case "TOGGLE_EDITING": 651 - return { 652 - ...state, 653 - editingEnabled: !state.editingEnabled, 654 - mode: "select", 655 - selectedTalkRkey: null, 656 - activeDrag: null, 657 - }; 658 - 659 - case "SET_MODE": 660 - return { ...state, mode: action.mode, activeDrag: null }; 661 - 662 - case "SELECT_TALK": 663 - return { ...state, selectedTalkRkey: action.rkey }; 664 - 665 - case "START_DRAG": 666 - return { 667 - ...state, 668 - activeDrag: { 669 - talkRkey: action.talkRkey, 670 - edge: action.edge, 671 - originalSeconds: action.seconds, 672 - currentSeconds: action.seconds, 673 - }, 674 - }; 675 - 676 - case "UPDATE_DRAG": 677 - if (!state.activeDrag) return state; 678 - return { 679 - ...state, 680 - activeDrag: { ...state.activeDrag, currentSeconds: action.seconds }, 681 - }; 682 - 683 - case "COMMIT_DRAG": { 684 - if (!state.activeDrag) return state; 685 - const { talkRkey, edge, originalSeconds, currentSeconds } = state.activeDrag; 686 - if (Math.abs(originalSeconds - currentSeconds) < 0.05) { 687 - return { ...state, activeDrag: null }; 688 - } 689 - const correction: CorrectionEntry = { 690 - id: generateId(), 691 - timestamp: new Date().toISOString(), 692 - authorDid: state.authorDid, 693 - streamSlug: state.streamSlug, 694 - action: { 695 - type: "move_boundary", 696 - talkRkey, 697 - edge, 698 - fromSeconds: originalSeconds, 699 - toSeconds: currentSeconds, 700 - }, 701 - }; 702 - const corrections = [...state.corrections.slice(0, state.undoCursor), correction]; 703 - return { 704 - ...state, 705 - corrections, 706 - undoCursor: corrections.length, 707 - activeDrag: null, 708 - }; 709 - } 710 - 711 - case "CANCEL_DRAG": 712 - return { ...state, activeDrag: null }; 713 - 714 - case "APPLY_CORRECTION": { 715 - const correction: CorrectionEntry = { 716 - id: generateId(), 717 - timestamp: new Date().toISOString(), 718 - authorDid: state.authorDid, 719 - streamSlug: state.streamSlug, 720 - action: action.action, 721 - }; 722 - const corrections = [...state.corrections.slice(0, state.undoCursor), correction]; 723 - return { ...state, corrections, undoCursor: corrections.length }; 724 - } 725 - 726 - case "UNDO": 727 - if (state.undoCursor <= 0) return state; 728 - return { ...state, undoCursor: state.undoCursor - 1 }; 729 - 730 - case "REDO": 731 - if (state.undoCursor >= state.corrections.length) return state; 732 - return { ...state, undoCursor: state.undoCursor + 1 }; 733 - 734 - case "MARK_SAVED": 735 - return { ...state, savedCursor: state.undoCursor }; 736 - 737 - case "LOAD_CORRECTIONS": 738 - return { 739 - ...state, 740 - corrections: action.corrections, 741 - undoCursor: action.corrections.length, 742 - savedCursor: action.corrections.length, 743 - }; 744 - 745 - default: 746 - return state; 747 - } 748 - } 749 - 750 - // --- Context --- 751 - 752 - interface TimelineEngineContextValue { 753 - // State 754 - editingEnabled: boolean; 755 - mode: EditMode; 756 - selectedTalkRkey: string | null; 757 - activeDrag: DragState | null; 758 - isDirty: boolean; 759 - canUndo: boolean; 760 - canRedo: boolean; 761 - 762 - // Derived 763 - effectiveTalks: EffectiveTalk[]; 764 - speakerNames: Map<string, string>; 765 - snapTargets: SnapTarget[]; 766 - 767 - // Viewport (managed externally, exposed here for coordinate conversion) 768 - windowStart: number; 769 - windowEnd: number; 770 - containerWidth: number; 771 - 772 - // Coordinate conversion 773 - timeToPixel: (seconds: number) => number; 774 - pixelToTime: (px: number) => number; 775 - 776 - // Snap 777 - findSnap: (timeSeconds: number, edge: "start" | "end", radiusPx: number) => SnapResult | null; 778 - 779 - // Actions 780 - toggleEditing: () => void; 781 - setMode: (mode: EditMode) => void; 782 - selectTalk: (rkey: string | null) => void; 783 - startDrag: (talkRkey: string, edge: "start" | "end", seconds: number) => void; 784 - updateDrag: (seconds: number) => void; 785 - commitDrag: () => void; 786 - cancelDrag: () => void; 787 - applyCorrection: (action: CorrectionAction) => void; 788 - undo: () => void; 789 - redo: () => void; 790 - markSaved: () => void; 791 - getCorrectionsToSave: () => CorrectionEntry[]; 792 - } 793 - 794 - const TimelineEngineContext = createContext<TimelineEngineContextValue | null>(null); 795 - 796 - export function useTimelineEngine() { 797 - const ctx = useContext(TimelineEngineContext); 798 - if (!ctx) throw new Error("useTimelineEngine must be used within TimelineEngineProvider"); 799 - return ctx; 800 - } 801 - 802 - // --- Provider --- 803 - 804 - interface TimelineEngineProviderProps { 805 - children: ReactNode; 806 - streamSlug: string; 807 - baseTalks: BaseTalk[]; 808 - words: Array<{ start: number; end: number; speaker: string }>; 809 - diarization: Array<{ start: number; end: number; speaker: string }>; 810 - initialCorrections?: CorrectionEntry[]; 811 - authorDid?: string; 812 - // Viewport props (managed by parent zoom component) 813 - windowStart: number; 814 - windowEnd: number; 815 - containerWidth: number; 816 - } 817 - 818 - export function TimelineEngineProvider({ 819 - children, 820 - streamSlug, 821 - baseTalks, 822 - words, 823 - diarization, 824 - initialCorrections, 825 - authorDid, 826 - windowStart, 827 - windowEnd, 828 - containerWidth, 829 - }: TimelineEngineProviderProps) { 830 - const [state, dispatch] = useReducer(engineReducer, { 831 - editingEnabled: false, 832 - mode: "select", 833 - selectedTalkRkey: null, 834 - activeDrag: null, 835 - corrections: initialCorrections ?? [], 836 - undoCursor: initialCorrections?.length ?? 0, 837 - savedCursor: initialCorrections?.length ?? 0, 838 - streamSlug, 839 - baseTalks, 840 - authorDid, 841 - }); 842 - 843 - // Replay corrections to get effective state 844 - const { talks: effectiveTalks, speakerNames } = useMemo( 845 - () => replayCorrections(state.baseTalks, state.corrections, state.undoCursor), 846 - [state.baseTalks, state.corrections, state.undoCursor], 847 - ); 848 - 849 - // Pre-compute snap targets 850 - const snapTargets = useMemo( 851 - () => computeSnapTargets(words, diarization), 852 - [words, diarization], 853 - ); 854 - 855 - // Coordinate conversion 856 - const windowDuration = windowEnd - windowStart; 857 - const timeToPixel = useCallback( 858 - (seconds: number) => ((seconds - windowStart) / windowDuration) * containerWidth, 859 - [windowStart, windowDuration, containerWidth], 860 - ); 861 - const pixelToTime = useCallback( 862 - (px: number) => windowStart + (px / containerWidth) * windowDuration, 863 - [windowStart, containerWidth, windowDuration], 864 - ); 865 - 866 - const findSnap = useCallback( 867 - (timeSeconds: number, edge: "start" | "end", radiusPx: number) => { 868 - const radiusSeconds = (radiusPx / containerWidth) * windowDuration; 869 - return findNearestSnap(snapTargets, timeSeconds, edge, radiusSeconds); 870 - }, 871 - [snapTargets, containerWidth, windowDuration], 872 - ); 873 - 874 - const value: TimelineEngineContextValue = useMemo(() => ({ 875 - editingEnabled: state.editingEnabled, 876 - mode: state.mode, 877 - selectedTalkRkey: state.selectedTalkRkey, 878 - activeDrag: state.activeDrag, 879 - isDirty: state.undoCursor !== state.savedCursor, 880 - canUndo: state.undoCursor > 0, 881 - canRedo: state.undoCursor < state.corrections.length, 882 - effectiveTalks, 883 - speakerNames, 884 - snapTargets, 885 - windowStart, 886 - windowEnd, 887 - containerWidth, 888 - timeToPixel, 889 - pixelToTime, 890 - findSnap, 891 - toggleEditing: () => dispatch({ type: "TOGGLE_EDITING" }), 892 - setMode: (mode: EditMode) => dispatch({ type: "SET_MODE", mode }), 893 - selectTalk: (rkey: string | null) => dispatch({ type: "SELECT_TALK", rkey }), 894 - startDrag: (talkRkey: string, edge: "start" | "end", seconds: number) => 895 - dispatch({ type: "START_DRAG", talkRkey, edge, seconds }), 896 - updateDrag: (seconds: number) => dispatch({ type: "UPDATE_DRAG", seconds }), 897 - commitDrag: () => dispatch({ type: "COMMIT_DRAG" }), 898 - cancelDrag: () => dispatch({ type: "CANCEL_DRAG" }), 899 - applyCorrection: (action: CorrectionAction) => 900 - dispatch({ type: "APPLY_CORRECTION", action }), 901 - undo: () => dispatch({ type: "UNDO" }), 902 - redo: () => dispatch({ type: "REDO" }), 903 - markSaved: () => dispatch({ type: "MARK_SAVED" }), 904 - getCorrectionsToSave: () => state.corrections.slice(0, state.undoCursor), 905 - }), [state, effectiveTalks, speakerNames, snapTargets, windowStart, windowEnd, containerWidth, timeToPixel, pixelToTime, findSnap]); 906 - 907 - return ( 908 - <TimelineEngineContext.Provider value={value}> 909 - {children} 910 - </TimelineEngineContext.Provider> 911 - ); 912 - } 913 - ``` 914 - 915 - - [ ] **Step 2: Verify TypeScript compiles** 916 - 917 - Run: `cd apps/ionosphere && npx tsc --noEmit` 918 - Expected: No errors (or only pre-existing ones) 919 - 920 - - [ ] **Step 3: Commit** 921 - 922 - ```bash 923 - git add apps/ionosphere/src/lib/timeline-engine.ts 924 - git commit -m "feat(editor): timeline engine context and reducer" 925 - ``` 926 - 927 - --- 928 - 929 - ### Task 4: Corrections API Endpoints 930 - 931 - **Files:** 932 - - Modify: `apps/ionosphere-appview/src/routes.ts` (add two endpoints) 933 - 934 - - [ ] **Step 1: Add corrections endpoints to routes.ts** 935 - 936 - Add the following before the `return app;` at the end of `createRoutes()`, after the existing tracks routes (line ~529 in routes.ts). Also add the necessary imports at the top. 937 - 938 - Add to imports at top of file: 939 - ```ts 940 - import { readFileSync, existsSync, writeFileSync, mkdirSync } from "node:fs"; 941 - ``` 942 - 943 - Note: `readFileSync` is already imported — extend that import to include `existsSync`, `writeFileSync`, and `mkdirSync`. 944 - 945 - Add the CORS middleware update — change `"Access-Control-Allow-Methods"` from `"GET, OPTIONS"` to `"GET, PUT, OPTIONS"`. 946 - 947 - Add endpoints before `return app;`: 948 - ```ts 949 - // --- Corrections sidecar --- 950 - 951 - app.get("/xrpc/tv.ionosphere.getCorrections", (c) => { 952 - const stream = c.req.query("stream"); 953 - if (!stream) return c.json({ error: "missing stream parameter" }, 400); 954 - 955 - const correctionsPath = path.resolve( 956 - import.meta.dirname, 957 - `../data/corrections/corrections-${stream}.json`, 958 - ); 959 - if (!existsSync(correctionsPath)) { 960 - return c.json({ corrections: [] }); 961 - } 962 - const data = JSON.parse(readFileSync(correctionsPath, "utf-8")); 963 - return c.json({ corrections: data }); 964 - }); 965 - 966 - // Valid stream slugs (prevents path traversal) 967 - // Note: STREAMS is not exported from tracks.ts yet — add `export` to the STREAMS array, 968 - // then import it: `import { getTracksIndex, getTrackData, STREAMS } from "./tracks.js";` 969 - const validSlugs = new Set(STREAMS.map((s) => s.slug)); 970 - 971 - app.put("/xrpc/tv.ionosphere.putCorrections", async (c) => { 972 - const body = await c.req.json(); 973 - const stream = body.stream; 974 - const corrections = body.corrections; 975 - if (!stream || !Array.isArray(corrections)) { 976 - return c.json({ error: "missing stream or corrections" }, 400); 977 - } 978 - if (!validSlugs.has(stream)) { 979 - return c.json({ error: "invalid stream" }, 400); 980 - } 981 - 982 - const dir = path.resolve(import.meta.dirname, "../data/corrections"); 983 - if (!existsSync(dir)) { 984 - mkdirSync(dir, { recursive: true }); 985 - } 986 - const correctionsPath = path.resolve(dir, `corrections-${stream}.json`); 987 - writeFileSync(correctionsPath, JSON.stringify(corrections, null, 2)); 988 - // NOTE: No authentication — acceptable for local dev, will need auth for production 989 - return c.json({ ok: true, count: corrections.length }); 990 - }); 991 - ``` 992 - 993 - - [ ] **Step 2: Create the corrections data directory** 994 - 995 - Run: `mkdir -p apps/ionosphere-appview/data/corrections` 996 - 997 - - [ ] **Step 3: Verify the appview still starts** 998 - 999 - Run: `cd apps/ionosphere-appview && npm run build` (or `npx tsc --noEmit`) 1000 - Expected: No errors 1001 - 1002 - - [ ] **Step 4: Commit** 1003 - 1004 - ```bash 1005 - git add apps/ionosphere-appview/src/routes.ts 1006 - git commit -m "feat(editor): corrections load/save API endpoints" 1007 - ``` 1008 - 1009 - --- 1010 - 1011 - ## Chunk 2: UI Components — Toolbar, Talk Segments, and Integration 1012 - 1013 - ### Task 5: NLE Toolbar Component 1014 - 1015 - **Files:** 1016 - - Create: `apps/ionosphere/src/app/components/TimelineToolbar.tsx` 1017 - 1018 - - [ ] **Step 1: Create the toolbar component** 1019 - 1020 - ```tsx 1021 - // apps/ionosphere/src/app/components/TimelineToolbar.tsx 1022 - "use client"; 1023 - 1024 - import { useTimelineEngine, type EditMode } from "@/lib/timeline-engine"; 1025 - 1026 - const MODE_BUTTONS: { mode: EditMode; label: string; shortcut: string }[] = [ 1027 - { mode: "select", label: "Select", shortcut: "V" }, 1028 - { mode: "trim", label: "Trim", shortcut: "T" }, 1029 - { mode: "split", label: "Split", shortcut: "S" }, 1030 - { mode: "add", label: "Add", shortcut: "A" }, 1031 - ]; 1032 - 1033 - export default function TimelineToolbar({ onSave }: { onSave: () => void }) { 1034 - const { 1035 - editingEnabled, 1036 - toggleEditing, 1037 - mode, 1038 - setMode, 1039 - canUndo, 1040 - canRedo, 1041 - undo, 1042 - redo, 1043 - isDirty, 1044 - effectiveTalks, 1045 - selectedTalkRkey, 1046 - applyCorrection, 1047 - } = useTimelineEngine(); 1048 - 1049 - const verifiedCount = effectiveTalks.filter((t) => t.verified).length; 1050 - const totalCount = effectiveTalks.length; 1051 - 1052 - const handleDelete = () => { 1053 - if (!selectedTalkRkey) return; 1054 - const talk = effectiveTalks.find((t) => t.rkey === selectedTalkRkey); 1055 - if (!talk) return; 1056 - if (talk.verified && !confirm("Delete verified talk?")) return; 1057 - applyCorrection({ type: "remove_talk", talkRkey: selectedTalkRkey }); 1058 - }; 1059 - 1060 - return ( 1061 - <div className="flex flex-col gap-1"> 1062 - {/* Top row: Edit toggle + verification progress */} 1063 - <div className="flex items-center gap-2"> 1064 - <button 1065 - onClick={toggleEditing} 1066 - className={`px-3 py-1 text-xs rounded font-medium transition-colors ${ 1067 - editingEnabled 1068 - ? "bg-blue-600 text-white" 1069 - : "bg-neutral-800 text-neutral-400 hover:text-neutral-200" 1070 - }`} 1071 - > 1072 - {editingEnabled ? "Editing" : "Edit"} 1073 - </button> 1074 - <span className="text-xs text-neutral-500"> 1075 - {verifiedCount}/{totalCount} verified 1076 - </span> 1077 - </div> 1078 - 1079 - {/* Bottom row: Mode buttons + undo/redo + save (only when editing) */} 1080 - {editingEnabled && ( 1081 - <div className="flex items-center gap-1"> 1082 - {/* Mode buttons */} 1083 - <div className="flex items-center gap-0.5 border-r border-neutral-700 pr-2 mr-1"> 1084 - {MODE_BUTTONS.map(({ mode: m, label, shortcut }) => ( 1085 - <button 1086 - key={m} 1087 - onClick={() => setMode(m)} 1088 - className={`px-2 py-0.5 text-xs rounded transition-colors ${ 1089 - mode === m 1090 - ? "bg-neutral-700 text-neutral-100" 1091 - : "text-neutral-500 hover:text-neutral-300 hover:bg-neutral-800" 1092 - }`} 1093 - title={`${label} (${shortcut})`} 1094 - > 1095 - {label} 1096 - </button> 1097 - ))} 1098 - <button 1099 - onClick={handleDelete} 1100 - disabled={!selectedTalkRkey} 1101 - className="px-2 py-0.5 text-xs rounded text-neutral-500 hover:text-red-400 hover:bg-neutral-800 disabled:opacity-30 disabled:cursor-not-allowed" 1102 - title="Delete (Backspace)" 1103 - > 1104 - Delete 1105 - </button> 1106 - </div> 1107 - 1108 - {/* Undo/Redo */} 1109 - <div className="flex items-center gap-0.5 border-r border-neutral-700 pr-2 mr-1"> 1110 - <button 1111 - onClick={undo} 1112 - disabled={!canUndo} 1113 - className="px-2 py-0.5 text-xs rounded text-neutral-500 hover:text-neutral-300 hover:bg-neutral-800 disabled:opacity-30" 1114 - title="Undo (Ctrl+Z)" 1115 - > 1116 - Undo 1117 - </button> 1118 - <button 1119 - onClick={redo} 1120 - disabled={!canRedo} 1121 - className="px-2 py-0.5 text-xs rounded text-neutral-500 hover:text-neutral-300 hover:bg-neutral-800 disabled:opacity-30" 1122 - title="Redo (Ctrl+Shift+Z)" 1123 - > 1124 - Redo 1125 - </button> 1126 - </div> 1127 - 1128 - {/* Save */} 1129 - <button 1130 - onClick={onSave} 1131 - className={`px-2 py-0.5 text-xs rounded transition-colors ${ 1132 - isDirty 1133 - ? "bg-blue-600/20 text-blue-400 hover:bg-blue-600/30" 1134 - : "text-neutral-600 cursor-default" 1135 - }`} 1136 - disabled={!isDirty} 1137 - title="Save (Ctrl+S)" 1138 - > 1139 - Save{isDirty ? " *" : ""} 1140 - </button> 1141 - </div> 1142 - )} 1143 - </div> 1144 - ); 1145 - } 1146 - ``` 1147 - 1148 - - [ ] **Step 2: Verify TypeScript compiles** 1149 - 1150 - Run: `cd apps/ionosphere && npx tsc --noEmit` 1151 - Expected: No errors 1152 - 1153 - - [ ] **Step 3: Commit** 1154 - 1155 - ```bash 1156 - git add apps/ionosphere/src/app/components/TimelineToolbar.tsx 1157 - git commit -m "feat(editor): NLE toolbar component with mode buttons" 1158 - ``` 1159 - 1160 - --- 1161 - 1162 - ### Task 6: Refactor StreamTimeline to Use Engine 1163 - 1164 - **Files:** 1165 - - Modify: `apps/ionosphere/src/app/components/StreamTimeline.tsx` 1166 - 1167 - The existing StreamTimeline is ~100 lines. Refactor it to read from the engine's `effectiveTalks` and show edit affordances. Keep the existing click-to-seek behavior when not in edit mode. 1168 - 1169 - - [ ] **Step 1: Refactor StreamTimeline** 1170 - 1171 - Replace the full contents of `StreamTimeline.tsx`: 1172 - 1173 - ```tsx 1174 - // apps/ionosphere/src/app/components/StreamTimeline.tsx 1175 - "use client"; 1176 - 1177 - import { useRef, useCallback, useMemo } from "react"; 1178 - import { useTimestamp } from "./TimestampProvider"; 1179 - import { talkColor, buildIndexMap } from "@/lib/track-colors"; 1180 - import { useTimelineEngine } from "@/lib/timeline-engine"; 1181 - 1182 - function formatTime(seconds: number): string { 1183 - const h = Math.floor(seconds / 3600); 1184 - const m = Math.floor((seconds % 3600) / 60); 1185 - const s = Math.floor(seconds % 60); 1186 - return `${h}:${String(m).padStart(2, "0")}:${String(s).padStart(2, "0")}`; 1187 - } 1188 - 1189 - interface StreamTimelineProps { 1190 - /** All talk rkeys for stable color assignment (full list, not just visible). */ 1191 - allTalkRkeys: string[]; 1192 - } 1193 - 1194 - export default function StreamTimeline({ allTalkRkeys }: StreamTimelineProps) { 1195 - const { currentTimeNs, seekTo } = useTimestamp(); 1196 - const barRef = useRef<HTMLDivElement>(null); 1197 - const currentTimeSec = currentTimeNs / 1e9; 1198 - 1199 - const { 1200 - effectiveTalks, 1201 - editingEnabled, 1202 - mode, 1203 - selectedTalkRkey, 1204 - selectTalk, 1205 - activeDrag, 1206 - windowStart, 1207 - windowEnd, 1208 - timeToPixel, 1209 - pixelToTime, 1210 - startDrag, 1211 - applyCorrection, 1212 - } = useTimelineEngine(); 1213 - 1214 - const windowDuration = windowEnd - windowStart; 1215 - 1216 - // Stable color index from ALL talks 1217 - const colorIndex = useMemo( 1218 - () => buildIndexMap(allTalkRkeys), 1219 - [allTalkRkeys], 1220 - ); 1221 - 1222 - // Filter to visible talks 1223 - const visibleTalks = useMemo( 1224 - () => effectiveTalks.filter( 1225 - (t) => t.startSeconds < windowEnd && (t.endSeconds ?? windowEnd) > windowStart, 1226 - ), 1227 - [effectiveTalks, windowStart, windowEnd], 1228 - ); 1229 - 1230 - const handleBarClick = useCallback( 1231 - (e: React.MouseEvent<HTMLDivElement>) => { 1232 - if (!barRef.current) return; 1233 - const rect = barRef.current.getBoundingClientRect(); 1234 - const fraction = (e.clientX - rect.left) / rect.width; 1235 - const seconds = windowStart + fraction * windowDuration; 1236 - 1237 - if (editingEnabled && mode === "split" && selectedTalkRkey) { 1238 - const talk = effectiveTalks.find((t) => t.rkey === selectedTalkRkey); 1239 - if (talk && seconds > talk.startSeconds && seconds < (talk.endSeconds ?? windowEnd)) { 1240 - const newRkey = Math.random().toString(36).slice(2, 10); 1241 - applyCorrection({ type: "split_talk", talkRkey: selectedTalkRkey, atSeconds: seconds, newRkey }); 1242 - return; 1243 - } 1244 - } 1245 - 1246 - if (editingEnabled && mode === "select") { 1247 - // Find which talk was clicked 1248 - const clicked = visibleTalks.find( 1249 - (t) => seconds >= t.startSeconds && seconds < (t.endSeconds ?? windowEnd), 1250 - ); 1251 - selectTalk(clicked?.rkey ?? null); 1252 - if (clicked) { 1253 - seekTo(clicked.startSeconds * 1e9); 1254 - return; 1255 - } 1256 - } 1257 - 1258 - seekTo(seconds * 1e9); 1259 - }, 1260 - [windowStart, windowDuration, seekTo, editingEnabled, mode, selectedTalkRkey, effectiveTalks, visibleTalks, selectTalk, applyCorrection, windowEnd], 1261 - ); 1262 - 1263 - const handleEdgeMouseDown = useCallback( 1264 - (e: React.MouseEvent, talkRkey: string, edge: "start" | "end", seconds: number) => { 1265 - if (!editingEnabled || mode !== "trim") return; 1266 - e.stopPropagation(); 1267 - startDrag(talkRkey, edge, seconds); 1268 - }, 1269 - [editingEnabled, mode, startDrag], 1270 - ); 1271 - 1272 - const scrubberPct = Math.min(100, Math.max(0, 1273 - ((currentTimeSec - windowStart) / windowDuration) * 100, 1274 - )); 1275 - 1276 - return ( 1277 - <div 1278 - ref={barRef} 1279 - onClick={handleBarClick} 1280 - className="relative w-full h-10 bg-neutral-900 rounded cursor-pointer overflow-hidden border border-neutral-800" 1281 - > 1282 - {visibleTalks.map((talk, i) => { 1283 - const talkStart = Math.max(talk.startSeconds, windowStart); 1284 - const talkEnd = Math.min(talk.endSeconds ?? windowEnd, windowEnd); 1285 - if (talkStart >= windowEnd || talkEnd <= windowStart) return null; 1286 - 1287 - // If this talk's boundary is being dragged, use the drag position 1288 - let displayStart = talkStart; 1289 - let displayEnd = talkEnd; 1290 - if (activeDrag?.talkRkey === talk.rkey) { 1291 - if (activeDrag.edge === "start") displayStart = Math.max(activeDrag.currentSeconds, windowStart); 1292 - if (activeDrag.edge === "end") displayEnd = Math.min(activeDrag.currentSeconds, windowEnd); 1293 - } 1294 - 1295 - const left = ((displayStart - windowStart) / windowDuration) * 100; 1296 - const width = ((displayEnd - displayStart) / windowDuration) * 100; 1297 - const isSelected = selectedTalkRkey === talk.rkey; 1298 - 1299 - return ( 1300 - <div 1301 - key={`${talk.rkey}-${i}`} 1302 - className={`absolute top-0 h-full flex items-center overflow-hidden ${ 1303 - isSelected ? "ring-2 ring-white/50 z-[5]" : "" 1304 - }`} 1305 - style={{ 1306 - left: `${left}%`, 1307 - width: `${width}%`, 1308 - backgroundColor: talkColor(talk.rkey, colorIndex), 1309 - }} 1310 - title={`${talk.title} (${formatTime(talk.startSeconds)})`} 1311 - > 1312 - {/* Left edge drag handle */} 1313 - {editingEnabled && mode === "trim" && ( 1314 - <div 1315 - className="absolute left-0 top-0 w-1 h-full cursor-col-resize hover:bg-white/40 z-[6]" 1316 - onMouseDown={(e) => handleEdgeMouseDown(e, talk.rkey, "start", talk.startSeconds)} 1317 - /> 1318 - )} 1319 - 1320 - <span className="text-[10px] text-neutral-300 px-1 truncate"> 1321 - {talk.title} 1322 - </span> 1323 - 1324 - {/* Verified badge */} 1325 - {talk.verified && ( 1326 - <span className="absolute top-0.5 right-1 text-[8px] text-green-400">&#10003;</span> 1327 - )} 1328 - 1329 - {/* Right edge drag handle */} 1330 - {editingEnabled && mode === "trim" && ( 1331 - <div 1332 - className="absolute right-0 top-0 w-1 h-full cursor-col-resize hover:bg-white/40 z-[6]" 1333 - onMouseDown={(e) => handleEdgeMouseDown(e, talk.rkey, "end", talk.endSeconds ?? windowEnd)} 1334 - /> 1335 - )} 1336 - </div> 1337 - ); 1338 - })} 1339 - 1340 - {/* Scrubber */} 1341 - <div 1342 - className="absolute top-0 h-full w-0.5 bg-white/80 z-10 pointer-events-none" 1343 - style={{ left: `${scrubberPct}%` }} 1344 - /> 1345 - 1346 - {/* Time labels */} 1347 - <div className="absolute bottom-0 left-1 text-[9px] text-neutral-500"> 1348 - {formatTime(windowStart)} 1349 - </div> 1350 - <div className="absolute bottom-0 right-1 text-[9px] text-neutral-500"> 1351 - {formatTime(windowEnd)} 1352 - </div> 1353 - </div> 1354 - ); 1355 - } 1356 - ``` 1357 - 1358 - - [ ] **Step 2: Verify TypeScript compiles** 1359 - 1360 - Run: `cd apps/ionosphere && npx tsc --noEmit` 1361 - 1362 - - [ ] **Step 3: Commit** 1363 - 1364 - ```bash 1365 - git add apps/ionosphere/src/app/components/StreamTimeline.tsx 1366 - git commit -m "refactor(editor): StreamTimeline reads from TimelineEngine" 1367 - ``` 1368 - 1369 - --- 1370 - 1371 - ### Task 7: Interaction Overlay (Drag Handling) 1372 - 1373 - **Files:** 1374 - - Create: `apps/ionosphere/src/app/components/InteractionOverlay.tsx` 1375 - 1376 - This component renders on top of the timeline and handles mouse move/up during drag operations. It also renders snap guide lines. 1377 - 1378 - - [ ] **Step 1: Create the interaction overlay** 1379 - 1380 - ```tsx 1381 - // apps/ionosphere/src/app/components/InteractionOverlay.tsx 1382 - "use client"; 1383 - 1384 - import { useEffect, useCallback, useState } from "react"; 1385 - import { useTimelineEngine } from "@/lib/timeline-engine"; 1386 - 1387 - export default function InteractionOverlay() { 1388 - const { 1389 - editingEnabled, 1390 - activeDrag, 1391 - updateDrag, 1392 - commitDrag, 1393 - cancelDrag, 1394 - pixelToTime, 1395 - timeToPixel, 1396 - findSnap, 1397 - windowStart, 1398 - windowEnd, 1399 - } = useTimelineEngine(); 1400 - 1401 - const [snapGuide, setSnapGuide] = useState<{ px: number; label: string } | null>(null); 1402 - 1403 - // Global mouse handlers during drag 1404 - useEffect(() => { 1405 - if (!activeDrag) { 1406 - setSnapGuide(null); 1407 - return; 1408 - } 1409 - 1410 - const onMouseMove = (e: MouseEvent) => { 1411 - const timeline = document.querySelector("[data-timeline-bar]") as HTMLElement; 1412 - if (!timeline) return; 1413 - const rect = timeline.getBoundingClientRect(); 1414 - const px = e.clientX - rect.left; 1415 - let timeSeconds = pixelToTime(px); 1416 - 1417 - // Check for snap (unless Alt is held) 1418 - if (!e.altKey) { 1419 - const snapResult = findSnap(timeSeconds, activeDrag.edge, 10); 1420 - if (snapResult) { 1421 - timeSeconds = snapResult.snappedTime; 1422 - const snapPx = timeToPixel(snapResult.snappedTime); 1423 - setSnapGuide({ px: snapPx, label: snapResult.target.type.replace("_", " ") }); 1424 - } else { 1425 - setSnapGuide(null); 1426 - } 1427 - } else { 1428 - setSnapGuide(null); 1429 - } 1430 - 1431 - updateDrag(timeSeconds); 1432 - }; 1433 - 1434 - const onMouseUp = () => { 1435 - commitDrag(); 1436 - setSnapGuide(null); 1437 - }; 1438 - 1439 - const onKeyDown = (e: KeyboardEvent) => { 1440 - if (e.key === "Escape") { 1441 - cancelDrag(); 1442 - setSnapGuide(null); 1443 - } 1444 - }; 1445 - 1446 - document.addEventListener("mousemove", onMouseMove); 1447 - document.addEventListener("mouseup", onMouseUp); 1448 - document.addEventListener("keydown", onKeyDown); 1449 - 1450 - return () => { 1451 - document.removeEventListener("mousemove", onMouseMove); 1452 - document.removeEventListener("mouseup", onMouseUp); 1453 - document.removeEventListener("keydown", onKeyDown); 1454 - }; 1455 - }, [activeDrag, pixelToTime, timeToPixel, findSnap, updateDrag, commitDrag, cancelDrag]); 1456 - 1457 - if (!editingEnabled) return null; 1458 - 1459 - const windowDuration = windowEnd - windowStart; 1460 - 1461 - return ( 1462 - <> 1463 - {/* Snap guide line */} 1464 - {snapGuide && ( 1465 - <div 1466 - className="absolute top-0 h-full w-px bg-yellow-400/60 z-20 pointer-events-none" 1467 - style={{ left: `${snapGuide.px}px` }} 1468 - > 1469 - <span className="absolute -top-4 left-1 text-[8px] text-yellow-400 whitespace-nowrap"> 1470 - {snapGuide.label} 1471 - </span> 1472 - </div> 1473 - )} 1474 - </> 1475 - ); 1476 - } 1477 - ``` 1478 - 1479 - - [ ] **Step 2: Verify TypeScript compiles** 1480 - 1481 - Run: `cd apps/ionosphere && npx tsc --noEmit` 1482 - 1483 - - [ ] **Step 3: Commit** 1484 - 1485 - ```bash 1486 - git add apps/ionosphere/src/app/components/InteractionOverlay.tsx 1487 - git commit -m "feat(editor): interaction overlay with drag and snap guides" 1488 - ``` 1489 - 1490 - --- 1491 - 1492 - ### Task 8: Keyboard Shortcuts 1493 - 1494 - **Files:** 1495 - - Create: `apps/ionosphere/src/app/components/useEditorKeyboard.ts` 1496 - 1497 - A hook that registers keyboard shortcuts when the editor is active. 1498 - 1499 - - [ ] **Step 1: Create the keyboard hook** 1500 - 1501 - ```ts 1502 - // apps/ionosphere/src/app/components/useEditorKeyboard.ts 1503 - "use client"; 1504 - 1505 - import { useEffect } from "react"; 1506 - import { useTimelineEngine } from "@/lib/timeline-engine"; 1507 - import { useTimestamp } from "./TimestampProvider"; 1508 - 1509 - export function useEditorKeyboard(onSave: () => void) { 1510 - const { 1511 - editingEnabled, 1512 - toggleEditing, 1513 - mode, 1514 - setMode, 1515 - selectedTalkRkey, 1516 - effectiveTalks, 1517 - applyCorrection, 1518 - undo, 1519 - redo, 1520 - canUndo, 1521 - canRedo, 1522 - activeDrag, 1523 - cancelDrag, 1524 - selectTalk, 1525 - } = useTimelineEngine(); 1526 - 1527 - const { seekTo, currentTimeNs, paused, setPaused } = useTimestamp(); 1528 - const currentTimeSec = currentTimeNs / 1e9; 1529 - 1530 - useEffect(() => { 1531 - const onKeyDown = (e: KeyboardEvent) => { 1532 - // Ignore when typing in inputs 1533 - const tag = (e.target as HTMLElement)?.tagName; 1534 - if (tag === "INPUT" || tag === "TEXTAREA" || tag === "SELECT") return; 1535 - 1536 - const ctrl = e.ctrlKey || e.metaKey; 1537 - 1538 - // --- Playback shortcuts (always active) --- 1539 - switch (e.key) { 1540 - case " ": 1541 - e.preventDefault(); 1542 - setPaused(!paused); 1543 - return; 1544 - case "ArrowLeft": 1545 - e.preventDefault(); 1546 - seekTo((currentTimeSec - (e.shiftKey ? 0.1 : 1)) * 1e9); 1547 - return; 1548 - case "ArrowRight": 1549 - e.preventDefault(); 1550 - seekTo((currentTimeSec + (e.shiftKey ? 0.1 : 1)) * 1e9); 1551 - return; 1552 - case "j": 1553 - case "J": 1554 - seekTo((currentTimeSec - 5) * 1e9); 1555 - return; 1556 - case "k": 1557 - case "K": 1558 - setPaused(!paused); 1559 - return; 1560 - case "l": 1561 - case "L": 1562 - seekTo((currentTimeSec + 5) * 1e9); 1563 - return; 1564 - } 1565 - 1566 - // --- Editing shortcuts (only when editing) --- 1567 - if (!editingEnabled) return; 1568 - 1569 - // Save 1570 - if (ctrl && e.key === "s") { 1571 - e.preventDefault(); 1572 - onSave(); 1573 - return; 1574 - } 1575 - 1576 - // Undo/Redo 1577 - if (ctrl && e.key === "z" && !e.shiftKey && canUndo) { 1578 - e.preventDefault(); 1579 - undo(); 1580 - return; 1581 - } 1582 - if (ctrl && e.key === "z" && e.shiftKey && canRedo) { 1583 - e.preventDefault(); 1584 - redo(); 1585 - return; 1586 - } 1587 - 1588 - // Mode switching 1589 - if (!ctrl) { 1590 - switch (e.key) { 1591 - case "v": setMode("select"); return; 1592 - case "t": setMode("trim"); return; 1593 - case "s": setMode("split"); return; 1594 - case "a": setMode("add"); return; 1595 - } 1596 - } 1597 - 1598 - // Escape 1599 - if (e.key === "Escape") { 1600 - if (activeDrag) { 1601 - cancelDrag(); 1602 - } else if (selectedTalkRkey) { 1603 - selectTalk(null); 1604 - } else { 1605 - toggleEditing(); 1606 - } 1607 - return; 1608 - } 1609 - 1610 - // Selected talk actions 1611 - if (selectedTalkRkey) { 1612 - const talk = effectiveTalks.find((t) => t.rkey === selectedTalkRkey); 1613 - if (!talk) return; 1614 - 1615 - switch (e.key) { 1616 - case "Enter": 1617 - applyCorrection( 1618 - talk.verified 1619 - ? { type: "unverify_talk", talkRkey: selectedTalkRkey } 1620 - : { type: "verify_talk", talkRkey: selectedTalkRkey }, 1621 - ); 1622 - return; 1623 - case "Backspace": 1624 - case "Delete": 1625 - if (talk.verified && !confirm("Delete verified talk?")) return; 1626 - applyCorrection({ type: "remove_talk", talkRkey: selectedTalkRkey }); 1627 - return; 1628 - case "[": 1629 - e.preventDefault(); 1630 - applyCorrection({ 1631 - type: "move_boundary", 1632 - talkRkey: selectedTalkRkey, 1633 - edge: "start", 1634 - fromSeconds: talk.startSeconds, 1635 - toSeconds: talk.startSeconds - (e.shiftKey ? 0.1 : 1), 1636 - }); 1637 - return; 1638 - case "]": 1639 - e.preventDefault(); 1640 - applyCorrection({ 1641 - type: "move_boundary", 1642 - talkRkey: selectedTalkRkey, 1643 - edge: "end", 1644 - fromSeconds: talk.endSeconds ?? 0, 1645 - toSeconds: (talk.endSeconds ?? 0) + (e.shiftKey ? 0.1 : 1), 1646 - }); 1647 - return; 1648 - } 1649 - } 1650 - }; 1651 - 1652 - document.addEventListener("keydown", onKeyDown); 1653 - return () => document.removeEventListener("keydown", onKeyDown); 1654 - }, [editingEnabled, mode, selectedTalkRkey, effectiveTalks, currentTimeSec, paused, activeDrag, canUndo, canRedo, onSave, setMode, selectTalk, applyCorrection, undo, redo, cancelDrag, toggleEditing, seekTo, setPaused]); 1655 - } 1656 - ``` 1657 - 1658 - - [ ] **Step 2: Verify TypeScript compiles** 1659 - 1660 - Run: `cd apps/ionosphere && npx tsc --noEmit` 1661 - 1662 - - [ ] **Step 3: Commit** 1663 - 1664 - ```bash 1665 - git add apps/ionosphere/src/app/components/useEditorKeyboard.ts 1666 - git commit -m "feat(editor): keyboard shortcuts hook for playback and editing" 1667 - ``` 1668 - 1669 - --- 1670 - 1671 - ### Task 9: Integrate Engine into TrackViewContent 1672 - 1673 - **Files:** 1674 - - Modify: `apps/ionosphere/src/app/tracks/[stream]/TrackViewContent.tsx` 1675 - - Modify: `apps/ionosphere/src/lib/api.ts` (add corrections fetch/save) 1676 - 1677 - This is the main integration task — wire the engine provider, toolbar, keyboard hook, and refactored StreamTimeline into the track view. 1678 - 1679 - - [ ] **Step 1: Add corrections API helpers to api.ts** 1680 - 1681 - Find the `getTrack` function in `apps/ionosphere/src/lib/api.ts` and add after it: 1682 - 1683 - ```ts 1684 - export async function getCorrections(stream: string) { 1685 - return fetchApi<{ corrections: any[] }>(`/xrpc/tv.ionosphere.getCorrections?stream=${encodeURIComponent(stream)}`); 1686 - } 1687 - 1688 - export async function saveCorrections(stream: string, corrections: any[]) { 1689 - const res = await fetch(`${API_BASE}/xrpc/tv.ionosphere.putCorrections`, { 1690 - method: "PUT", 1691 - headers: { "Content-Type": "application/json" }, 1692 - body: JSON.stringify({ stream, corrections }), 1693 - }); 1694 - return res.json(); 1695 - } 1696 - ``` 1697 - 1698 - - [ ] **Step 2: Rewrite TrackViewContent to use the engine** 1699 - 1700 - Replace the full contents of `TrackViewContent.tsx`. Key changes: 1701 - - Wrap in `TimelineEngineProvider` 1702 - - Add the toolbar and keyboard hook 1703 - - Pass `words` data through for snap computation 1704 - - Manage viewport state in the zoom component, pass to engine 1705 - - Load/save corrections via API 1706 - 1707 - ```tsx 1708 - // apps/ionosphere/src/app/tracks/[stream]/TrackViewContent.tsx 1709 - "use client"; 1710 - 1711 - import { useState, useRef, useCallback, useEffect, useMemo } from "react"; 1712 - import { TimestampProvider, useTimestamp } from "@/app/components/TimestampProvider"; 1713 - import VideoPlayer from "@/app/components/VideoPlayer"; 1714 - import TranscriptView from "@/app/components/TranscriptView"; 1715 - import StreamTimeline from "@/app/components/StreamTimeline"; 1716 - import DiarizationBand from "@/app/components/DiarizationBand"; 1717 - import TimelineToolbar from "@/app/components/TimelineToolbar"; 1718 - import InteractionOverlay from "@/app/components/InteractionOverlay"; 1719 - import { useEditorKeyboard } from "@/app/components/useEditorKeyboard"; 1720 - import { TimelineEngineProvider, useTimelineEngine } from "@/lib/timeline-engine"; 1721 - import { getCorrections, saveCorrections } from "@/lib/api"; 1722 - import type { BaseTalk, CorrectionEntry } from "@/lib/corrections"; 1723 - 1724 - interface Talk { 1725 - rkey: string; 1726 - title: string; 1727 - speakers: string[]; 1728 - startSeconds: number; 1729 - endSeconds: number | null; 1730 - confidence: string; 1731 - } 1732 - 1733 - interface TrackData { 1734 - slug: string; 1735 - name: string; 1736 - room: string; 1737 - dayLabel: string; 1738 - streamUri: string; 1739 - durationSeconds: number; 1740 - playbackUrl: string; 1741 - talks: Talk[]; 1742 - diarization: Array<{ start: number; end: number; speaker: string }>; 1743 - transcript?: { text: string; facets: any[] }; 1744 - words?: Array<{ start: number; end: number; speaker: string }>; 1745 - } 1746 - 1747 - function formatTime(seconds: number): string { 1748 - const h = Math.floor(seconds / 3600); 1749 - const m = Math.floor((seconds % 3600) / 60); 1750 - const s = Math.floor(seconds % 60); 1751 - return `${h}:${String(m).padStart(2, "0")}:${String(s).padStart(2, "0")}`; 1752 - } 1753 - 1754 - // --- Talk List (reads from engine) --- 1755 - 1756 - function TalkList() { 1757 - const { seekTo, currentTimeNs } = useTimestamp(); 1758 - const { effectiveTalks, editingEnabled, selectedTalkRkey, selectTalk } = useTimelineEngine(); 1759 - const currentTimeSec = currentTimeNs / 1e9; 1760 - const activeRef = useRef<HTMLButtonElement>(null); 1761 - 1762 - useEffect(() => { 1763 - activeRef.current?.scrollIntoView({ block: "nearest", behavior: "smooth" }); 1764 - }, [Math.floor(currentTimeSec / 60)]); 1765 - 1766 - return ( 1767 - <div className="space-y-1 p-4"> 1768 - {effectiveTalks.map((talk, i) => { 1769 - const isActive = 1770 - currentTimeSec >= talk.startSeconds && 1771 - (talk.endSeconds ? currentTimeSec < talk.endSeconds : i === effectiveTalks.length - 1); 1772 - const isSelected = selectedTalkRkey === talk.rkey; 1773 - 1774 - return ( 1775 - <button 1776 - key={`${talk.rkey}-${i}`} 1777 - ref={isActive ? activeRef : undefined} 1778 - onClick={() => { 1779 - if (editingEnabled) selectTalk(talk.rkey); 1780 - seekTo(talk.startSeconds * 1e9); 1781 - }} 1782 - className={`w-full text-left px-3 py-2 rounded transition-colors flex items-baseline gap-3 ${ 1783 - isSelected 1784 - ? "bg-blue-900/30 text-neutral-100 ring-1 ring-blue-500/50" 1785 - : isActive 1786 - ? "bg-neutral-800 text-neutral-100" 1787 - : "hover:bg-neutral-800/50 text-neutral-400" 1788 - }`} 1789 - > 1790 - <span className="text-xs font-mono shrink-0 w-16 text-neutral-500"> 1791 - {formatTime(talk.startSeconds)} 1792 - </span> 1793 - <span className="text-sm flex-1 truncate"> 1794 - {talk.verified && <span className="text-green-400 mr-1">&#10003;</span>} 1795 - {talk.title} 1796 - </span> 1797 - <span className="text-xs text-neutral-600 shrink-0 hidden sm:inline"> 1798 - {talk.speakers.join(", ")} 1799 - </span> 1800 - </button> 1801 - ); 1802 - })} 1803 - </div> 1804 - ); 1805 - } 1806 - 1807 - function TrackViewInner({ track, words }: { track: TrackData; words: Array<{ start: number; end: number; speaker: string }> }) { 1808 - const [activeTab, setActiveTab] = useState<"talks" | "transcript">("talks"); 1809 - const hasTranscript = !!(track.transcript?.facets?.length); 1810 - const containerRef = useRef<HTMLDivElement>(null); 1811 - const { currentTimeNs } = useTimestamp(); 1812 - const currentTimeSec = currentTimeNs / 1e9; 1813 - 1814 - // Zoom state 1815 - const [zoomLevel, setZoomLevel] = useState(1); 1816 - const [panCenter, setPanCenter] = useState<number | null>(null); 1817 - const [containerWidth, setContainerWidth] = useState(800); 1818 - 1819 - // Corrections state 1820 - const [initialCorrections, setInitialCorrections] = useState<CorrectionEntry[]>([]); 1821 - const [correctionsLoaded, setCorrectionsLoaded] = useState(false); 1822 - 1823 - // Load corrections on mount 1824 - useEffect(() => { 1825 - getCorrections(track.slug).then((data) => { 1826 - setInitialCorrections(data.corrections || []); 1827 - setCorrectionsLoaded(true); 1828 - }).catch(() => setCorrectionsLoaded(true)); 1829 - }, [track.slug]); 1830 - 1831 - const allSpeakers = useMemo(() => { 1832 - const seen = new Set<string>(); 1833 - const ordered: string[] = []; 1834 - for (const s of track.diarization) { 1835 - if (!seen.has(s.speaker)) { 1836 - seen.add(s.speaker); 1837 - ordered.push(s.speaker); 1838 - } 1839 - } 1840 - return ordered; 1841 - }, [track.diarization]); 1842 - 1843 - const allTalkRkeys = useMemo( 1844 - () => track.talks.map((t) => t.rkey), 1845 - [track.talks], 1846 - ); 1847 - 1848 - // Viewport 1849 - const center = panCenter ?? currentTimeSec; 1850 - const windowDuration = track.durationSeconds / zoomLevel; 1851 - const windowStart = Math.max(0, Math.min( 1852 - center - windowDuration / 2, 1853 - track.durationSeconds - windowDuration, 1854 - )); 1855 - const windowEnd = windowStart + windowDuration; 1856 - 1857 - // Measure container 1858 - useEffect(() => { 1859 - const el = containerRef.current; 1860 - if (!el) return; 1861 - const observer = new ResizeObserver((entries) => { 1862 - setContainerWidth(entries[0].contentRect.width); 1863 - }); 1864 - observer.observe(el); 1865 - return () => observer.disconnect(); 1866 - }, []); 1867 - 1868 - // Gesture handling 1869 - useEffect(() => { 1870 - const el = containerRef.current; 1871 - if (!el) return; 1872 - const onWheel = (e: WheelEvent) => { 1873 - e.preventDefault(); 1874 - if (e.ctrlKey || e.metaKey || Math.abs(e.deltaY) > Math.abs(e.deltaX)) { 1875 - const zoomDelta = e.deltaY > 0 ? 0.8 : 1.25; 1876 - setZoomLevel((prev) => Math.max(1, Math.min(64, prev * zoomDelta))); 1877 - } 1878 - if (Math.abs(e.deltaX) > 0 || e.shiftKey) { 1879 - const panDelta = (e.deltaX || e.deltaY) * (windowDuration / 1000); 1880 - setPanCenter((prev) => { 1881 - const c = prev ?? currentTimeSec; 1882 - return Math.max(windowDuration / 2, Math.min(track.durationSeconds - windowDuration / 2, c + panDelta)); 1883 - }); 1884 - } 1885 - }; 1886 - el.addEventListener("wheel", onWheel, { passive: false }); 1887 - return () => el.removeEventListener("wheel", onWheel); 1888 - }, [windowDuration, track.durationSeconds, currentTimeSec]); 1889 - 1890 - useEffect(() => { 1891 - if (zoomLevel <= 1) setPanCenter(null); 1892 - }, [zoomLevel]); 1893 - 1894 - const visibleDiarization = track.diarization.filter( 1895 - (s) => s.start < windowEnd && s.end > windowStart, 1896 - ); 1897 - 1898 - const baseTalks: BaseTalk[] = useMemo( 1899 - () => track.talks.map((t) => ({ 1900 - rkey: t.rkey, 1901 - title: t.title, 1902 - speakers: t.speakers, 1903 - startSeconds: t.startSeconds, 1904 - endSeconds: t.endSeconds, 1905 - confidence: t.confidence, 1906 - })), 1907 - [track.talks], 1908 - ); 1909 - 1910 - // Save ref — SaveHandler sets this so the toolbar can call it 1911 - const saveRef = useRef<(() => void) | null>(null); 1912 - 1913 - if (!correctionsLoaded) { 1914 - return <div className="p-4 text-neutral-500">Loading...</div>; 1915 - } 1916 - 1917 - return ( 1918 - <TimelineEngineProvider 1919 - streamSlug={track.slug} 1920 - baseTalks={baseTalks} 1921 - words={words} 1922 - diarization={track.diarization} 1923 - initialCorrections={initialCorrections} 1924 - windowStart={windowStart} 1925 - windowEnd={windowEnd} 1926 - containerWidth={containerWidth} 1927 - > 1928 - <div className="h-full flex flex-col"> 1929 - <div className="shrink-0 px-4 pt-3 border-b border-neutral-800"> 1930 - <div className="max-w-5xl mx-auto"> 1931 - <div className="mb-2"> 1932 - <h1 className="text-lg font-bold">{track.name}</h1> 1933 - <p className="text-xs text-neutral-500"> 1934 - {track.room} · {track.talks.length} talks · {formatTime(track.durationSeconds)} 1935 - </p> 1936 - </div> 1937 - 1938 - <div className="mb-2 max-h-[33vh] overflow-hidden rounded-lg bg-black"> 1939 - <VideoPlayer videoUri={track.streamUri} /> 1940 - </div> 1941 - 1942 - <div className="mb-2"> 1943 - <TimelineToolbar onSave={() => saveRef.current?.()} /> 1944 - <SaveHandler streamSlug={track.slug} saveRef={saveRef} /> 1945 - </div> 1946 - 1947 - <div ref={containerRef} className="mb-2"> 1948 - <div className="flex items-center gap-2 mb-1"> 1949 - <div className="flex items-center gap-1"> 1950 - <button 1951 - onClick={() => setZoomLevel((z) => Math.max(1, z / 2))} 1952 - disabled={zoomLevel <= 1} 1953 - className="px-2 py-0.5 text-xs rounded bg-neutral-800 text-neutral-400 hover:text-neutral-200 disabled:opacity-30" 1954 - > 1955 - 1956 - </button> 1957 - <span className="text-xs text-neutral-500 w-10 text-center"> 1958 - {zoomLevel <= 1 ? "Full" : `${zoomLevel.toFixed(zoomLevel < 2 ? 1 : 0)}x`} 1959 - </span> 1960 - <button 1961 - onClick={() => setZoomLevel((z) => Math.min(64, z * 2))} 1962 - disabled={zoomLevel >= 64} 1963 - className="px-2 py-0.5 text-xs rounded bg-neutral-800 text-neutral-400 hover:text-neutral-200 disabled:opacity-30" 1964 - > 1965 - + 1966 - </button> 1967 - </div> 1968 - {zoomLevel > 1 && ( 1969 - <> 1970 - <span className="text-xs text-neutral-600"> 1971 - {formatTime(windowStart)} — {formatTime(windowEnd)} 1972 - </span> 1973 - <button 1974 - onClick={() => { setZoomLevel(1); setPanCenter(null); }} 1975 - className="text-xs text-neutral-600 hover:text-neutral-300 ml-auto" 1976 - > 1977 - Reset 1978 - </button> 1979 - </> 1980 - )} 1981 - </div> 1982 - 1983 - <div className="relative" data-timeline-bar> 1984 - <StreamTimeline allTalkRkeys={allTalkRkeys} /> 1985 - <InteractionOverlay /> 1986 - </div> 1987 - 1988 - {track.diarization.length > 0 && ( 1989 - <div className="mt-1"> 1990 - <DiarizationBand 1991 - segments={visibleDiarization} 1992 - allSpeakers={allSpeakers} 1993 - durationSeconds={windowDuration} 1994 - offsetSeconds={windowStart} 1995 - /> 1996 - </div> 1997 - )} 1998 - </div> 1999 - 2000 - <div className="flex gap-4"> 2001 - <button 2002 - onClick={() => setActiveTab("talks")} 2003 - className={`pb-2 text-sm font-medium border-b-2 transition-colors ${ 2004 - activeTab === "talks" 2005 - ? "border-neutral-300 text-neutral-100" 2006 - : "border-transparent text-neutral-500 hover:text-neutral-300" 2007 - }`} 2008 - > 2009 - Talks 2010 - </button> 2011 - <button 2012 - onClick={() => setActiveTab("transcript")} 2013 - disabled={!hasTranscript} 2014 - className={`pb-2 text-sm font-medium border-b-2 transition-colors ${ 2015 - activeTab === "transcript" 2016 - ? "border-neutral-300 text-neutral-100" 2017 - : "border-transparent text-neutral-500 hover:text-neutral-300" 2018 - } ${!hasTranscript ? "opacity-30 cursor-not-allowed" : ""}`} 2019 - > 2020 - Transcript 2021 - </button> 2022 - </div> 2023 - </div> 2024 - </div> 2025 - 2026 - <div className="flex-1 min-h-0"> 2027 - <div className="max-w-5xl mx-auto h-full"> 2028 - {activeTab === "talks" && ( 2029 - <div className="h-full overflow-y-auto"> 2030 - <TalkList /> 2031 - </div> 2032 - )} 2033 - {activeTab === "transcript" && hasTranscript && ( 2034 - <TranscriptView document={track.transcript!} /> 2035 - )} 2036 - </div> 2037 - </div> 2038 - </div> 2039 - </TimelineEngineProvider> 2040 - ); 2041 - } 2042 - 2043 - /** Inner component that has access to the engine context for saving */ 2044 - function SaveHandler({ 2045 - streamSlug, 2046 - saveRef, 2047 - }: { 2048 - streamSlug: string; 2049 - saveRef: React.MutableRefObject<(() => void) | null>; 2050 - }) { 2051 - const engine = useTimelineEngine(); 2052 - 2053 - const handleSave = useCallback(async () => { 2054 - const corrections = engine.getCorrectionsToSave(); 2055 - await saveCorrections(streamSlug, corrections); 2056 - engine.markSaved(); 2057 - }, [engine, streamSlug]); 2058 - 2059 - // Expose save to toolbar via ref 2060 - useEffect(() => { saveRef.current = handleSave; }, [handleSave, saveRef]); 2061 - 2062 - useEditorKeyboard(handleSave); 2063 - 2064 - return null; 2065 - } 2066 - 2067 - export default function TrackViewContent({ track }: { track: TrackData }) { 2068 - // Extract words from transcript for snap computation 2069 - // Words come from the server-side data (not the faceted transcript) 2070 - const words = track.words ?? []; 2071 - 2072 - return ( 2073 - <TimestampProvider> 2074 - <TrackViewInner track={track} words={words} /> 2075 - </TimestampProvider> 2076 - ); 2077 - } 2078 - ``` 2079 - 2080 - **Note:** The `words` field needs to be added to the track API response. See step 3. 2081 - 2082 - - [ ] **Step 3: Add words to track API response** 2083 - 2084 - In `apps/ionosphere-appview/src/tracks.ts`, the `getTrackData` function returns `transcript` (pre-processed text + facets). We also need the raw word array for snap target computation. 2085 - 2086 - Add a `loadWords` function after the existing `loadTranscript` function: 2087 - 2088 - ```ts 2089 - function loadWords(dirName: string): Array<{ start: number; end: number; speaker: string }> { 2090 - const txPath = path.join(DATA_DIR, dirName, "transcript-enriched.json"); 2091 - if (!existsSync(txPath)) return []; 2092 - const data = JSON.parse(readFileSync(txPath, "utf-8")); 2093 - return (data.words || []).map((w: any) => ({ 2094 - start: w.start, 2095 - end: w.end, 2096 - speaker: w.speaker, 2097 - })); 2098 - } 2099 - ``` 2100 - 2101 - Then in `getTrackData`, add `const words = loadWords(stream.dirName);` before the return statement, and add `words` to the return object: 2102 - 2103 - Change: 2104 - ```ts 2105 - return { 2106 - slug: stream.slug, 2107 - ... 2108 - diarization, 2109 - transcript, 2110 - }; 2111 - ``` 2112 - To: 2113 - ```ts 2114 - return { 2115 - slug: stream.slug, 2116 - ... 2117 - diarization, 2118 - transcript, 2119 - words, 2120 - }; 2121 - ``` 2122 - 2123 - - [ ] **Step 4: Verify TypeScript compiles in both apps** 2124 - 2125 - Run: `cd apps/ionosphere && npx tsc --noEmit && cd ../ionosphere-appview && npx tsc --noEmit` 2126 - 2127 - - [ ] **Step 5: Commit** 2128 - 2129 - ```bash 2130 - git add apps/ionosphere/src/app/tracks/\[stream\]/TrackViewContent.tsx \ 2131 - apps/ionosphere/src/lib/api.ts \ 2132 - apps/ionosphere-appview/src/tracks.ts 2133 - git commit -m "feat(editor): integrate timeline engine into track view" 2134 - ``` 2135 - 2136 - --- 2137 - 2138 - ## Chunk 3: Waveform Band, Speaker Naming, and Ground Truth Export 2139 - 2140 - ### Task 10: Waveform/Diarization Band 2141 - 2142 - **Files:** 2143 - - Create: `apps/ionosphere/src/app/components/WaveformBand.tsx` 2144 - 2145 - A combined waveform/diarization visualization that morphs with zoom level. At low zoom it shows speaker-colored blocks (like the current DiarizationBand). At high zoom it becomes a speaker-colored area chart where height = word density. 2146 - 2147 - - [ ] **Step 1: Create the waveform band component** 2148 - 2149 - ```tsx 2150 - // apps/ionosphere/src/app/components/WaveformBand.tsx 2151 - "use client"; 2152 - 2153 - import { useMemo } from "react"; 2154 - import { speakerColor, buildIndexMap } from "@/lib/track-colors"; 2155 - import { useTimelineEngine } from "@/lib/timeline-engine"; 2156 - 2157 - interface WaveformBandProps { 2158 - words: Array<{ start: number; end: number; speaker: string }>; 2159 - diarization: Array<{ start: number; end: number; speaker: string }>; 2160 - allSpeakers: string[]; 2161 - zoomLevel: number; 2162 - } 2163 - 2164 - interface Bin { 2165 - startTime: number; 2166 - endTime: number; 2167 - wordCount: number; 2168 - dominantSpeaker: string; 2169 - } 2170 - 2171 - export default function WaveformBand({ 2172 - words, 2173 - diarization, 2174 - allSpeakers, 2175 - zoomLevel, 2176 - }: WaveformBandProps) { 2177 - const { windowStart, windowEnd } = useTimelineEngine(); 2178 - const windowDuration = windowEnd - windowStart; 2179 - 2180 - const colorIndex = useMemo( 2181 - () => buildIndexMap(allSpeakers), 2182 - [allSpeakers], 2183 - ); 2184 - 2185 - // At low zoom (< 4x), render as simple diarization blocks 2186 - // At high zoom (>= 4x), render as waveform bins 2187 - const useWaveform = zoomLevel >= 4; 2188 - 2189 - // Compute waveform bins from words in the visible window 2190 - const bins = useMemo(() => { 2191 - if (!useWaveform || words.length === 0) return []; 2192 - 2193 - const binCount = Math.min(400, Math.max(50, Math.round(windowDuration * 2))); 2194 - const binDuration = windowDuration / binCount; 2195 - const result: Bin[] = []; 2196 - 2197 - for (let i = 0; i < binCount; i++) { 2198 - const binStart = windowStart + i * binDuration; 2199 - const binEnd = binStart + binDuration; 2200 - const speakerCounts = new Map<string, number>(); 2201 - let count = 0; 2202 - 2203 - for (const w of words) { 2204 - if (w.end < binStart) continue; 2205 - if (w.start > binEnd) break; 2206 - count++; 2207 - speakerCounts.set(w.speaker, (speakerCounts.get(w.speaker) || 0) + 1); 2208 - } 2209 - 2210 - let dominant = ""; 2211 - let maxCount = 0; 2212 - for (const [spk, cnt] of speakerCounts) { 2213 - if (cnt > maxCount) { dominant = spk; maxCount = cnt; } 2214 - } 2215 - 2216 - result.push({ startTime: binStart, endTime: binEnd, wordCount: count, dominantSpeaker: dominant }); 2217 - } 2218 - 2219 - return result; 2220 - }, [words, windowStart, windowEnd, windowDuration, useWaveform]); 2221 - 2222 - const maxWordCount = useMemo( 2223 - () => Math.max(1, ...bins.map((b) => b.wordCount)), 2224 - [bins], 2225 - ); 2226 - 2227 - // Diarization blocks for low zoom 2228 - const visibleDiarization = useMemo(() => { 2229 - if (useWaveform) return []; 2230 - return diarization.filter((s) => s.end > windowStart && s.start < windowEnd); 2231 - }, [diarization, windowStart, windowEnd, useWaveform]); 2232 - 2233 - // Merge adjacent same-speaker diarization segments 2234 - const merged = useMemo(() => { 2235 - if (visibleDiarization.length === 0) return []; 2236 - const result: typeof visibleDiarization = []; 2237 - let current = { 2238 - ...visibleDiarization[0], 2239 - start: Math.max(visibleDiarization[0].start, windowStart), 2240 - end: Math.min(visibleDiarization[0].end, windowEnd), 2241 - }; 2242 - for (let i = 1; i < visibleDiarization.length; i++) { 2243 - const seg = visibleDiarization[i]; 2244 - const clipped = { 2245 - ...seg, 2246 - start: Math.max(seg.start, windowStart), 2247 - end: Math.min(seg.end, windowEnd), 2248 - }; 2249 - if (clipped.speaker === current.speaker && clipped.start - current.end < 1) { 2250 - current.end = clipped.end; 2251 - } else { 2252 - result.push(current); 2253 - current = clipped; 2254 - } 2255 - } 2256 - result.push(current); 2257 - return result; 2258 - }, [visibleDiarization, windowStart, windowEnd]); 2259 - 2260 - const bandHeight = useWaveform ? 24 : 12; 2261 - 2262 - return ( 2263 - <div 2264 - className="relative w-full bg-neutral-900 rounded overflow-hidden border border-neutral-800" 2265 - style={{ height: `${bandHeight}px` }} 2266 - > 2267 - {useWaveform 2268 - ? bins.map((bin, i) => { 2269 - if (bin.wordCount === 0) return null; 2270 - const left = ((bin.startTime - windowStart) / windowDuration) * 100; 2271 - const width = ((bin.endTime - bin.startTime) / windowDuration) * 100; 2272 - const height = (bin.wordCount / maxWordCount) * 100; 2273 - 2274 - return ( 2275 - <div 2276 - key={i} 2277 - className="absolute bottom-0" 2278 - style={{ 2279 - left: `${left}%`, 2280 - width: `${Math.max(width, 0.2)}%`, 2281 - height: `${height}%`, 2282 - backgroundColor: bin.dominantSpeaker 2283 - ? speakerColor(bin.dominantSpeaker, colorIndex) 2284 - : "transparent", 2285 - }} 2286 - /> 2287 - ); 2288 - }) 2289 - : merged.map((seg, i) => { 2290 - const left = ((seg.start - windowStart) / windowDuration) * 100; 2291 - const width = ((seg.end - seg.start) / windowDuration) * 100; 2292 - if (width < 0.05) return null; 2293 - return ( 2294 - <div 2295 - key={i} 2296 - className="absolute top-0 h-full" 2297 - style={{ 2298 - left: `${left}%`, 2299 - width: `${width}%`, 2300 - backgroundColor: speakerColor(seg.speaker, colorIndex), 2301 - }} 2302 - title={seg.speaker} 2303 - /> 2304 - ); 2305 - })} 2306 - </div> 2307 - ); 2308 - } 2309 - ``` 2310 - 2311 - - [ ] **Step 2: Verify TypeScript compiles** 2312 - 2313 - Run: `cd apps/ionosphere && npx tsc --noEmit` 2314 - 2315 - - [ ] **Step 3: Wire WaveformBand into TrackViewContent** 2316 - 2317 - In `TrackViewContent.tsx`, replace the existing `<DiarizationBand>` usage with `<WaveformBand>`, passing the required props (`words`, `diarization`, `allSpeakers`, `zoomLevel`). 2318 - 2319 - - [ ] **Step 4: Commit** 2320 - 2321 - ```bash 2322 - git add apps/ionosphere/src/app/components/WaveformBand.tsx \ 2323 - apps/ionosphere/src/app/tracks/\[stream\]/TrackViewContent.tsx 2324 - git commit -m "feat(editor): waveform/diarization band with zoom morphing" 2325 - ``` 2326 - 2327 - --- 2328 - 2329 - ### Task 11: Speaker Popover 2330 - 2331 - **Files:** 2332 - - Create: `apps/ionosphere/src/app/components/SpeakerPopover.tsx` 2333 - 2334 - - [ ] **Step 1: Create the speaker popover** 2335 - 2336 - ```tsx 2337 - // apps/ionosphere/src/app/components/SpeakerPopover.tsx 2338 - "use client"; 2339 - 2340 - import { useState, useRef, useEffect } from "react"; 2341 - import { useTimelineEngine } from "@/lib/timeline-engine"; 2342 - 2343 - interface SpeakerPopoverProps { 2344 - speakerId: string; 2345 - position: { x: number; y: number }; 2346 - onClose: () => void; 2347 - } 2348 - 2349 - export default function SpeakerPopover({ speakerId, position, onClose }: SpeakerPopoverProps) { 2350 - const { speakerNames, applyCorrection, effectiveTalks } = useTimelineEngine(); 2351 - const [name, setName] = useState(speakerNames.get(speakerId) || ""); 2352 - const inputRef = useRef<HTMLInputElement>(null); 2353 - 2354 - useEffect(() => { 2355 - inputRef.current?.focus(); 2356 - }, []); 2357 - 2358 - // Find talks where this speaker is dominant (rough heuristic) 2359 - const relatedTalks = effectiveTalks.filter((t) => 2360 - t.speakers.some((s) => s.toLowerCase().includes(speakerId.toLowerCase())), 2361 - ); 2362 - 2363 - const handleSubmit = () => { 2364 - if (name.trim()) { 2365 - applyCorrection({ type: "name_speaker", speakerId, name: name.trim() }); 2366 - } 2367 - onClose(); 2368 - }; 2369 - 2370 - return ( 2371 - <div 2372 - className="fixed z-50 bg-neutral-800 border border-neutral-700 rounded-lg shadow-xl p-3 w-64" 2373 - style={{ left: position.x, top: position.y }} 2374 - > 2375 - <div className="text-xs text-neutral-500 mb-2">{speakerId}</div> 2376 - <input 2377 - ref={inputRef} 2378 - type="text" 2379 - value={name} 2380 - onChange={(e) => setName(e.target.value)} 2381 - onKeyDown={(e) => { 2382 - if (e.key === "Enter") handleSubmit(); 2383 - if (e.key === "Escape") onClose(); 2384 - }} 2385 - placeholder="Speaker name" 2386 - className="w-full px-2 py-1 text-sm bg-neutral-900 border border-neutral-700 rounded text-neutral-200 placeholder-neutral-600 mb-2" 2387 - /> 2388 - <div className="flex gap-2"> 2389 - <button 2390 - onClick={handleSubmit} 2391 - className="px-2 py-1 text-xs bg-blue-600 text-white rounded hover:bg-blue-500" 2392 - > 2393 - Save 2394 - </button> 2395 - <button 2396 - onClick={onClose} 2397 - className="px-2 py-1 text-xs text-neutral-400 hover:text-neutral-200" 2398 - > 2399 - Cancel 2400 - </button> 2401 - </div> 2402 - {relatedTalks.length > 0 && ( 2403 - <div className="mt-2 border-t border-neutral-700 pt-2"> 2404 - <div className="text-[10px] text-neutral-500 mb-1">Appears in:</div> 2405 - {relatedTalks.slice(0, 5).map((t) => ( 2406 - <div key={t.rkey} className="text-[10px] text-neutral-400 truncate"> 2407 - {t.title} 2408 - </div> 2409 - ))} 2410 - </div> 2411 - )} 2412 - </div> 2413 - ); 2414 - } 2415 - ``` 2416 - 2417 - - [ ] **Step 2: Verify TypeScript compiles** 2418 - 2419 - Run: `cd apps/ionosphere && npx tsc --noEmit` 2420 - 2421 - - [ ] **Step 3: Commit** 2422 - 2423 - ```bash 2424 - git add apps/ionosphere/src/app/components/SpeakerPopover.tsx 2425 - git commit -m "feat(editor): speaker naming popover" 2426 - ``` 2427 - 2428 - --- 2429 - 2430 - ### Task 12: Ground Truth Export 2431 - 2432 - **Files:** 2433 - - Create: `apps/ionosphere/src/lib/ground-truth-export.ts` 2434 - - Create: `apps/ionosphere/src/lib/ground-truth-export.test.ts` 2435 - 2436 - - [ ] **Step 1: Write failing tests** 2437 - 2438 - ```ts 2439 - // apps/ionosphere/src/lib/ground-truth-export.test.ts 2440 - import { describe, it, expect } from "vitest"; 2441 - import { exportGroundTruth } from "./ground-truth-export"; 2442 - import type { EffectiveTalk } from "./corrections"; 2443 - 2444 - describe("exportGroundTruth", () => { 2445 - it("exports only verified talks", () => { 2446 - const talks: EffectiveTalk[] = [ 2447 - { rkey: "t1", title: "Talk 1", speakers: ["Alice"], startSeconds: 100, endSeconds: 500, confidence: "high", verified: true }, 2448 - { rkey: "t2", title: "Talk 2", speakers: ["Bob"], startSeconds: 500, endSeconds: 900, confidence: "high", verified: false }, 2449 - ]; 2450 - const result = exportGroundTruth("test-stream", talks, new Map()); 2451 - expect(result.talks).toHaveLength(1); 2452 - expect(result.talks[0].rkey).toBe("t1"); 2453 - expect(result.talks[0].verified).toBe(true); 2454 - expect(result.talks[0].ground_truth_start).toBe(100); 2455 - expect(result.talks[0].tolerance_seconds).toBe(120); 2456 - }); 2457 - 2458 - it("includes speaker name from mapping", () => { 2459 - const talks: EffectiveTalk[] = [ 2460 - { rkey: "t1", title: "Talk 1", speakers: [], startSeconds: 100, endSeconds: 500, confidence: "high", verified: true }, 2461 - ]; 2462 - const speakerNames = new Map([["SPEAKER_01", "Alice Smith"]]); 2463 - // The dominant speaker logic is at the integration level. 2464 - // For the export function, we pass speaker name directly. 2465 - const result = exportGroundTruth("test-stream", talks, speakerNames, { t1: "SPEAKER_01" }); 2466 - expect(result.talks[0].speaker).toBe("Alice Smith"); 2467 - }); 2468 - 2469 - it("returns empty string for unnamed speaker", () => { 2470 - const talks: EffectiveTalk[] = [ 2471 - { rkey: "t1", title: "Talk 1", speakers: [], startSeconds: 100, endSeconds: 500, confidence: "high", verified: true }, 2472 - ]; 2473 - const result = exportGroundTruth("test-stream", talks, new Map(), { t1: "SPEAKER_99" }); 2474 - expect(result.talks[0].speaker).toBe(""); 2475 - }); 2476 - }); 2477 - ``` 2478 - 2479 - - [ ] **Step 2: Run tests to verify they fail** 2480 - 2481 - Run: `cd apps/ionosphere && npx vitest run src/lib/ground-truth-export.test.ts` 2482 - Expected: FAIL 2483 - 2484 - - [ ] **Step 3: Implement ground truth export** 2485 - 2486 - ```ts 2487 - // apps/ionosphere/src/lib/ground-truth-export.ts 2488 - import type { EffectiveTalk } from "./corrections"; 2489 - 2490 - interface GroundTruthTalk { 2491 - rkey: string; 2492 - title: string; 2493 - speaker: string; 2494 - ground_truth_start: number; 2495 - tolerance_seconds: number; 2496 - verified: boolean; 2497 - notes: string; 2498 - } 2499 - 2500 - interface GroundTruthExport { 2501 - stream: string; 2502 - talks: GroundTruthTalk[]; 2503 - } 2504 - 2505 - export function exportGroundTruth( 2506 - streamSlug: string, 2507 - talks: EffectiveTalk[], 2508 - speakerNames: Map<string, string>, 2509 - dominantSpeakers?: Record<string, string>, // rkey -> speakerId 2510 - ): GroundTruthExport { 2511 - const verified = talks.filter((t) => t.verified); 2512 - 2513 - return { 2514 - stream: streamSlug, 2515 - talks: verified.map((t) => { 2516 - const speakerId = dominantSpeakers?.[t.rkey]; 2517 - const speaker = speakerId ? (speakerNames.get(speakerId) || "") : ""; 2518 - 2519 - return { 2520 - rkey: t.rkey, 2521 - title: t.title, 2522 - speaker, 2523 - ground_truth_start: t.startSeconds, 2524 - tolerance_seconds: 120, 2525 - verified: true, 2526 - notes: `Verified via alignment editor. Confidence: ${t.confidence}.`, 2527 - }; 2528 - }), 2529 - }; 2530 - } 2531 - ``` 2532 - 2533 - - [ ] **Step 4: Run tests to verify they pass** 2534 - 2535 - Run: `cd apps/ionosphere && npx vitest run src/lib/ground-truth-export.test.ts` 2536 - Expected: All 3 tests PASS 2537 - 2538 - - [ ] **Step 5: Commit** 2539 - 2540 - ```bash 2541 - git add apps/ionosphere/src/lib/ground-truth-export.ts apps/ionosphere/src/lib/ground-truth-export.test.ts 2542 - git commit -m "feat(editor): ground truth export from verified talks" 2543 - ``` 2544 - 2545 - --- 2546 - 2547 - ### Task 13: Add Mode (Drag-to-Create) and SpeakerPopover Integration 2548 - 2549 - **Files:** 2550 - - Modify: `apps/ionosphere/src/app/components/InteractionOverlay.tsx` 2551 - - Modify: `apps/ionosphere/src/app/components/WaveformBand.tsx` 2552 - - Modify: `apps/ionosphere/src/app/tracks/[stream]/TrackViewContent.tsx` 2553 - 2554 - This task implements the missing Add mode interaction and wires the SpeakerPopover into the waveform band. 2555 - 2556 - - [ ] **Step 1: Add drag-to-create to InteractionOverlay** 2557 - 2558 - Add state for Add mode drag (separate from trim drag): 2559 - 2560 - ```tsx 2561 - // In InteractionOverlay, add: 2562 - const [addDrag, setAddDrag] = useState<{ startTime: number; currentTime: number } | null>(null); 2563 - 2564 - // Add mode: mousedown on empty gap starts a drag to define new segment 2565 - useEffect(() => { 2566 - if (!editingEnabled || mode !== "add") return; 2567 - 2568 - const timeline = document.querySelector("[data-timeline-bar]") as HTMLElement; 2569 - if (!timeline) return; 2570 - 2571 - const onMouseDown = (e: MouseEvent) => { 2572 - const rect = timeline.getBoundingClientRect(); 2573 - const px = e.clientX - rect.left; 2574 - const time = pixelToTime(px); 2575 - setAddDrag({ startTime: time, currentTime: time }); 2576 - }; 2577 - 2578 - const onMouseMove = (e: MouseEvent) => { 2579 - if (!addDrag) return; 2580 - const rect = timeline.getBoundingClientRect(); 2581 - const px = e.clientX - rect.left; 2582 - setAddDrag((prev) => prev ? { ...prev, currentTime: pixelToTime(px) } : null); 2583 - }; 2584 - 2585 - const onMouseUp = () => { 2586 - if (!addDrag) return; 2587 - const start = Math.min(addDrag.startTime, addDrag.currentTime); 2588 - const end = Math.max(addDrag.startTime, addDrag.currentTime); 2589 - if (end - start > 5) { // Minimum 5 second segment 2590 - const rkey = crypto.randomUUID().slice(0, 8); 2591 - applyCorrection({ type: "add_talk", rkey, title: "Untitled", startSeconds: start, endSeconds: end }); 2592 - } 2593 - setAddDrag(null); 2594 - }; 2595 - 2596 - timeline.addEventListener("mousedown", onMouseDown); 2597 - document.addEventListener("mousemove", onMouseMove); 2598 - document.addEventListener("mouseup", onMouseUp); 2599 - return () => { 2600 - timeline.removeEventListener("mousedown", onMouseDown); 2601 - document.removeEventListener("mousemove", onMouseMove); 2602 - document.removeEventListener("mouseup", onMouseUp); 2603 - }; 2604 - }, [editingEnabled, mode, addDrag, pixelToTime, applyCorrection]); 2605 - ``` 2606 - 2607 - Also render the add-drag preview rectangle: 2608 - ```tsx 2609 - {addDrag && ( 2610 - <div 2611 - className="absolute top-0 h-full bg-blue-500/20 border border-blue-500/40 z-20 pointer-events-none" 2612 - style={{ 2613 - left: `${timeToPixel(Math.min(addDrag.startTime, addDrag.currentTime))}px`, 2614 - width: `${Math.abs(timeToPixel(addDrag.currentTime) - timeToPixel(addDrag.startTime))}px`, 2615 - }} 2616 - /> 2617 - )} 2618 - ``` 2619 - 2620 - Add `applyCorrection` and `mode` to the destructured engine values. 2621 - 2622 - - [ ] **Step 2: Add click handler for SpeakerPopover to WaveformBand** 2623 - 2624 - In `WaveformBand.tsx`, add a click handler that identifies the speaker at the clicked position and opens the popover: 2625 - 2626 - ```tsx 2627 - // Add to WaveformBand props: 2628 - onSpeakerClick?: (speakerId: string, position: { x: number; y: number }) => void; 2629 - 2630 - // Add to the band container div: 2631 - onClick={(e) => { 2632 - if (!onSpeakerClick) return; 2633 - const rect = e.currentTarget.getBoundingClientRect(); 2634 - const fraction = (e.clientX - rect.left) / rect.width; 2635 - const time = windowStart + fraction * windowDuration; 2636 - // Find speaker at this time from diarization 2637 - const seg = diarization.find((s) => s.start <= time && s.end >= time); 2638 - if (seg) { 2639 - onSpeakerClick(seg.speaker, { x: e.clientX, y: e.clientY }); 2640 - } 2641 - }} 2642 - ``` 2643 - 2644 - - [ ] **Step 3: Wire SpeakerPopover into TrackViewContent** 2645 - 2646 - In `TrackViewInner`, add state for the popover and render it: 2647 - 2648 - ```tsx 2649 - import SpeakerPopover from "@/app/components/SpeakerPopover"; 2650 - 2651 - // In TrackViewInner: 2652 - const [speakerPopover, setSpeakerPopover] = useState<{ speakerId: string; position: { x: number; y: number } } | null>(null); 2653 - 2654 - // In the WaveformBand usage: 2655 - <WaveformBand 2656 - words={words} 2657 - diarization={track.diarization} 2658 - allSpeakers={allSpeakers} 2659 - zoomLevel={zoomLevel} 2660 - onSpeakerClick={(speakerId, position) => setSpeakerPopover({ speakerId, position })} 2661 - /> 2662 - 2663 - // After the timeline area, render the popover: 2664 - {speakerPopover && ( 2665 - <SpeakerPopover 2666 - speakerId={speakerPopover.speakerId} 2667 - position={speakerPopover.position} 2668 - onClose={() => setSpeakerPopover(null)} 2669 - /> 2670 - )} 2671 - ``` 2672 - 2673 - - [ ] **Step 4: Replace DiarizationBand with WaveformBand in TrackViewContent** 2674 - 2675 - In the template, replace the `<DiarizationBand>` section: 2676 - 2677 - Old: 2678 - ```tsx 2679 - {track.diarization.length > 0 && ( 2680 - <div className="mt-1"> 2681 - <DiarizationBand 2682 - segments={visibleDiarization} 2683 - allSpeakers={allSpeakers} 2684 - durationSeconds={windowDuration} 2685 - offsetSeconds={windowStart} 2686 - /> 2687 - </div> 2688 - )} 2689 - ``` 2690 - 2691 - New: 2692 - ```tsx 2693 - {track.diarization.length > 0 && ( 2694 - <div className="mt-1"> 2695 - <WaveformBand 2696 - words={words} 2697 - diarization={track.diarization} 2698 - allSpeakers={allSpeakers} 2699 - zoomLevel={zoomLevel} 2700 - onSpeakerClick={(speakerId, position) => setSpeakerPopover({ speakerId, position })} 2701 - /> 2702 - </div> 2703 - )} 2704 - ``` 2705 - 2706 - Update imports: remove `DiarizationBand`, add `WaveformBand` and `SpeakerPopover`. 2707 - 2708 - - [ ] **Step 5: Verify TypeScript compiles and tests pass** 2709 - 2710 - Run: `cd apps/ionosphere && npx tsc --noEmit && npx vitest run` 2711 - Expected: All tests pass, no type errors 2712 - 2713 - - [ ] **Step 6: Commit** 2714 - 2715 - ```bash 2716 - git add apps/ionosphere/src/app/components/InteractionOverlay.tsx \ 2717 - apps/ionosphere/src/app/components/WaveformBand.tsx \ 2718 - apps/ionosphere/src/app/components/SpeakerPopover.tsx \ 2719 - apps/ionosphere/src/app/tracks/\[stream\]/TrackViewContent.tsx 2720 - git commit -m "feat(editor): add mode drag-to-create, speaker popover integration, waveform band" 2721 - ``` 2722 - 2723 - --- 2724 - 2725 - ### Task 14: Manual Smoke Test 2726 - 2727 - This is not automated — verify the full flow works in the browser. 2728 - 2729 - - [ ] **Step 1: Start the dev servers** 2730 - 2731 - Run: `cd apps/ionosphere-appview && npm run dev` (in one terminal) 2732 - Run: `cd apps/ionosphere && npm run dev` (in another terminal) 2733 - 2734 - - [ ] **Step 2: Verify read-only mode still works** 2735 - 2736 - Navigate to `http://localhost:3002/tracks/great-hall-day-1`. Verify: 2737 - - Video player loads 2738 - - Timeline shows talk segments with colors 2739 - - Waveform/diarization band shows at bottom 2740 - - Click to seek works 2741 - - Zoom/pan works 2742 - - Talk list and transcript tabs work 2743 - 2744 - - [ ] **Step 3: Test edit mode** 2745 - 2746 - Click "Edit" button. Verify: 2747 - - Toolbar appears with mode buttons 2748 - - Select mode: clicking talks highlights them, talk list highlights selected 2749 - - Trim mode: boundary handles appear on hover, drag works with snap guides 2750 - - Split mode: clicking on a talk splits it 2751 - - Add mode: click-drag on an empty gap creates a new segment 2752 - - Delete: select a talk, press Backspace to delete 2753 - - Keyboard shortcuts (Space, arrows, J/K/L, V/T/S/A) 2754 - - Undo/redo (Ctrl+Z / Ctrl+Shift+Z) 2755 - - Boundary nudging ([ and ] with selected talk) 2756 - - Playhead nudging (arrow keys, shift+arrows) 2757 - 2758 - - [ ] **Step 4: Test save/load cycle** 2759 - 2760 - Make an edit, press Ctrl+S (or click Save). Reload the page. Verify the edit persists. 2761 - 2762 - - [ ] **Step 5: Test verification** 2763 - 2764 - Select a talk, press Enter to verify. Verify checkmark appears on timeline and talk list. Check the progress counter updates. 2765 - 2766 - - [ ] **Step 6: Test waveform morphing** 2767 - 2768 - Zoom in past 4x. Verify the diarization band transitions from flat speaker-colored blocks to a height-varying waveform. 2769 - 2770 - - [ ] **Step 7: Test speaker naming** 2771 - 2772 - Click on a diarization segment. Verify the popover appears with the speaker ID and text input. Type a name, click Save. Verify the name appears in tooltips.
-267
docs/superpowers/plans/2026-04-05-track-timeline-view.md
··· 1 - # Track Timeline View Implementation Plan 2 - 3 - > **For agentic workers:** REQUIRED: Use superpowers:subagent-driven-development (if subagents available) or superpowers:executing-plans to implement this plan. Steps use checkbox (`- [ ]`) syntax for tracking. 4 - 5 - **Goal:** Build a browsable full-day stream view with video, talk segments on a timeline, speaker diarization bands, and synced transcript. 6 - 7 - **Architecture:** New `/tracks` routes in the Next.js frontend, new `getTrack` API endpoint in the Hono appview serving stream metadata + talk segments + diarization from existing data. Timeline and diarization are new client components; video player and transcript view are reused. 8 - 9 - **Tech Stack:** Next.js (App Router), React, Hono, SQLite, existing HLS player 10 - 11 - --- 12 - 13 - ## File Structure 14 - 15 - ### New files 16 - 17 - | File | Responsibility | 18 - |------|---------------| 19 - | `apps/ionosphere-appview/src/tracks.ts` | Track data: stream config, diarization loading, getTrack handler | 20 - | `apps/ionosphere/src/app/tracks/page.tsx` | `/tracks` index page (server component) | 21 - | `apps/ionosphere/src/app/tracks/[stream]/page.tsx` | `/tracks/[stream]` detail page (server component) | 22 - | `apps/ionosphere/src/app/tracks/[stream]/TrackViewContent.tsx` | Client component orchestrating player + timeline + transcript | 23 - | `apps/ionosphere/src/app/components/StreamTimeline.tsx` | Horizontal timeline with talk segments + scrubber | 24 - | `apps/ionosphere/src/app/components/DiarizationBand.tsx` | Colored speaker band | 25 - 26 - ### Modified files 27 - 28 - | File | Change | 29 - |------|--------| 30 - | `apps/ionosphere-appview/src/routes.ts` | Register `getTrack` and `getTracks` endpoints | 31 - | `apps/ionosphere/src/lib/api.ts` | Add `getTracks()` and `getTrack(stream)` functions | 32 - | `apps/ionosphere/src/app/components/NavHeader.tsx` | Add "Tracks" link to nav | 33 - 34 - --- 35 - 36 - ## Chunk 1: API Endpoint 37 - 38 - ### Task 1: Track data module 39 - 40 - **Files:** 41 - - Create: `apps/ionosphere-appview/src/tracks.ts` 42 - 43 - - [ ] **Step 1: Create tracks.ts with stream config and getTrack handler** 44 - 45 - This module: 46 - - Defines the 7 stream configs (slug, name, room, day, URI) 47 - - Loads diarization JSON from disk 48 - - Queries talks from DB filtered by room/day 49 - - Returns combined track data 50 - 51 - ```typescript 52 - // tracks.ts structure: 53 - // - STREAMS array with slug, name, room, day, uri 54 - // - getTrackData(db, slug) -> { stream metadata, talks with offsets, diarization } 55 - // - getTracksIndex(db) -> list of streams with talk counts 56 - ``` 57 - 58 - Key details: 59 - - Stream slugs: `great-hall-day-1`, `great-hall-day-2`, etc. 60 - - Diarization loaded from `data/fullday/<DirName>/diarization.json` 61 - - Talk offsets read from `video_segments` JSON field on talk records 62 - - Filter to only talks that have a fullday segment matching this stream URI 63 - - PDT timezone conversion: `datetime(starts_at, '-7 hours')` for date filtering 64 - 65 - - [ ] **Step 2: Register endpoints in routes.ts** 66 - 67 - Add to routes.ts: 68 - - `GET /xrpc/tv.ionosphere.getTracks` — returns `{ tracks: [...] }` with slug, name, room, day, duration, talkCount 69 - - `GET /xrpc/tv.ionosphere.getTrack?stream=<slug>` — returns full track data 70 - 71 - - [ ] **Step 3: Test the endpoint** 72 - 73 - Run: `curl -s http://localhost:9401/xrpc/tv.ionosphere.getTracks | python3 -m json.tool | head -20` 74 - Run: `curl -s "http://localhost:9401/xrpc/tv.ionosphere.getTrack?stream=great-hall-day-1" | python3 -m json.tool | head -40` 75 - 76 - - [ ] **Step 4: Commit** 77 - 78 - ```bash 79 - git add apps/ionosphere-appview/src/tracks.ts apps/ionosphere-appview/src/routes.ts 80 - git commit -m "feat: getTrack and getTracks API endpoints" 81 - ``` 82 - 83 - --- 84 - 85 - ### Task 2: API client functions 86 - 87 - **Files:** 88 - - Modify: `apps/ionosphere/src/lib/api.ts` 89 - 90 - - [ ] **Step 1: Add getTracks and getTrack to api.ts** 91 - 92 - ```typescript 93 - export async function getTracks() { 94 - return fetchApi<{ tracks: any[] }>("/xrpc/tv.ionosphere.getTracks"); 95 - } 96 - 97 - export async function getTrack(stream: string) { 98 - return fetchApi<any>(`/xrpc/tv.ionosphere.getTrack?stream=${encodeURIComponent(stream)}`); 99 - } 100 - ``` 101 - 102 - - [ ] **Step 2: Commit** 103 - 104 - ```bash 105 - git add apps/ionosphere/src/lib/api.ts 106 - git commit -m "feat: add getTracks/getTrack API client functions" 107 - ``` 108 - 109 - --- 110 - 111 - ## Chunk 2: Track Pages 112 - 113 - ### Task 3: Tracks index page 114 - 115 - **Files:** 116 - - Create: `apps/ionosphere/src/app/tracks/page.tsx` 117 - - Modify: `apps/ionosphere/src/app/components/NavHeader.tsx` 118 - 119 - - [ ] **Step 1: Create /tracks index page** 120 - 121 - Server component. Calls `getTracks()`, renders a list of streams grouped by day. Each links to `/tracks/[slug]`. Shows room name, day, duration, talk count. 122 - 123 - Follow the pattern from `apps/ionosphere/src/app/talks/page.tsx`. 124 - 125 - - [ ] **Step 2: Add "Tracks" to NavHeader** 126 - 127 - Add a link alongside the existing Talks, Speakers, Concepts links. 128 - 129 - - [ ] **Step 3: Test** 130 - 131 - Open `http://127.0.0.1:9402/tracks` — should show 7 streams. 132 - 133 - - [ ] **Step 4: Commit** 134 - 135 - ```bash 136 - git add apps/ionosphere/src/app/tracks/page.tsx apps/ionosphere/src/app/components/NavHeader.tsx 137 - git commit -m "feat: tracks index page with nav link" 138 - ``` 139 - 140 - --- 141 - 142 - ### Task 4: Track detail page shell 143 - 144 - **Files:** 145 - - Create: `apps/ionosphere/src/app/tracks/[stream]/page.tsx` 146 - - Create: `apps/ionosphere/src/app/tracks/[stream]/TrackViewContent.tsx` 147 - 148 - - [ ] **Step 1: Create the server page** 149 - 150 - `page.tsx`: Server component that calls `getTrack(params.stream)`, passes data to `TrackViewContent`. 151 - 152 - - [ ] **Step 2: Create TrackViewContent client component** 153 - 154 - Initial version: video player at top, talk list below with jump-to buttons. Wire up `TimestampProvider` so seeking works. 155 - 156 - The video player uses the stream URI directly (no offset — we're playing the whole stream). Talk list items call `onSeek` with the talk's start time in nanoseconds. 157 - 158 - - [ ] **Step 3: Test** 159 - 160 - Open `http://127.0.0.1:9402/tracks/great-hall-day-1` — should show video + talk list. Clicking a talk should seek the video. 161 - 162 - - [ ] **Step 4: Commit** 163 - 164 - ```bash 165 - git add apps/ionosphere/src/app/tracks/\[stream\]/ 166 - git commit -m "feat: track detail page with video player and talk list" 167 - ``` 168 - 169 - --- 170 - 171 - ## Chunk 3: Timeline + Diarization 172 - 173 - ### Task 5: StreamTimeline component 174 - 175 - **Files:** 176 - - Create: `apps/ionosphere/src/app/components/StreamTimeline.tsx` 177 - 178 - - [ ] **Step 1: Build the timeline** 179 - 180 - Client component. Props: `talks` (with start/end seconds), `durationSeconds`, `currentTimeNs`, `onSeek`. 181 - 182 - Renders: 183 - - Horizontal bar (full width) representing the stream duration 184 - - Talk segments as colored blocks with labels (truncated to fit) 185 - - Vertical scrubber line at current playback position 186 - - Click anywhere to seek 187 - 188 - Use CSS for layout — talk blocks are absolutely positioned with `left` and `width` as percentages of duration. 189 - 190 - - [ ] **Step 2: Wire into TrackViewContent** 191 - 192 - Add `StreamTimeline` between the video player and the talk list. Pass current time from `TimestampProvider` and talks from API data. 193 - 194 - - [ ] **Step 3: Test** 195 - 196 - Timeline should show colored blocks for each talk. Scrubber should move with video playback. Clicking should seek. 197 - 198 - - [ ] **Step 4: Commit** 199 - 200 - ```bash 201 - git add apps/ionosphere/src/app/components/StreamTimeline.tsx apps/ionosphere/src/app/tracks/\[stream\]/TrackViewContent.tsx 202 - git commit -m "feat: stream timeline with talk segments and scrubber" 203 - ``` 204 - 205 - --- 206 - 207 - ### Task 6: DiarizationBand component 208 - 209 - **Files:** 210 - - Create: `apps/ionosphere/src/app/components/DiarizationBand.tsx` 211 - 212 - - [ ] **Step 1: Build the diarization band** 213 - 214 - Client component. Props: `segments` (from diarization JSON), `durationSeconds`. 215 - 216 - Renders a thin horizontal bar with colored blocks for each speaker. Speaker → color mapping generated deterministically from speaker ID (hash to hue). Adjacent segments from the same speaker merged for performance. 217 - 218 - - [ ] **Step 2: Wire into TrackViewContent** 219 - 220 - Add below the `StreamTimeline`. 221 - 222 - - [ ] **Step 3: Test** 223 - 224 - Colored bands should appear below the timeline. Different speakers should have different colors. 225 - 226 - - [ ] **Step 4: Commit** 227 - 228 - ```bash 229 - git add apps/ionosphere/src/app/components/DiarizationBand.tsx apps/ionosphere/src/app/tracks/\[stream\]/TrackViewContent.tsx 230 - git commit -m "feat: speaker diarization band on track timeline" 231 - ``` 232 - 233 - --- 234 - 235 - ## Chunk 4: Transcript Integration 236 - 237 - ### Task 7: Track transcript in the detail view 238 - 239 - **Files:** 240 - - Modify: `apps/ionosphere/src/app/tracks/[stream]/TrackViewContent.tsx` 241 - 242 - - [ ] **Step 1: Add transcript display** 243 - 244 - The API endpoint should include the track transcript data (or a reference to it). Add the transcript below the timeline/talk list, using the existing `TranscriptView` component or a simplified version. 245 - 246 - The track transcript is the full `transcript-enriched.json` content. For the API, serve the word-level data with timestamps so the existing transcript sync works. The transcript is large (~50-65K words) so consider pagination or virtualized rendering. 247 - 248 - Initial approach: serve transcript words grouped into chunks (~500 words each) and render the chunk containing the current playback position + surrounding chunks. 249 - 250 - - [ ] **Step 2: Test** 251 - 252 - Transcript should auto-scroll to current playback position. Talk boundary markers should be visible. 253 - 254 - - [ ] **Step 3: Commit** 255 - 256 - ```bash 257 - git add apps/ionosphere/src/app/tracks/\[stream\]/TrackViewContent.tsx 258 - git commit -m "feat: synced transcript display in track view" 259 - ``` 260 - 261 - --- 262 - 263 - ## Notes 264 - 265 - - The transcript integration (Task 7) is the most complex piece due to the size of full-day transcripts. A simple initial approach (render a window around the current time) is fine — optimize later. 266 - - The existing `TranscriptView` expects a `TranscriptDocument` format. The track transcript is in a different format (raw words with timestamps). Either adapt TranscriptView or build a simpler `TrackTranscript` component that just renders timestamped text. 267 - - Diarization data is ~3000 segments per stream. Rendering all of them as DOM elements works fine — it's just colored divs.
-733
docs/superpowers/plans/2026-04-12-boundary-detection-v7.md
··· 1 - # Boundary Detection v7 Implementation Plan 2 - 3 - > **For agentic workers:** REQUIRED: Use superpowers:subagent-driven-development (if subagents available) or superpowers:executing-plans to implement this plan. Steps use checkbox (`- [ ]`) syntax for tracking. 4 - 5 - **Goal:** Replace transcript-gap-based boundary detection with a diarization-first pipeline that avoids hallucination zones and cross-track errors. 6 - 7 - **Architecture:** Three-stage pipeline — diarization segmentation builds talk-shaped segments from audio, transcript content matching identifies which talk each segment contains, schedule reconciliation fills gaps and outputs v6-compatible JSON. Reuses `phonetic.ts` and `db.ts` from existing codebase. 8 - 9 - **Tech Stack:** TypeScript, better-sqlite3 (existing DB), vitest (existing test runner) 10 - 11 - **Spec:** `docs/superpowers/specs/2026-04-12-boundary-detection-v7-design.md` 12 - **Verification notes:** `docs/alignment-verification-notes.md` 13 - 14 - --- 15 - 16 - ## File Structure 17 - 18 - | File | Responsibility | 19 - |------|---------------| 20 - | `src/detect-boundaries-v7.ts` | CLI entry point, orchestrates pipeline | 21 - | `src/v7/diarization-segmenter.ts` | Stage 1: parse diarization, build talk segments, detect hallucination | 22 - | `src/v7/transcript-matcher.ts` | Stage 2: extract identity signals, match segments to schedule | 23 - | `src/v7/schedule-reconciler.ts` | Stage 3: resolve conflicts, fill gaps, compute end times | 24 - | `src/v7/hallucination-detector.ts` | Detect known hallucination patterns in transcript | 25 - | `src/v7/types.ts` | Shared interfaces: TalkSegment, BoundaryMatch, HallucinationZone, etc. | 26 - | `src/v7/__tests__/diarization-segmenter.test.ts` | Tests for Stage 1 | 27 - | `src/v7/__tests__/transcript-matcher.test.ts` | Tests for Stage 2 | 28 - | `src/v7/__tests__/schedule-reconciler.test.ts` | Tests for Stage 3 | 29 - | `src/v7/__tests__/hallucination-detector.test.ts` | Tests for hallucination detection | 30 - 31 - Reused from existing code: 32 - - `src/phonetic.ts` — fuzzy speaker name matching 33 - - `src/db.ts` — SQLite access for schedule data 34 - - `src/tracks.ts` — `STREAM_MATCH` config, `STREAMS` config, `DAY_DATES` 35 - 36 - --- 37 - 38 - ## Chunk 1: Types + Hallucination Detection 39 - 40 - ### Task 1: Shared Types 41 - 42 - **Files:** 43 - - Create: `src/v7/types.ts` 44 - 45 - - [ ] **Step 1: Create type definitions** 46 - 47 - ```ts 48 - // src/v7/types.ts 49 - 50 - export interface DiarizationInput { 51 - speakers: string[]; 52 - segments: { start: number; end: number; speaker: string }[]; 53 - total_segments: number; 54 - } 55 - 56 - export interface TranscriptInput { 57 - stream: string; 58 - duration_seconds: number; 59 - words: { word: string; start: number; end: number; speaker?: string; confidence?: number }[]; 60 - } 61 - 62 - export interface TalkSegment { 63 - startS: number; 64 - endS: number; 65 - speakers: { id: string; durationS: number }[]; 66 - type: 'single-speaker' | 'panel' | 'unknown'; 67 - dominantSpeaker?: string; 68 - precedingGapS: number; 69 - hallucinationZone: boolean; 70 - } 71 - 72 - export interface HallucinationZone { 73 - startS: number; 74 - endS: number; 75 - pattern: string; 76 - } 77 - 78 - export interface ScheduleTalk { 79 - rkey: string; 80 - title: string; 81 - starts_at: string; 82 - ends_at: string; 83 - speaker_names: string; 84 - } 85 - 86 - export interface BoundaryMatch { 87 - rkey: string; 88 - title: string; 89 - startTimestamp: number; // seconds — v6 compat field name 90 - endTimestamp: number | null; 91 - confidence: 'high' | 'medium' | 'low' | 'unverifiable'; 92 - signals: string[]; 93 - panel: boolean; 94 - hallucinationZones: HallucinationZone[]; 95 - } 96 - 97 - export interface V7Output { 98 - stream: string; 99 - results: BoundaryMatch[]; 100 - hallucinationZones: HallucinationZone[]; 101 - unmatchedSegments: TalkSegment[]; 102 - unmatchedSchedule: string[]; 103 - } 104 - ``` 105 - 106 - - [ ] **Step 2: Commit** 107 - 108 - ```bash 109 - git add src/v7/types.ts 110 - git commit -m "feat(v7): shared type definitions for boundary detection pipeline" 111 - ``` 112 - 113 - ### Task 2: Hallucination Detector 114 - 115 - **Files:** 116 - - Create: `src/v7/hallucination-detector.ts` 117 - - Create: `src/v7/__tests__/hallucination-detector.test.ts` 118 - 119 - - [ ] **Step 1: Write failing tests** 120 - 121 - ```ts 122 - // src/v7/__tests__/hallucination-detector.test.ts 123 - import { describe, it, expect } from "vitest"; 124 - import { detectHallucinationZones } from "../hallucination-detector.js"; 125 - import type { TranscriptInput, DiarizationInput } from "../types.js"; 126 - 127 - describe("detectHallucinationZones", () => { 128 - it("detects CastingWords loops", () => { 129 - const words = Array.from({ length: 30 }, (_, i) => ({ 130 - word: ["Transcription", "by", "CastingWords"][i % 3], 131 - start: 100 + i * 0.5, 132 - end: 100 + i * 0.5 + 0.3, 133 - })); 134 - const zones = detectHallucinationZones( 135 - { stream: "test", duration_seconds: 200, words } as TranscriptInput, 136 - { speakers: [], segments: [], total_segments: 0 } // no diarization speech 137 - ); 138 - expect(zones.length).toBeGreaterThan(0); 139 - expect(zones[0].pattern).toContain("CastingWords"); 140 - }); 141 - 142 - it("detects numeric zero loops", () => { 143 - const words = Array.from({ length: 50 }, (_, i) => ({ 144 - word: "0", 145 - start: 200 + i * 0.2, 146 - end: 200 + i * 0.2 + 0.1, 147 - })); 148 - const zones = detectHallucinationZones( 149 - { stream: "test", duration_seconds: 300, words } as TranscriptInput, 150 - { speakers: [], segments: [], total_segments: 0 } 151 - ); 152 - expect(zones.length).toBeGreaterThan(0); 153 - expect(zones[0].pattern).toContain("zero"); 154 - }); 155 - 156 - it("detects diarization silence with transcript words", () => { 157 - // Transcript has words from 100-200s, but diarization has no segments there 158 - const words = Array.from({ length: 20 }, (_, i) => ({ 159 - word: "hello", 160 - start: 100 + i * 5, 161 - end: 100 + i * 5 + 1, 162 - })); 163 - const diarization: DiarizationInput = { 164 - speakers: ["SPEAKER_00"], 165 - segments: [ 166 - { start: 0, end: 90, speaker: "SPEAKER_00" }, 167 - { start: 210, end: 300, speaker: "SPEAKER_00" }, 168 - ], 169 - total_segments: 2, 170 - }; 171 - const zones = detectHallucinationZones( 172 - { stream: "test", duration_seconds: 300, words } as TranscriptInput, 173 - diarization 174 - ); 175 - expect(zones.length).toBeGreaterThan(0); 176 - expect(zones[0].startS).toBeLessThanOrEqual(100); 177 - expect(zones[0].endS).toBeGreaterThanOrEqual(195); 178 - }); 179 - 180 - it("does not flag real speech as hallucination", () => { 181 - const words = [ 182 - { word: "Hello", start: 10, end: 11 }, 183 - { word: "my", start: 11, end: 11.5 }, 184 - { word: "name", start: 11.5, end: 12 }, 185 - { word: "is", start: 12, end: 12.5 }, 186 - { word: "Justin", start: 12.5, end: 13 }, 187 - ]; 188 - const diarization: DiarizationInput = { 189 - speakers: ["SPEAKER_00"], 190 - segments: [{ start: 9, end: 14, speaker: "SPEAKER_00" }], 191 - total_segments: 1, 192 - }; 193 - const zones = detectHallucinationZones( 194 - { stream: "test", duration_seconds: 30, words } as TranscriptInput, 195 - diarization 196 - ); 197 - expect(zones.length).toBe(0); 198 - }); 199 - }); 200 - ``` 201 - 202 - - [ ] **Step 2: Run tests to verify they fail** 203 - 204 - ```bash 205 - cd apps/ionosphere-appview && npx vitest run src/v7/__tests__/hallucination-detector.test.ts 206 - ``` 207 - 208 - Expected: FAIL — module not found 209 - 210 - - [ ] **Step 3: Implement hallucination detector** 211 - 212 - ```ts 213 - // src/v7/hallucination-detector.ts 214 - import type { TranscriptInput, DiarizationInput, HallucinationZone } from "./types.js"; 215 - 216 - /** Known repeating hallucination phrases. Each entry: [pattern words, label]. */ 217 - const HALLUCINATION_PATTERNS: [string[], string][] = [ 218 - [["Transcription", "by", "CastingWords"], "CastingWords loop"], 219 - [["Transcribed", "by", "https://otter"], "otter.ai loop"], 220 - [["Transcription", "by", "ESO"], "ESO Translation loop"], 221 - [["Microsoft", "Office", "Word", "Document"], "MSWord loop"], 222 - [["Transcripts", "provided", "by", "Transcription", "Outsourcing"], "Transcription Outsourcing loop"], 223 - [["UGA", "Extension", "Office"], "UGA Extension loop"], 224 - [["Thank", "you", "for", "watching"], "Thank you for watching loop"], 225 - [["Subs", "by", "www"], "subtitle attribution loop"], 226 - [["www", "fema", "gov"], "fema.gov loop"], 227 - ]; 228 - 229 - /** Minimum consecutive zeros to flag as hallucination. */ 230 - const ZERO_LOOP_THRESHOLD = 20; 231 - 232 - /** Minimum repetitions of a phrase to flag as hallucination loop. */ 233 - const PHRASE_REPEAT_THRESHOLD = 3; 234 - 235 - /** 236 - * Detect hallucination zones using two methods: 237 - * 1. Pattern matching: known repeating phrases in transcript 238 - * 2. Silence mismatch: diarization shows no speech but transcript has words 239 - */ 240 - export function detectHallucinationZones( 241 - transcript: TranscriptInput, 242 - diarization: DiarizationInput, 243 - ): HallucinationZone[] { 244 - const zones: HallucinationZone[] = []; 245 - 246 - // Method 1: Known phrase patterns 247 - zones.push(...detectPhrasePatterns(transcript)); 248 - 249 - // Method 2: Numeric zero loops 250 - zones.push(...detectZeroLoops(transcript)); 251 - 252 - // Method 3: Diarization silence with transcript words 253 - zones.push(...detectSilenceMismatch(transcript, diarization)); 254 - 255 - // Merge overlapping zones 256 - return mergeZones(zones); 257 - } 258 - 259 - function detectPhrasePatterns(transcript: TranscriptInput): HallucinationZone[] { 260 - const zones: HallucinationZone[] = []; 261 - const words = transcript.words; 262 - 263 - for (const [pattern, label] of HALLUCINATION_PATTERNS) { 264 - let matchCount = 0; 265 - let firstMatchStart: number | null = null; 266 - let lastMatchEnd: number | null = null; 267 - 268 - for (let i = 0; i <= words.length - pattern.length; i++) { 269 - const matches = pattern.every((p, j) => words[i + j].word === p); 270 - if (matches) { 271 - matchCount++; 272 - if (firstMatchStart === null) firstMatchStart = words[i].start; 273 - lastMatchEnd = words[i + pattern.length - 1].end; 274 - } else if (matchCount >= PHRASE_REPEAT_THRESHOLD && firstMatchStart !== null && lastMatchEnd !== null) { 275 - zones.push({ startS: firstMatchStart, endS: lastMatchEnd, pattern: label }); 276 - matchCount = 0; 277 - firstMatchStart = null; 278 - lastMatchEnd = null; 279 - } 280 - } 281 - 282 - if (matchCount >= PHRASE_REPEAT_THRESHOLD && firstMatchStart !== null && lastMatchEnd !== null) { 283 - zones.push({ startS: firstMatchStart, endS: lastMatchEnd, pattern: label }); 284 - } 285 - } 286 - 287 - return zones; 288 - } 289 - 290 - function detectZeroLoops(transcript: TranscriptInput): HallucinationZone[] { 291 - const zones: HallucinationZone[] = []; 292 - let runStart: number | null = null; 293 - let runCount = 0; 294 - 295 - for (const w of transcript.words) { 296 - if (w.word === "0") { 297 - if (runStart === null) runStart = w.start; 298 - runCount++; 299 - } else { 300 - if (runCount >= ZERO_LOOP_THRESHOLD && runStart !== null) { 301 - zones.push({ startS: runStart, endS: w.start, pattern: "numeric zeros" }); 302 - } 303 - runStart = null; 304 - runCount = 0; 305 - } 306 - } 307 - 308 - if (runCount >= ZERO_LOOP_THRESHOLD && runStart !== null) { 309 - const last = transcript.words[transcript.words.length - 1]; 310 - zones.push({ startS: runStart, endS: last.end, pattern: "numeric zeros" }); 311 - } 312 - 313 - return zones; 314 - } 315 - 316 - function detectSilenceMismatch( 317 - transcript: TranscriptInput, 318 - diarization: DiarizationInput, 319 - ): HallucinationZone[] { 320 - if (diarization.segments.length === 0) return []; 321 - 322 - const zones: HallucinationZone[] = []; 323 - 324 - // Find diarization silence gaps > 60s 325 - const sortedSegs = [...diarization.segments].sort((a, b) => a.start - b.start); 326 - const silenceGaps: { start: number; end: number }[] = []; 327 - 328 - for (let i = 1; i < sortedSegs.length; i++) { 329 - const gap = sortedSegs[i].start - sortedSegs[i - 1].end; 330 - if (gap > 60) { 331 - silenceGaps.push({ start: sortedSegs[i - 1].end, end: sortedSegs[i].start }); 332 - } 333 - } 334 - 335 - // Check if transcript has words during these silence gaps 336 - for (const gap of silenceGaps) { 337 - const wordsInGap = transcript.words.filter( 338 - (w) => w.start >= gap.start && w.end <= gap.end 339 - ); 340 - if (wordsInGap.length > 10) { 341 - zones.push({ 342 - startS: gap.start, 343 - endS: gap.end, 344 - pattern: "diarization silence with transcript words", 345 - }); 346 - } 347 - } 348 - 349 - return zones; 350 - } 351 - 352 - function mergeZones(zones: HallucinationZone[]): HallucinationZone[] { 353 - if (zones.length === 0) return []; 354 - const sorted = [...zones].sort((a, b) => a.startS - b.startS); 355 - const merged: HallucinationZone[] = [sorted[0]]; 356 - 357 - for (let i = 1; i < sorted.length; i++) { 358 - const prev = merged[merged.length - 1]; 359 - const curr = sorted[i]; 360 - // Merge if overlapping or within 60s 361 - if (curr.startS <= prev.endS + 60) { 362 - prev.endS = Math.max(prev.endS, curr.endS); 363 - if (!prev.pattern.includes(curr.pattern)) { 364 - prev.pattern += " + " + curr.pattern; 365 - } 366 - } else { 367 - merged.push({ ...curr }); 368 - } 369 - } 370 - 371 - return merged; 372 - } 373 - ``` 374 - 375 - - [ ] **Step 4: Run tests to verify they pass** 376 - 377 - ```bash 378 - cd apps/ionosphere-appview && npx vitest run src/v7/__tests__/hallucination-detector.test.ts 379 - ``` 380 - 381 - Expected: PASS (all 4 tests) 382 - 383 - - [ ] **Step 5: Commit** 384 - 385 - ```bash 386 - git add src/v7/ 387 - git commit -m "feat(v7): hallucination detector with pattern + diarization silence detection" 388 - ``` 389 - 390 - --- 391 - 392 - ## Chunk 2: Diarization Segmenter (Stage 1) 393 - 394 - ### Task 3: Diarization Segmenter 395 - 396 - **Files:** 397 - - Create: `src/v7/diarization-segmenter.ts` 398 - - Create: `src/v7/__tests__/diarization-segmenter.test.ts` 399 - 400 - - [ ] **Step 1: Write failing tests** 401 - 402 - ```ts 403 - // src/v7/__tests__/diarization-segmenter.test.ts 404 - import { describe, it, expect } from "vitest"; 405 - import { segmentDiarization } from "../diarization-segmenter.js"; 406 - import type { DiarizationInput, HallucinationZone } from "../types.js"; 407 - 408 - function makeDiarization(segments: { start: number; end: number; speaker: string }[]): DiarizationInput { 409 - const speakers = [...new Set(segments.map(s => s.speaker))]; 410 - return { speakers, segments, total_segments: segments.length }; 411 - } 412 - 413 - describe("segmentDiarization", () => { 414 - it("creates single-speaker segments from continuous speech with gaps", () => { 415 - const diar = makeDiarization([ 416 - { start: 0, end: 1800, speaker: "SPEAKER_00" }, // Talk 1: 0-30m 417 - { start: 1860, end: 3600, speaker: "SPEAKER_01" }, // Talk 2: 31-60m (60s gap) 418 - ]); 419 - const result = segmentDiarization(diar, []); 420 - expect(result.length).toBe(2); 421 - expect(result[0].dominantSpeaker).toBe("SPEAKER_00"); 422 - expect(result[1].dominantSpeaker).toBe("SPEAKER_01"); 423 - expect(result[1].precedingGapS).toBeCloseTo(60, 0); 424 - }); 425 - 426 - it("detects session breaks at gaps > 60s", () => { 427 - const diar = makeDiarization([ 428 - { start: 0, end: 1800, speaker: "SPEAKER_00" }, 429 - // 76-minute gap (break) 430 - { start: 6360, end: 8000, speaker: "SPEAKER_01" }, 431 - ]); 432 - const result = segmentDiarization(diar, []); 433 - expect(result.length).toBe(2); 434 - expect(result[1].precedingGapS).toBeGreaterThan(4000); 435 - }); 436 - 437 - it("identifies panels (multiple balanced speakers)", () => { 438 - const diar = makeDiarization([ 439 - { start: 0, end: 600, speaker: "SPEAKER_00" }, 440 - { start: 600, end: 1200, speaker: "SPEAKER_01" }, 441 - { start: 1200, end: 1800, speaker: "SPEAKER_02" }, 442 - { start: 1800, end: 2400, speaker: "SPEAKER_00" }, 443 - ]); 444 - const result = segmentDiarization(diar, []); 445 - expect(result.length).toBe(1); 446 - expect(result[0].type).toBe("panel"); 447 - expect(result[0].speakers.length).toBe(3); 448 - }); 449 - 450 - it("marks segments overlapping hallucination zones", () => { 451 - const diar = makeDiarization([ 452 - { start: 0, end: 1800, speaker: "SPEAKER_00" }, 453 - // no diarization segments from 1800-5000 (hallucination zone) 454 - { start: 5000, end: 7000, speaker: "SPEAKER_01" }, 455 - ]); 456 - const hallucinationZones: HallucinationZone[] = [ 457 - { startS: 1800, endS: 5000, pattern: "CastingWords loop" }, 458 - ]; 459 - const result = segmentDiarization(diar, hallucinationZones); 460 - // Should not create a segment in the hallucination zone 461 - expect(result.length).toBe(2); 462 - expect(result[0].hallucinationZone).toBe(false); 463 - expect(result[1].hallucinationZone).toBe(false); 464 - }); 465 - 466 - it("splits talk boundaries at 30-60s gaps with speaker changes", () => { 467 - const diar = makeDiarization([ 468 - { start: 0, end: 1500, speaker: "SPEAKER_00" }, 469 - // 45s gap + speaker change 470 - { start: 1545, end: 3000, speaker: "SPEAKER_01" }, 471 - ]); 472 - const result = segmentDiarization(diar, []); 473 - expect(result.length).toBe(2); 474 - expect(result[0].type).toBe("single-speaker"); 475 - expect(result[1].type).toBe("single-speaker"); 476 - }); 477 - }); 478 - ``` 479 - 480 - - [ ] **Step 2: Run tests to verify they fail** 481 - 482 - ```bash 483 - cd apps/ionosphere-appview && npx vitest run src/v7/__tests__/diarization-segmenter.test.ts 484 - ``` 485 - 486 - - [ ] **Step 3: Implement diarization segmenter** 487 - 488 - The segmenter should: 489 - 1. Sort diarization segments by start time 490 - 2. Merge same-speaker segments with < 5s gaps into speech blocks 491 - 3. Find gaps between blocks, classifying as break (>60s), boundary (30-60s + speaker change), or pause (<30s) 492 - 4. Group blocks between breaks into sessions 493 - 5. Within each session, group blocks between boundary gaps into talk segments 494 - 6. For each talk segment, compute speaker distribution and classify as single-speaker (one speaker > 70%) or panel 495 - 7. Mark any segment that overlaps a hallucination zone 496 - 497 - Implementation in `src/v7/diarization-segmenter.ts`. Core function signature: 498 - 499 - ```ts 500 - export function segmentDiarization( 501 - diarization: DiarizationInput, 502 - hallucinationZones: HallucinationZone[], 503 - ): TalkSegment[] 504 - ``` 505 - 506 - - [ ] **Step 4: Run tests to verify they pass** 507 - 508 - ```bash 509 - cd apps/ionosphere-appview && npx vitest run src/v7/__tests__/diarization-segmenter.test.ts 510 - ``` 511 - 512 - - [ ] **Step 5: Commit** 513 - 514 - ```bash 515 - git add src/v7/ 516 - git commit -m "feat(v7): diarization segmenter — stage 1 of pipeline" 517 - ``` 518 - 519 - --- 520 - 521 - ## Chunk 3: Transcript Matcher (Stage 2) 522 - 523 - ### Task 4: Transcript Matcher 524 - 525 - **Files:** 526 - - Create: `src/v7/transcript-matcher.ts` 527 - - Create: `src/v7/__tests__/transcript-matcher.test.ts` 528 - 529 - - [ ] **Step 1: Write failing tests** 530 - 531 - ```ts 532 - // src/v7/__tests__/transcript-matcher.test.ts 533 - import { describe, it, expect } from "vitest"; 534 - import { extractSignals, matchSegmentToSchedule } from "../transcript-matcher.js"; 535 - import type { TranscriptInput, TalkSegment, ScheduleTalk } from "../types.js"; 536 - 537 - describe("extractSignals", () => { 538 - it("finds self-introductions", () => { 539 - const words = "Hello my name is Justin Bank I am a journalist".split(" ").map((w, i) => ({ 540 - word: w, start: 100 + i, end: 101 + i, 541 - })); 542 - const signals = extractSignals({ words } as TranscriptInput, 99, 115); 543 - expect(signals).toContainEqual(expect.objectContaining({ type: "self-intro" })); 544 - expect(signals.find(s => s.type === "self-intro")?.name).toContain("Justin"); 545 - }); 546 - 547 - it("finds MC handoffs", () => { 548 - const words = "please welcome Justin for his talk".split(" ").map((w, i) => ({ 549 - word: w, start: 90 + i, end: 91 + i, 550 - })); 551 - const signals = extractSignals({ words } as TranscriptInput, 89, 100); 552 - expect(signals).toContainEqual(expect.objectContaining({ type: "mc-handoff" })); 553 - }); 554 - 555 - it("extracts topic keywords", () => { 556 - const words = "I will talk about sovereign media and how publishers can".split(" ").map((w, i) => ({ 557 - word: w, start: 100 + i, end: 101 + i, 558 - })); 559 - const signals = extractSignals({ words } as TranscriptInput, 99, 115); 560 - expect(signals).toContainEqual(expect.objectContaining({ type: "topic" })); 561 - }); 562 - }); 563 - 564 - describe("matchSegmentToSchedule", () => { 565 - const schedule: ScheduleTalk[] = [ 566 - { rkey: "talk1", title: "Sovereign Media Economics", starts_at: "2026-03-28T17:30:00Z", ends_at: "2026-03-28T18:00:00Z", speaker_names: "Natalie Mullins" }, 567 - { rkey: "talk2", title: "AI in the Atmosphere", starts_at: "2026-03-28T18:00:00Z", ends_at: "2026-03-28T18:30:00Z", speaker_names: "Cameron Stream" }, 568 - ]; 569 - 570 - it("matches by speaker name + topic", () => { 571 - const signals = [ 572 - { type: "self-intro" as const, name: "Natalie" }, 573 - { type: "topic" as const, keywords: ["sovereign", "media"] }, 574 - ]; 575 - const match = matchSegmentToSchedule(signals, schedule); 576 - expect(match?.rkey).toBe("talk1"); 577 - expect(match?.confidence).toBe("high"); 578 - }); 579 - 580 - it("returns medium confidence for speaker-only match", () => { 581 - const signals = [{ type: "self-intro" as const, name: "Cameron" }]; 582 - const match = matchSegmentToSchedule(signals, schedule); 583 - expect(match?.rkey).toBe("talk2"); 584 - expect(match?.confidence).toBe("medium"); 585 - }); 586 - 587 - it("returns null for no match", () => { 588 - const signals = [{ type: "self-intro" as const, name: "Unknown Person" }]; 589 - const match = matchSegmentToSchedule(signals, schedule); 590 - expect(match).toBeNull(); 591 - }); 592 - }); 593 - ``` 594 - 595 - - [ ] **Step 2: Run tests to verify they fail** 596 - 597 - ```bash 598 - cd apps/ionosphere-appview && npx vitest run src/v7/__tests__/transcript-matcher.test.ts 599 - ``` 600 - 601 - - [ ] **Step 3: Implement transcript matcher** 602 - 603 - Core functions: 604 - - `extractSignals(transcript, startS, endS)` — scan transcript words in range for self-intros, MC handoffs, topic keywords 605 - - `matchSegmentToSchedule(signals, schedule)` — fuzzy match signals against schedule using `phoneticCode` from `../phonetic.js` 606 - - `matchAllSegments(segments, transcript, schedule, hallucinationZones)` — orchestrate matching for all segments 607 - 608 - Speaker name matching should use phonetic codes for fuzzy matching (handles "Jekard"/"Jacquard", "Wardmuller"/"Werdmuller"). Topic matching should tokenize talk titles and look for 2+ matching keywords in the first 2 minutes of transcript. 609 - 610 - - [ ] **Step 4: Run tests to verify they pass** 611 - 612 - ```bash 613 - cd apps/ionosphere-appview && npx vitest run src/v7/__tests__/transcript-matcher.test.ts 614 - ``` 615 - 616 - - [ ] **Step 5: Commit** 617 - 618 - ```bash 619 - git add src/v7/ 620 - git commit -m "feat(v7): transcript matcher — stage 2 of pipeline" 621 - ``` 622 - 623 - --- 624 - 625 - ## Chunk 4: Schedule Reconciler (Stage 3) + CLI 626 - 627 - ### Task 5: Schedule Reconciler 628 - 629 - **Files:** 630 - - Create: `src/v7/schedule-reconciler.ts` 631 - - Create: `src/v7/__tests__/schedule-reconciler.test.ts` 632 - 633 - - [ ] **Step 1: Write failing tests** 634 - 635 - Tests should cover: 636 - - Duplicate assignment resolution (same rkey matched to multiple segments → keep highest confidence) 637 - - Unmatched schedule entries in hallucination zones → `unverifiable` 638 - - Unmatched segments → `unmatchedSegments` output 639 - - End time calculation: next talk start minus gap, or diarization silence onset 640 - - Last talk in session: ends at last speech, not next session start 641 - 642 - - [ ] **Step 2: Run tests, verify fail** 643 - - [ ] **Step 3: Implement reconciler** 644 - 645 - Core function: 646 - ```ts 647 - export function reconcileSchedule( 648 - matches: BoundaryMatch[], 649 - segments: TalkSegment[], 650 - schedule: ScheduleTalk[], 651 - hallucinationZones: HallucinationZone[], 652 - streamDurationS: number, 653 - ): V7Output 654 - ``` 655 - 656 - - [ ] **Step 4: Run tests, verify pass** 657 - - [ ] **Step 5: Commit** 658 - 659 - ```bash 660 - git add src/v7/ 661 - git commit -m "feat(v7): schedule reconciler — stage 3 of pipeline" 662 - ``` 663 - 664 - ### Task 6: CLI Entry Point 665 - 666 - **Files:** 667 - - Create: `src/detect-boundaries-v7.ts` 668 - 669 - - [ ] **Step 1: Implement CLI** 670 - 671 - Wire together all three stages: 672 - 1. Parse args: `<transcript.json> --diarization <diarization.json> --stream-slug <slug>` 673 - 2. Load transcript JSON and diarization JSON 674 - 3. Load schedule from DB using `STREAM_MATCH[slug]` (reuse pattern from `tracks.ts`) 675 - 4. Run pipeline: `detectHallucinationZones` → `segmentDiarization` → `matchAllSegments` → `reconcileSchedule` 676 - 5. Print summary table (like v6) 677 - 6. Write output to `<transcript>-boundaries-v7.json` 678 - 679 - - [ ] **Step 2: Test manually against one stream** 680 - 681 - ```bash 682 - cd apps/ionosphere-appview 683 - npx tsx src/detect-boundaries-v7.ts \ 684 - data/fullday/ATScience/transcript-enriched.json \ 685 - --diarization data/fullday/ATScience/diarization.json \ 686 - --stream-slug atscience 687 - ``` 688 - 689 - Compare output against manually verified ground truth from April 12 audit. 690 - 691 - - [ ] **Step 3: Commit** 692 - 693 - ```bash 694 - git add src/detect-boundaries-v7.ts 695 - git commit -m "feat(v7): CLI entry point for boundary detection pipeline" 696 - ``` 697 - 698 - --- 699 - 700 - ## Chunk 5: Validation Against Ground Truth 701 - 702 - ### Task 7: Run Against All 7 Streams 703 - 704 - - [ ] **Step 1: Run v7 on all streams** 705 - 706 - ```bash 707 - cd apps/ionosphere-appview 708 - for dir in ATScience Great_Hall___Day_1 Great_Hall___Day_2 Room_2301___Day_1 Room_2301___Day_2 Performance_Theater___Day_1 Performance_Theater___Day_2; do 709 - slug=$(echo "$dir" | sed 's/Great_Hall___Day_1/great-hall-day-1/' | sed 's/Great_Hall___Day_2/great-hall-day-2/' | sed 's/Room_2301___Day_1/room-2301-day-1/' | sed 's/Room_2301___Day_2/room-2301-day-2/' | sed 's/Performance_Theater___Day_1/performance-theatre-day-1/' | sed 's/Performance_Theater___Day_2/performance-theatre-day-2/' | sed 's/ATScience/atscience/') 710 - echo "=== $slug ===" 711 - npx tsx src/detect-boundaries-v7.ts \ 712 - "data/fullday/$dir/transcript-enriched.json" \ 713 - --diarization "data/fullday/$dir/diarization.json" \ 714 - --stream-slug "$slug" 715 - echo 716 - done 717 - ``` 718 - 719 - - [ ] **Step 2: Compare against current DB state (ground truth)** 720 - 721 - Write a quick comparison script that loads the v7 output and the current DB video_segments, computing: 722 - - Talks matched correctly (same rkey, start within 60s) 723 - - Talks missed by v7 724 - - Talks v7 found that aren't in ground truth 725 - - Average start time error for matched talks 726 - 727 - - [ ] **Step 3: Fix any systematic issues found** 728 - - [ ] **Step 4: Commit final version** 729 - 730 - ```bash 731 - git add -A 732 - git commit -m "feat(v7): boundary detection v7 complete — diarization-first pipeline" 733 - ```
-743
docs/superpowers/plans/2026-04-12-conference-discussion.md
··· 1 - # Conference Discussion Page Implementation Plan 2 - 3 - > **For agentic workers:** REQUIRED: Use superpowers:subagent-driven-development (if subagents available) or superpowers:executing-plans to implement this plan. Steps use checkbox (`- [ ]`) syntax for tracking. 4 - 5 - **Goal:** Build a curated multi-column "Conference Discussion" page showing top Bluesky posts, blog recaps, and VOD site links from ATmosphereConf, with talk deep-links and filterable sections. 6 - 7 - **Architecture:** Extend the mentions table with `content_type`, `external_url`, and `og_title` columns. A new fetch phase searches 20+ VOD domains and blog/recap queries, classifies content, and fetches OG metadata. A new XRPC endpoint serves discussion data grouped by type. The frontend uses the concordance `IndexContent.tsx` greedy-column pattern with section-based flow items and a filter bar. 8 - 9 - **Tech Stack:** SQLite (schema), Node.js (fetch scripts), Hono (API), React/Next.js with greedy column-fill layout 10 - 11 - --- 12 - 13 - ## File Structure 14 - 15 - | File | Action | Responsibility | 16 - |------|--------|---------------| 17 - | `apps/ionosphere-appview/src/db.ts` | Modify | Add 3 columns to mentions table | 18 - | `scripts/fetch-discussion.mjs` | Create | Wider search: VOD domains, blog recaps, OG metadata, talk matching | 19 - | `apps/ionosphere-appview/src/routes.ts` | Modify | Add `getDiscussion` endpoint | 20 - | `apps/ionosphere/src/lib/api.ts` | Modify | Add `getDiscussion()` client | 21 - | `apps/ionosphere/src/app/discussion/page.tsx` | Create | Route + SSR data fetch | 22 - | `apps/ionosphere/src/app/discussion/DiscussionContent.tsx` | Create | Multi-column layout with filter, section nav, click-to-play | 23 - | `apps/ionosphere/src/app/components/NavHeader.tsx` | Modify | Add "Discussion" nav item | 24 - 25 - --- 26 - 27 - ## Task 1: Schema Migration — Add Columns 28 - 29 - **Files:** 30 - - Modify: `apps/ionosphere-appview/src/db.ts:167-187` 31 - 32 - - [ ] **Step 1: Add columns to schema and run migration** 33 - 34 - In `db.ts`, add after the mentions table CREATE statement (inside the same `db.exec` block): 35 - 36 - ```sql 37 - -- Add columns if they don't exist (idempotent via try/catch in migration) 38 - ``` 39 - 40 - Since SQLite doesn't support `ADD COLUMN IF NOT EXISTS`, add a migration block after the main `db.exec`. Find the existing migration section and add: 41 - 42 - ```typescript 43 - // Mentions table extensions 44 - try { db.exec("ALTER TABLE mentions ADD COLUMN content_type TEXT DEFAULT 'post'"); } catch {} 45 - try { db.exec("ALTER TABLE mentions ADD COLUMN external_url TEXT"); } catch {} 46 - try { db.exec("ALTER TABLE mentions ADD COLUMN og_title TEXT"); } catch {} 47 - try { db.exec("ALTER TABLE mentions ADD COLUMN talk_rkey TEXT"); } catch {} 48 - ``` 49 - 50 - Also run these directly on the SQLite database: 51 - 52 - ```bash 53 - sqlite3 apps/data/ionosphere.sqlite "ALTER TABLE mentions ADD COLUMN content_type TEXT DEFAULT 'post';" 2>/dev/null 54 - sqlite3 apps/data/ionosphere.sqlite "ALTER TABLE mentions ADD COLUMN external_url TEXT;" 2>/dev/null 55 - sqlite3 apps/data/ionosphere.sqlite "ALTER TABLE mentions ADD COLUMN og_title TEXT;" 2>/dev/null 56 - sqlite3 apps/data/ionosphere.sqlite "ALTER TABLE mentions ADD COLUMN talk_rkey TEXT;" 2>/dev/null 57 - ``` 58 - 59 - - [ ] **Step 2: Verify columns** 60 - 61 - ```bash 62 - sqlite3 apps/data/ionosphere.sqlite "PRAGMA table_info(mentions);" | grep -E "content_type|external_url|og_title|talk_rkey" 63 - ``` 64 - 65 - Expected: 4 new columns listed. 66 - 67 - - [ ] **Step 3: Commit** 68 - 69 - ```bash 70 - git add apps/ionosphere-appview/src/db.ts 71 - git commit -m "feat: add content_type, external_url, og_title, talk_rkey to mentions" 72 - ``` 73 - 74 - --- 75 - 76 - ## Task 2: Discussion Fetch Script 77 - 78 - **Files:** 79 - - Create: `scripts/fetch-discussion.mjs` 80 - 81 - This script runs as a separate batch job (does not modify `fetch-mentions.mjs`). It: 82 - 1. Searches for blog/recap posts via multiple keyword queries 83 - 2. Searches for VOD site links via `domain:` queries across 20+ domains 84 - 3. Classifies each post as `blog`, `video`, or `post` 85 - 4. Extracts external URLs from facets 86 - 5. Fetches OG titles for blog posts 87 - 6. Matches posts to talks via ionosphere.tv URL parsing or speaker @-mention cross-referencing 88 - 7. Upserts into the mentions table with the new columns populated 89 - 8. Backfills profiles for any new authors 90 - 91 - - [ ] **Step 1: Create the fetch script** 92 - 93 - ```javascript 94 - import { createRequire } from 'module'; 95 - const require = createRequire( 96 - new URL('../apps/ionosphere-appview/package.json', import.meta.url).pathname 97 - ); 98 - const { BskyAgent } = require('@atproto/api'); 99 - const Database = require('better-sqlite3'); 100 - 101 - import { fileURLToPath } from 'url'; 102 - import { dirname, join } from 'path'; 103 - 104 - const __dirname = dirname(fileURLToPath(import.meta.url)); 105 - const DB_PATH = join(__dirname, '..', 'apps', 'data', 'ionosphere.sqlite'); 106 - 107 - const agent = new BskyAgent({ service: 'https://bsky.social' }); 108 - function sleep(ms) { return new Promise(r => setTimeout(r, ms)); } 109 - 110 - // ── VOD domains ──────────────────────────────────────────────────── 111 - 112 - const VOD_DOMAINS = [ 113 - 'stream.place', 'vods.sky.boo', 'vod.atverkackt.de', 'ionosphere.tv', 114 - 'atmosphereconf-vods.wisp.place', 'rpg.actor', 'vod.j4ck.xyz', 115 - 'atmosphere-vods.j4ck.xyz', 'atmosphereconf-tv.btao.org', 116 - 'stream-bsky.pages.dev', 'sites.wisp.place', 'vods.ajbird.net', 117 - 'streamhut.wisp.place', 'conf-vods.wisp.place', 'aetheros.computer', 118 - 'atmo.rsvp', 'atmosphereconf.org', 'youtube.com', 119 - ]; 120 - 121 - // ── Blog/recap queries ───────────────────────────────────────────── 122 - 123 - const BLOG_QUERIES = [ 124 - 'atmosphereconf recap', 125 - 'atmosphereconf wrote', 126 - 'atmosphereconf writeup', 127 - 'atmosphereconf takeaway', 128 - 'atmosphereconf reflection', 129 - 'atmosphereconf blog', 130 - 'atmosphere conference wrote', 131 - 'atmosphere conference recap', 132 - ]; 133 - 134 - // ── Helpers ───────────────────────────────────────────────────────── 135 - 136 - function extractLinks(post) { 137 - return (post.record?.facets || []) 138 - .flatMap(f => f.features || []) 139 - .filter(f => f.uri) 140 - .map(f => f.uri); 141 - } 142 - 143 - function classifyPost(post, searchDomain) { 144 - const links = extractLinks(post); 145 - const text = (post.record?.text || '').toLowerCase(); 146 - 147 - // If searched by a VOD domain, it's a video 148 - if (searchDomain && VOD_DOMAINS.includes(searchDomain)) return 'video'; 149 - 150 - // Check links for known blog patterns 151 - for (const link of links) { 152 - try { 153 - const url = new URL(link); 154 - if (VOD_DOMAINS.some(d => url.hostname.endsWith(d))) return 'video'; 155 - } catch {} 156 - } 157 - 158 - // Blog indicators 159 - if (text.includes('wrote') || text.includes('recap') || text.includes('writeup') || 160 - text.includes('blog') || text.includes('reflection')) { 161 - if (links.some(l => !VOD_DOMAINS.some(d => l.includes(d)))) return 'blog'; 162 - } 163 - 164 - return 'post'; 165 - } 166 - 167 - function extractPrimaryUrl(post, contentType) { 168 - const links = extractLinks(post); 169 - if (contentType === 'video') { 170 - return links.find(l => VOD_DOMAINS.some(d => l.includes(d))) || links[0] || null; 171 - } 172 - if (contentType === 'blog') { 173 - return links.find(l => !VOD_DOMAINS.some(d => l.includes(d)) && !l.includes('bsky.app')) || links[0] || null; 174 - } 175 - return links[0] || null; 176 - } 177 - 178 - function matchTalkByUrl(url, talksByRkey) { 179 - if (!url) return null; 180 - const match = url.match(/ionosphere\.tv\/talks\/([^/?#]+)/); 181 - if (match && talksByRkey.has(match[1])) return match[1]; 182 - return null; 183 - } 184 - 185 - function matchTalkBySpeaker(post, speakerHandleToTalks) { 186 - const mentions = (post.record?.facets || []) 187 - .flatMap(f => f.features || []) 188 - .filter(f => f.$type === 'app.bsky.richtext.facet#mention') 189 - .map(f => f.did); 190 - 191 - // Also check text for @handle patterns 192 - const text = post.record?.text || ''; 193 - const handleMatches = text.match(/@([\w.-]+)/g) || []; 194 - 195 - for (const handle of handleMatches) { 196 - const clean = handle.replace('@', ''); 197 - const talks = speakerHandleToTalks.get(clean); 198 - if (talks?.length === 1) return talks[0]; // unambiguous match 199 - } 200 - return null; 201 - } 202 - 203 - async function fetchOgTitle(url) { 204 - try { 205 - const controller = new AbortController(); 206 - const timeout = setTimeout(() => controller.abort(), 5000); 207 - const res = await fetch(url, { 208 - signal: controller.signal, 209 - headers: { 'User-Agent': 'ionosphere.tv/1.0' }, 210 - redirect: 'follow', 211 - }); 212 - clearTimeout(timeout); 213 - if (!res.ok) return null; 214 - const html = await res.text(); 215 - // Extract og:title 216 - const ogMatch = html.match(/<meta[^>]+property=["']og:title["'][^>]+content=["']([^"']+)["']/i) 217 - || html.match(/<meta[^>]+content=["']([^"']+)["'][^>]+property=["']og:title["']/i); 218 - if (ogMatch) return ogMatch[1]; 219 - // Fallback to <title> 220 - const titleMatch = html.match(/<title[^>]*>([^<]+)<\/title>/i); 221 - return titleMatch ? titleMatch[1].trim() : null; 222 - } catch { 223 - return null; 224 - } 225 - } 226 - 227 - // ── Main ─────────────────────────────────────────────────────────── 228 - 229 - async function main() { 230 - console.log('=== Fetch Discussion Content ===\n'); 231 - 232 - await agent.login({ 233 - identifier: 'ionosphere.tv', 234 - password: process.env.BOT_PASSWORD, 235 - }); 236 - console.log('Authenticated\n'); 237 - 238 - const db = new Database(DB_PATH); 239 - 240 - // Ensure new columns exist 241 - try { db.exec("ALTER TABLE mentions ADD COLUMN content_type TEXT DEFAULT 'post'"); } catch {} 242 - try { db.exec("ALTER TABLE mentions ADD COLUMN external_url TEXT"); } catch {} 243 - try { db.exec("ALTER TABLE mentions ADD COLUMN og_title TEXT"); } catch {} 244 - try { db.exec("ALTER TABLE mentions ADD COLUMN talk_rkey TEXT"); } catch {} 245 - 246 - const upsert = db.prepare(` 247 - INSERT INTO mentions (uri, talk_uri, author_did, author_handle, text, created_at, 248 - talk_offset_ms, byte_position, likes, reposts, replies, parent_uri, 249 - mention_type, indexed_at, content_type, external_url, og_title, talk_rkey) 250 - VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) 251 - ON CONFLICT(uri) DO UPDATE SET 252 - likes=excluded.likes, reposts=excluded.reposts, replies=excluded.replies, 253 - content_type=excluded.content_type, external_url=excluded.external_url, 254 - og_title=excluded.og_title, talk_rkey=excluded.talk_rkey, indexed_at=excluded.indexed_at 255 - `); 256 - 257 - // Load talk data for matching 258 - const talks = db.prepare("SELECT DISTINCT rkey, title, uri FROM talks WHERE starts_at IS NOT NULL").all(); 259 - const talksByRkey = new Map(talks.map(t => [t.rkey, t])); 260 - 261 - const speakerTalks = db.prepare(` 262 - SELECT s.handle, t.rkey 263 - FROM speakers s 264 - JOIN talk_speakers ts ON ts.speaker_uri = s.uri 265 - JOIN talks t ON t.uri = ts.talk_uri 266 - WHERE s.handle IS NOT NULL 267 - `).all(); 268 - const speakerHandleToTalks = new Map(); 269 - for (const { handle, rkey } of speakerTalks) { 270 - if (!speakerHandleToTalks.has(handle)) speakerHandleToTalks.set(handle, []); 271 - speakerHandleToTalks.get(handle).push(rkey); 272 - } 273 - 274 - const allPosts = new Map(); 275 - 276 - // Phase 1: VOD domain searches 277 - console.log('--- Phase 1: VOD domains ---'); 278 - for (const domain of VOD_DOMAINS) { 279 - try { 280 - const res = await agent.app.bsky.feed.searchPosts({ 281 - q: 'atmosphere OR atmosphereconf', 282 - domain, 283 - since: '2026-03-25T00:00:00Z', 284 - sort: 'top', 285 - limit: 100, 286 - }); 287 - const posts = res.data?.posts || []; 288 - for (const p of posts) { 289 - if (!allPosts.has(p.uri)) allPosts.set(p.uri, { post: p, searchDomain: domain }); 290 - } 291 - if (posts.length > 0) console.log(` ${domain}: ${posts.length} posts`); 292 - await sleep(200); 293 - } catch (e) { 294 - // Some domains may not return results 295 - } 296 - } 297 - 298 - // Phase 2: Blog/recap queries 299 - console.log('\n--- Phase 2: Blog/recap queries ---'); 300 - for (const q of BLOG_QUERIES) { 301 - try { 302 - const res = await agent.app.bsky.feed.searchPosts({ 303 - q, 304 - since: '2026-03-25T00:00:00Z', 305 - sort: 'top', 306 - limit: 50, 307 - }); 308 - const posts = res.data?.posts || []; 309 - for (const p of posts) { 310 - if (!allPosts.has(p.uri)) allPosts.set(p.uri, { post: p, searchDomain: null }); 311 - } 312 - if (posts.length > 0) console.log(` "${q}": ${posts.length} posts`); 313 - await sleep(200); 314 - } catch {} 315 - } 316 - 317 - // Phase 3: Top conference posts (sorted by engagement) 318 - console.log('\n--- Phase 3: Top conference posts ---'); 319 - for (const q of ['atmosphereconf', 'atmosphere conf', '#atmosphereconf', '#ATmosphere']) { 320 - try { 321 - const res = await agent.app.bsky.feed.searchPosts({ 322 - q, 323 - since: '2026-03-25T00:00:00Z', 324 - sort: 'top', 325 - limit: 100, 326 - }); 327 - const posts = res.data?.posts || []; 328 - for (const p of posts) { 329 - if (!allPosts.has(p.uri)) allPosts.set(p.uri, { post: p, searchDomain: null }); 330 - } 331 - if (posts.length > 0) console.log(` "${q}": ${posts.length} posts`); 332 - await sleep(200); 333 - } catch {} 334 - } 335 - 336 - console.log(`\nTotal unique posts: ${allPosts.size}`); 337 - 338 - // Phase 4: Classify, extract URLs, match talks, fetch OG titles 339 - console.log('\n--- Phase 4: Classify and enrich ---'); 340 - let blogCount = 0, videoCount = 0, postCount = 0, ogFetched = 0; 341 - const now = new Date().toISOString(); 342 - 343 - const batchInsert = db.transaction((items) => { 344 - for (const item of items) { 345 - upsert.run(...item); 346 - } 347 - }); 348 - 349 - const rows = []; 350 - for (const [uri, { post: p, searchDomain }] of allPosts) { 351 - const contentType = classifyPost(p, searchDomain); 352 - const externalUrl = extractPrimaryUrl(p, contentType); 353 - let talkRkey = matchTalkByUrl(externalUrl, talksByRkey); 354 - if (!talkRkey) talkRkey = matchTalkBySpeaker(p, speakerHandleToTalks); 355 - 356 - const talkUri = talkRkey ? (talksByRkey.get(talkRkey)?.uri || null) : null; 357 - 358 - if (contentType === 'blog') blogCount++; 359 - else if (contentType === 'video') videoCount++; 360 - else postCount++; 361 - 362 - rows.push([ 363 - p.uri, talkUri, p.author.did, p.author.handle, 364 - p.record?.text, p.record?.createdAt, 365 - null, null, // talk_offset_ms, byte_position 366 - p.likeCount || 0, p.repostCount || 0, p.replyCount || 0, 367 - null, // parent_uri 368 - 'discussion', now, 369 - contentType, externalUrl, null, talkRkey, 370 - ]); 371 - } 372 - 373 - batchInsert(rows); 374 - console.log(` Posts: ${postCount}, Blog posts: ${blogCount}, Videos: ${videoCount}`); 375 - 376 - // Phase 5: Fetch OG titles for blog posts 377 - console.log('\n--- Phase 5: OG titles ---'); 378 - const blogRows = db.prepare( 379 - "SELECT uri, external_url FROM mentions WHERE content_type = 'blog' AND external_url IS NOT NULL AND og_title IS NULL" 380 - ).all(); 381 - 382 - const updateOg = db.prepare("UPDATE mentions SET og_title = ? WHERE uri = ?"); 383 - for (const row of blogRows) { 384 - const title = await fetchOgTitle(row.external_url); 385 - if (title) { 386 - updateOg.run(title, row.uri); 387 - ogFetched++; 388 - console.log(` ${row.external_url} → ${title}`); 389 - } 390 - await sleep(100); 391 - } 392 - console.log(` OG titles fetched: ${ogFetched}/${blogRows.length}`); 393 - 394 - // Phase 6: Backfill profiles 395 - console.log('\n--- Phase 6: Profile backfill ---'); 396 - const missing = db.prepare(` 397 - SELECT DISTINCT m.author_did FROM mentions m 398 - LEFT JOIN profiles p ON m.author_did = p.did WHERE p.did IS NULL 399 - `).all(); 400 - 401 - const profileUpsert = db.prepare( 402 - "INSERT OR REPLACE INTO profiles (did, handle, display_name, avatar_url, fetched_at) VALUES (?, ?, ?, ?, ?)" 403 - ); 404 - let profilesFetched = 0; 405 - for (const { author_did: did } of missing) { 406 - try { 407 - const res = await fetch( 408 - `https://public.api.bsky.app/xrpc/app.bsky.actor.getProfile?actor=${encodeURIComponent(did)}` 409 - ); 410 - if (res.ok) { 411 - const data = await res.json(); 412 - profileUpsert.run(did, data.handle || null, data.displayName || null, data.avatar || null, now); 413 - profilesFetched++; 414 - } 415 - } catch {} 416 - await sleep(50); 417 - } 418 - console.log(` New profiles: ${profilesFetched}`); 419 - 420 - // Also backfill talk_rkey for existing during_talk mentions 421 - console.log('\n--- Phase 7: Backfill talk_rkey on existing mentions ---'); 422 - const updated = db.prepare(` 423 - UPDATE mentions SET talk_rkey = ( 424 - SELECT t.rkey FROM talks t WHERE t.uri = mentions.talk_uri LIMIT 1 425 - ) WHERE talk_uri IS NOT NULL AND talk_rkey IS NULL 426 - `).run(); 427 - console.log(` Updated ${updated.changes} existing mentions with talk_rkey`); 428 - 429 - // Summary 430 - const stats = db.prepare(` 431 - SELECT content_type, COUNT(*) as c FROM mentions 432 - WHERE content_type IS NOT NULL GROUP BY content_type 433 - `).all(); 434 - console.log('\n=== DONE ==='); 435 - for (const s of stats) console.log(` ${s.content_type}: ${s.c}`); 436 - console.log(` Total: ${db.prepare('SELECT COUNT(*) as c FROM mentions').get().c}`); 437 - 438 - db.close(); 439 - } 440 - 441 - main().catch(console.error); 442 - ``` 443 - 444 - - [ ] **Step 2: Run the script** 445 - 446 - ```bash 447 - source apps/ionosphere-appview/.env && BOT_PASSWORD="$BOT_PASSWORD" node scripts/fetch-discussion.mjs 448 - ``` 449 - 450 - Expected: Finds posts across VOD domains and blog queries, classifies them, fetches OG titles, and backfills existing mentions with `talk_rkey`. 451 - 452 - - [ ] **Step 3: Verify** 453 - 454 - ```bash 455 - sqlite3 apps/data/ionosphere.sqlite "SELECT content_type, COUNT(*) FROM mentions WHERE content_type IS NOT NULL GROUP BY content_type;" 456 - sqlite3 apps/data/ionosphere.sqlite "SELECT og_title, external_url FROM mentions WHERE og_title IS NOT NULL LIMIT 5;" 457 - sqlite3 apps/data/ionosphere.sqlite "SELECT COUNT(*) FROM mentions WHERE talk_rkey IS NOT NULL;" 458 - ``` 459 - 460 - - [ ] **Step 4: Commit** 461 - 462 - ```bash 463 - git add scripts/fetch-discussion.mjs 464 - git commit -m "feat: wider search for discussion content — VOD sites, blogs, OG metadata" 465 - ``` 466 - 467 - --- 468 - 469 - ## Task 3: API Endpoint — getDiscussion 470 - 471 - **Files:** 472 - - Modify: `apps/ionosphere-appview/src/routes.ts` (after getMentions, ~line 310) 473 - - Modify: `apps/ionosphere/src/lib/api.ts` 474 - 475 - - [ ] **Step 1: Add getDiscussion route** 476 - 477 - Add after the getMentions handler: 478 - 479 - ```typescript 480 - app.get("/xrpc/tv.ionosphere.getDiscussion", (c) => { 481 - const profileJoin = ` 482 - LEFT JOIN profiles p ON m.author_did = p.did 483 - `; 484 - const selectCols = ` 485 - m.uri, m.author_did, m.text, m.created_at, m.likes, m.reposts, m.replies, 486 - m.content_type, m.external_url, m.og_title, m.talk_rkey, m.mention_type, 487 - COALESCE(p.handle, m.author_handle) as author_handle, 488 - p.display_name as author_display_name, 489 - p.avatar_url as author_avatar_url 490 - `; 491 - 492 - // Top posts: highest engagement, exclude thread replies 493 - const posts = db.prepare(` 494 - SELECT ${selectCols}, 495 - (SELECT t.title FROM talks t WHERE t.rkey = m.talk_rkey LIMIT 1) as talk_title 496 - FROM mentions m ${profileJoin} 497 - WHERE m.parent_uri IS NULL 498 - AND (m.content_type IS NULL OR m.content_type = 'post') 499 - ORDER BY m.likes DESC 500 - LIMIT 200 501 - `).all(); 502 - 503 - // Blog posts 504 - const blogs = db.prepare(` 505 - SELECT ${selectCols}, 506 - (SELECT t.title FROM talks t WHERE t.rkey = m.talk_rkey LIMIT 1) as talk_title 507 - FROM mentions m ${profileJoin} 508 - WHERE m.content_type = 'blog' AND m.parent_uri IS NULL 509 - ORDER BY m.likes DESC 510 - `).all(); 511 - 512 - // Videos 513 - const videos = db.prepare(` 514 - SELECT ${selectCols}, 515 - (SELECT t.title FROM talks t WHERE t.rkey = m.talk_rkey LIMIT 1) as talk_title 516 - FROM mentions m ${profileJoin} 517 - WHERE m.content_type = 'video' AND m.parent_uri IS NULL 518 - ORDER BY m.likes DESC 519 - `).all(); 520 - 521 - // VOD site domains 522 - const vodSites = db.prepare(` 523 - SELECT DISTINCT 524 - REPLACE(REPLACE(REPLACE(external_url, 'https://', ''), 'http://', ''), SUBSTR(REPLACE(REPLACE(external_url, 'https://', ''), 'http://', ''), INSTR(REPLACE(REPLACE(external_url, 'https://', ''), 'http://', ''), '/')), '') as domain 525 - FROM mentions 526 - WHERE content_type = 'video' AND external_url IS NOT NULL 527 - `).all().map((r: any) => r.domain).filter(Boolean); 528 - 529 - // Stats 530 - const stats = { 531 - totalPosts: db.prepare("SELECT COUNT(*) as c FROM mentions WHERE parent_uri IS NULL").get() as any, 532 - blogCount: blogs.length, 533 - vodSiteCount: new Set(vodSites).size, 534 - uniqueAuthors: db.prepare("SELECT COUNT(DISTINCT author_did) as c FROM mentions").get() as any, 535 - }; 536 - 537 - return c.json({ 538 - posts, 539 - blogs, 540 - videos, 541 - vodSites: [...new Set(vodSites)], 542 - stats: { 543 - totalPosts: stats.totalPosts.c, 544 - blogCount: stats.blogCount, 545 - vodSiteCount: stats.vodSiteCount, 546 - uniqueAuthors: stats.uniqueAuthors.c, 547 - }, 548 - }); 549 - }); 550 - ``` 551 - 552 - - [ ] **Step 2: Add frontend API client** 553 - 554 - In `apps/ionosphere/src/lib/api.ts`, add: 555 - 556 - ```typescript 557 - export async function getDiscussion() { 558 - return fetchApi<{ 559 - posts: any[]; blogs: any[]; videos: any[]; 560 - vodSites: string[]; 561 - stats: { totalPosts: number; blogCount: number; vodSiteCount: number; uniqueAuthors: number }; 562 - }>("/xrpc/tv.ionosphere.getDiscussion"); 563 - } 564 - ``` 565 - 566 - - [ ] **Step 3: Commit** 567 - 568 - ```bash 569 - git add apps/ionosphere-appview/src/routes.ts apps/ionosphere/src/lib/api.ts 570 - git commit -m "feat: add getDiscussion XRPC endpoint" 571 - ``` 572 - 573 - --- 574 - 575 - ## Task 4: Discussion Page and Content Component 576 - 577 - **Files:** 578 - - Create: `apps/ionosphere/src/app/discussion/page.tsx` 579 - - Create: `apps/ionosphere/src/app/discussion/DiscussionContent.tsx` 580 - 581 - - [ ] **Step 1: Create the page route** 582 - 583 - `apps/ionosphere/src/app/discussion/page.tsx`: 584 - 585 - ```tsx 586 - import DiscussionContent from "./DiscussionContent"; 587 - import { getDiscussion } from "@/lib/api"; 588 - 589 - export default async function DiscussionPage() { 590 - const data = await getDiscussion().catch(() => ({ 591 - posts: [], blogs: [], videos: [], vodSites: [], 592 - stats: { totalPosts: 0, blogCount: 0, vodSiteCount: 0, uniqueAuthors: 0 }, 593 - })); 594 - 595 - return <DiscussionContent data={data} />; 596 - } 597 - ``` 598 - 599 - - [ ] **Step 2: Create the DiscussionContent component** 600 - 601 - This is the main component. It follows `IndexContent.tsx` patterns: greedy column-fill, section nav, filter bar, click-to-play panel. The file will be ~400 lines. 602 - 603 - Create `apps/ionosphere/src/app/discussion/DiscussionContent.tsx`: 604 - 605 - The component should implement: 606 - 607 - 1. **Data types**: `DiscussionItem` with uri, author_handle, author_display_name, author_avatar_url, text, likes, reposts, content_type, external_url, og_title, talk_rkey, talk_title 608 - 2. **Flow items**: `{ type: "heading", label: string }` or `{ type: "item", item: DiscussionItem }` or `{ type: "vodDirectory", sites: string[] }` or `{ type: "stats", stats: Stats }` 609 - 3. **Filter state**: `"all" | "posts" | "blogs" | "videos"` — filters which sections appear in the flow 610 - 4. **Column layout**: Reuse the greedy-fill pattern from IndexContent: measure container, compute columns, fill greedily with height estimation 611 - 5. **Section nav**: T (Top Posts) / R (Recaps) / V (Videos) sidebar buttons 612 - 6. **Item rendering**: Each item is a compact block: 613 - - Line 1: 14px avatar + handle (blue) + like count (muted) 614 - - Line 2: Post text (neutral-400, 1-2 lines truncated) or og_title for blogs 615 - - Line 3 (optional): Talk link → (neutral-500) + external link ↗ (green for blogs, purple for videos) 616 - 7. **Click handler**: Click on a talk link → opens right panel with talk video + transcript (same as concordance) 617 - 8. **Mobile**: Single-column progressive rendering (same as concordance MobileConcordance) 618 - 619 - Key measurements for column fill: 620 - - `ITEM_HEIGHT = 58` (3 lines × ~19px + 4px margin) 621 - - `HEADING_HEIGHT = 28` 622 - - `STATS_HEIGHT = 60` 623 - - `VOD_DIRECTORY_HEIGHT = 80` 624 - 625 - Filter bar at top: pill buttons styled like: 626 - ```tsx 627 - <button className={`text-xs px-3 py-1 rounded-full transition-colors ${ 628 - active ? "bg-blue-500/20 text-blue-300" : "text-neutral-500 hover:text-neutral-300" 629 - }`}>All</button> 630 - ``` 631 - 632 - Section headings in the flow: 633 - ```tsx 634 - <h3 className="text-[11px] font-bold text-neutral-500 uppercase tracking-wide border-b border-neutral-800 pb-1 mb-1 mt-2 first:mt-0"> 635 - {label} 636 - </h3> 637 - ``` 638 - 639 - Item rendering: 640 - ```tsx 641 - <div className="mb-1.5 text-[12px] leading-[1.5]"> 642 - <div className="flex items-baseline gap-1"> 643 - {item.author_avatar_url ? ( 644 - <img src={item.author_avatar_url} className="w-3.5 h-3.5 rounded-full shrink-0 relative top-[2px]" /> 645 - ) : ( 646 - <div className="w-3.5 h-3.5 rounded-full bg-neutral-700 shrink-0 relative top-[2px]" /> 647 - )} 648 - <span className="text-blue-400 text-[11px] truncate">{item.author_handle}</span> 649 - <span className="text-neutral-600 text-[10px] ml-auto shrink-0">{item.likes}♡</span> 650 - </div> 651 - <div className="text-neutral-400 pl-[18px] line-clamp-2 -mt-px"> 652 - {item.og_title || item.text} 653 - </div> 654 - {(item.talk_rkey || item.external_url) && ( 655 - <div className="pl-[18px] mt-0.5 flex gap-2"> 656 - {item.talk_rkey && ( 657 - <button onClick={() => handleSelect(item.talk_rkey)} className="text-neutral-500 text-[10px] hover:text-neutral-300"> 658 - {item.talk_title || 'Talk'} → 659 - </button> 660 - )} 661 - {item.external_url && ( 662 - <a href={item.external_url} target="_blank" rel="noopener" className={`text-[10px] ${ 663 - item.content_type === 'blog' ? 'text-emerald-500' : item.content_type === 'video' ? 'text-purple-400' : 'text-neutral-500' 664 - }`}> 665 - {new URL(item.external_url).hostname} ↗ 666 - </a> 667 - )} 668 - </div> 669 - )} 670 - </div> 671 - ``` 672 - 673 - VOD directory: 674 - ```tsx 675 - <div className="p-2 bg-neutral-900 rounded mb-2"> 676 - <div className="text-neutral-600 text-[10px] font-semibold mb-1">VOD JAM SITES</div> 677 - <div className="flex flex-wrap gap-1"> 678 - {sites.map(s => ( 679 - <a key={s} href={`https://${s}`} target="_blank" rel="noopener" 680 - className="text-purple-400 text-[10px] bg-purple-500/10 px-1.5 py-0.5 rounded">{s}</a> 681 - ))} 682 - </div> 683 - </div> 684 - ``` 685 - 686 - Stats card: 687 - ```tsx 688 - <div className="p-2 bg-neutral-900 rounded mb-2 flex gap-4 justify-center text-center"> 689 - <div><div className="text-blue-400 text-lg font-bold">{stats.totalPosts}</div><div className="text-neutral-600 text-[9px]">posts</div></div> 690 - <div><div className="text-emerald-400 text-lg font-bold">{stats.blogCount}</div><div className="text-neutral-600 text-[9px]">recaps</div></div> 691 - <div><div className="text-purple-400 text-lg font-bold">{stats.vodSiteCount}</div><div className="text-neutral-600 text-[9px]">VOD sites</div></div> 692 - <div><div className="text-amber-400 text-lg font-bold">{stats.uniqueAuthors}</div><div className="text-neutral-600 text-[9px]">people</div></div> 693 - </div> 694 - ``` 695 - 696 - The right panel for click-to-play reuses the exact same pattern as IndexContent: fetch talk data, open `<TimestampProvider>` with `<VideoPlayer>` and `<TranscriptView>`. 697 - 698 - - [ ] **Step 3: Commit** 699 - 700 - ```bash 701 - git add apps/ionosphere/src/app/discussion/ 702 - git commit -m "feat: conference discussion page with multi-column layout" 703 - ``` 704 - 705 - --- 706 - 707 - ## Task 5: Nav Update and Verification 708 - 709 - **Files:** 710 - - Modify: `apps/ionosphere/src/app/components/NavHeader.tsx:7-13` 711 - 712 - - [ ] **Step 1: Add Discussion to nav** 713 - 714 - In NavHeader.tsx, update the NAV_ITEMS array (line 7-13): 715 - 716 - ```typescript 717 - const NAV_ITEMS = [ 718 - { href: "/talks", label: "Talks" }, 719 - { href: "/tracks", label: "Tracks" }, 720 - { href: "/speakers", label: "Speakers" }, 721 - { href: "/concepts", label: "Concepts" }, 722 - { href: "/concordance", label: "Index" }, 723 - { href: "/discussion", label: "Discussion" }, 724 - ]; 725 - ``` 726 - 727 - - [ ] **Step 2: Restart appview and frontend** 728 - 729 - Restart both servers (kill existing, re-launch on ports 3010 and 3011). 730 - 731 - - [ ] **Step 3: Verify** 732 - 733 - 1. Check API: `curl -s http://localhost:3010/xrpc/tv.ionosphere.getDiscussion | node -e "process.stdin.on('data',d=>{const j=JSON.parse(d);console.log('Posts:',j.posts.length,'Blogs:',j.blogs.length,'Videos:',j.videos.length,'VOD sites:',j.vodSites.length)})"` 734 - 2. Open http://localhost:3011/discussion — verify multi-column layout, section headers, filter bar, clickable items 735 - 3. Click a post with a talk link → verify right panel opens with video + transcript 736 - 4. Test filter pills: "Blog Posts" should show only blog section, "Videos" only video section 737 - 738 - - [ ] **Step 4: Commit** 739 - 740 - ```bash 741 - git add apps/ionosphere/src/app/components/NavHeader.tsx 742 - git commit -m "feat: add Discussion to site navigation" 743 - ```
-964
docs/superpowers/plans/2026-04-12-conference-mentions.md
··· 1 - # Conference Mentions Implementation Plan 2 - 3 - > **For agentic workers:** REQUIRED: Use superpowers:subagent-driven-development (if subagents available) or superpowers:executing-plans to implement this plan. Steps use checkbox (`- [ ]`) syntax for tracking. 4 - 5 - **Goal:** Surface time-aligned Bluesky mentions of speakers in the ionosphere.tv talk detail sidebar, with paginated fetching, thread following, and post-conference mentions. 6 - 7 - **Architecture:** A batch fetch script pulls mentions from the Bluesky search API, computes byte positions from transcript timings, and stores them in SQLite. A new XRPC endpoint serves mentions per talk. The frontend adds a "Mentions" tab to the right sidebar with scroll-synced mention cards using pretext spacers for vertical alignment. 8 - 9 - **Tech Stack:** Node.js (fetch script), SQLite (storage), Hono (API), React/Next.js (frontend), `@atproto/api` (Bluesky SDK) 10 - 11 - --- 12 - 13 - ## File Structure 14 - 15 - | File | Action | Responsibility | 16 - |------|--------|---------------| 17 - | `apps/ionosphere-appview/src/db.ts` | Modify | Add `mentions` table schema | 18 - | `apps/ionosphere-appview/src/routes.ts` | Modify | Add `getMentions` endpoint | 19 - | `apps/ionosphere/src/lib/api.ts` | Modify | Add `getMentions()` client function | 20 - | `apps/ionosphere/src/app/talks/[rkey]/page.tsx` | Modify | Fetch mentions server-side, pass to TalkContent | 21 - | `apps/ionosphere/src/app/talks/[rkey]/TalkContent.tsx` | Modify | Add tab system, render MentionsSidebar | 22 - | `apps/ionosphere/src/app/components/MentionsSidebar.tsx` | Create | Scroll-synced mention cards with thread expansion | 23 - | `scripts/fetch-mentions.mjs` | Create | Paginated fetch, thread following, byte mapping | 24 - 25 - --- 26 - 27 - ## Task 1: Database Schema 28 - 29 - **Files:** 30 - - Modify: `apps/ionosphere-appview/src/db.ts:167` (after comments table, before profiles table) 31 - 32 - - [ ] **Step 1: Add mentions table to schema** 33 - 34 - In `db.ts`, add after the comments index (line 166) and before the profiles table (line 168): 35 - 36 - ```sql 37 - CREATE TABLE IF NOT EXISTS mentions ( 38 - uri TEXT PRIMARY KEY, 39 - talk_uri TEXT, 40 - author_did TEXT NOT NULL, 41 - author_handle TEXT, 42 - text TEXT, 43 - created_at TEXT NOT NULL, 44 - talk_offset_ms INTEGER, 45 - byte_position INTEGER, 46 - likes INTEGER DEFAULT 0, 47 - reposts INTEGER DEFAULT 0, 48 - replies INTEGER DEFAULT 0, 49 - parent_uri TEXT, 50 - mention_type TEXT DEFAULT 'during_talk', 51 - indexed_at TEXT NOT NULL 52 - ); 53 - 54 - CREATE INDEX IF NOT EXISTS idx_mentions_talk ON mentions(talk_uri, talk_offset_ms); 55 - CREATE INDEX IF NOT EXISTS idx_mentions_parent ON mentions(parent_uri); 56 - ``` 57 - 58 - - [ ] **Step 2: Verify schema applies** 59 - 60 - Run: `cd apps/ionosphere-appview && npx tsx src/db.ts 2>&1 || echo "Check if db module exports migrate"` 61 - 62 - If `db.ts` doesn't have a standalone entry point, verify by checking the appview starts: 63 - ```bash 64 - sqlite3 apps/data/ionosphere.sqlite ".tables" | tr ' ' '\n' | sort 65 - ``` 66 - 67 - The `mentions` table should appear. If not, start the appview briefly or run the migrate function directly. 68 - 69 - - [ ] **Step 3: Commit** 70 - 71 - ```bash 72 - git add apps/ionosphere-appview/src/db.ts 73 - git commit -m "feat: add mentions table to SQLite schema" 74 - ``` 75 - 76 - --- 77 - 78 - ## Task 2: Fetch Script with Pagination and Threads 79 - 80 - **Files:** 81 - - Create: `scripts/fetch-mentions.mjs` 82 - 83 - This replaces the prototype scripts. Key improvements: cursor pagination, thread fetching, byte-position mapping, SQLite storage. 84 - 85 - - [ ] **Step 1: Write the fetch script** 86 - 87 - Create `scripts/fetch-mentions.mjs`: 88 - 89 - ```javascript 90 - import { createRequire } from 'module'; 91 - const require = createRequire( 92 - new URL('../apps/ionosphere-appview/package.json', import.meta.url).pathname 93 - ); 94 - const { BskyAgent } = require('@atproto/api'); 95 - const Database = require('better-sqlite3'); 96 - 97 - import { fileURLToPath } from 'url'; 98 - import { dirname, join } from 'path'; 99 - 100 - const __dirname = dirname(fileURLToPath(import.meta.url)); 101 - const DB_PATH = join(__dirname, '..', 'apps', 'data', 'ionosphere.sqlite'); 102 - 103 - const CONF_SINCE = '2026-03-25T00:00:00Z'; 104 - const CONF_UNTIL = '2026-03-31T00:00:00Z'; 105 - const PRE_BUFFER_MS = 5 * 60 * 1000; 106 - const POST_BUFFER_MS = 30 * 60 * 1000; 107 - 108 - const agent = new BskyAgent({ service: 'https://bsky.social' }); 109 - 110 - function sleep(ms) { return new Promise(r => setTimeout(r, ms)); } 111 - 112 - // ── Byte position mapping ────────────────────────────────────────── 113 - 114 - function mapOffsetToBytePosition(talkOffsetMs, compactTranscript) { 115 - if (!compactTranscript) return null; 116 - const { text, startMs, timings } = compactTranscript; 117 - const words = text.split(/\s+/).filter(w => w.length > 0); 118 - const encoder = new TextEncoder(); 119 - 120 - let cursorMs = startMs; 121 - let wordIndex = 0; 122 - let searchFrom = 0; 123 - let lastBytePos = 0; 124 - 125 - for (const value of timings) { 126 - if (value < 0) { 127 - cursorMs += Math.abs(value); 128 - } else { 129 - if (wordIndex < words.length) { 130 - const word = words[wordIndex]; 131 - const idx = text.indexOf(word, searchFrom); 132 - if (idx !== -1) { 133 - const bytePos = encoder.encode(text.slice(0, idx)).length; 134 - if (cursorMs >= talkOffsetMs) return bytePos; 135 - lastBytePos = bytePos; 136 - searchFrom = idx + word.length; 137 - } 138 - cursorMs += value; 139 - wordIndex++; 140 - } 141 - } 142 - } 143 - return lastBytePos; 144 - } 145 - 146 - // ── Search with pagination ───────────────────────────────────────── 147 - 148 - async function searchAllMentions(handle, since, until) { 149 - const allPosts = []; 150 - let cursor = undefined; 151 - 152 - for (let page = 0; page < 10; page++) { 153 - try { 154 - const params = { q: '*', mentions: handle, since, until, sort: 'latest', limit: 100 }; 155 - if (cursor) params.cursor = cursor; 156 - 157 - const res = await agent.app.bsky.feed.searchPosts(params); 158 - const posts = res.data?.posts || []; 159 - allPosts.push(...posts); 160 - 161 - cursor = res.data?.cursor; 162 - if (!cursor || posts.length < 100) break; 163 - await sleep(200); 164 - } catch (e) { 165 - // Fallback without wildcard 166 - try { 167 - const params = { q: 'atmosphere OR atproto', mentions: handle, since, until, sort: 'latest', limit: 100 }; 168 - if (cursor) params.cursor = cursor; 169 - const res = await agent.app.bsky.feed.searchPosts(params); 170 - allPosts.push(...(res.data?.posts || [])); 171 - break; 172 - } catch { break; } 173 - } 174 - } 175 - return allPosts; 176 - } 177 - 178 - // ── Thread fetching ──────────────────────────────────────────────── 179 - 180 - async function fetchThread(uri) { 181 - try { 182 - const res = await agent.app.bsky.feed.getPostThread({ uri, depth: 2 }); 183 - const thread = res.data?.thread; 184 - if (!thread?.replies) return []; 185 - 186 - return thread.replies 187 - .filter(r => r.$type === 'app.bsky.feed.defs#threadViewPost') 188 - .map(r => ({ 189 - uri: r.post.uri, 190 - author: r.post.author, 191 - text: r.post.record?.text, 192 - createdAt: r.post.record?.createdAt, 193 - likes: r.post.likeCount || 0, 194 - reposts: r.post.repostCount || 0, 195 - replies: r.post.replyCount || 0, 196 - })); 197 - } catch { 198 - return []; 199 - } 200 - } 201 - 202 - // ── Post-conference mentions ─────────────────────────────────────── 203 - 204 - async function searchPostConference(handle) { 205 - const allPosts = []; 206 - for (const q of ['atmosphere OR atmosphereconf', 'ionosphere.tv']) { 207 - try { 208 - const res = await agent.app.bsky.feed.searchPosts({ 209 - q, mentions: handle, since: '2026-03-30T00:00:00Z', sort: 'latest', limit: 100 210 - }); 211 - for (const p of (res.data?.posts || [])) allPosts.push(p); 212 - await sleep(200); 213 - } catch { /* skip */ } 214 - } 215 - // Also search by domain 216 - try { 217 - const res = await agent.app.bsky.feed.searchPosts({ 218 - q: '*', since: '2026-03-30T00:00:00Z', domain: 'ionosphere.tv', sort: 'latest', limit: 100 219 - }); 220 - for (const p of (res.data?.posts || [])) allPosts.push(p); 221 - } catch { /* skip */ } 222 - 223 - // Deduplicate 224 - const seen = new Set(); 225 - return allPosts.filter(p => { if (seen.has(p.uri)) return false; seen.add(p.uri); return true; }); 226 - } 227 - 228 - // ── Main ─────────────────────────────────────────────────────────── 229 - 230 - async function main() { 231 - console.log('=== Fetch Mentions → SQLite ===\n'); 232 - 233 - await agent.login({ 234 - identifier: 'ionosphere.tv', 235 - password: process.env.BOT_PASSWORD, 236 - }); 237 - console.log('Authenticated\n'); 238 - 239 - const db = new Database(DB_PATH); 240 - 241 - // Ensure table exists 242 - db.exec(` 243 - CREATE TABLE IF NOT EXISTS mentions ( 244 - uri TEXT PRIMARY KEY, 245 - talk_uri TEXT, 246 - author_did TEXT NOT NULL, 247 - author_handle TEXT, 248 - text TEXT, 249 - created_at TEXT NOT NULL, 250 - talk_offset_ms INTEGER, 251 - byte_position INTEGER, 252 - likes INTEGER DEFAULT 0, 253 - reposts INTEGER DEFAULT 0, 254 - replies INTEGER DEFAULT 0, 255 - parent_uri TEXT, 256 - mention_type TEXT DEFAULT 'during_talk', 257 - indexed_at TEXT NOT NULL 258 - ); 259 - CREATE INDEX IF NOT EXISTS idx_mentions_talk ON mentions(talk_uri, talk_offset_ms); 260 - CREATE INDEX IF NOT EXISTS idx_mentions_parent ON mentions(parent_uri); 261 - `); 262 - 263 - const upsert = db.prepare(` 264 - INSERT OR REPLACE INTO mentions 265 - (uri, talk_uri, author_did, author_handle, text, created_at, 266 - talk_offset_ms, byte_position, likes, reposts, replies, 267 - parent_uri, mention_type, indexed_at) 268 - VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) 269 - `); 270 - 271 - // Load talks with speakers and transcripts 272 - const talks = db.prepare(` 273 - SELECT DISTINCT t.uri, t.rkey, t.title, t.starts_at, t.ends_at, t.room, 274 - s.name as speaker_name, s.handle as speaker_handle 275 - FROM talks t 276 - JOIN talk_speakers ts ON ts.talk_uri = t.uri 277 - JOIN speakers s ON s.uri = ts.speaker_uri 278 - WHERE t.starts_at IS NOT NULL AND t.ends_at IS NOT NULL 279 - ORDER BY t.starts_at 280 - `).all(); 281 - 282 - // Group by talk 283 - const talkMap = new Map(); 284 - for (const row of talks) { 285 - if (!talkMap.has(row.uri)) { 286 - talkMap.set(row.uri, { ...row, speakers: [] }); 287 - } 288 - const t = talkMap.get(row.uri); 289 - if (!t.speakers.find(s => s.handle === row.speaker_handle)) { 290 - t.speakers.push({ name: row.speaker_name, handle: row.speaker_handle }); 291 - } 292 - } 293 - 294 - // Load transcripts for byte mapping 295 - const transcriptStmt = db.prepare(` 296 - SELECT document FROM transcripts WHERE talk_uri = ? LIMIT 1 297 - `); 298 - 299 - let totalMentions = 0; 300 - let totalThreadReplies = 0; 301 - const talkList = [...talkMap.values()]; 302 - 303 - for (let i = 0; i < talkList.length; i++) { 304 - const talk = talkList[i]; 305 - const talkStart = new Date(talk.starts_at); 306 - const talkEnd = new Date(talk.ends_at); 307 - const since = new Date(talkStart.getTime() - PRE_BUFFER_MS).toISOString(); 308 - const until = new Date(talkEnd.getTime() + POST_BUFFER_MS).toISOString(); 309 - 310 - // Get transcript for byte mapping 311 - const transcriptRow = transcriptStmt.get(talk.uri); 312 - let compact = null; 313 - if (transcriptRow?.document) { 314 - try { compact = JSON.parse(transcriptRow.document); } catch {} 315 - } 316 - 317 - const allPosts = new Map(); 318 - 319 - // During-talk mentions per speaker 320 - for (const speaker of talk.speakers) { 321 - if (!speaker.handle) continue; 322 - const posts = await searchAllMentions(speaker.handle, since, until); 323 - for (const p of posts) { 324 - if (!allPosts.has(p.uri)) allPosts.set(p.uri, p); 325 - } 326 - await sleep(150); 327 - } 328 - 329 - // Process and store 330 - const insertMany = db.transaction((posts) => { 331 - for (const p of posts) { 332 - const createdAt = new Date(p.record?.createdAt); 333 - const offsetMs = createdAt.getTime() - talkStart.getTime(); 334 - const bytePos = compact ? mapOffsetToBytePosition(offsetMs, compact) : null; 335 - 336 - upsert.run( 337 - p.uri, talk.uri, p.author.did, p.author.handle, 338 - p.record?.text, p.record?.createdAt, 339 - offsetMs, bytePos, 340 - p.likeCount || 0, p.repostCount || 0, p.replyCount || 0, 341 - null, 'during_talk', new Date().toISOString() 342 - ); 343 - } 344 - }); 345 - 346 - const posts = [...allPosts.values()]; 347 - if (posts.length > 0) insertMany(posts); 348 - totalMentions += posts.length; 349 - 350 - // Fetch threads for posts with replies 351 - const postsWithReplies = posts.filter(p => (p.replyCount || 0) > 0); 352 - for (const p of postsWithReplies) { 353 - const replies = await fetchThread(p.uri); 354 - for (const reply of replies) { 355 - const parentCreatedAt = new Date(p.record?.createdAt); 356 - const parentOffsetMs = parentCreatedAt.getTime() - talkStart.getTime(); 357 - const parentBytePos = compact ? mapOffsetToBytePosition(parentOffsetMs, compact) : null; 358 - 359 - upsert.run( 360 - reply.uri, talk.uri, reply.author.did, reply.author.handle, 361 - reply.text, reply.createdAt, 362 - parentOffsetMs, parentBytePos, 363 - reply.likes, reply.reposts, reply.replies, 364 - p.uri, 'during_talk', new Date().toISOString() 365 - ); 366 - totalThreadReplies++; 367 - } 368 - await sleep(200); 369 - } 370 - 371 - if (posts.length > 0) { 372 - console.log(`[${i + 1}/${talkList.length}] "${talk.title}" — ${posts.length} mentions, ${postsWithReplies.length} threads`); 373 - } else { 374 - console.log(`[${i + 1}/${talkList.length}] "${talk.title}" — no mentions`); 375 - } 376 - } 377 - 378 - // Post-conference mentions 379 - console.log('\n--- Post-conference mentions ---'); 380 - const speakerHandles = [...new Set(talkList.flatMap(t => t.speakers.map(s => s.handle)).filter(Boolean))]; 381 - let postConfCount = 0; 382 - 383 - // Domain search for ionosphere.tv links 384 - try { 385 - const res = await agent.app.bsky.feed.searchPosts({ 386 - q: '*', domain: 'ionosphere.tv', since: '2026-03-30T00:00:00Z', sort: 'latest', limit: 100 387 - }); 388 - const posts = res.data?.posts || []; 389 - const insertPostConf = db.transaction((posts) => { 390 - for (const p of posts) { 391 - upsert.run( 392 - p.uri, null, p.author.did, p.author.handle, 393 - p.record?.text, p.record?.createdAt, 394 - null, null, 395 - p.likeCount || 0, p.repostCount || 0, p.replyCount || 0, 396 - null, 'post_conference', new Date().toISOString() 397 - ); 398 - } 399 - }); 400 - insertPostConf(posts); 401 - postConfCount += posts.length; 402 - console.log(` ionosphere.tv domain: ${posts.length} posts`); 403 - } catch (e) { 404 - console.error(` ionosphere.tv domain search failed: ${e.message}`); 405 - } 406 - 407 - // stream.place domain 408 - try { 409 - const res = await agent.app.bsky.feed.searchPosts({ 410 - q: 'atmosphere', domain: 'stream.place', since: '2026-03-30T00:00:00Z', sort: 'latest', limit: 100 411 - }); 412 - const posts = res.data?.posts || []; 413 - for (const p of posts) { 414 - upsert.run( 415 - p.uri, null, p.author.did, p.author.handle, 416 - p.record?.text, p.record?.createdAt, 417 - null, null, 418 - p.likeCount || 0, p.repostCount || 0, p.replyCount || 0, 419 - null, 'post_conference', new Date().toISOString() 420 - ); 421 - } 422 - postConfCount += posts.length; 423 - console.log(` stream.place domain: ${posts.length} posts`); 424 - } catch (e) { 425 - console.error(` stream.place domain search failed: ${e.message}`); 426 - } 427 - 428 - await sleep(200); 429 - 430 - console.log(`\n=== DONE ===`); 431 - console.log(`During-talk mentions: ${totalMentions}`); 432 - console.log(`Thread replies: ${totalThreadReplies}`); 433 - console.log(`Post-conference: ${postConfCount}`); 434 - console.log(`Total stored: ${db.prepare('SELECT COUNT(*) as c FROM mentions').get().c}`); 435 - 436 - db.close(); 437 - } 438 - 439 - main().catch(console.error); 440 - ``` 441 - 442 - - [ ] **Step 2: Run the fetch script** 443 - 444 - ```bash 445 - source apps/ionosphere-appview/.env && BOT_PASSWORD="$BOT_PASSWORD" node scripts/fetch-mentions.mjs 446 - ``` 447 - 448 - Expected: Iterates through ~120 talks, stores mentions with byte positions into SQLite. Should take 5-10 minutes due to API rate limiting. 449 - 450 - - [ ] **Step 3: Verify data** 451 - 452 - ```bash 453 - sqlite3 apps/data/ionosphere.sqlite "SELECT COUNT(*) FROM mentions;" 454 - sqlite3 apps/data/ionosphere.sqlite "SELECT mention_type, COUNT(*) FROM mentions GROUP BY mention_type;" 455 - sqlite3 apps/data/ionosphere.sqlite "SELECT m.talk_offset_ms, m.byte_position, m.text FROM mentions m WHERE m.talk_uri IS NOT NULL AND m.byte_position IS NOT NULL LIMIT 5;" 456 - ``` 457 - 458 - Expected: 2000+ rows, mix of during_talk and post_conference, byte_position populated for during-talk mentions. 459 - 460 - - [ ] **Step 4: Commit** 461 - 462 - ```bash 463 - git add scripts/fetch-mentions.mjs 464 - git commit -m "feat: paginated mention fetcher with threads and byte mapping" 465 - ``` 466 - 467 - --- 468 - 469 - ## Task 3: API Endpoint 470 - 471 - **Files:** 472 - - Modify: `apps/ionosphere-appview/src/routes.ts:273` (after getComments, before getConceptClusters) 473 - 474 - - [ ] **Step 1: Add getMentions route** 475 - 476 - Insert after line 272 (end of getComments handler) in `routes.ts`: 477 - 478 - ```typescript 479 - app.get("/xrpc/tv.ionosphere.getMentions", (c) => { 480 - const talkRkey = c.req.query("talkRkey"); 481 - if (!talkRkey) return c.json({ mentions: [], total: 0 }); 482 - 483 - const talk = db.prepare("SELECT uri FROM talks WHERE rkey = ?").get(talkRkey) as any; 484 - if (!talk) return c.json({ mentions: [], total: 0 }); 485 - 486 - // Fetch top-level mentions (parent_uri IS NULL) 487 - const topLevel = db.prepare( 488 - `SELECT m.*, p.handle as author_handle, p.display_name as author_display_name, p.avatar_url as author_avatar_url 489 - FROM mentions m 490 - LEFT JOIN profiles p ON m.author_did = p.did 491 - WHERE m.talk_uri = ? AND m.parent_uri IS NULL 492 - ORDER BY 493 - CASE m.mention_type WHEN 'during_talk' THEN 0 ELSE 1 END, 494 - m.talk_offset_ms ASC, 495 - m.created_at ASC` 496 - ).all(talk.uri); 497 - 498 - // Fetch thread replies for each top-level mention 499 - const replyStmt = db.prepare( 500 - `SELECT m.*, p.handle as author_handle, p.display_name as author_display_name, p.avatar_url as author_avatar_url 501 - FROM mentions m 502 - LEFT JOIN profiles p ON m.author_did = p.did 503 - WHERE m.parent_uri = ? 504 - ORDER BY m.created_at ASC` 505 - ); 506 - 507 - const mentions = topLevel.map((m: any) => ({ 508 - ...m, 509 - thread: replyStmt.all(m.uri), 510 - })); 511 - 512 - return c.json({ mentions, total: mentions.length }); 513 - }); 514 - ``` 515 - 516 - - [ ] **Step 2: Test the endpoint** 517 - 518 - Start the appview and curl: 519 - ```bash 520 - curl -s 'http://localhost:3001/xrpc/tv.ionosphere.getMentions?talkRkey=landslide' | node -e "process.stdin.on('data',d=>{const j=JSON.parse(d);console.log('total:',j.total);j.mentions.slice(0,3).forEach(m=>console.log(m.author_handle,m.talk_offset_ms,m.text?.slice(0,60)))})" 521 - ``` 522 - 523 - Expected: Returns mentions array sorted by talk_offset_ms with nested thread replies. 524 - 525 - - [ ] **Step 3: Commit** 526 - 527 - ```bash 528 - git add apps/ionosphere-appview/src/routes.ts 529 - git commit -m "feat: add getMentions XRPC endpoint" 530 - ``` 531 - 532 - --- 533 - 534 - ## Task 4: Frontend API Client and Data Fetching 535 - 536 - **Files:** 537 - - Modify: `apps/ionosphere/src/lib/api.ts:31` (after getConcept) 538 - - Modify: `apps/ionosphere/src/app/talks/[rkey]/page.tsx:36` (add mentions fetch) 539 - 540 - - [ ] **Step 1: Add getMentions to api.ts** 541 - 542 - Add after `getConcept` (line 31): 543 - 544 - ```typescript 545 - export async function getMentions(talkRkey: string) { 546 - return fetchApi<{ mentions: any[]; total: number }>(`/xrpc/tv.ionosphere.getMentions?talkRkey=${encodeURIComponent(talkRkey)}`); 547 - } 548 - ``` 549 - 550 - - [ ] **Step 2: Fetch mentions in page.tsx** 551 - 552 - Update `page.tsx` to fetch mentions server-side and pass to TalkContent: 553 - 554 - ```typescript 555 - import { getTalk, getTalks, getMentions } from "@/lib/api"; 556 - ``` 557 - 558 - Update the `TalkPage` component (line 34-38): 559 - 560 - ```typescript 561 - export default async function TalkPage({ params }: { params: Promise<{ rkey: string }> }) { 562 - const { rkey } = await params; 563 - const [{ talk, speakers, concepts }, { mentions }] = await Promise.all([ 564 - getTalk(rkey), 565 - getMentions(rkey), 566 - ]); 567 - 568 - return <TalkContent talk={talk} speakers={speakers} concepts={concepts} mentions={mentions} />; 569 - } 570 - ``` 571 - 572 - - [ ] **Step 3: Update TalkContent props** 573 - 574 - In `TalkContent.tsx`, update the interface (line 10-13): 575 - 576 - ```typescript 577 - interface TalkContentProps { 578 - talk: any; 579 - speakers: any[]; 580 - concepts: any[]; 581 - mentions: any[]; 582 - } 583 - ``` 584 - 585 - Update the destructuring (line 24): 586 - 587 - ```typescript 588 - export default function TalkContent({ talk, speakers, concepts, mentions }: TalkContentProps) { 589 - ``` 590 - 591 - - [ ] **Step 4: Commit** 592 - 593 - ```bash 594 - git add apps/ionosphere/src/lib/api.ts apps/ionosphere/src/app/talks/[rkey]/page.tsx apps/ionosphere/src/app/talks/[rkey]/TalkContent.tsx 595 - git commit -m "feat: wire mentions data from API to talk page" 596 - ``` 597 - 598 - --- 599 - 600 - ## Task 5: MentionsSidebar Component 601 - 602 - **Files:** 603 - - Create: `apps/ionosphere/src/app/components/MentionsSidebar.tsx` 604 - 605 - - [ ] **Step 1: Create the MentionsSidebar component** 606 - 607 - ```tsx 608 - "use client"; 609 - 610 - import { useState, useRef, useEffect, useCallback } from "react"; 611 - import { useTimestamp } from "@/app/components/TimestampProvider"; 612 - 613 - interface Mention { 614 - uri: string; 615 - author_did: string; 616 - author_handle: string; 617 - author_display_name: string; 618 - author_avatar_url: string; 619 - text: string; 620 - created_at: string; 621 - talk_offset_ms: number; 622 - byte_position: number; 623 - likes: number; 624 - reposts: number; 625 - replies: number; 626 - mention_type: string; 627 - thread: Mention[]; 628 - } 629 - 630 - interface MentionsSidebarProps { 631 - mentions: Mention[]; 632 - words: Array<{ byteStart: number; startTime: number }>; 633 - } 634 - 635 - export default function MentionsSidebar({ mentions, words }: MentionsSidebarProps) { 636 - const { currentTimeNs } = useTimestamp(); 637 - const containerRef = useRef<HTMLDivElement>(null); 638 - const [expandedThreads, setExpandedThreads] = useState<Set<string>>(new Set()); 639 - 640 - const duringTalk = mentions.filter(m => m.mention_type === "during_talk"); 641 - const postConference = mentions.filter(m => m.mention_type === "post_conference"); 642 - 643 - // Find the mention closest to current playback time 644 - const currentOffsetMs = Number(currentTimeNs) / 1_000_000; 645 - const activeMentionIdx = duringTalk.findIndex((m, i) => { 646 - const next = duringTalk[i + 1]; 647 - return !next || next.talk_offset_ms > currentOffsetMs; 648 - }); 649 - 650 - // Auto-scroll to active mention 651 - useEffect(() => { 652 - if (activeMentionIdx < 0) return; 653 - const container = containerRef.current; 654 - if (!container) return; 655 - const el = container.querySelector(`[data-mention-idx="${activeMentionIdx}"]`); 656 - if (el) { 657 - el.scrollIntoView({ behavior: "smooth", block: "center" }); 658 - } 659 - }, [activeMentionIdx]); 660 - 661 - const toggleThread = useCallback((uri: string) => { 662 - setExpandedThreads(prev => { 663 - const next = new Set(prev); 664 - if (next.has(uri)) next.delete(uri); 665 - else next.add(uri); 666 - return next; 667 - }); 668 - }, []); 669 - 670 - const { seekTo } = useTimestamp(); 671 - 672 - const handleMentionClick = useCallback((offsetMs: number) => { 673 - if (offsetMs != null && seekTo) { 674 - seekTo(BigInt(offsetMs) * 1_000_000n); 675 - } 676 - }, [seekTo]); 677 - 678 - return ( 679 - <div ref={containerRef} className="flex flex-col gap-1 overflow-y-auto h-full"> 680 - {duringTalk.length === 0 && postConference.length === 0 && ( 681 - <p className="text-neutral-500 text-xs">No mentions found for this talk.</p> 682 - )} 683 - 684 - {duringTalk.map((m, idx) => ( 685 - <MentionCard 686 - key={m.uri} 687 - mention={m} 688 - idx={idx} 689 - isActive={idx === activeMentionIdx} 690 - isThreadExpanded={expandedThreads.has(m.uri)} 691 - onToggleThread={() => toggleThread(m.uri)} 692 - onClick={() => handleMentionClick(m.talk_offset_ms)} 693 - /> 694 - ))} 695 - 696 - {postConference.length > 0 && ( 697 - <> 698 - <div className="border-t border-neutral-700 my-3 pt-2"> 699 - <h3 className="text-[10px] font-semibold text-neutral-500 uppercase tracking-wide"> 700 - After the conference 701 - </h3> 702 - </div> 703 - {postConference.map((m) => ( 704 - <MentionCard 705 - key={m.uri} 706 - mention={m} 707 - idx={-1} 708 - isActive={false} 709 - isThreadExpanded={expandedThreads.has(m.uri)} 710 - onToggleThread={() => toggleThread(m.uri)} 711 - onClick={() => {}} 712 - /> 713 - ))} 714 - </> 715 - )} 716 - </div> 717 - ); 718 - } 719 - 720 - function MentionCard({ 721 - mention: m, 722 - idx, 723 - isActive, 724 - isThreadExpanded, 725 - onToggleThread, 726 - onClick, 727 - }: { 728 - mention: Mention; 729 - idx: number; 730 - isActive: boolean; 731 - isThreadExpanded: boolean; 732 - onToggleThread: () => void; 733 - onClick: () => void; 734 - }) { 735 - const offsetMin = m.talk_offset_ms != null ? Math.floor(m.talk_offset_ms / 60000) : null; 736 - const offsetSec = m.talk_offset_ms != null ? Math.floor((m.talk_offset_ms % 60000) / 1000) : null; 737 - const timeLabel = offsetMin != null ? `${offsetMin}:${String(offsetSec).padStart(2, "0")}` : null; 738 - 739 - return ( 740 - <div data-mention-idx={idx}> 741 - <div 742 - onClick={onClick} 743 - className={`p-2 rounded-md border-l-2 cursor-pointer transition-colors ${ 744 - isActive 745 - ? "bg-blue-500/10 border-blue-400" 746 - : "bg-neutral-900/50 border-neutral-700 hover:bg-neutral-800/50 hover:border-blue-500/50" 747 - }`} 748 - > 749 - <div className="flex items-center gap-1.5 mb-1"> 750 - {m.author_avatar_url ? ( 751 - <img src={m.author_avatar_url} alt="" className="w-4 h-4 rounded-full" /> 752 - ) : ( 753 - <div className="w-4 h-4 rounded-full bg-neutral-700 shrink-0" /> 754 - )} 755 - <span className="text-blue-400 text-[11px] font-medium truncate"> 756 - @{m.author_handle || "unknown"} 757 - </span> 758 - {timeLabel && ( 759 - <span className="text-neutral-600 text-[10px] ml-auto shrink-0">{timeLabel}</span> 760 - )} 761 - </div> 762 - <p className="text-neutral-300 text-[11px] leading-relaxed line-clamp-3">{m.text}</p> 763 - <div className="flex items-center gap-3 mt-1 text-[10px] text-neutral-500"> 764 - {m.likes > 0 && <span>{m.likes} ♡</span>} 765 - {m.reposts > 0 && <span>{m.reposts} ⟳</span>} 766 - {m.thread?.length > 0 && ( 767 - <button 768 - onClick={(e) => { e.stopPropagation(); onToggleThread(); }} 769 - className="text-blue-400/70 hover:text-blue-300" 770 - > 771 - {isThreadExpanded ? "▾" : "▸"} {m.thread.length} {m.thread.length === 1 ? "reply" : "replies"} 772 - </button> 773 - )} 774 - </div> 775 - </div> 776 - 777 - {isThreadExpanded && m.thread?.length > 0 && ( 778 - <div className="ml-3 mt-0.5 flex flex-col gap-0.5"> 779 - {m.thread.map((reply) => ( 780 - <div key={reply.uri} className="p-1.5 rounded bg-neutral-900/30 border-l border-neutral-700"> 781 - <div className="flex items-center gap-1.5 mb-0.5"> 782 - {reply.author_avatar_url ? ( 783 - <img src={reply.author_avatar_url} alt="" className="w-3 h-3 rounded-full" /> 784 - ) : ( 785 - <div className="w-3 h-3 rounded-full bg-neutral-700 shrink-0" /> 786 - )} 787 - <span className="text-blue-400/70 text-[10px]">@{reply.author_handle}</span> 788 - {reply.likes > 0 && <span className="text-neutral-600 text-[10px] ml-auto">{reply.likes} ♡</span>} 789 - </div> 790 - <p className="text-neutral-400 text-[10px] leading-relaxed line-clamp-3">{reply.text}</p> 791 - </div> 792 - ))} 793 - </div> 794 - )} 795 - </div> 796 - ); 797 - } 798 - ``` 799 - 800 - - [ ] **Step 2: Verify component compiles** 801 - 802 - ```bash 803 - cd apps/ionosphere && npx next build 2>&1 | tail -20 804 - ``` 805 - 806 - Expected: No TypeScript errors for MentionsSidebar. (Full build may fail if other parts have issues, but the component itself should be clean.) 807 - 808 - - [ ] **Step 3: Commit** 809 - 810 - ```bash 811 - git add apps/ionosphere/src/app/components/MentionsSidebar.tsx 812 - git commit -m "feat: MentionsSidebar component with scroll sync and thread expansion" 813 - ``` 814 - 815 - --- 816 - 817 - ## Task 6: Tab System and Integration 818 - 819 - **Files:** 820 - - Modify: `apps/ionosphere/src/app/talks/[rkey]/TalkContent.tsx:195-223` 821 - 822 - - [ ] **Step 1: Add imports and state** 823 - 824 - At the top of TalkContent.tsx, add the MentionsSidebar import (after line 8): 825 - 826 - ```typescript 827 - import MentionsSidebar from "@/app/components/MentionsSidebar"; 828 - ``` 829 - 830 - Inside the component function, add tab state (after the `comments` state on line 25): 831 - 832 - ```typescript 833 - const [sidebarTab, setSidebarTab] = useState<"concepts" | "mentions">( 834 - mentions.length > 0 ? "mentions" : "concepts" 835 - ); 836 - ``` 837 - 838 - - [ ] **Step 2: Replace the right sidebar** 839 - 840 - Replace lines 195-223 (the entire `<aside>` block) with: 841 - 842 - ```tsx 843 - {/* Right sidebar — concepts + mentions (hidden on mobile, scrollable on desktop) */} 844 - <aside className="hidden lg:flex lg:flex-col lg:w-56 xl:w-64 shrink-0 border-l border-neutral-800 overflow-y-auto"> 845 - {/* Tab switcher */} 846 - <div className="flex border-b border-neutral-800 shrink-0"> 847 - <button 848 - onClick={() => setSidebarTab("concepts")} 849 - className={`flex-1 text-[11px] font-semibold px-3 py-2.5 transition-colors ${ 850 - sidebarTab === "concepts" 851 - ? "text-amber-300 border-b-2 border-amber-300" 852 - : "text-neutral-500 hover:text-neutral-300" 853 - }`} 854 - > 855 - Concepts{concepts.length > 0 ? ` (${concepts.length})` : ""} 856 - </button> 857 - <button 858 - onClick={() => setSidebarTab("mentions")} 859 - className={`flex-1 text-[11px] font-semibold px-3 py-2.5 transition-colors ${ 860 - sidebarTab === "mentions" 861 - ? "text-blue-300 border-b-2 border-blue-300" 862 - : "text-neutral-500 hover:text-neutral-300" 863 - }`} 864 - > 865 - Mentions{mentions.length > 0 ? ` (${mentions.length})` : ""} 866 - </button> 867 - </div> 868 - 869 - {/* Tab content */} 870 - <div className="flex-1 min-h-0 overflow-y-auto p-4"> 871 - {sidebarTab === "concepts" && ( 872 - <> 873 - {concepts.length > 0 && ( 874 - <section> 875 - <h2 className="text-xs font-semibold text-neutral-500 uppercase tracking-wide mb-2">Concepts</h2> 876 - <div className="flex flex-wrap gap-1.5"> 877 - {concepts.map((c: any) => ( 878 - <a 879 - key={c.rkey} 880 - href={`/concepts/${c.rkey}`} 881 - className="text-xs px-2 py-0.5 rounded-full bg-amber-500/10 text-amber-300/80 hover:bg-amber-500/20 hover:text-amber-200 transition-colors" 882 - > 883 - {c.name} 884 - </a> 885 - ))} 886 - </div> 887 - </section> 888 - )} 889 - </> 890 - )} 891 - 892 - {sidebarTab === "mentions" && ( 893 - <MentionsSidebar mentions={mentions} words={[]} /> 894 - )} 895 - </div> 896 - 897 - {/* Mobile speakers (shown below transcript on small screens) */} 898 - <section className="lg:hidden p-4"> 899 - <h2 className="text-xs font-semibold text-neutral-500 uppercase tracking-wide mb-1">Speakers</h2> 900 - {speakers.map((s: any) => ( 901 - <a key={s.rkey} href={`/speakers/${s.rkey}`} className="block text-sm text-neutral-200 hover:text-white"> 902 - {s.name} 903 - </a> 904 - ))} 905 - </section> 906 - </aside> 907 - ``` 908 - 909 - - [ ] **Step 3: Verify build** 910 - 911 - ```bash 912 - cd apps/ionosphere && npx next build 2>&1 | tail -20 913 - ``` 914 - 915 - Expected: Clean build with mentions tab rendered in sidebar. 916 - 917 - - [ ] **Step 4: Commit** 918 - 919 - ```bash 920 - git add apps/ionosphere/src/app/talks/[rkey]/TalkContent.tsx 921 - git commit -m "feat: tabbed sidebar with mentions alongside concepts" 922 - ``` 923 - 924 - --- 925 - 926 - ## Task 7: End-to-End Verification 927 - 928 - - [ ] **Step 1: Run fetch script if not already done** 929 - 930 - ```bash 931 - source apps/ionosphere-appview/.env && BOT_PASSWORD="$BOT_PASSWORD" node scripts/fetch-mentions.mjs 932 - ``` 933 - 934 - - [ ] **Step 2: Start appview** 935 - 936 - ```bash 937 - cd apps/ionosphere-appview && npm run dev & 938 - ``` 939 - 940 - - [ ] **Step 3: Verify API returns data** 941 - 942 - ```bash 943 - curl -s 'http://localhost:3001/xrpc/tv.ionosphere.getMentions?talkRkey=landslide' | node -e "process.stdin.on('data',d=>{const j=JSON.parse(d);console.log(j.total,'mentions');if(j.mentions[0])console.log('first:',j.mentions[0].author_handle,j.mentions[0].text?.slice(0,80))})" 944 - ``` 945 - 946 - - [ ] **Step 4: Start frontend and verify UI** 947 - 948 - ```bash 949 - cd apps/ionosphere && npm run dev 950 - ``` 951 - 952 - Open a talk page with known mentions (e.g., "Landslide" by Erin Kissane). Verify: 953 - - Tab switcher shows "Concepts (N)" and "Mentions (N)" 954 - - Clicking Mentions tab shows mention cards with author, text, likes 955 - - Cards show time offset (e.g., "14:32") 956 - - Clicking a card seeks the video 957 - - Thread replies expand inline 958 - 959 - - [ ] **Step 5: Final commit** 960 - 961 - ```bash 962 - git add -A 963 - git commit -m "feat: conference mentions integration — time-aligned Bluesky mentions in talk sidebar" 964 - ```
-216
docs/superpowers/plans/2026-04-12-retranscribe-hallucinations.md
··· 1 - # Re-transcribe Hallucination Zones Implementation Plan 2 - 3 - > **For agentic workers:** REQUIRED: Use superpowers:subagent-driven-development (if subagents available) or superpowers:executing-plans to implement this plan. Steps use checkbox (`- [ ]`) syntax for tracking. 4 - 5 - **Goal:** Re-transcribe Whisper hallucination zones using diarization-aligned chunk boundaries, splicing clean results back into `transcript-enriched.json`. 6 - 7 - **Architecture:** Single CLI tool that reads v7 boundary output for hallucination zones, extracts audio from HLS for each zone, re-transcribes with OpenAI Whisper API using diarization-derived chunk points, and patches the transcript in-place. 8 - 9 - **Tech Stack:** TypeScript, OpenAI Whisper API, ffmpeg (audio extraction), existing `transcript-enriched.json` format 10 - 11 - **Spec:** `docs/superpowers/specs/2026-04-12-retranscribe-hallucinations-design.md` 12 - 13 - --- 14 - 15 - ## File Structure 16 - 17 - | File | Responsibility | 18 - |------|---------------| 19 - | `src/retranscribe-hallucinations.ts` | CLI tool: parse args, orchestrate extraction/transcription/splicing | 20 - 21 - Reused from existing code: 22 - - `src/transcribe-fullday.ts` — reference for `extractChunk`, `transcribeChunk` patterns (copy/adapt, don't import — that file has hardcoded stream configs) 23 - - `src/v7/types.ts` — `HallucinationZone`, `DiarizationInput` 24 - - `src/tracks.ts` — `STREAMS` config for stream URIs 25 - 26 - --- 27 - 28 - ## Chunk 1: The Tool 29 - 30 - ### Task 1: retranscribe-hallucinations.ts 31 - 32 - **Files:** 33 - - Create: `apps/ionosphere-appview/src/retranscribe-hallucinations.ts` 34 - 35 - - [ ] **Step 1: Implement the CLI tool** 36 - 37 - The tool needs these parts: 38 - 39 - **CLI arg parsing:** 40 - ``` 41 - npx tsx src/retranscribe-hallucinations.ts \ 42 - --stream-slug <slug> \ 43 - --boundaries <path-to-v7-boundaries.json> \ 44 - --diarization <path-to-diarization.json> 45 - ``` 46 - 47 - Also needs `OPENAI_API_KEY` from environment (load via `./env.js` like existing code). 48 - 49 - **Core logic:** 50 - 51 - ```ts 52 - import "./env.js"; 53 - import { readFileSync, writeFileSync, existsSync, mkdirSync } from "node:fs"; 54 - import { execSync } from "node:child_process"; 55 - import path from "node:path"; 56 - import OpenAI from "openai"; 57 - import type { HallucinationZone, DiarizationInput } from "./v7/types.js"; 58 - ``` 59 - 60 - 1. **Load inputs:** 61 - - Parse v7 boundaries JSON → extract `hallucinationZones` array 62 - - Load diarization JSON 63 - - Load existing `transcript-enriched.json` from the stream's fullday dir 64 - - Get stream URI from `STREAMS` config (import from tracks.ts or inline) 65 - 66 - 2. **For each hallucination zone:** 67 - - Find diarization segments that overlap the zone 68 - - If no diarization speech → skip (log "no speech in zone, skipping") 69 - - Compute chunk boundaries from diarization: 70 - - Start at first speech segment onset within zone 71 - - End at last speech segment offset within zone 72 - - If total duration > 20 min (Whisper's practical limit at 32kbps), split at diarization gaps > 5s 73 - - Extract audio for each chunk via ffmpeg: 74 - ``` 75 - ffmpeg -ss <startS> -i "<playlistUrl>" -t <durationS> -vn -acodec libmp3lame -ar 16000 -ac 1 -b:a 32k "<tmpFile>" -y 76 - ``` 77 - Use a temp directory: `<streamDir>/retranscribe-chunks/` 78 - - Transcribe each chunk via OpenAI Whisper API (same params as `transcribe-fullday.ts`): 79 - ```ts 80 - const response = await client.audio.transcriptions.create({ 81 - model: "whisper-1", 82 - file: createReadStream(chunkPath), 83 - response_format: "verbose_json", 84 - timestamp_granularities: ["word"], 85 - }); 86 - ``` 87 - - Adjust word timestamps: Whisper returns timestamps relative to chunk start, so add the chunk's absolute start time: `word.start += chunkStartS; word.end += chunkStartS;` 88 - 89 - 3. **Splice into transcript:** 90 - - Load `transcript-enriched.json` 91 - - For each re-transcribed zone: 92 - - Remove all words where `word.start >= zone.startS && word.start <= zone.endS` 93 - - Insert new words 94 - - Sort all words by start time 95 - - Update `total_words` 96 - - Write back to `transcript-enriched.json` 97 - - Also back up original as `transcript-enriched.json.bak` before first write 98 - 99 - **Console output:** 100 - ``` 101 - === Re-transcribing hallucination zones for room-2301-day-2 === 102 - Loaded 9 hallucination zones 103 - Zone 1: 96.0m - 209.0m (113.0m) 104 - Diarization speech: 5 segments, 12.3m total 105 - Chunks: 1 (12.3m) 106 - Extracting audio... done 107 - Transcribing chunk 1/1... 847 words 108 - Zone 2: ... 109 - ... 110 - Splicing 2,341 new words into transcript 111 - Removed 8,542 hallucinated words 112 - Wrote transcript-enriched.json (backup: .bak) 113 - ``` 114 - 115 - **Key references in existing code:** 116 - - `transcribe-fullday.ts:64-76` — `extractChunk` function (ffmpeg command) 117 - - `transcribe-fullday.ts:81-98` — `transcribeChunk` function (OpenAI API call) 118 - - `transcribe-fullday.ts:38-41` — `FULLDAY_STREAMS` for stream URIs 119 - - `tracks.ts:27-34` — `STREAMS` config with URIs and dir names 120 - 121 - **Stream URI → playlist URL:** 122 - ```ts 123 - const VOD_ENDPOINT = "https://vod-beta.stream.place/xrpc/place.stream.playback.getVideoPlaylist"; 124 - const playlistUrl = `${VOD_ENDPOINT}?uri=${encodeURIComponent(streamUri)}`; 125 - ``` 126 - 127 - **STREAMS config for slug → URI + dirName mapping** (from tracks.ts): 128 - ```ts 129 - const STREAMS = [ 130 - { slug: "great-hall-day-1", uri: "at://did:plc:rbvrr34edl5ddpuwcubjiost/place.stream.video/3miieadw52j22", dirName: "Great_Hall___Day_1" }, 131 - { slug: "great-hall-day-2", uri: "at://did:plc:rbvrr34edl5ddpuwcubjiost/place.stream.video/3miighlz53o22", dirName: "Great_Hall___Day_2" }, 132 - { slug: "room-2301-day-1", uri: "at://did:plc:rbvrr34edl5ddpuwcubjiost/place.stream.video/3miieadx2dj22", dirName: "Room_2301___Day_1" }, 133 - { slug: "room-2301-day-2", uri: "at://did:plc:rbvrr34edl5ddpuwcubjiost/place.stream.video/3miieadxeqn22", dirName: "Room_2301___Day_2" }, 134 - { slug: "performance-theatre-day-1", uri: "at://did:plc:rbvrr34edl5ddpuwcubjiost/place.stream.video/3miieadwgvz22", dirName: "Performance_Theater___Day_1" }, 135 - { slug: "performance-theatre-day-2", uri: "at://did:plc:rbvrr34edl5ddpuwcubjiost/place.stream.video/3miieadwqgy22", dirName: "Performance_Theater___Day_2" }, 136 - { slug: "atscience", uri: "at://did:plc:rbvrr34edl5ddpuwcubjiost/place.stream.video/3miieadvruo22", dirName: "ATScience" }, 137 - ]; 138 - ``` 139 - 140 - - [ ] **Step 2: Test on a small hallucination zone first** 141 - 142 - Pick ATScience (Welsh hallucination, only 16 min zone) as a test case: 143 - 144 - ```bash 145 - cd apps/ionosphere-appview 146 - source .env && export OPENAI_API_KEY 147 - 148 - # First generate v7 boundaries if not already present 149 - npx tsx src/detect-boundaries-v7.ts \ 150 - data/fullday/ATScience/transcript-enriched.json \ 151 - --diarization data/fullday/ATScience/diarization.json \ 152 - --stream-slug atscience 153 - 154 - # Then re-transcribe 155 - npx tsx src/retranscribe-hallucinations.ts \ 156 - --stream-slug atscience \ 157 - --boundaries data/fullday/ATScience/transcript-enriched-boundaries-v7.json \ 158 - --diarization data/fullday/ATScience/diarization.json 159 - ``` 160 - 161 - Verify: 162 - - Check that `transcript-enriched.json.bak` was created 163 - - Compare word count before and after 164 - - Spot-check the re-transcribed zone: are the Welsh hallucinations gone? Is there English content now? 165 - 166 - - [ ] **Step 3: Commit** 167 - 168 - ```bash 169 - git add src/retranscribe-hallucinations.ts 170 - git commit -m "feat: retranscribe hallucination zones using diarization-aligned chunks" 171 - ``` 172 - 173 - ### Task 2: Run on All Streams 174 - 175 - - [ ] **Step 1: Generate v7 boundaries for all streams** (if not already done) 176 - 177 - - [ ] **Step 2: Re-transcribe each stream** 178 - 179 - ```bash 180 - cd apps/ionosphere-appview 181 - source .env && export OPENAI_API_KEY 182 - 183 - for slug in atscience great-hall-day-1 great-hall-day-2 room-2301-day-1 room-2301-day-2 performance-theatre-day-1 performance-theatre-day-2; do 184 - dir=$(echo "$slug" | sed 's/great-hall-day-1/Great_Hall___Day_1/' | sed 's/great-hall-day-2/Great_Hall___Day_2/' | sed 's/room-2301-day-1/Room_2301___Day_1/' | sed 's/room-2301-day-2/Room_2301___Day_2/' | sed 's/performance-theatre-day-1/Performance_Theater___Day_1/' | sed 's/performance-theatre-day-2/Performance_Theater___Day_2/' | sed 's/atscience/ATScience/') 185 - echo "=== $slug ===" 186 - npx tsx src/retranscribe-hallucinations.ts \ 187 - --stream-slug "$slug" \ 188 - --boundaries "data/fullday/$dir/transcript-enriched-boundaries-v7.json" \ 189 - --diarization "data/fullday/$dir/diarization.json" 190 - echo 191 - done 192 - ``` 193 - 194 - - [ ] **Step 3: Re-run v7 detection on updated transcripts** 195 - 196 - Run v7 again with the patched transcripts to see if match accuracy improves: 197 - 198 - ```bash 199 - for slug in atscience great-hall-day-1 great-hall-day-2 room-2301-day-1 room-2301-day-2 performance-theatre-day-1 performance-theatre-day-2; do 200 - dir=$(...) 201 - echo "=== $slug ===" 202 - npx tsx src/detect-boundaries-v7.ts \ 203 - "data/fullday/$dir/transcript-enriched.json" \ 204 - --diarization "data/fullday/$dir/diarization.json" \ 205 - --stream-slug "$slug" 2>&1 | grep -E "^(Results|Unmatched sch)" 206 - done 207 - ``` 208 - 209 - Expected: match accuracy improves from 90% toward 95%+ as previously-unverifiable talks become matchable. 210 - 211 - - [ ] **Step 4: Commit updated transcripts** 212 - 213 - ```bash 214 - git add -A 215 - git commit -m "data: re-transcribed hallucination zones across all 7 streams" 216 - ```
-203
docs/superpowers/specs/2026-03-30-ionosphere-design.md
··· 1 - # Ionosphere: Semantically Enriched Conference Video Archive 2 - 3 - ## Overview 4 - 5 - Ionosphere is an AT Protocol-native video archive that transforms conference talk recordings into a browsable, semantically enriched knowledge base. Conference talks become richly annotated documents with synchronized transcripts, concept cross-references, speaker profiles, and a navigable knowledge graph. MTV meets the British Library and Wikipedia. 6 - 7 - The first corpus is ATmosphereConf 2026 (126 VOD records from Streamplace, ~100 schedule events). 8 - 9 - **Domain:** ionosphere.tv 10 - **NSID namespace:** `tv.ionosphere.*` 11 - 12 - ## Architecture 13 - 14 - Follows the pannacotta pattern: diverse source lexicons -> lenses -> internal lexicons -> appview -> render. Uses RelationalText for the document model and `pub.layers.annotation` for semantic enrichment layers. panproto for schema versioning. 15 - 16 - ### Source Data 17 - 18 - - **`place.stream.video`** records from `did:plc:rbvrr34edl5ddpuwcubjiost` (stream.place), served from PDS at `iameli.com`. 126 VOD records, 1080p HLS CMAF playback via `vod-beta.stream.place/xrpc/place.stream.playback.getVideoPlaylist?uri=<at-uri>`. 19 - - **`community.lexicon.calendar.event`** records from `did:plc:3xewinw4wtimo2lqfy5fm5sw` (atmosphereconf.org). ~100 schedule events with title, description, room, type, category, speakers (handle + name), start/end times. 20 - 21 - ### Domain Lexicons 22 - 23 - - **`tv.ionosphere.talk`** — Title, speaker refs, track/room, start/end times, video ref (AT URI to `place.stream.video`), transcript as RelationalText document with word-level temporal facets, source schedule ref. 24 - - **`tv.ionosphere.speaker`** — Name, AT Protocol handle/DID, bio, affiliations, talk refs. 25 - - **`tv.ionosphere.concept`** — Knowledge entity: name, aliases, description, Wikidata ref. Created by LLM enrichment, curated by humans. 26 - - **`tv.ionosphere.event`** — Conference-level metadata: name, dates, location, tracks/rooms, schedule ref. Supports multiple conferences. 27 - 28 - ### Annotation Facet Types 29 - 30 - Defined in the format-lexicon (`formats/tv.ionosphere/ionosphere.lexicon.json`), not as AT Protocol record lexicons. These are RelationalText feature types used within documents. 31 - 32 - - **`speaker-segment`** — featureClass: block. Marks a speaker turn within the transcript. 33 - - `speakerUri`: AT URI to `tv.ionosphere.speaker` record 34 - - `startTime`: number (nanoseconds from video start) 35 - - `endTime`: number (nanoseconds from video start) 36 - 37 - - **`concept-ref`** — featureClass: inline. Links a text span to a concept. 38 - - `conceptUri`: AT URI to `tv.ionosphere.concept` record 39 - 40 - - **`speaker-ref`** — featureClass: inline. Links a mention of a person to a speaker record. 41 - - `speakerUri`: AT URI to `tv.ionosphere.speaker` record 42 - 43 - - **`talk-xref`** — featureClass: inline. Cross-reference to another talk. 44 - - `talkUri`: AT URI to `tv.ionosphere.talk` record 45 - 46 - - **`link`** — featureClass: inline. External URL reference. 47 - - `url`: string (URI) 48 - - `title`: string (optional) 49 - 50 - - **`timestamp`** — featureClass: meta. Word-level timing for video sync. 51 - - `startTime`: number (nanoseconds from video start) 52 - - `endTime`: number (nanoseconds from video start) 53 - 54 - ## Data Layer: Progressive Enhancement 55 - 56 - Each stage enriches but does not gate. A talk with no transcript still renders with schedule metadata and video. A transcript without LLM enrichment still plays with timestamps. Concepts get richer as more talks reference them. Records arrive independently and the view gets progressively richer. 57 - 58 - ### Stage 1: Ingest & Correlate 59 - 60 - Pull `place.stream.video` and `community.lexicon.calendar.event` records. Correlate by fuzzy title matching + time overlap. Filter noise (lunch breaks, test streams, duplicates). Produce one `tv.ionosphere.talk` per real talk, linking to both source records. Manual overrides for tricky matches. 61 - 62 - ### Stage 2: Transcribe 63 - 64 - Download audio from each VOD via HLS endpoint. Run through transcription service with word-level timestamps. Output: plain text + word timing array. Transcription provider evaluated at implementation time (Whisper, Deepgram, AssemblyAI — compare cost/quality on samples first). 65 - 66 - ### Stage 3: Document Assembly 67 - 68 - Convert transcript + timestamps into a RelationalText document. Facets include: 69 - - `timestamp` facets on every word/word-group (temporal anchoring) 70 - - `speaker-segment` facets marking speaker turns 71 - 72 - Speaker diarization (identifying who speaks when within a single audio stream) is deferred for v1. Single-speaker talks get one `speaker-segment` for the entire transcript. Multi-speaker sessions (panels, Q&A) default to one segment per scheduled speaker, with manual refinement as a curation task. Diarization can be added as a later enrichment stage without changing the document model. 73 - 74 - ### Stage 4: LLM Enrichment 75 - 76 - Pass transcript through an LLM to identify and annotate: 77 - - Concept mentions -> create/link `tv.ionosphere.concept` records 78 - - Speaker/person mentions -> link to `tv.ionosphere.speaker` records 79 - - Cross-references to other talks -> `talk-xref` facets 80 - - External links/references mentioned verbally -> `link` facets 81 - 82 - Annotations stored as `pub.layers.annotation` layers with source metadata. 83 - 84 - ### Stage 5: Appview Indexing 85 - 86 - SQLite materialization from annotation layers: 87 - - `talks` — title, speaker, times, document JSON, video ref 88 - - `talk_concepts` — join table 89 - - `talk_speakers` — join table 90 - - `talk_crossrefs` — join table 91 - - `concepts` — vocabulary table 92 - - `speakers` — vocabulary table 93 - 94 - ### Stage 6: Render 95 - 96 - Next.js SSG. Video streams from Streamplace at runtime. Transcript sync is client-side JS. Data baked at build time from appview SQLite. 97 - 98 - ### Appview Role 99 - 100 - The appview serves two purposes: (1) a build-time data layer that the Next.js SSG reads from to generate static pages, and (2) a development server for iterating on data and testing. The shipped site is fully static — the appview does not need to run in production. It can be promoted to a live service later if needed (e.g., for search, real-time updates, or API consumers). 101 - 102 - ## Frontend & Playback 103 - 104 - ### Global Timestamp State 105 - 106 - The video player broadcasts current playback time. Every annotation has temporal bounds from word-level timestamps. Components subscribe to global time and activate/deactivate themselves autonomously. No central controller — each annotation is a reactive entity, same pattern as pannacotta's ingredient quantity scaling. 107 - 108 - ``` 109 - VideoPlayer -> currentTime (global state) 110 - | 111 - TranscriptView (Pretext-rendered) 112 - +-- word spans: scroll-to + highlight on match 113 - +-- concept-ref chips: glow/activate on match 114 - +-- speaker-segment blocks: current speaker indicator 115 - +-- talk-xref chips: activate when mentioned 116 - ``` 117 - 118 - Click a word in the transcript -> seek video. Click a concept -> navigate to concept page. Bidirectional. 119 - 120 - ### Page Types 121 - 122 - - **Talk page** — video + synced transcript + sidebar (metadata, speakers, concepts) 123 - - **Speaker page** — bio, all talks, concept co-occurrences 124 - - **Concept page** — description, Wikidata link, all talks mentioning it, timeline of mentions 125 - - **Browse/index** — by day, track, category, concept, speaker. The research librarian view. 126 - - **Home** — conference overview, featured/recent talks 127 - 128 - ### SSG 129 - 130 - All pages statically generated. Video streams from Streamplace at runtime. Transcript sync is client-side JS. Data baked from appview SQLite at build time. 131 - 132 - ## Project Structure 133 - 134 - ``` 135 - ionosphere/ 136 - +-- package.json # pnpm workspace root 137 - +-- pnpm-workspace.yaml 138 - +-- tsconfig.json 139 - +-- lexicons/ 140 - | +-- tv/ionosphere/ 141 - | +-- talk.json 142 - | +-- speaker.json 143 - | +-- concept.json 144 - | +-- event.json 145 - +-- formats/ 146 - | +-- tv.ionosphere/ 147 - | +-- ionosphere.lexicon.json # facet type definitions 148 - | +-- lenses/ 149 - | | +-- schedule-to-talk.lens.json # calendar.event -> tv.ionosphere.talk 150 - | | +-- transcript-to-document.lens.json # raw transcript -> RelationalText doc 151 - | +-- ts/ # annotation, enrichment utilities 152 - +-- apps/ 153 - | +-- ionosphere/ # Next.js SSG frontend 154 - | | +-- src/ 155 - | | +-- app/ 156 - | | | +-- talks/ 157 - | | | +-- speakers/ 158 - | | | +-- concepts/ 159 - | | | +-- components/ 160 - | | | +-- VideoPlayer.tsx 161 - | | | +-- TranscriptView.tsx 162 - | | | +-- AnnotationChips.tsx 163 - | | | +-- TimestampProvider.tsx 164 - | | +-- lib/ 165 - | +-- ionosphere-appview/ # Hono server, SQLite, indexer 166 - | +-- src/ 167 - | +-- appview.ts 168 - | +-- db.ts 169 - | +-- ingest.ts 170 - | +-- correlate.ts 171 - | +-- transcribe.ts 172 - | +-- enrich.ts 173 - | +-- indexer.ts 174 - | +-- routes.ts 175 - +-- scripts/ 176 - | +-- ingest.ts # CLI: pull source data 177 - | +-- transcribe.ts # CLI: run transcription 178 - | +-- enrich.ts # CLI: run LLM enrichment 179 - +-- data/ # cached source data, transcripts 180 - ``` 181 - 182 - ### Dependencies 183 - 184 - - `relational-text` — document model, facets, annotation layers 185 - - `@atproto/api` — AT Protocol client 186 - - `hono` — HTTP server for appview 187 - - `better-sqlite3` — appview storage 188 - - `next` — SSG frontend 189 - - Pretext — transcript layout (integration TBD based on available API) 190 - - panproto — schema versioning 191 - 192 - ## Transcription 193 - 194 - Provider to be evaluated at implementation time. Candidates: 195 - - **Whisper (local)** — free, good quality, word-level timestamps via whisper.cpp or faster-whisper. Requires GPU for reasonable speed. 196 - - **Deepgram** — fast, good word-level timestamps, ~$0.0043/min (Nova-2). ~$0.50 for the full corpus. 197 - - **AssemblyAI** — good diarization, word-level timestamps, ~$0.01/min. ~$1.20 for the full corpus. 198 - 199 - Decision deferred — will compare on a few sample talks first. 200 - 201 - ## Enrichment 202 - 203 - LLM-assisted annotation generates first-pass semantic layers. Humans refine via a curation interface (design TBD, not in initial scope). Each annotation layer carries source metadata (algorithm, model, version, timestamp) so provenance is transparent.
-228
docs/superpowers/specs/2026-03-31-lens-layer-design.md
··· 1 - # Lens Layer: Panproto-Powered Schema Boundaries for Ionosphere 2 - 3 - ## Overview 4 - 5 - The lens layer makes ionosphere forwards-compatible with both source schema changes (new conference platforms, different calendar lexicons, alternative transcription providers) and output schema evolution (versioning `tv.ionosphere.*` lexicons). It uses panproto as the runtime, which provides algebraically correct bidirectional lenses with native AT Protocol support. 6 - 7 - ## Architecture 8 - 9 - Lenses sit at every boundary where an external schema meets an ionosphere schema. Internal transforms (compact encoding, annotation overlay) stay as TypeScript but are shaped for future lens graduation. 10 - 11 - ``` 12 - Source lexicons (calendar, VOD, Whisper, ...) 13 - ↓ panproto lens (auto-generated from lexicon pairs) 14 - Domain lexicons (tv.ionosphere.talk, transcript, ...) 15 - ↓ version migration lens (when lexicons evolve) 16 - Domain lexicons vN 17 - ↓ internal TypeScript (decodeToDocument, overlay) 18 - Rendered output 19 - ``` 20 - 21 - Lenses are AT Protocol records, discoverable via standard XRPC, indexed by the appview. 22 - 23 - ## Runtime: @panproto/core 24 - 25 - Dependency: `@panproto/core` (v0.22.0+, MIT, WASM-backed, ~860KB). 26 - 27 - ### Schema Loading 28 - 29 - `panproto.parseLexicon()` ingests any ATProto lexicon JSON and returns a `BuiltSchema`. We store the source lexicons we don't own alongside our own in the `lexicons/` directory: 30 - 31 - ``` 32 - lexicons/ 33 - tv/ionosphere/ # our lexicons 34 - talk.json 35 - speaker.json 36 - concept.json 37 - event.json 38 - transcript.json 39 - annotation.json 40 - community/lexicon/ # source: ATmosphereConf schedule 41 - calendar/ 42 - event.json 43 - place/stream/ # source: Streamplace VOD 44 - video.json 45 - openai/whisper/ # source: OpenAI Whisper output 46 - verbose_json.json 47 - ``` 48 - 49 - ### Lens Generation 50 - 51 - For most schema boundaries, panproto auto-generates the lens: 52 - 53 - ```typescript 54 - const panproto = await Panproto.init(); 55 - const calendarSchema = panproto.parseLexicon(calendarEventLexicon); 56 - const talkSchema = panproto.parseLexicon(talkLexicon); 57 - const lens = panproto.lens(calendarSchema, talkSchema); 58 - ``` 59 - 60 - For boundaries where auto-generation needs overrides (ambiguous mappings, custom defaults), we serialize protolens chains via `chain.toJson()` and store them as AT Protocol records. 61 - 62 - ### Data Conversion 63 - 64 - `panproto.convert()` takes plain JS objects and returns plain JS objects: 65 - 66 - ```typescript 67 - const talk = await panproto.convert(scheduleRecord, { 68 - from: calendarSchema, 69 - to: talkSchema, 70 - defaults: { room: '', category: '' }, 71 - }); 72 - ``` 73 - 74 - No msgpack serialization on our side. 75 - 76 - ### Version Migration 77 - 78 - When `tv.ionosphere.talk` evolves from v1 to v2: 79 - 80 - ```typescript 81 - const talkV1 = panproto.parseLexicon(talkV1Lexicon); 82 - const talkV2 = panproto.parseLexicon(talkV2Lexicon); 83 - const migrated = await panproto.convert(oldRecord, { from: talkV1, to: talkV2 }); 84 - ``` 85 - 86 - Panproto auto-generates the migration lens from the lexicon diff. The complement preserves any data removed between versions for round-tripping. 87 - 88 - ## Lenses as AT Protocol Records 89 - 90 - Serialized protolens chains are stored as records in the `org.relationaltext.lens` collection on the PDS. This makes lenses discoverable via standard XRPC mechanisms, same as any other AT Protocol record. 91 - 92 - ### Publishing 93 - 94 - `publish.ts` publishes lens records before all other records. Each lens record contains the serialized protolens chain JSON, source/target identifiers, and version metadata. 95 - 96 - ### Resolution 97 - 98 - Pipeline scripts resolve lenses through: 99 - 100 - 1. **Appview index** — materialized from backfill/Jetstream (fast, local SQLite lookup) 101 - 2. **PDS fetch** — direct XRPC `listRecords` on `org.relationaltext.lens` from our PDS (always available after publish) 102 - 3. **Error** — no lens found 103 - 104 - No disk file fallback. The PDS is the single source of truth. 105 - 106 - ### Indexing 107 - 108 - The appview indexes `org.relationaltext.lens` as a new collection. The indexer table stores: `uri, did, rkey, source_nsid, target_nsid, version, chain_json`. 109 - 110 - ## Pipeline Integration 111 - 112 - ### publish.ts 113 - 114 - Gains step 0: publish lens records from lexicon pairs to the PDS. For auto-generated lenses, this means loading both lexicons, generating the chain, serializing it, and writing the record. Idempotent via `putRecord`. 115 - 116 - ### ingest.ts 117 - 118 - Already uses a lens for schedule-to-talk. Changes: 119 - - Resolve lens from appview index / PDS instead of `loadLens("filename")` 120 - - Wire up the VOD-to-talk lens (currently bypassed in `parseVodRecord`) 121 - - Provenance: record which lens produced each talk record 122 - 123 - ### transcribe.ts / providers 124 - 125 - The transcription provider returns raw output in its native format. A lens transforms it to `tv.ionosphere.transcript` format. Swapping to Deepgram or AssemblyAI means: 126 - 1. Add the new provider's output lexicon to `lexicons/` 127 - 2. Publish the new lens (auto-generated from lexicon pair) 128 - 3. The pipeline resolves the right lens by source type 129 - 130 - ### enrich.ts 131 - 132 - No change. LLM enrichment is bespoke extraction logic, not a schema boundary. 133 - 134 - ### indexer.ts 135 - 136 - Gains `org.relationaltext.lens` as a new indexed collection, processed by `processEvent`. 137 - 138 - ### routes.ts 139 - 140 - No change. `decodeToDocument` and `overlayAnnotations` are internal lens-shaped transforms. They stay as TypeScript, candidates for graduation when the pattern proves out (following pannacotta's lead on internal lens usage). 141 - 142 - ## Format Package Changes 143 - 144 - `@ionosphere/format` changes: 145 - 146 - ### lenses.ts (rewrite) 147 - 148 - Becomes a thin panproto wrapper (~40 lines): 149 - 150 - ```typescript 151 - import { Panproto, LensHandle, BuiltSchema } from '@panproto/core'; 152 - 153 - let _panproto: Panproto | null = null; 154 - 155 - export async function init(): Promise<Panproto> { 156 - if (!_panproto) _panproto = await Panproto.init(); 157 - return _panproto; 158 - } 159 - 160 - export async function loadSchema(lexiconJson: object | string): Promise<BuiltSchema> { 161 - const pp = await init(); 162 - return pp.parseLexicon(lexiconJson); 163 - } 164 - 165 - export async function createLens(from: BuiltSchema, to: BuiltSchema): Promise<LensHandle> { 166 - const pp = await init(); 167 - return pp.lens(from, to); 168 - } 169 - 170 - export async function convert( 171 - data: object, 172 - from: BuiltSchema, 173 - to: BuiltSchema, 174 - defaults?: Record<string, unknown>, 175 - ): Promise<unknown> { 176 - const pp = await init(); 177 - return pp.convert(data, { from, to, defaults }); 178 - } 179 - ``` 180 - 181 - Exports `init`, `loadSchema`, `createLens`, `convert`. Pipeline scripts consume these. 182 - 183 - ### Deleted 184 - 185 - - `LensSpec`, `LensRule` interfaces 186 - - `applyLens` function 187 - - `loadLens` function 188 - - `getNestedValue` helper 189 - 190 - ### Kept 191 - 192 - - `lenses/` directory on disk becomes an authoring workspace only (edit JSON there, publish pushes to PDS) 193 - - Existing lens JSON files remain as reference but are no longer loaded at runtime 194 - 195 - ## What Gets Deleted 196 - 197 - - `formats/tv.ionosphere/ts/lenses.ts` — current implementation (replaced by panproto wrapper) 198 - - `formats/tv.ionosphere/ts/lenses.test.ts` — current tests (replaced by panproto law checks + integration tests) 199 - 200 - ## Testing 201 - 202 - ### Lens Law Verification 203 - 204 - Panproto provides `lens.checkLaws(instance)` which verifies both GetPut and PutGet laws. Run this on real source records from each provider. 205 - 206 - ### Integration Tests 207 - 208 - - Real `community.lexicon.calendar.event` record from ATmosphereConf PDS → lens → verify output matches expected `tv.ionosphere.talk` shape 209 - - Real `place.stream.video` record from Streamplace PDS → lens → verify video metadata maps correctly 210 - - Whisper verbose_json output → lens → verify transcript format 211 - 212 - ### Round-Trip Tests 213 - 214 - - Lens publish → PDS → appview backfill → resolve from index → apply → verify output 215 - - PDS direct fetch fallback when appview index is empty 216 - 217 - ### Regression 218 - 219 - - If a source lexicon changes upstream, auto-generation produces a different lens. Tests catch schema drift before it hits production. 220 - 221 - ## Internal Lens Graduation Path 222 - 223 - `decodeToDocument` and `overlayAnnotations` are lens-shaped but stay as TypeScript. The graduation criteria: 224 - - The combinator vocabulary can express the transform (array expansion with byte-range computation is the gap for `decodeToDocument`) 225 - - Pannacotta demonstrates the pattern working for similar internal boundaries 226 - - The benefit of formal lens laws outweighs the complexity of expressing the transform declaratively 227 - 228 - This is not a near-term goal. Note it and revisit when panproto's combinator vocabulary grows.
-109
docs/superpowers/specs/2026-03-31-librarian-index-design.md
··· 1 - # Librarian-Grade Word Index 2 - 3 - ## Overview 4 - 5 - Transform the raw concordance into a professional back-of-book index with lemmatization, multi-word terms, subentries, cross-references, proper noun handling, and inverted entries. Chicago Manual of Style chapter 16 as the reference standard. 6 - 7 - ## Data Pipeline 8 - 9 - Four preprocessing stages run in the concordance builder before word→talk aggregation. 10 - 11 - ### Stage 1: Multi-word term detection 12 - 13 - - **Concept-sourced terms:** Mark transcript spans matching concept names and aliases (760+ terms from LLM enrichment). These become single index entries. 14 - - **Statistical bigrams:** Extract word pairs with PMI (pointwise mutual information) above threshold, appearing in 2+ talks. Filter pairs where either word is a stopword. These catch terms the LLM missed. 15 - - Multi-word terms are single index entries. Constituent words also appear standalone with "see also" pointing to the multi-word term. 16 - 17 - ### Stage 2: Lemmatization 18 - 19 - Using `compromise` NLP library: 20 - 21 - - Collapse plurals (protocols→protocol), verb conjugations (building→build), -ing/-ed/-tion forms 22 - - Normalize British/American spelling (normalise→normalize, colour→color) 23 - - Merge abbreviations with expansions using concept aliases (API filed under "application programming interface" with "see" from API) 24 - - Index entry shows the lemma. All inflected forms' occurrences merge under it. 25 - - `compromise` provides POS tagging — prefer noun forms as the lemma. 26 - 27 - ### Stage 3: Concept enrichment 28 - 29 - - For each lemmatized word, look up co-occurring concepts (via concept annotations on the same transcript) 30 - - **Subentries** when a word appears in 3+ talks: group references by the co-occurring concept. "protocol" → subentry "AT Protocol (3)", subentry "governance (2)" 31 - - **See also** from concept co-occurrence: words that frequently share concepts get cross-referenced 32 - - **See** for synonyms: concept aliases generate redirects ("decentralised → see decentralized") 33 - 34 - ### Stage 4: Proper noun detection 35 - 36 - - Words capitalized in >80% of transcript occurrences → proper noun 37 - - Rendered with original capitalization in the index 38 - - Multi-word proper nouns get inverted entries: "AT Protocol" also appears as "Protocol, AT → see AT Protocol" 39 - 40 - ## API Response Shape 41 - 42 - ```typescript 43 - interface IndexEntry { 44 - term: string; // lemmatized display form 45 - proper: boolean; // render with original capitalization 46 - talks: TalkRef[]; // direct references (not covered by subentries) 47 - subentries: Subentry[]; // grouped by concept context 48 - see: string[]; // redirects to canonical form 49 - seeAlso: string[]; // cross-references to related terms 50 - totalCount: number; 51 - } 52 - 53 - interface Subentry { 54 - label: string; // concept name or context label 55 - talks: TalkRef[]; 56 - } 57 - 58 - interface TalkRef { 59 - rkey: string; 60 - title: string; 61 - count: number; 62 - firstTimestampNs: number; 63 - } 64 - ``` 65 - 66 - ## Rendering 67 - 68 - Entry with subentries: 69 - ``` 70 - protocol — Opening Remarks (1) 71 - — AT Protocol, Building with AT Protocol (3), Protocol Governance (2) 72 - — design, Decentralized Identity (2) 73 - — governance, Protocol Governance (5), Keynote (1) 74 - see also: decentralization, federation, lexicon 75 - ``` 76 - 77 - Simple entry: 78 - ``` 79 - zurich — Research Synthesis (1) 80 - ``` 81 - 82 - See redirect: 83 - ``` 84 - decentralised — see decentralized 85 - ``` 86 - 87 - Proper noun with inversion: 88 - ``` 89 - AT Protocol — Building with AT Protocol (3), Protocol Governance (2) 90 - Protocol, AT — see AT Protocol 91 - ``` 92 - 93 - ## Dependencies 94 - 95 - - `compromise` — NLP: lemmatization, POS tagging, proper noun detection. Zero dependencies, runs in Node. 96 - - Statistical bigram extraction — pure math on word co-occurrence, no dependency. 97 - - Existing: concept data from appview SQLite (annotations + concepts tables). 98 - 99 - ## Files 100 - 101 - ### Modified 102 - - `apps/ionosphere-appview/src/concordance.ts` — add preprocessing pipeline stages 103 - - `apps/ionosphere-appview/src/routes.ts` — `/index` endpoint returns enriched entries 104 - - `apps/ionosphere/src/app/concordance/IndexContent.tsx` — render subentries, see/see also, proper nouns 105 - 106 - ### New 107 - - `apps/ionosphere-appview/src/lemmatize.ts` — compromise wrapper for lemmatization + POS 108 - - `apps/ionosphere-appview/src/bigrams.ts` — PMI-based bigram extraction 109 - - `apps/ionosphere-appview/src/index-enrichment.ts` — concept-based subentries, cross-refs, see/see also
-66
docs/superpowers/specs/2026-03-31-tests-ci-design.md
··· 1 - # Tests & CI/CD Design 2 - 3 - ## Overview 4 - 5 - Add frontend unit tests for logic-heavy code and a GitHub Actions CI pipeline that runs typecheck + tests on every push and PR. 6 - 7 - ## Frontend Unit Tests 8 - 9 - Test pure functions and hooks in `apps/ionosphere` — no DOM environment, no component rendering. 10 - 11 - ### What to test 12 - 13 - **Timestamp calculations:** 14 - - TimestampProvider's time broadcast logic (currentTimeNs offset adjustment) 15 - - Seek callback adjustment (ns → seconds + offset) 16 - 17 - **Transcript position mapping:** 18 - - Compact transcript decoding → word positions (the time→Y mapping that drives brightness wave) 19 - - Boundary time calculations (shared midpoints between adjacent words) 20 - - Time-to-word-index lookup 21 - 22 - **Concept facet overlay:** 23 - - Byte range → word span matching 24 - - Overlapping facet merging 25 - - concept-ref facet extraction from mixed facet arrays 26 - 27 - ### What NOT to test 28 - 29 - - React component rendering (TranscriptView, VideoPlayer) — these are visual and better caught by design review 30 - - HLS video playback — depends on browser APIs 31 - - API integration — depends on running appview 32 - 33 - ## CI/CD Pipeline 34 - 35 - ### GitHub Actions workflow 36 - 37 - **File:** `.github/workflows/ci.yml` 38 - 39 - **Triggers:** push to `main`, pull requests to `main` 40 - 41 - **Steps:** 42 - 1. Checkout 43 - 2. Setup Node 24 + pnpm 44 - 3. `pnpm install` 45 - 4. `pnpm -r typecheck` 46 - 5. `pnpm -r test` 47 - 48 - ### What's NOT in CI (yet) 49 - 50 - - **`next build`** — requires a running appview with data, which needs PDS + SQLite. Add when we set up a CI data fixture or build-time API mock. 51 - - **Panproto WASM tests** — skip gracefully when WASM binary isn't present. No WASM build in CI. These tests gate locally, not in CI. 52 - - **Enrichment/publish jobs** — need OpenAI API key and PDS credentials. Add as separate workflows with GitHub Actions secrets when ready. 53 - - **Linting** — no linter configured. Add during a style pass. 54 - 55 - ### Secrets strategy 56 - 57 - Current workflow needs zero secrets — pure static analysis + unit tests with local/mocked data. 58 - 59 - Future secrets (when needed): 60 - - `OPENAI_API_KEY` — for enrichment CI jobs 61 - - `PDS_URL`, `BOT_HANDLE`, `BOT_PASSWORD` — for publish/integration CI jobs 62 - - Stored as GitHub Actions secrets, never in code or `.env` committed to repo 63 - 64 - ### `.env` handling 65 - 66 - `.env` stays in `.gitignore` (already is). CI uses no env vars for the current workflow. Future workflows use GitHub Actions secrets → env vars.
-74
docs/superpowers/specs/2026-03-31-word-index-design.md
··· 1 - # Word Index: Conference Concordance 2 - 3 - ## Overview 4 - 5 - A book-style concordance page at `/index`. Every non-stopword from every transcript, alphabetized, multi-column typeset layout. Each word links to its occurrences across talks. Clicking an occurrence loads the video and transcript in a fixed side panel, scrolled to and highlighting the target word. 6 - 7 - ## Layout 8 - 9 - - **Left ~75%:** Pretext-rendered multi-column word index, scrollable 10 - - **Right ~25%:** Fixed player column — VideoPlayer on top, TranscriptView below, persists as you browse 11 - 12 - ## Word Index (left panel) 13 - 14 - **Data source:** Raw transcript text from all talks. Split on whitespace, lowercase, filter stopwords, aggregate across talks. 15 - 16 - **Stopwords:** Standard English stopword list plus filler words (um, uh, like, you know). Hardcoded, small, refinable later. 17 - 18 - **Entry format:** 19 - ``` 20 - atproto — Building with AT Protocol (3), Protocol Governance (2), ... 21 - ``` 22 - 23 - Each talk reference is clickable. Number is occurrence count in that talk. 24 - 25 - **Letter headings:** Bold section breaks (A, B, C...) grouped with their entries. 26 - 27 - **Typesetting:** Pretext (`chenglou/pretext`) handles multi-column layout. 28 - - `prepare()` all index entries once 29 - - `layoutWithLines()` to flow into balanced columns 30 - - Proper column balancing (not CSS fill-left-then-right) 31 - - Height measurement for virtualization (concordance could be thousands of entries) 32 - - Letter headings kept grouped with first entries 33 - 34 - ## Player Column (right panel) 35 - 36 - **On click:** Clicking a talk reference in the index: 37 - 1. Loads the video in the VideoPlayer (same component, with offset support) 38 - 2. Shows the full TranscriptView below the video 39 - 3. Scrolls the transcript to the target word 40 - 4. Highlights the index term in the transcript 41 - 42 - **Reuse:** VideoPlayer and TranscriptView are existing components. The brightness wave, scroll-scrub, and concept highlighting all come for free. The player column is essentially a mini talk viewer. 43 - 44 - **Persistence:** The player stays fixed as you scroll the index. Clicking a different word/talk swaps the content. 45 - 46 - ## API 47 - 48 - New appview endpoint: `GET /index` 49 - 50 - Returns the concordance built from transcripts + compact timings in SQLite: 51 - ```json 52 - { 53 - "words": [ 54 - { 55 - "word": "atproto", 56 - "talks": [ 57 - { 58 - "rkey": "ats26-keynote", 59 - "title": "Keynote: Towards Modular Open Science", 60 - "count": 3, 61 - "firstTimestampNs": 1234567890 62 - } 63 - ] 64 - } 65 - ] 66 - } 67 - ``` 68 - 69 - Built at serve time: decode compact transcripts, split text, filter stopwords, aggregate by word across talks. Cacheable — transcripts don't change at runtime. 70 - 71 - ## Dependencies 72 - 73 - - `pretext` — text measurement and multi-column layout 74 - - Existing: `VideoPlayer`, `TranscriptView`, `TimestampProvider`
-184
docs/superpowers/specs/2026-04-01-comments-oauth-design.md
··· 1 - # Comments & Reactions with AT Protocol OAuth 2 - 3 - ## Overview 4 - 5 - Users can comment on and react to conference talk transcripts. Comments are AT Protocol records published to the user's own PDS, discovered via Jetstream firehose, and indexed by the ionosphere appview. A single `tv.ionosphere.comment` lexicon handles comments, emoji reactions, and threaded replies. 6 - 7 - ## Lexicon 8 - 9 - One record type: `tv.ionosphere.comment` 10 - 11 - ```json 12 - { 13 - "lexicon": 1, 14 - "$type": "com.atproto.lexicon.schema", 15 - "id": "tv.ionosphere.comment", 16 - "revision": 1, 17 - "description": "A comment or reaction on a transcript, talk, or another comment.", 18 - "defs": { 19 - "main": { 20 - "type": "record", 21 - "key": "tid", 22 - "record": { 23 - "type": "object", 24 - "required": ["subject", "text", "createdAt"], 25 - "properties": { 26 - "subject": { 27 - "type": "string", 28 - "format": "at-uri", 29 - "description": "AT URI of the transcript, talk, or parent comment." 30 - }, 31 - "text": { 32 - "type": "string", 33 - "description": "Comment body or single emoji reaction." 34 - }, 35 - "facets": { 36 - "type": "array", 37 - "items": { "type": "ref", "ref": "app.bsky.richtext.facet" }, 38 - "description": "Rich text facets (mentions, links) in the comment." 39 - }, 40 - "anchor": { 41 - "type": "ref", 42 - "ref": "#byteRange", 43 - "description": "Optional byte range on the subject's text." 44 - }, 45 - "createdAt": { 46 - "type": "string", 47 - "format": "datetime" 48 - } 49 - } 50 - } 51 - }, 52 - "byteRange": { 53 - "type": "object", 54 - "required": ["byteStart", "byteEnd"], 55 - "properties": { 56 - "byteStart": { "type": "integer" }, 57 - "byteEnd": { "type": "integer" } 58 - } 59 - } 60 - } 61 - } 62 - ``` 63 - 64 - **Use cases:** 65 - - Emoji on a passage: `{ subject: transcriptUri, text: "🔥", anchor: { byteStart: 100, byteEnd: 150 } }` 66 - - Comment on a passage: `{ subject: transcriptUri, text: "Great point about federation", anchor: { byteStart: 100, byteEnd: 150 } }` 67 - - Reply to a comment: `{ subject: commentUri, text: "Agreed!" }` — no anchor 68 - - Emoji on a whole talk: `{ subject: talkUri, text: "👏" }` — no anchor 69 - 70 - ## AT Protocol OAuth 71 - 72 - **Library:** `@atproto/oauth-client-browser` — handles DPOP, PAR, token refresh, IndexedDB storage. 73 - 74 - **Scope:** `atproto` (minimal — read/write to user's own repo). 75 - 76 - **Client metadata:** Published at `https://ionosphere.tv/client-metadata.json` (localhost variant for dev). Defines app name, redirect URI, scope. 77 - 78 - **Token storage:** Browser IndexedDB only. The appview is stateless — never sees tokens. 79 - 80 - **Flow:** 81 - 1. User clicks "Sign in" 82 - 2. OAuth redirect to user's PDS authorization endpoint 83 - 3. User authorizes ionosphere with `atproto` scope 84 - 4. Redirect back with authorization code 85 - 5. Browser exchanges code for tokens (DPOP-bound) 86 - 6. `@atproto/api` Agent created with authenticated session 87 - 7. Agent writes `tv.ionosphere.comment` directly to user's PDS 88 - 8. Jetstream picks it up → appview indexes → visible to everyone 89 - 90 - **Writing comments:** The frontend uses `agent.com.atproto.repo.createRecord` to write comments to the user's PDS. The user's PDS must accept the `tv.ionosphere.comment` collection — this works on any standard AT Protocol PDS. 91 - 92 - ## Comment Indexing 93 - 94 - **Jetstream subscription:** The appview subscribes to a public Jetstream instance with `wantedCollections=tv.ionosphere.comment`. This delivers every `tv.ionosphere.comment` from any user on the network. 95 - 96 - Separate from the existing local PDS Jetstream subscription. The appview runs two Jetstream connections: one for local PDS (ionosphere data), one for public network (user comments). 97 - 98 - **Database:** 99 - 100 - ```sql 101 - CREATE TABLE comments ( 102 - uri TEXT PRIMARY KEY, 103 - author_did TEXT NOT NULL, 104 - rkey TEXT NOT NULL, 105 - subject_uri TEXT NOT NULL, 106 - text TEXT NOT NULL, 107 - facets TEXT, 108 - byte_start INTEGER, 109 - byte_end INTEGER, 110 - created_at TEXT NOT NULL, 111 - indexed_at TEXT DEFAULT CURRENT_TIMESTAMP 112 - ); 113 - 114 - CREATE INDEX idx_comments_subject ON comments(subject_uri); 115 - CREATE INDEX idx_comments_author ON comments(author_did); 116 - ``` 117 - 118 - **API endpoints:** 119 - - `GET /talks/:rkey/comments` — all comments on a talk's transcript (anchored + unanchored) 120 - - `GET /comments?subject=<at-uri>` — comments on any subject URI 121 - - Replies: query where `subject_uri` matches a comment URI 122 - 123 - **Author resolution:** Lazy-fetch DID → handle/display name from the network. Cache in a `profiles` table. 124 - 125 - ## Comment UI 126 - 127 - **Inline transcript annotations:** 128 - - Comments anchored to byte ranges render as highlights on the transcript word spans 129 - - Visual treatment: subtle background tint (different from concept amber glow) 130 - - Small emoji clusters displayed near highlighted spans 131 - - Click a highlighted span → opens comment thread in sidebar 132 - 133 - **Comment composition:** 134 - - Quick reaction: select text → emoji palette → click → published 135 - - Full comment: select text → comment input → type → submit 136 - - Both require OAuth sign-in 137 - 138 - **Signed-out experience:** 139 - - All comments/reactions visible (read from appview) 140 - - "Sign in" prompt when attempting to react or comment 141 - 142 - **Threading:** 143 - - Reply to a comment: creates a new comment with `subject` pointing to parent comment URI 144 - - Displayed as indented thread under parent 145 - 146 - **Aggregation:** 147 - - Talk listing shows comment count badges 148 - - Transcript highlights show reaction counts per span 149 - 150 - ## Architecture 151 - 152 - ``` 153 - User (browser) 154 - ↓ OAuth sign-in → user's PDS authorization 155 - ↓ Write tv.ionosphere.comment → user's PDS 156 - 157 - Public Jetstream (filtered: tv.ionosphere.comment) 158 - 159 - Ionosphere Appview → index into SQLite → serve via API 160 - 161 - All users see comments (no auth required to read) 162 - ``` 163 - 164 - ## Files 165 - 166 - ### New 167 - - `lexicons/tv/ionosphere/comment.json` — comment lexicon 168 - - `apps/ionosphere/src/lib/auth.ts` — OAuth client setup, sign-in/out, session state 169 - - `apps/ionosphere/src/app/components/AuthButton.tsx` — sign in/out button in nav 170 - - `apps/ionosphere/src/app/components/CommentOverlay.tsx` — inline comment highlights on transcript 171 - - `apps/ionosphere/src/app/components/CommentPanel.tsx` — comment thread sidebar 172 - - `apps/ionosphere/src/app/components/EmojiPicker.tsx` — quick reaction palette 173 - - `apps/ionosphere/src/app/components/TextSelection.tsx` — handles text selection → comment/react 174 - - `apps/ionosphere-appview/src/public-jetstream.ts` — Jetstream subscription for public network 175 - 176 - ### Modified 177 - - `apps/ionosphere-appview/src/db.ts` — add comments table 178 - - `apps/ionosphere-appview/src/indexer.ts` — handle tv.ionosphere.comment events 179 - - `apps/ionosphere-appview/src/routes.ts` — add comment endpoints 180 - - `apps/ionosphere-appview/src/appview.ts` — start public Jetstream connection 181 - - `apps/ionosphere/src/app/components/NavHeader.tsx` — add auth button 182 - - `apps/ionosphere/src/app/components/TranscriptView.tsx` — render comment highlights 183 - - `apps/ionosphere/src/app/layout.tsx` — OAuth provider wrapper 184 - - `apps/ionosphere/public/client-metadata.json` — OAuth client metadata
-96
docs/superpowers/specs/2026-04-02-comment-ui-polish-design.md
··· 1 - # Comment UI Polish — Design Spec 2 - 3 - **Date:** 2026-04-02 4 - **Status:** Approved 5 - 6 - ## Context 7 - 8 - Comments and reactions are working end-to-end (AT Protocol OAuth → user PDS → Jetstream → appview → frontend with optimistic rendering). This spec covers the next round of polish: author identity, discoverability, whole-talk reactions, and comment count badges. 9 - 10 - ## 1. Author Identity Resolution 11 - 12 - **Problem:** Comments display truncated DIDs (`did:plc:abc123...`) instead of human-readable identities. 13 - 14 - **Solution:** Appview-side profile cache. 15 - 16 - - Add a `profiles` table to the appview SQLite DB: 17 - ```sql 18 - CREATE TABLE IF NOT EXISTS profiles ( 19 - did TEXT PRIMARY KEY, 20 - handle TEXT, 21 - display_name TEXT, 22 - avatar_url TEXT, 23 - fetched_at TEXT 24 - ); 25 - ``` 26 - - When the appview encounters a comment from an unknown DID, resolve via `https://public.api.bsky.app/xrpc/app.bsky.actor.getProfile?actor=<did>`. 27 - - Cache in DB. Refresh if `fetched_at` is older than 24 hours. 28 - - The `/talks/:rkey/comments` endpoint joins profile data onto each comment in its response. 29 - - Frontend renders handle + avatar wherever comments appear: 30 - - TranscriptView expanded popover (replaces `c.author_did.slice(8, 24)...`) 31 - - CommentPanel author line 32 - - Any future comment surfaces 33 - 34 - ## 2. Discoverability Hint 35 - 36 - **Problem:** No indication that text selection enables reactions. Users don't discover the feature. 37 - 38 - **Solution:** Persistent hint that dismisses after first use. 39 - 40 - - Small, subtle text below the transcript panel: "Select text to add a reaction" 41 - - Neutral color (e.g. `text-neutral-600`), doesn't compete with transcript content. 42 - - Visibility controlled by localStorage key `has_commented`. 43 - - Once the user publishes any comment or reaction (anchored or whole-talk), set the flag and hide the hint. 44 - - On subsequent visits, the hint never appears. 45 - 46 - ## 3. Whole-Talk Reaction Bar 47 - 48 - **Problem:** Users can only react to specific text selections. There's no way to react to or comment on a talk as a whole (CommentPanel exists but isn't wired in). 49 - 50 - **Solution:** Compact reaction bar below the video player, above the transcript. 51 - 52 - - Row of 6 quick-reaction emoji buttons (same set as TextSelector: fire, clap, bulb, question, 100, heart). 53 - - A "Comment" button at the end. 54 - - Click emoji → publish unanchored comment (no byte range) with just the emoji as text. Optimistic rendering. 55 - - Click Comment → expand an inline text input. Post on Enter, collapse on Escape or after posting. 56 - - Display current whole-talk reaction counts inline in the bar (emoji + count pills, same style as existing player header reactions). 57 - - Remove the existing whole-talk reaction display from the TalksListContent player header title bar (it moves here). 58 - - When no reactions exist yet, just show the emoji buttons — no empty state clutter. 59 - 60 - ## 4. Comment Count Badges on Talk Listings 61 - 62 - **Problem:** Talk listings show no indication of comment/reaction activity. 63 - 64 - **Solution:** Add reaction summary to talk metadata lines. 65 - 66 - - Add a query to the `/talks` endpoint that returns `comment_count` and `reaction_summary` per talk. 67 - - `reaction_summary`: top 3 emoji types with counts, as a JSON array (e.g. `[["fire",2],["clap",1]]`). 68 - - `comment_count`: count of text comments (non-emoji). 69 - - Frontend renders in the existing metadata line pattern: 70 - ``` 71 - Speaker · Room · 10:30 AM · 🔥2 👏1 💬3 72 - ``` 73 - - Max 3 emoji types shown. The 💬N counter only appears if there are text-only comments. 74 - - If no comments/reactions exist for a talk, nothing is shown (no empty badge). 75 - 76 - ## Files Affected 77 - 78 - ### Appview (backend) 79 - - `apps/ionosphere-appview/src/db.ts` — add `profiles` table to migration 80 - - `apps/ionosphere-appview/src/routes.ts` — join profiles on comment endpoints, add reaction summary to `/talks` 81 - - `apps/ionosphere-appview/src/indexer.ts` or new `src/profiles.ts` — profile resolution + caching logic 82 - - `apps/ionosphere-appview/src/public-jetstream.ts` — trigger profile resolution on new comment DIDs 83 - 84 - ### Frontend 85 - - `apps/ionosphere/src/app/components/TranscriptView.tsx` — render author handle/avatar in popover, add discoverability hint 86 - - `apps/ionosphere/src/app/talks/[rkey]/TalkContent.tsx` — add whole-talk reaction bar between video and transcript 87 - - `apps/ionosphere/src/app/talks/TalksListContent.tsx` — render comment count badges, remove header reaction display 88 - - `apps/ionosphere/src/lib/comments.ts` — update CommentData type to include profile fields 89 - - New component: reaction bar (could be inline in TalkContent or extracted) 90 - 91 - ## Non-Goals 92 - 93 - - Threaded comment replies (future work) 94 - - Comment moderation / reporting 95 - - Real-time comment updates via WebSocket to the frontend (currently polls on publish) 96 - - Comment editing or deletion
-144
docs/superpowers/specs/2026-04-03-enhanced-boundary-detection-design.md
··· 1 - # Enhanced Boundary Detection Pipeline 2 - 3 - ## Goal 4 - 5 - Improve talk boundary detection reliability by adding two new signal layers — Whisper segment confidence and speaker diarization — while formalizing evaluation against ground truth. Train on Great Hall Day 1; reserve other 6 streams for verification to avoid overfitting. 6 - 7 - ## Current State 8 - 9 - **Pipeline:** `transcribe-fullday.ts` → `detect-boundaries-v5.ts` → `apply-boundaries.ts` 10 - 11 - **v5 signals:** silence gaps, transition phrases, phonetic speaker name matching, title keywords, DP assignment with drift tracking, forward-scan refinement. 12 - 13 - **v5 results (Great Hall Day 1, 16 talks):** 12/13 verified within 2 min of ground truth. One outlier (Sattestations, 3:02 vs 3:16) in a garbled break zone. 14 - 15 - **Weaknesses:** 16 - - Garbled zone detection uses fragile word-repetition patterns 17 - - No speaker identity signal — can't distinguish MC from presenter 18 - - No automated evaluation — manual eyeballing only 19 - 20 - ## Design 21 - 22 - ### Architecture 23 - 24 - ``` 25 - HLS stream 26 - → ffmpeg extract audio (once, shared) 27 - → [parallel] 28 - Whisper re-transcription (with prompt hints + segment confidence) 29 - pyannote speaker diarization 30 - → merge enrichment data into unified transcript JSON 31 - → detect-boundaries-v6.ts (enhanced scoring) 32 - → evaluate against ground truth 33 - ``` 34 - 35 - ### Layer 1: Audio Extraction (shared) 36 - 37 - Single extraction step produces audio files consumed by both Whisper and pyannote. 20-minute MP3 chunks for Whisper (25MB limit), plus full WAV for pyannote (needs uncompressed audio for best results, and has no size limit). 38 - 39 - Extracted audio stored in `data/fullday/<stream-name>/` so it persists across runs. 40 - 41 - ### Layer 2: Whisper Re-transcription with Segment Confidence 42 - 43 - Re-transcribe Great Hall Day 1 using both `word` and `segment` timestamp granularities. Each segment gains `avg_logprob` and `no_speech_prob` fields. 44 - 45 - Prompt hints per chunk: speaker names, talk titles, venue name ("ATmosphereConf 2026, Great Hall South"). This dramatically improves transcription quality (learned last session). 46 - 47 - Output: enhanced transcript JSON with both word-level timestamps and segment-level confidence. 48 - 49 - ### Layer 3: Speaker Diarization (pyannote.audio) 50 - 51 - **Tool:** pyannote.audio 3.x — state-of-the-art speaker diarization. Runs on MPS (Apple Silicon GPU). 52 - 53 - **Input:** WAV audio (full stream or chunked). 54 - 55 - **Output:** Speaker segments — `[{start, end, speaker_id}]` — aligned to word timestamps. 56 - 57 - **Integration:** Each word in the transcript gets a `speaker` field. Boundary detection uses speaker-change points as signals. 58 - 59 - **Panel handling:** Multiple speakers within a talk segment is expected. The boundary signal is whether the *set of active speakers* changes across a gap, not individual turn-taking. 60 - 61 - **Broader value:** Speaker diarization is a first-class annotation layer, useful beyond boundary detection (city council meetings, interviews, panels, etc.). 62 - 63 - ### Layer 4: Enhanced Boundary Detection (v6) 64 - 65 - New signals added to v5's scoring: 66 - 67 - | Signal | Weight | Description | 68 - |--------|--------|-------------| 69 - | `speaker_change` | 12 | Dominant speaker before gap ≠ dominant speaker after gap | 70 - | `speaker_set_change` | 8 | Set of speakers in prev window ≠ set in next window | 71 - | `confidence_drop` | 6 | Low `avg_logprob` zone (music, applause, bad mic) near gap | 72 - | `no_speech_zone` | 4 | High `no_speech_prob` segments reinforcing gap detection | 73 - 74 - Weights are initial estimates; tuned against ground truth. 75 - 76 - Garbled zone detection replaced: instead of word-repetition pattern matching, use `avg_logprob < threshold` and `no_speech_prob > threshold`. 77 - 78 - ### Layer 5: Ground Truth & Evaluation 79 - 80 - Formalize Great Hall Day 1 ground truth as structured JSON: 81 - 82 - ```json 83 - { 84 - "stream": "Great Hall - Day 1", 85 - "talks": [ 86 - { 87 - "rkey": "gDELD0M", 88 - "title": "Landslide", 89 - "speaker": "Erin Kissane", 90 - "ground_truth_start": 990, 91 - "tolerance_seconds": 120 92 - }, 93 - ... 94 - ] 95 - } 96 - ``` 97 - 98 - Evaluation script scores a boundary detection run: 99 - - Accuracy: % of talks within tolerance of ground truth 100 - - Mean absolute error (seconds) 101 - - Per-talk breakdown with pass/fail 102 - 103 - ### Project Structure 104 - 105 - ``` 106 - apps/ionosphere-appview/ 107 - tools/ 108 - requirements.txt 109 - extract_audio.py # shared audio extraction from HLS 110 - diarize.py # pyannote speaker diarization → JSON 111 - transcribe_enhanced.py # Whisper with segment confidence + prompt hints 112 - merge_enrichment.py # combine transcription + diarization into unified JSON 113 - evaluate.py # score boundaries against ground truth 114 - src/ 115 - detect-boundaries-v6.ts # enhanced detection 116 - data/ 117 - ground-truth/ 118 - great-hall-day-1.json 119 - fullday/ # extracted audio + enriched transcripts 120 - ``` 121 - 122 - Python tools are standalone scripts that produce JSON. TypeScript pipeline consumes JSON. Clean boundary between languages. 123 - 124 - ### Dev Workflow 125 - 126 - 1. Extract audio for Great Hall Day 1 (once) 127 - 2. Run Whisper re-transcription with segment confidence 128 - 3. Run pyannote diarization 129 - 4. Merge into enriched transcript 130 - 5. Run v6 boundary detection 131 - 6. Evaluate against ground truth 132 - 7. Iterate on scoring weights 133 - 8. When satisfied, run on verification streams (other 6) to check generalization 134 - 135 - ### Dependencies 136 - 137 - **Python (new):** 138 - - `pyannote.audio` ≥ 3.1 139 - - `torch` (with MPS support) 140 - - `openai` (for Whisper API) 141 - 142 - **Existing:** 143 - - `ffmpeg` (already used) 144 - - `openai` npm package (existing, but Whisper calls move to Python)
-251
docs/superpowers/specs/2026-04-05-alignment-editor-design.md
··· 1 - # Alignment Editor Design 2 - 3 - NLE-style alignment editing tools for the ionosphere.tv track timeline view. Enables visual verification and correction of talk boundaries, speaker naming, and ground truth building — directly in the browser. 4 - 5 - ## Scope 6 - 7 - Phases 1–4 from the pre-plan: 8 - 1. Drag-to-edit boundaries with undo/redo 9 - 2. Magnetic snap to silence gaps, speaker changes, word boundaries 10 - 3. Verification workflow with ground truth export 11 - 4. Speaker naming from diarization IDs 12 - 13 - Phase 5 (AT Protocol persistence) is deferred. The sidecar format is designed to map naturally to AT Protocol records when that time comes. 14 - 15 - ## Architecture: Timeline Engine with Layered Tracks 16 - 17 - A shared `TimelineEngine` (React context + store) owns the coordinate system, editing state, and corrections log. Individual rendering layers subscribe to the engine and handle their own display. This separates interaction, rendering, and data concerns cleanly. 18 - 19 - ### Timeline Engine 20 - 21 - The engine replaces the zoom/pan state currently in `ZoomableTimeline` and adds editing concerns. 22 - 23 - **State:** 24 - - **Viewport**: zoomLevel, panCenter, windowStart/windowEnd, durationSeconds 25 - - **Editing**: mode (`select` | `trim` | `split` | `add`), editingEnabled (toggle), activeDrag, selection 26 - - **Playback**: currentTimeSec (mirrored from TimestampProvider) 27 - - **Corrections**: sidecar log, computed effectiveTalks, computed snapTargets, undo cursor 28 - 29 - **Derived values:** 30 - - `timeToPixel(seconds)` / `pixelToTime(px)` — coordinate conversion for all layers 31 - - `effectiveTalks` — pipeline talks with corrections replayed 32 - - `snapTargets` — computed from word timestamps, diarization, silence gaps 33 - - `canUndo` / `canRedo` 34 - 35 - **Actions:** 36 - - `setMode(mode)`, `toggleEditing()` 37 - - `startDrag(talkRkey, edge, pixelX)`, `updateDrag(pixelX)`, `commitDrag()`, `cancelDrag()` 38 - - `splitTalk(rkey, atSeconds)`, `addTalk(startSeconds, endSeconds)`, `removeTalk(rkey)` 39 - - `markVerified(rkey)`, `unmarkVerified(rkey)` 40 - - `setSpeakerName(speakerId, name)` 41 - - `undo()`, `redo()` 42 - - `save()` — persist sidecar to disk via API 43 - 44 - ### Corrections Sidecar 45 - 46 - An append-only log of edit operations. Each entry is a discrete record with identity and timestamp. 47 - 48 - **Entry shape:** 49 - ```ts 50 - interface CorrectionEntry { 51 - id: string; // nanoid 52 - timestamp: string; // ISO 8601 53 - authorDid?: string; // AT Protocol DID if logged in 54 - streamSlug: string; 55 - action: CorrectionAction; 56 - } 57 - 58 - type CorrectionAction = 59 - | { type: "move_boundary"; talkRkey: string; edge: "start" | "end"; fromSeconds: number; toSeconds: number } 60 - | { type: "split_talk"; talkRkey: string; atSeconds: number; newRkey: string } 61 - | { type: "add_talk"; rkey: string; title: string; startSeconds: number; endSeconds: number } 62 - | { type: "remove_talk"; talkRkey: string } 63 - | { type: "set_talk_title"; talkRkey: string; title: string } 64 - | { type: "verify_talk"; talkRkey: string } 65 - | { type: "unverify_talk"; talkRkey: string } 66 - | { type: "name_speaker"; speakerId: string; name: string } 67 - ``` 68 - 69 - **Storage:** One JSON file per stream alongside the existing boundary JSON, e.g. `corrections-great-hall-day-1.json`. 70 - 71 - **Effective state:** Computed by replaying the log (up to the undo cursor) against the base pipeline talks. Pure reduce — the log is the source of truth. Replay semantics per action: 72 - - `move_boundary`: set the specified edge of the talk to `toSeconds` (absolute value; `fromSeconds` is for auditability only) 73 - - `split_talk`: replace the original talk with two — first gets `[originalStart, atSeconds)`, second gets `[atSeconds, originalEnd)` with `newRkey` and a placeholder title 74 - - `add_talk`: insert a new talk with the given fields 75 - - `remove_talk`: exclude the talk from effective state (if the talk was created by an earlier `add_talk`, that entry is still in the log and will re-create it on undo) 76 - - `set_talk_title`: update the talk's title 77 - 78 - **Undo/redo:** A cursor into the log. Undo decrements (entry stays but isn't applied). Redo increments. New edits truncate after the cursor. On save, only entries up to the cursor are persisted. 79 - 80 - **AT Protocol future:** Each entry maps naturally to an AT Protocol record. The sidecar becomes a collection of records published to a PDS. 81 - 82 - ## Rendering Layers 83 - 84 - The timeline is a stack of independently rendered layers sharing coordinate conversion from the engine. 85 - 86 - **Layer stack (top to bottom):** 87 - 88 - 1. **Interaction overlay** — transparent div on top, handles all pointer events during editing. Renders drag handles, cursor changes, snap guide lines. 89 - 90 - 2. **Talk segments** — the existing `StreamTimeline` rendering, refactored to read from `effectiveTalks`. In edit mode, boundary edges get a visual affordance (brighter edge, ~4px hit zone). Selected talk gets a highlight. Verified talks show a checkmark badge. 91 - 92 - 3. **Waveform/diarization band** — combined visualization that morphs with zoom level: 93 - - **Low zoom (1–4x):** Speaker-colored blocks (current diarization band behavior) 94 - - **High zoom (4–8x+):** Speaker-colored area chart where height = word density per time bin 95 - - Crossover is gradual — blocks gain height variation as zoom increases 96 - 97 - 4. **Snap guides** — vertical lines at snap target positions, only visible during drag. Faint dashed lines that brighten within snap range. 98 - 99 - 5. **Time ruler** — tick marks and time labels. At higher zoom, intermediate ticks appear. 100 - 101 - Each layer is a React component reading from `useTimelineEngine()`, absolutely positioned within a shared container. 102 - 103 - **Waveform computation:** Pre-computed on mount from transcript words. For the visible window, bucket words into ~2px-wide time bins. Height = word count per bin. Color = dominant speaker in each bin. 64K words into a few hundred bins is trivial. 104 - 105 - ## NLE Toolbar & Edit Mode 106 - 107 - **Layout:** 108 - ``` 109 - [Edit toggle] | [− zoom +] [window range] [Reset] 110 - | [Select] [Trim] [Split] [Add] [Delete] | [Undo] [Redo] | [Save] 111 - ``` 112 - 113 - Top row: existing zoom controls. Bottom row: editing toolbar, visible only when Edit is toggled on. 114 - 115 - **Modes:** 116 - 117 - - **Select** (`V`): Click a talk to select it. Shows details (title, speaker, start/end, verified status). Default when editing is enabled. 118 - - **Trim** (`T`): Hover near a boundary edge to see drag handle. Click-drag to move. Snap targets attract within 10px. Alt/Option overrides snapping. 119 - - **Split** (`S`): Click on a talk segment to split at that position. First segment inherits original metadata; second gets a placeholder title. 120 - - **Add** (`A`): Click-drag on an empty gap to create a new talk segment. Gets placeholder title, unverified status. 121 - - **Delete** (`Backspace`/`Delete` when selected): Removes the selected talk. Confirmation required for verified talks. 122 - 123 - **Keyboard shortcuts:** 124 - 125 - Playback (always active): 126 - - `J` / `K` / `L` — reverse / pause / forward 127 - - `Arrow Left` / `Arrow Right` — nudge playhead 1 second 128 - - `Shift+Arrow Left` / `Shift+Arrow Right` — nudge playhead 0.1 second 129 - - `Space` — play/pause 130 - 131 - Editing (when edit mode is on): 132 - - `Ctrl+Z` / `Ctrl+Shift+Z` — undo / redo 133 - - `[` — nudge selected talk's start boundary 1 second earlier 134 - - `]` — nudge selected talk's end boundary 1 second later 135 - - `Shift+[` — nudge start boundary 0.1 second earlier 136 - - `Shift+]` — nudge end boundary 0.1 second later 137 - - `Enter` — mark selected talk as verified 138 - - `Escape` — cancel drag, deselect, or exit edit mode 139 - - `V` / `T` / `S` / `A` — switch mode 140 - - `Ctrl+S` / `Cmd+S` — save 141 - 142 - **Save** writes the corrections sidecar to disk via the API. Explicit save (not auto-save) — an editor should be intentional. 143 - 144 - **Dirty state indicator:** When unsaved edits exist (in-memory log diverges from persisted sidecar), the Save button shows a dot/badge and the toolbar displays "unsaved changes". 145 - 146 - ## Snap System 147 - 148 - During boundary drag in Trim mode, nearby positions exert magnetic pull. 149 - 150 - **Snap targets (priority order):** 151 - 152 - 1. **Silence gaps > 2s** — from word timestamps. Where consecutive words are > 2s apart, the gap is a snap target. (Refined from the pre-plan's 3s threshold — 2s catches more useful transitions.) 153 - 2. **Speaker change points** — from diarization segments. 154 - 3. **Low-confidence zones** — Whisper segments with low `avg_logprob`. Edges often correspond to applause/noise. 155 - 4. **Word boundaries** — at high zoom, snap to nearest word start/end. 156 - 157 - **Edge-aware snapping:** Snap targets resolve to the near edge of the target feature relative to the boundary being dragged, offset by 500ms: 158 - 159 - - Dragging a **start boundary** (left edge): snaps to the **end** of the preceding gap + 500ms. Lands just before the speaker's first words. 160 - - Dragging an **end boundary** (right edge): snaps to the **start** of the following gap − 500ms. Lands just after the speaker's last words. 161 - 162 - The 500ms delta provides breathing room so cuts don't land on the first/last syllable. If the offset would overshoot the nearest word boundary (e.g., gap ends only 300ms before the first word), clamp to the word boundary instead. 163 - 164 - **Behavior:** 165 - - Snap radius: ~10px screen distance (adapts to zoom) 166 - - Multiple targets within range: highest priority wins 167 - - Visual feedback: snap guide brightens, small label appears ("silence gap", "speaker change") 168 - - Alt/Option held: snapping disabled, continuous positioning 169 - 170 - **Pre-computation:** Snap targets computed once on stream data load from existing transcript and diarization data. Stored sorted by time for binary-search during drag. 171 - 172 - ## Verification Workflow 173 - 174 - 1. Open track view, toggle Edit on 175 - 2. Select a talk — highlights on timeline, video seeks to start 176 - 3. Play through boundary — video + transcript sync shows what's happening 177 - 4. If wrong: Trim mode, drag to correct, snaps help 178 - 5. If correct: Enter to mark verified 179 - 180 - **Visual indicators:** Verified talks show a checkmark badge on the timeline segment and in the talk list. 181 - 182 - **Progress:** Stream page shows "8/13 talks verified" stat. 183 - 184 - **Ground truth export:** An action (toolbar button or CLI) that outputs all verified talks in the existing ground truth JSON format. Field mapping from effective state: 185 - - `rkey` — from the effective talk 186 - - `title` — from the effective talk (after any `set_talk_title` corrections) 187 - - `speaker` — from the `name_speaker` mapping for the dominant speaker during the talk, or empty string if unnamed 188 - - `ground_truth_start` — the effective talk's `startSeconds` 189 - - `tolerance_seconds` — 120 (uniform, matching existing ground truth) 190 - - `verified` — true (only verified talks are exported) 191 - - `notes` — auto-generated: correction count, original pipeline timestamp for diff reference 192 - 193 - Feeds directly into the boundary detection evaluation pipeline. 194 - 195 - ## Speaker Naming 196 - 197 - **Interaction:** In Select mode, clicking a diarization segment selects that speaker. A popover shows: 198 - - Current label (e.g., "SPEAKER_12") 199 - - Text input to assign a name 200 - - List of talks where this speaker is dominant 201 - 202 - **Auto-suggestion:** When verifying a talk, the engine checks which speaker ID is dominant during that talk's range. If the talk has a known speaker from schedule data, it offers to auto-map. 203 - 204 - **Scope:** Stream-level mapping. SPEAKER_12 in Great Hall Day 1 maps to a name for that stream only (pyannote assigns IDs independently per file). 205 - 206 - **Display:** Named speakers show their name in tooltips. Diarization band legend uses names where available. 207 - 208 - ## Data Flow 209 - 210 - ``` 211 - Pipeline boundary JSON (read-only base) 212 - 213 - TimelineEngine loads base talks + corrections sidecar 214 - ↓ replay corrections log up to undo cursor 215 - Effective talks (derived) 216 - 217 - Rendering layers read effective talks + snap targets 218 - 219 - User edits → new CorrectionEntry appended to log 220 - 221 - Save → sidecar JSON written to disk 222 - ↓ (future) 223 - Publish → sidecar entries become AT Protocol records 224 - ``` 225 - 226 - ## API Surface 227 - 228 - The appview needs two new XRPC endpoints (matching existing naming convention): 229 - 230 - - `GET /xrpc/tv.ionosphere.getCorrections?stream=<slug>` — load the corrections sidecar 231 - - `PUT /xrpc/tv.ionosphere.putCorrections` (body: `{ stream, corrections }`) — save the corrections sidecar 232 - 233 - These read/write the sidecar JSON file. No schema changes to the existing database. The PUT endpoint validates the stream slug against known streams to prevent path traversal. 234 - 235 - ## File References 236 - 237 - Files to modify: 238 - - `apps/ionosphere/src/app/tracks/[stream]/TrackViewContent.tsx` — refactor to use TimelineEngine 239 - - `apps/ionosphere/src/app/components/StreamTimeline.tsx` — becomes a rendering layer 240 - - `apps/ionosphere/src/app/components/DiarizationBand.tsx` — becomes the waveform/diarization layer 241 - 242 - New files: 243 - - `apps/ionosphere/src/lib/timeline-engine.ts` — engine store and logic 244 - - `apps/ionosphere/src/lib/corrections.ts` — sidecar types, replay, serialization 245 - - `apps/ionosphere/src/lib/snap-targets.ts` — snap computation and lookup 246 - - `apps/ionosphere/src/app/components/TimelineToolbar.tsx` — NLE toolbar 247 - - `apps/ionosphere/src/app/components/InteractionOverlay.tsx` — drag handles, hit detection 248 - - `apps/ionosphere/src/app/components/WaveformBand.tsx` — combined waveform/diarization 249 - - `apps/ionosphere/src/app/components/SnapGuides.tsx` — snap guide rendering 250 - - `apps/ionosphere/src/app/components/SpeakerPopover.tsx` — speaker naming UI 251 - - `apps/ionosphere-appview/src/corrections-api.ts` — load/save endpoints
-160
docs/superpowers/specs/2026-04-05-alignment-editor-preplan.md
··· 1 - # Alignment Editor Pre-Plan 2 - 3 - ## Context 4 - 5 - We have a working track timeline view (`/tracks/[stream]`) showing full-day conference streams with talk segments, speaker diarization, and synced transcripts. The current pipeline (v6 + LLM refinement) achieves 100% accuracy / 11s MAE on ground truth, but many streams have no ground truth yet and the remaining errors require manual correction. 6 - 7 - The next step is building NLE-style alignment editing tools so that talk boundaries can be visually verified and adjusted directly in the browser, like segment editing in Final Cut Pro or DaVinci Resolve. 8 - 9 - ## What We Have 10 - 11 - ### Data 12 - 13 - - **7 full-day streams** (Great Hall Sat/Sun, Room 2301 Sat/Sun, Perf Theatre Sat/Sun, ATScience Friday) 14 - - **~358K words** transcribed with Whisper segment confidence (`avg_logprob`, `no_speech_prob`) 15 - - **Speaker diarization** for all streams (pyannote, 21-68 speakers per stream, ~1600-3800 segments each) 16 - - **Boundary results** per stream (`transcript-enriched-boundaries-v6-refined.json`) with detected talk starts, confidence, refinement method (LLM/diarization-fallback/manual) 17 - - **Ground truth** for Great Hall Day 1 only (13 verified talks, 11s MAE) 18 - - **Talk records** in the DB with `video_segments` containing fullday stream offsets 19 - 20 - ### Current UI 21 - 22 - - `/tracks/[stream]` — video player (1/3 vh), zoomable timeline with talk segments + speaker diarization band, tabs for Talks list and Transcript (reuses existing infinite-scroll TranscriptView) 23 - - Zoom: scroll/pinch to zoom, shift+scroll to pan, +/- buttons, Reset 24 - - Colors: golden angle hue spacing, stable across zoom (keyed by rkey/speaker ID) 25 - - Timeline click seeks video 26 - - Talk list highlights active talk, click seeks 27 - 28 - ### Known Issues in Current UI 29 - 30 - - Video displays as narrow band (aspect ratio not constrained properly on some viewports) 31 - - Some layout edge cases with scroll containers 32 - - Zoom gesture can conflict with page scroll on some browsers/trackpads 33 - - No way to edit anything — read-only 34 - 35 - ## What We Want to Build 36 - 37 - ### Core: Drag-to-Edit Talk Boundaries 38 - 39 - The primary interaction: grab a talk boundary edge on the timeline and drag it to adjust where the talk starts or ends. This is the NLE razor/trim metaphor. 40 - 41 - **Key design questions:** 42 - - Should boundaries be discrete (snap to silence gaps / speaker changes) or continuous (any position)? 43 - - How do we handle the gap between talks (MC intro, applause, setup)? 44 - - Can two talks overlap? Or must they be contiguous? 45 - - What happens to the "unassigned" time between talks (MC segments, breaks)? 46 - 47 - **Likely approach:** Talk segments on the timeline become editable regions. Each boundary is a draggable handle. Dragging snaps to nearby useful positions (silence gaps, speaker changes, transcript word boundaries) but can be overridden. Talks don't overlap but gaps between them are allowed (representing MC/transition time). 48 - 49 - ### Waveform or Energy Display 50 - 51 - NLEs show audio waveform to help identify silence gaps, applause, and speech. We have: 52 - - Whisper segment confidence (`avg_logprob`, `no_speech_prob`) which roughly correlates with audio energy 53 - - Speaker diarization segments which show speech/silence patterns 54 - - Raw word timestamps which show speech density 55 - 56 - A pseudo-waveform derived from word density + confidence would be cheaper than computing actual audio waveform but still useful for visual alignment. 57 - 58 - ### Verification Workflow 59 - 60 - 1. Open track view → see all detected boundaries on timeline 61 - 2. Play from each boundary — video + transcript sync shows what's happening 62 - 3. If boundary is wrong: drag to correct position 63 - 4. Mark talk as "verified" — builds ground truth 64 - 5. Save corrections → updates DB + boundary JSON 65 - 66 - ### Speaker Assignment 67 - 68 - Currently speakers are anonymous (SPEAKER_00, SPEAKER_01). The diarization data could be enriched by: 69 - - Mapping dominant speaker in each talk segment to the known speaker name 70 - - Allowing manual correction of speaker labels 71 - - Using this to identify where speakers appear across the full day (panels, Q&A, etc.) 72 - 73 - ### Data Model for Edits 74 - 75 - Corrections need to be: 76 - - Persisted (survive appview restart) 77 - - Exportable (can update production PDS records) 78 - - Versionable (track who changed what when) 79 - 80 - Options: 81 - - **A) Write to local JSON** (simplest, like current boundary-timings.csv) 82 - - **B) Write to DB** (update `video_segments` directly) 83 - - **C) Write to AT Protocol** (new lexicon for alignment corrections, published to PDS) 84 - 85 - Option B for local dev, graduating to C for production. The AT Protocol approach makes corrections social — anyone could submit alignment corrections. 86 - 87 - ## Technical Considerations 88 - 89 - ### Timeline Interaction Layer 90 - 91 - The current `StreamTimeline` is a simple div with absolutely-positioned colored blocks. For drag editing we need: 92 - - Hit detection on boundary edges (not just the block interior) 93 - - Drag handles with visual affordance (resize cursors, highlighted edges) 94 - - Drag constraints (min segment duration, can't overlap adjacent talk) 95 - - Snap-to guides (silence gaps, speaker changes, word boundaries) 96 - - Undo/redo 97 - 98 - This is a significant step up from the current passive display. Libraries like `@use-gesture/react` could help with drag interactions. Canvas rendering might be needed for smooth scrubbing at high zoom. 99 - 100 - ### Snap Targets 101 - 102 - When dragging a boundary, nearby "interesting" positions should attract the handle: 103 - - Silence gaps > 3s (from word timestamps) 104 - - Speaker change points (from diarization) 105 - - Low-confidence zone edges (from Whisper segments) 106 - - Transcript word boundaries (for precise alignment) 107 - 108 - These are all derivable from existing data — no new processing needed. 109 - 110 - ### Performance 111 - 112 - Full-day transcripts are 50-65K words. The current TranscriptView renders all words and handles it well. The timeline has ~16-24 talk segments and ~1600-3800 diarization segments — no performance concern. Drag interactions need 60fps which means the timeline render path must be efficient (CSS transforms, not re-layout). 113 - 114 - ### Keyboard Shortcuts 115 - 116 - NLE users expect: 117 - - J/K/L for reverse/pause/forward 118 - - Arrow keys for frame-by-frame (or word-by-word in our case) 119 - - I/O for setting in/out points 120 - - Spacebar for play/pause 121 - - [ ] for nudging boundaries 122 - 123 - ## Dependencies 124 - 125 - - Current track timeline view (done) 126 - - Stable colors (done) 127 - - Zoom + pan (done) 128 - - Video player seeking (done) 129 - - TranscriptView sync (done) 130 - 131 - ## What to Decide Before Building 132 - 133 - 1. **Interaction model**: Should editing be a separate mode (toggle "Edit" button) or always-on with hover affordance? 134 - 2. **Granularity**: Edit individual talk boundaries, or also allow adding/removing/splitting talks? 135 - 3. **Persistence**: Local-only for now, or AT Protocol from the start? 136 - 4. **Waveform**: Worth building a pseudo-waveform from word density, or skip it? 137 - 5. **Scope**: Just boundaries, or also speaker naming in this iteration? 138 - 6. **Multi-user**: Can multiple people edit the same stream? Conflict resolution? 139 - 140 - ## Suggested Decomposition 141 - 142 - 1. **Phase 1: Drag-to-edit boundaries** — core NLE trim interaction on the timeline, persisted to local JSON, undo/redo 143 - 2. **Phase 2: Snap targets** — silence gaps, speaker changes, word boundaries as magnetic snap points 144 - 3. **Phase 3: Verification workflow** — mark talks as verified, build ground truth, evaluation scoring 145 - 4. **Phase 4: Speaker naming** — map diarization IDs to known speakers, manual correction 146 - 5. **Phase 5: AT Protocol persistence** — publish corrections as records, social verification 147 - 148 - ## References 149 - 150 - - Current track view: `apps/ionosphere/src/app/tracks/[stream]/TrackViewContent.tsx` 151 - - Timeline component: `apps/ionosphere/src/app/components/StreamTimeline.tsx` 152 - - Diarization band: `apps/ionosphere/src/app/components/DiarizationBand.tsx` 153 - - Color system: `apps/ionosphere/src/lib/track-colors.ts` 154 - - Boundary detection: `apps/ionosphere-appview/src/detect-boundaries-v6.ts` 155 - - LLM refinement: `apps/ionosphere-appview/src/refine-boundaries-llm.ts` 156 - - Track API: `apps/ionosphere-appview/src/tracks.ts` 157 - - Boundary results: `apps/ionosphere-appview/data/fullday/<stream>/transcript-enriched-boundaries-v6-refined.json` 158 - - Ground truth: `apps/ionosphere-appview/data/ground-truth/great-hall-day-1.json` 159 - - Diarization data: `apps/ionosphere-appview/data/fullday/<stream>/diarization.json` 160 - - Transcript data: `apps/ionosphere-appview/data/fullday/<stream>/transcript-enriched.json`
-93
docs/superpowers/specs/2026-04-05-track-timeline-view-design.md
··· 1 - # Track Timeline View 2 - 3 - ## Goal 4 - 5 - A browsable view for full-day conference streams, showing the video with talk segments on a visual timeline, speaker diarization as colored bands, and the existing transcript/comments display synced to playback. 6 - 7 - ## Route & Navigation 8 - 9 - - `/tracks` — index page listing all streams (7 total) 10 - - `/tracks/[stream]` — individual stream view (e.g. `/tracks/great-hall-day-1`) 11 - - Talks that have a fullday video source link to their track view from the talk page 12 - 13 - ## Layout 14 - 15 - 1. **Video player** — full-day HLS stream, same player component used elsewhere 16 - 2. **Timeline bar** — horizontal bar spanning stream duration 17 - - Talk segments as labeled, proportionally-sized blocks 18 - - Playback scrubber line moving with the video 19 - - Click anywhere on the timeline to seek 20 - - Click a talk segment to jump to its start 21 - 3. **Speaker diarization band** — thin colored strip below the timeline showing speaker activity. Each speaker gets a consistent color. Hovering shows speaker ID. 22 - 4. **Talk list** — ordered list of talks in the stream with start times, speakers, and jump-to action 23 - 5. **Transcript view** — the full track transcript (from `transcript-enriched.json`), synced to playback position. Talk boundaries shown as markers within the continuous transcript. No switching between per-talk transcripts — the track transcript IS the transcript, with talk segments as markers on it. 24 - 25 - ## API 26 - 27 - New endpoint: `tv.ionosphere.getTrack` 28 - 29 - **Input:** stream identifier (slug like `great-hall-day-1` or stream URI) 30 - 31 - **Output:** 32 - ```json 33 - { 34 - "stream": "Great Hall - Day 1", 35 - "streamUri": "at://...", 36 - "durationSeconds": 28433, 37 - "playbackUrl": "https://vod-beta.stream.place/...", 38 - "talks": [ 39 - { 40 - "rkey": "gDELD0M", 41 - "title": "Landslide", 42 - "speakers": ["Erin Kissane"], 43 - "startSeconds": 990, 44 - "endSeconds": 4254, 45 - "confidence": "high" 46 - } 47 - ], 48 - "diarization": [ 49 - { "start": 0, "end": 45.2, "speaker": "SPEAKER_00" }, 50 - { "start": 45.2, "end": 120.5, "speaker": "SPEAKER_01" } 51 - ] 52 - } 53 - ``` 54 - 55 - The diarization array is served from the per-stream JSON files already on disk. For the initial implementation, serve the full diarization data — it's ~3000 segments per stream which is manageable. Can be simplified later if needed. 56 - 57 - ## Data Sources 58 - 59 - All data already exists: 60 - - Stream URIs and playback URLs: hardcoded in `transcribe-fullday.ts`, also derivable from stream records 61 - - Talk segments with offsets: `video_segments` field on talk records in DB 62 - - Diarization: `data/fullday/<stream>/diarization.json` 63 - - Track transcripts: `data/fullday/<stream>/transcript-enriched.json` (full track with timestamps + speaker labels) 64 - 65 - ## Stream Slug Mapping 66 - 67 - | Slug | Stream Name | URI | 68 - |------|------------|-----| 69 - | great-hall-day-1 | Great Hall - Day 1 | at://...3miieadw52j22 | 70 - | great-hall-day-2 | Great Hall - Day 2 | at://...3miighlz53o22 | 71 - | room-2301-day-1 | Room 2301 - Day 1 | at://...3miieadx2dj22 | 72 - | room-2301-day-2 | Room 2301 - Day 2 | at://...3miieadxeqn22 | 73 - | performance-theatre-day-1 | Performance Theater - Day 1 | at://...3miieadwgvz22 | 74 - | performance-theatre-day-2 | Performance Theater - Day 2 | at://...3miieadwqgy22 | 75 - | atscience | ATScience - Full Day | at://...3miieadvruo22 | 76 - 77 - ## Frontend Components 78 - 79 - - **TrackIndex** — `/tracks` page, lists all streams with room, day, duration, talk count 80 - - **TrackView** — `/tracks/[stream]` page, orchestrates the layout 81 - - **StreamTimeline** — the horizontal timeline with talk segments and scrubber 82 - - **DiarizationBand** — colored speaker band below timeline 83 - - **TrackTalkList** — ordered talk list with jump-to actions 84 - 85 - Reuses existing: 86 - - Video player component 87 - - TranscriptView (adapted to use track-level transcript rather than per-talk) 88 - 89 - ## Not In Scope 90 - 91 - - Drag-to-edit boundaries (future) 92 - - Speaker naming / mapping diarization IDs to real names (future) 93 - - Waveform visualization (future)
-159
docs/superpowers/specs/2026-04-12-boundary-detection-v7-design.md
··· 1 - # Boundary Detection v7 — Diarization-First Pipeline 2 - 3 - **Date**: 2026-04-12 4 - **Status**: Approved 5 - **File**: `apps/ionosphere-appview/src/detect-boundaries-v7.ts` 6 - 7 - ## Problem 8 - 9 - The v6 boundary detector uses transcript gaps and Whisper speaker labels as primary signals. During manual verification of all 7 streams, we found this approach fails badly in hallucination zones (Whisper fills silence with repeating text like "Transcription by CastingWords", "0 0 0 0", song lyrics) and causes cross-track assignment errors when matching by schedule time + room. 10 - 11 - Key finding: **diarization data tracks actual audio** and shows real speech boundaries even through hallucination zones. A 94-minute silence gap on PT D2 was completely invisible in the transcript but obvious in diarization. 12 - 13 - ## Approach 14 - 15 - **Diarization first, transcript second.** Build a timeline of talk-shaped segments from diarization, then use transcript content to identify which talk each segment contains. Schedule provides candidate talks but never dictates timing. 16 - 17 - ## Pipeline 18 - 19 - ``` 20 - Diarization JSON + Transcript JSON + Schedule (DB) 21 - 22 - Stage 1: Diarization Segmentation 23 - 24 - Stage 2: Transcript Content Matching 25 - 26 - Stage 3: Schedule Reconciliation 27 - 28 - Boundary JSON (v6-compatible format) 29 - ``` 30 - 31 - ## Stage 1: Diarization Segmentation 32 - 33 - **Input**: `diarization.json` (`{segments: [{start, end, speaker}], speakers: []}`) 34 - 35 - **Process**: 36 - - Merge adjacent same-speaker segments with < 5s gaps into speech blocks 37 - - Classify gaps between blocks: 38 - - \> 60s = **session break** 39 - - 30-60s = **likely talk boundary** 40 - - < 30s = **within-talk pause** 41 - - Group blocks between session breaks into sessions 42 - - Within each session, classify by speaker distribution: 43 - - One speaker > 70% duration = **single-speaker talk** 44 - - Multiple speakers with balanced time = **panel** 45 - - **Hallucination detection**: Where diarization shows silence but transcript has words, mark as hallucination zone. Also detect known patterns: 46 - - Repeating phrases in ~30s loops ("Transcription by CastingWords", "Transcribed by https://otter.ai") 47 - - Numeric zeros ("0 0 0 0 0") 48 - - "Microsoft Office Word Document MSWordDoc" 49 - - "Transcription by ESO Translation by --" 50 - - "UGA Extension Office of Communications and Creative Services" 51 - - Non-English loops (Welsh "Rwy n gobeithio...") 52 - - Song lyrics between known gaps (DJ music on GH D2) 53 - - URLs/attribution ("Subs by www.zeoranger.co.uk", "www.fema.gov") 54 - - "Thank you for watching" / "Thank you" loops 55 - 56 - **Output**: 57 - ```ts 58 - interface TalkSegment { 59 - startS: number; 60 - endS: number; 61 - speakers: { id: string; durationS: number }[]; 62 - type: 'single-speaker' | 'panel' | 'unknown'; 63 - dominantSpeaker?: string; 64 - precedingGapS: number; 65 - hallucinationZone: boolean; 66 - } 67 - 68 - interface HallucinationZone { 69 - startS: number; 70 - endS: number; 71 - pattern: string; 72 - } 73 - ``` 74 - 75 - ## Stage 2: Transcript Content Matching 76 - 77 - For each `TalkSegment`, extract identity signals from transcript text in that time range. 78 - 79 - **Signals (ordered by reliability)**: 80 - 81 - 1. **MC handoffs**: "please welcome {NAME}", "next up is {NAME}", "setting up next". Found in the 30-60s before a talk starts. 82 - 2. **Self-introductions**: "my name is {NAME}", "I'm {NAME} I'm from/at/with {ORG}". First 60s of a talk. Strongest identity signal. 83 - 3. **Topic keywords**: Nouns/phrases from first 2 minutes matched against talk titles. 84 - 4. **Speaker name matching**: Fuzzy/phonetic match against schedule speaker list. Handles Whisper mangling ("Jekard"/"Jacquard", "Wardmuller"/"Werdmuller"). 85 - 86 - **Matching logic**: 87 - - Hallucination zone segments: `confidence = 'unverifiable'`, candidates from schedule by time window 88 - - Speaker name + topic keyword match: `confidence = 'high'` 89 - - Speaker name OR topic keyword match: `confidence = 'medium'` 90 - - No match: `confidence = 'low'` 91 - 92 - **Panel handling**: When segment type is `panel`, extract ALL speaker names and match multiple schedule entries to the same time range. Flag as `panel: true`. 93 - 94 - **Output**: 95 - ```ts 96 - interface BoundaryMatch { 97 - rkey: string; 98 - title: string; 99 - startS: number; 100 - endS: number; 101 - confidence: 'high' | 'medium' | 'low' | 'unverifiable'; 102 - signals: string[]; 103 - panel: boolean; 104 - hallucinationZones: HallucinationZone[]; 105 - } 106 - ``` 107 - 108 - ## Stage 3: Schedule Reconciliation 109 - 110 - 1. **Validate matches**: Resolve duplicate assignments (same rkey to multiple segments). Pick highest confidence. 111 - 2. **Unmatched schedule entries**: If scheduled time falls in hallucination zone, assign as `unverifiable`. If within a panel's range, assign with `low` confidence. Otherwise omit with log message. 112 - 3. **Unmatched segments**: Real speech with no schedule match. Output as `unknown-talk` for manual review. 113 - 4. **End time calculation**: Each talk ends at next talk's start minus gap, or diarization silence onset. Last talk in session ends at last diarization speech. Absolute last talk ends at stream duration or last speech. 114 - 115 - ## Output Format 116 - 117 - Compatible with v6 for downstream use by `refine-boundaries-llm.ts` and `apply-boundaries.ts`: 118 - 119 - ```ts 120 - { 121 - stream: string; 122 - results: BoundaryMatch[]; 123 - hallucinationZones: HallucinationZone[]; 124 - unmatchedSegments: TalkSegment[]; 125 - unmatchedSchedule: string[]; 126 - } 127 - ``` 128 - 129 - ## CLI Interface 130 - 131 - ```bash 132 - npx tsx src/detect-boundaries-v7.ts \ 133 - data/fullday/<Dir>/transcript-enriched.json \ 134 - --diarization data/fullday/<Dir>/diarization.json \ 135 - --stream-slug great-hall-day-1 136 - ``` 137 - 138 - `--diarization` is required. Stream slug pulls schedule from DB. 139 - 140 - ## Confidence Tiers 141 - 142 - | Tier | Meaning | Action | 143 - |------|---------|--------| 144 - | high | Diarization boundary + transcript confirms speaker and topic | Auto-accept | 145 - | medium | Diarization boundary exists, partial transcript match | Review recommended | 146 - | low | Weak match, possibly wrong assignment | Manual verification needed | 147 - | unverifiable | Talk in hallucination zone, no audio evidence | Check video or remove | 148 - 149 - ## Future: Hallucination Re-transcription (Phase C) 150 - 151 - Marked hallucination zones enable a future pipeline stage: re-transcribe those audio regions using diarization-derived boundaries as chunking points, giving Whisper clean context without the rotted pre-context that causes hallucination cascading. This is out of scope for v7 but the data model supports it. 152 - 153 - ## Validation 154 - 155 - Run v7 on all 7 streams and compare output against the manually verified ground truth from the April 12 audit. Success criteria: 156 - - All `high` confidence results match ground truth 157 - - No cross-track assignment errors 158 - - All hallucination zones correctly detected 159 - - Unmatched segments/schedule entries are legitimate (talks not recorded, etc.)
-138
docs/superpowers/specs/2026-04-12-conference-discussion-design.md
··· 1 - # Conference Discussion Page 2 - 3 - A curated, high-density overview of what people said about ATmosphereConf 2026 — top posts, recaps & blog posts, follow-up videos, and VOD sites. Displayed in tight responsive columns matching the concordance index style. 4 - 5 - ## Layout 6 - 7 - Multi-column greedy-fill layout (same as `IndexContent.tsx`). Content flows across columns naturally. Sections act as dividers within the flow. 8 - 9 - **Left nav**: Section shortcuts (T/R/V) for quick jumping, like the letter nav on the concordance. 10 - 11 - **Filter bar**: At the top, filter pills to show all or just one medium: 12 - - All (default) 13 - - Top Posts 14 - - Recaps & Blog Posts 15 - - Videos & VOD Sites 16 - 17 - Plus a text filter input for searching within visible items. 18 - 19 - **Right panel**: Click any post that has an associated talk → opens the talk video + transcript in a slide-out panel (same pattern as concordance click-to-play). 20 - 21 - ## Sections 22 - 23 - ### Top Posts 24 - 25 - All conference mentions sorted by likes (descending). Each item: 26 - - 14px avatar + handle + like count (inline) 27 - - Post text (1-2 lines, truncated) 28 - - Talk link → (if matched to a talk) + "View on Bluesky ↗" 29 - 30 - ### Recaps & Blog Posts 31 - 32 - Posts containing links to blog/article domains, identified by facet URIs. Each item: 33 - - Avatar + handle + like count 34 - - Post text or OG title (prefer OG title when available) 35 - - Domain link ↗ (green accent) 36 - - Talk link → (if matched by speaker mentions) 37 - 38 - OG metadata (title, description) fetched at index time and stored. No images — just title + domain to keep it tight. 39 - 40 - ### Videos & VOD Sites 41 - 42 - Posts linking to video platforms. Each item: 43 - - Avatar + handle + like count 44 - - Post text 45 - - Video link ↗ (purple accent) 46 - - Talk link → (if matched) 47 - 48 - Plus a compact pill directory of all VOD JAM sites as clickable external links. 49 - 50 - ### Stats Card 51 - 52 - Aggregate numbers: total posts, blog recaps, VOD sites, unique people. 53 - 54 - ## Data 55 - 56 - ### Wider search (fetch script extension) 57 - 58 - Extend `fetch-mentions.mjs` with a new phase that searches for: 59 - 60 - **Blog/recap posts:** 61 - - `q: "atmosphereconf recap"`, `q: "atmosphereconf wrote"`, `q: "atmosphereconf takeaway"`, `q: "atmosphereconf writeup"` 62 - - `q: "atmosphere"` with `author:` for known community writers 63 - 64 - **VOD/video posts:** 65 - - `domain:` searches for each known VOD site: 66 - - stream.place, vods.sky.boo, vod.atverkackt.de, ionosphere.tv, atmosphereconf-vods.wisp.place, rpg.actor, vod.j4ck.xyz, atmosphere-vods.j4ck.xyz, atmosphereconf-tv.btao.org, stream-bsky.pages.dev, sites.wisp.place, vods.ajbird.net, streamhut.wisp.place, conf-vods.wisp.place, aetheros.computer, atmo.rsvp, atmosphereconf.org, youtube.com (with atmosphere keywords) 67 - 68 - **ionosphere.tv links:** 69 - - `domain: ionosphere.tv` — already done, can extract talk rkey from URL 70 - 71 - ### New fields in mentions table 72 - 73 - Add `content_type` column: `post` | `blog` | `video` | `vod_site` 74 - 75 - Add `external_url` column: the primary external link from facets (blog URL, VOD URL). 76 - 77 - Add `og_title` column: OG metadata title fetched from external URL (nullable). 78 - 79 - ### Talk matching 80 - 81 - 1. **Direct URL match**: If post links to `ionosphere.tv/talks/RKEY`, match directly 82 - 2. **Speaker mention match**: If post @-mentions a speaker, match to their talks (prefer talks during the post's time window) 83 - 3. **Keyword match**: If post text contains a talk title (fuzzy), match to that talk 84 - 85 - Store matched `talk_uri` on the mention row. 86 - 87 - ### OG metadata fetching 88 - 89 - For blog/recap posts with external URLs, fetch the page and extract `<meta property="og:title">` and `<meta property="og:description">`. Store in `og_title` column. Skip if fetch fails — text from the Bluesky post is the fallback. 90 - 91 - ## API 92 - 93 - ### `tv.ionosphere.getDiscussion` 94 - 95 - Returns all discussion items grouped by content_type, sorted by likes within each group. 96 - 97 - ``` 98 - GET /xrpc/tv.ionosphere.getDiscussion 99 - ``` 100 - 101 - Response: 102 - ```json 103 - { 104 - "posts": [...], // content_type = 'post', sorted by likes desc 105 - "blogs": [...], // content_type = 'blog' 106 - "videos": [...], // content_type = 'video' or 'vod_site' 107 - "vodSites": [...], // unique VOD site domains as strings 108 - "stats": { "totalPosts": N, "blogCount": N, "vodSiteCount": N, "uniqueAuthors": N } 109 - } 110 - ``` 111 - 112 - Each item includes: uri, author_handle, author_display_name, author_avatar_url, text, likes, reposts, external_url, og_title, talk_rkey, talk_title, content_type. 113 - 114 - ## Frontend 115 - 116 - ### Route: `/discussion` 117 - 118 - New Next.js page at `apps/ionosphere/src/app/discussion/page.tsx`. 119 - 120 - ### Component: `DiscussionContent.tsx` 121 - 122 - Based on the concordance `IndexContent.tsx` pattern: 123 - - Greedy column-fill with section headers as flow items 124 - - Filter bar (medium pills + text search) 125 - - Section nav sidebar 126 - - Click-to-play right panel for talk associations 127 - - Mobile: single column with progressive rendering 128 - 129 - ### Nav update 130 - 131 - Add "Discussion" link to the site nav in `layout.tsx`. 132 - 133 - ## Not in scope 134 - 135 - - Real-time updates 136 - - Editing/curating items manually 137 - - Full OG card with images (just title + domain) 138 - - Comment/reply threading on the discussion page (that's on the talk page)
-160
docs/superpowers/specs/2026-04-12-conference-mentions-design.md
··· 1 - # Conference Mentions Integration 2 - 3 - Surface Bluesky mentions of speakers during (and after) their talks, time-aligned with the transcript in the ionosphere.tv UI. 4 - 5 - ## Data Model 6 - 7 - ### `mentions` table (SQLite) 8 - 9 - ```sql 10 - CREATE TABLE mentions ( 11 - uri TEXT PRIMARY KEY, -- at:// URI of the Bluesky post 12 - talk_uri TEXT, -- talk this aligns to (null for unaligned buzz) 13 - author_did TEXT NOT NULL, 14 - author_handle TEXT, 15 - text TEXT, 16 - created_at TEXT NOT NULL, 17 - talk_offset_ms INTEGER, -- ms into the talk when posted 18 - byte_position INTEGER, -- transcript byte position (from offset) 19 - likes INTEGER DEFAULT 0, 20 - reposts INTEGER DEFAULT 0, 21 - replies INTEGER DEFAULT 0, 22 - parent_uri TEXT, -- non-null for thread replies 23 - mention_type TEXT DEFAULT 'during_talk', -- 'during_talk' | 'post_conference' 24 - indexed_at TEXT NOT NULL 25 - ); 26 - 27 - CREATE INDEX idx_mentions_talk ON mentions(talk_uri, talk_offset_ms); 28 - CREATE INDEX idx_mentions_parent ON mentions(parent_uri); 29 - ``` 30 - 31 - Thread replies share the parent's `talk_uri` and `byte_position`. 32 - 33 - Author profiles reuse the existing `profiles` table (already caches handle, display_name, avatar_url from the Bluesky public API). 34 - 35 - ## Fetch Script: `scripts/fetch-mentions.mjs` 36 - 37 - Enhanced version of the exploration scripts already built. Runs as a batch job, not a live service. 38 - 39 - ### During-talk mentions 40 - 41 - For each talk with a schedule (`starts_at`, `ends_at`): 42 - 1. Search `app.bsky.feed.searchPosts` with `mentions=<speaker_handle>`, `since=starts_at - 5min`, `until=ends_at + 30min` 43 - 2. Paginate with cursors until exhausted (current scripts cap at 100) 44 - 3. Compute `talk_offset_ms = mention.createdAt - talk.starts_at` 45 - 4. Map offset to `byte_position` using transcript word-level timings 46 - 5. For each mention with replies, fetch thread via `app.bsky.feed.getPostThread` (depth 1-2) 47 - 6. Upsert into `mentions` table 48 - 49 - ### Post-conference mentions 50 - 51 - Wider searches with no `until` bound: 52 - - `domain=ionosphere.tv` — posts linking to talk pages 53 - - `domain=stream.place` — posts linking to VODs 54 - - `mentions=<speaker_handle>` + `q=atmosphere OR atmosphereconf` with `since=2026-03-30` 55 - 56 - These get `mention_type='post_conference'` and align to a talk by matching the speaker. 57 - 58 - ### Byte position mapping 59 - 60 - The transcript stores word-level timings as a compact array (positive = word duration ms, negative = silence gap ms). To map a `talk_offset_ms` to a byte position: 61 - 62 - 1. Walk the timings array, accumulating elapsed time 63 - 2. When elapsed >= talk_offset_ms, return the current byte offset 64 - 3. If the mention falls outside transcript range, use the nearest boundary 65 - 66 - This is done at fetch time and stored, not computed on every request. 67 - 68 - ## API Endpoint 69 - 70 - ### `tv.ionosphere.getMentions` 71 - 72 - ``` 73 - GET /xrpc/tv.ionosphere.getMentions?talkRkey=<rkey> 74 - ``` 75 - 76 - Response: 77 - ```json 78 - { 79 - "mentions": [ 80 - { 81 - "uri": "at://did:plc:.../app.bsky.feed.post/...", 82 - "author_did": "did:plc:...", 83 - "author_handle": "faineg.bsky.social", 84 - "author_display_name": "Faine G", 85 - "author_avatar_url": "https://...", 86 - "text": "as @kissane notes...", 87 - "created_at": "2026-03-28T21:32:15.000Z", 88 - "talk_offset_ms": 872000, 89 - "byte_position": 4521, 90 - "likes": 137, 91 - "reposts": 12, 92 - "replies": 3, 93 - "parent_uri": null, 94 - "mention_type": "during_talk", 95 - "thread": [ 96 - { 97 - "uri": "at://...", 98 - "author_handle": "...", 99 - "author_display_name": "...", 100 - "author_avatar_url": "...", 101 - "text": "reply text...", 102 - "created_at": "...", 103 - "likes": 5 104 - } 105 - ] 106 - } 107 - ], 108 - "total": 51 109 - } 110 - ``` 111 - 112 - Backend query joins `mentions` with `profiles` for author enrichment. Thread replies are nested under their parent. Sorted by `talk_offset_ms` (during-talk first, post-conference after). 113 - 114 - ## Frontend 115 - 116 - ### Right sidebar tabs 117 - 118 - Add a "Mentions" tab alongside existing "Concepts" tab in `TalkContent.tsx`: 119 - 120 - ``` 121 - [Concepts] [Mentions (51)] 122 - ``` 123 - 124 - Tab count comes from the API response `total`. 125 - 126 - ### `MentionsSidebar` component 127 - 128 - Renders mention cards in a scrollable column with pretext spacers for vertical alignment with the transcript. 129 - 130 - **Scroll sync:** Listens to `TimestampProvider` context. Uses the same scroll-position logic as `TranscriptView` — maps current playback nanoseconds to a byte position, then scrolls to keep the matching mention near the viewport center. 131 - 132 - **Pretext spacers:** Each mention card is positioned using top-padding calculated from its `byte_position` relative to the previous mention's position. When a thread is expanded/collapsed, spacers below are recalculated to maintain alignment. 133 - 134 - **Mention card contents:** 135 - - Author avatar (18px circle) + handle + like count 136 - - Post text (truncated to ~120 chars, expandable) 137 - - "↳ N replies" link for threads 138 - - Click anywhere on card → seek video to `talk_offset_ms` 139 - 140 - **Thread expansion:** 141 - - Clicking "↳ N replies" expands replies inline below the parent card 142 - - Reply cards are indented and slightly smaller 143 - - Spacers below recalculate on expand/collapse 144 - - Each reply is also clickable to open the full post on Bluesky (external link) 145 - 146 - **Post-conference section:** 147 - - After all during-talk mentions, a divider: "After the conference" 148 - - Post-conference mentions listed chronologically, no time alignment 149 - - These don't scroll-sync with playback 150 - 151 - ### Mobile 152 - 153 - Right sidebar is hidden on mobile (existing behavior). Mentions accessible via a tab/accordion below the transcript, same as concepts. 154 - 155 - ## Not in scope 156 - 157 - - Real-time mention streaming or webhooks 158 - - Composing/replying to mentions from within ionosphere 159 - - Full-text search within mentions 160 - - Mentions of non-speaker topics (conference hashtags without speaker tags)
-97
docs/superpowers/specs/2026-04-12-retranscribe-hallucinations-design.md
··· 1 - # Re-transcribe Hallucination Zones 2 - 3 - **Date**: 2026-04-12 4 - **Status**: Approved 5 - **File**: `apps/ionosphere-appview/src/retranscribe-hallucinations.ts` 6 - 7 - ## Problem 8 - 9 - Whisper hallucinates during silence/break periods in full-day conference streams, producing repeating phrases ("Transcription by CastingWords", "0 0 0 0", Welsh text, song lyrics, etc.). These hallucination zones cover ~20% of stream content and obscure real talk boundaries and content. The root cause: fixed 20-minute chunking means Whisper's context window fills with garbage from previous silence, causing cascading hallucination. 10 - 11 - ## Approach 12 - 13 - Re-transcribe only the hallucination zones using diarization-aligned chunk boundaries. The diarization data shows exactly when real speech starts and stops, so we can give Whisper chunks that begin with real speech — clean context from the first word. 14 - 15 - Uses existing OpenAI Whisper API (same as original transcription). Splices results back into `transcript-enriched.json`, replacing the hallucinated words. 16 - 17 - ## Pipeline 18 - 19 - ``` 20 - v7 boundary JSON (hallucinationZones) + HLS stream + diarization 21 - 22 - 1. For each hallucination zone with diarization speech: 23 - a. Extract audio from HLS via ffmpeg (zone.startS → zone.endS) 24 - b. Split at diarization gaps if > 25 min (Whisper's limit) 25 - c. Transcribe via OpenAI Whisper API (word timestamps) 26 - 27 - 2. Load transcript-enriched.json 28 - 29 - 3. For each zone: remove hallucinated words, insert new words 30 - 31 - 4. Write updated transcript-enriched.json 32 - ``` 33 - 34 - ## Chunking Strategy 35 - 36 - For each hallucination zone: 37 - 1. Load diarization segments overlapping the zone 38 - 2. Find first speech onset and last speech offset 39 - 3. If no speech in zone → skip (genuinely silent) 40 - 4. If speech < 25 min → one chunk 41 - 5. If speech > 25 min → split at diarization gaps > 5s 42 - 43 - Each chunk starts at a diarization speech onset, ensuring Whisper gets clean context. 44 - 45 - ## Audio Extraction 46 - 47 - Use ffmpeg to extract audio from the HLS VOD endpoint: 48 - ```bash 49 - ffmpeg -ss <startS> -t <durationS> -i "<playlist_url>" -vn -ac 1 -ar 16000 -f mp3 <output.mp3> 50 - ``` 51 - 52 - Stream URIs come from the `STREAMS` config (same as existing transcription pipeline). Playlist URL: `https://vod-beta.stream.place/xrpc/place.stream.playback.getVideoPlaylist?uri=<uri>`. 53 - 54 - ## Transcript Splicing 55 - 56 - After re-transcription of a zone: 57 - 1. Filter out existing words where `startS >= zone.startS && endS <= zone.endS` 58 - 2. Insert new words (with timestamps relative to zone start, adjusted to absolute stream time) 59 - 3. Re-sort word array by start time 60 - 4. Recalculate `total_words` 61 - 62 - ## CLI 63 - 64 - ```bash 65 - npx tsx src/retranscribe-hallucinations.ts \ 66 - --stream-slug room-2301-day-2 \ 67 - --boundaries data/fullday/Room_2301___Day_2/transcript-enriched-boundaries-v7.json \ 68 - --diarization data/fullday/Room_2301___Day_2/diarization.json 69 - ``` 70 - 71 - Requires: `OPENAI_API_KEY` in environment (from `.env`). 72 - 73 - Reads stream URI from `STREAMS` config. Writes updated `transcript-enriched.json` in the same fullday directory. 74 - 75 - ## Scope 76 - 77 - **Does:** 78 - - Extract audio for hallucination zones from HLS 79 - - Re-transcribe with diarization-aligned chunks 80 - - Splice new words into existing transcript 81 - 82 - **Does not:** 83 - - Re-run diarization (existing is good) 84 - - Re-run v7 boundary detection (separate step) 85 - - Publish to PDS (separate step) 86 - 87 - ## Expected Impact 88 - 89 - Hallucination zones covering actual talks (where diarization shows real speech): 90 - - R2301 D2: 96-209m (Content Mod Futures, start of Blacksky) 91 - - PT D2: 117-210m (Community Privacy, Cooperate & Succeed) 92 - - PT D1: 124-200m (end of morning session) 93 - - GH D2: 180-267m (lunch period — mostly DJ music, limited real speech) 94 - - ATScience: 264-280m (Welsh hallucination over Astrosky start) 95 - - Various short zones (< 5 min) 96 - 97 - Re-transcription should recover talk content currently lost, improving v7 match accuracy from 90% toward 95%+.