Add speakers segment insight for meeting participant detection

New segment-level insight that analyzes audio + screen context to:
- Detect active video meetings (Zoom, Teams, Meet, etc.)
- Extract participant names from visible UI and conversation
- Handle audio-only imports via conversation pattern detection
- Support segments spanning two distinct meetings

Output structured for occurrence extraction with participant lists.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Jer Miller 4 months ago f22f8177 11fa9c01

+85

2 changed files

expand all

think

insights

speakers.json

speakers.txt

think/insights/speakers.json

··· 1 + { 2 + "title": "Meeting Speakers", 3 + "description": "Detects video meetings in the segment and extracts participant names from screen and conversation.", 4 + "frequency": "segment", 5 + "occurrences": "For each meeting detected, extract start/end times and the list of participant names. If segment spans two meetings, create separate occurrences.", 6 + "contains": "Meeting detection with time bounds and participant name lists extracted from visible participant panels and conversation.", 7 + "color": "#ff5722" 8 + }

+77

think/insights/speakers.txt

··· 1 + # Meeting Speaker Detection 2 + 3 + ## Objective 4 + 5 + Identify meetings in this segment and extract participant names. This output feeds into occurrence extraction to populate meeting participant lists. 6 + 7 + ## Input Types 8 + 9 + ### With Screen Activity (Live Capture) 10 + 11 + A meeting is detected when screen activity shows a video conferencing application: 12 + - Zoom, Google Meet, Microsoft Teams, Webex, or similar 13 + - Participant panels or video tiles visible 14 + - Meeting controls or UI elements present 15 + 16 + Extract participant names from: 17 + 1. **Visible participant list/panel** on screen (highest confidence) 18 + 2. **Names spoken in conversation** - direct address ("Thanks, Sarah"), mentions ("John was saying...") 19 + 3. **Self-introductions** ("Hi, I'm Alex from...") 20 + 21 + ### Audio Only (Import) 22 + 23 + When no screen activity is present, a meeting is detected when audio shows: 24 + - Multiple speakers with conversational turn-taking 25 + - Meeting-style patterns: agenda items, discussion, decisions, action items 26 + - Professional/work context evident from topics 27 + 28 + Extract names from: 29 + 1. **Names spoken in conversation** - direct address and mentions 30 + 2. **Self-introductions** 31 + 3. **Setting/topics metadata** if available from transcription 32 + 33 + ## Boundary Detection 34 + 35 + If the segment spans two distinct meetings (one ending, another starting): 36 + - Look for farewell patterns followed by new greetings 37 + - Different participant set appears 38 + - Significant topic shift with new introductions 39 + 40 + Report each meeting separately with its own time range and participants. 41 + 42 + ## Output Format 43 + 44 + Structure your output clearly for occurrence extraction: 45 + 46 + ### Meeting Detection 47 + 48 + State whether meeting(s) were detected and the detection basis: 49 + - **Detected**: Yes / No 50 + - **Source**: Screen (video call UI visible) / Audio-only (conversation patterns) 51 + 52 + ### Meeting 1 53 + 54 + - **Time**: HH:MM:SS - HH:MM:SS 55 + - **Platform**: Zoom / Meet / Teams / etc. (if visible, omit for audio-only) 56 + - **Participants**: 57 + - Name (evidence: where/how detected) 58 + - Name (evidence: where/how detected) 59 + 60 + ### Meeting 2 (if segment contains two meetings) 61 + 62 + Same structure as above. 63 + 64 + ## What to Ignore 65 + 66 + - Podcasts, videos, or streaming content being watched 67 + - Music or background audio 68 + - Single-speaker recordings without meeting context 69 + - Phone calls (no video conferencing UI) 70 + 71 + ## No Meeting Case 72 + 73 + If no meeting is detected, output: 74 + 75 + ### Meeting Detection 76 + - **Detected**: No 77 + - **Reason**: [Brief explanation - e.g., "No video call UI visible, single speaker only"]

Configure Feed

Configure Feed