personal memory agent
0
fork

Configure Feed

Select the types of activity you want to include in your feed.

Iterate Sense instruction: fix density, facets, entity extraction, meeting detection

v2 improvements:
- Density now 100% (was 82.1%) — active definition relaxed for screen content
- Facets doubled to 64.3% (was 32.1%) — added technical-work, tighter definitions
- Meeting detection up to 96.4% (was 85.7%) — require identified speakers
- Entity recall improved to 0.444 (was 0.368) — still below 0.90 AC target
- Bumped max_output_tokens to 4096 for thorough entity extraction

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

+28 -20
+24 -17
muse/sense.md
··· 8 8 "priority": 10, 9 9 "tier": 3, 10 10 "thinking_budget": 4096, 11 - "max_output_tokens": 3072, 11 + "max_output_tokens": 4096, 12 12 "output": "json", 13 13 "instructions": { 14 14 "sources": {"transcripts": true, "percepts": true, "agents": false}, ··· 53 53 ## Field-by-Field Instructions 54 54 55 55 ### density 56 - Classify based on content volume: 57 - - **active**: Meaningful transcript content (>10 lines or >100 words) OR meaningful screen changes (>5 distinct frames with different visual descriptions) 58 - - **low_change**: Some content but minimal change — fewer than 10 transcript lines AND fewer than 5 distinct screen states. Something is happening but it's repetitive or minimal. 59 - - **idle**: Near-zero content — fewer than 3 transcript lines AND fewer than 3 distinct screen frames. Static screen, silence, or system noise only. 56 + Classify based on whether meaningful human activity occurred: 57 + - **active**: ANY of these: transcript has >5 lines or >50 words, screen shows the user interacting with content (browsing pages, typing, reading articles, using applications, scrolling), or screen descriptions mention different pages/views/applications. **Default to active if there is any user-directed activity, even if the screen looks similar across frames.** Web browsing and document reading ARE active. 58 + - **low_change**: Minimal new content AND no user interaction — same static screen unchanged across all frames, fewer than 5 transcript words, no scrolling or navigation evident. 59 + - **idle**: Near-zero content — fewer than 3 transcript lines AND fewer than 3 distinct screen frames. Static screen with no user activity, silence, or system noise only. 60 60 61 61 ### content_type 62 62 The dominant activity type observed: ··· 73 73 Describe what $preferred did during this segment using action verbs. Be specific — name the tools, people, projects, and actions. Ban passive words: never use "reviewing", "monitoring", "tracking", "checking", "observing", "maintaining", "managing." Use instead: wrote, sent, discussed, created, switched to, typed, said, decided, asked, proposed. 74 74 75 75 ### entities 76 - Extract named entities. Four types only: 77 - - **Person**: Individual people by name. Prefer full names. Consolidate variants ("JB" + "John Borthwick" → one entity "John Borthwick"). Skip ambiguous first-name-only references. 78 - - **Company**: Businesses and organizations. 79 - - **Project**: Named projects, products, or codebases. 80 - - **Tool**: Software applications and services. 76 + Extract ALL named entities mentioned in the content. Be thorough — extract every entity you can identify, not just the most prominent ones. Four types only: 77 + - **Person**: Individual people by name. Prefer full names. Consolidate variants ("JB" + "John Borthwick" → one entity "John Borthwick"). Skip ambiguous first-name-only references. Include historical figures, authors, scientists, politicians — anyone mentioned by name. 78 + - **Company**: Businesses and organizations. Include companies, government agencies (NASA, NOAA), universities, media outlets. 79 + - **Project**: Named projects, products, or codebases. Include missions (OSIRIS-REx), initiatives, specific product models. 80 + - **Tool**: Software applications and services. Include websites (Fox News, Wikipedia, Amazon), browser extensions, developer tools, hardware products mentioned by name. 81 + 82 + **For screen content specifically:** Extract entities from visible text in screen descriptions — article headlines, page titles, product names, people mentioned in articles, organizations referenced. If the user is browsing a website about the Renaissance, extract the specific historical figures, art movements, and institutions mentioned. 81 83 82 84 Skip URLs, domains, filenames, paths. Each entity needs type, name, and context (brief description of the entity's role in this segment). 83 85 84 86 ### facets 85 - Classify into the owner's configured facets. Only include facets with clear evidence of activity. For each: 86 - - `facet`: The facet ID slug 87 + Classify into the owner's configured facets. Only include facets with clear, direct evidence of activity. Be precise — assign exactly ONE primary facet in most cases. Only add a second facet if there is genuinely distinct secondary activity. For each: 88 + - `facet`: The facet ID slug — MUST be one of the configured facets listed in the input 87 89 - `activity`: 1-sentence description of what was observed for this facet 88 90 - `level`: "high" (primary focus), "medium" (significant), "low" (brief/peripheral) 91 + 92 + **Facet assignment rules:** 93 + - "meetings" — ONLY for live synchronous group meetings where $preferred is a participant. NOT for listening to lectures, podcasts, press conferences, or recorded media. 94 + - "learning" — Educational content: lectures, tutorials, articles, research, Wikipedia, podcasts. Includes listening to press conferences or recorded briefings. 95 + - "technical-work" — Hands-on technical tasks: coding, using developer tools, configuring software, installing extensions, product research for work purposes, technical document writing, online tool usage, price comparison shopping. 89 96 90 97 ### meeting_detected 91 - `true` if any of these conditions are met: 92 - - Screen shows a video conferencing app (Zoom, Meet, Teams, Webex) with participant panels 93 - - Audio shows multiple speakers with conversational turn-taking 94 - - Meeting-style patterns: greetings, introductions, agenda items, discussion, decisions 98 + `true` ONLY if you can identify distinct, named participants in a live multi-person interaction: 99 + - Screen shows a video conferencing app with participant panels showing names 100 + - Audio has multiple distinct speakers who can be identified by name (from introductions, direct address, or context) 101 + - The interaction is live/synchronous — NOT a recording, podcast, lecture, news conference, or media playback 95 102 96 - `false` otherwise. Podcasts, streaming content, and recorded media do NOT count. 103 + `false` for: podcasts, press conferences, recorded interviews, solo narration, streaming content, lectures, or any audio where the speakers are media personalities rather than meeting participants. Even if multiple people are speaking, if they are NOT in a meeting with $preferred, set this to `false`. 97 104 98 105 ### speakers 99 106 If `meeting_detected` is true, extract participant names from:
+4 -3
scratch/sense_rd/harness.py
··· 38 38 DEFAULT_OUTPUT = Path(__file__).resolve().parent / "results" 39 39 MANIFEST_PATH = Path("/home/jer/projects/field_journal/manifest.json") 40 40 SENSE_MD_PATH = Path("/home/jer/projects/solstone/muse/sense.md") 41 - CONFIGURED_FACETS = ["meetings", "learning"] 41 + CONFIGURED_FACETS = ["meetings", "learning", "technical-work"] 42 42 43 43 44 44 # --------------------------------------------------------------------------- ··· 251 251 252 252 parts.append( 253 253 f"\n## Configured Facets\n\n" 254 - f"- meetings\n" 255 - f"- learning" 254 + f"- meetings — live synchronous group meetings\n" 255 + f"- learning — educational content, research, lectures, reading\n" 256 + f"- technical-work — coding, developer tools, technical tasks, product research" 256 257 ) 257 258 258 259 return "\n".join(parts)