personal memory agent
0
fork

Configure Feed

Select the types of activity you want to include in your feed.

refactor(events): retire occurrence/timeline/flow/knowledge_graph producer talents and think/hooks.py

Sprint 4 Lode A for parent req_ouq77ho6.

Retire the occurrence/event producer side by deleting talent/timeline.md,
talent/flow.md, talent/knowledge_graph.md, talent/occurrence.md, and
talent/occurrence.py.

Scrub the occurrence-producing frontmatter from talent/meetings.md,
talent/decisions.md, talent/followups.md, and talent/messaging.md by
removing the occurrences metadata and hook.post=occurrence contract.

Create think/_extraction_utils.py with only log_extraction_failure,
move think/talents.py to import it from there, and delete think/hooks.py.

Trim tests/test_output_hooks.py per triage:
- Kept the generic post-hook, pre-hook, and template-var coverage
(~lines 50-306 and ~594-end).
- Deleted the 3 occurrence post-process tests
(drop-26-participants / keep-25 / keep-non-meeting-large).
- Deleted the 5 write_events_jsonl tests because they depended on the
deleted talent/occurrence.py or think/hooks.write_events_jsonl.

Remove the stale think/hooks.py stealth-writes row from
docs/coding-standards.md.

Refresh the API baselines and 5 collateral tests to reflect the deleted
talents and the scrubbed frontmatter on the surviving activity talents.

talent/schedule.md and talent/schedule.py were verified untouched,
correcting the CPO spec error called out in section 3.

make ci now passes at 3249 passed / 1 skipped, which is -8 from the
Sprint 3 baseline of 3257 tests.

Out-of-scope followups for the remaining lodes:
- Lode B: think/events.py rename and formatters registry cleanup.
- Lode C: read-side cleanup in apps/, stats, and briefing surfaces.
- Lode D: remaining docs/baselines scrub.

+92 -1224
-1
docs/coding-standards.md
··· 71 71 | Facets (`facets/*/facet.json`, `facets/*/relationships/`) | `think/facets.py` + `apps/facets/*` (if/when created) | 72 72 | Observations (`observations.jsonl`) | `think/entities/observations.py` | 73 73 | Activities (`facets/*/activities/*.jsonl`) | `think/activities.py` | 74 - | Facet events (`facets/*/events/*.jsonl`) | `think/hooks.py::write_events_jsonl`, called only via declared hook contract | 75 74 | Chronicle day content (`chronicle/YYYYMMDD/**`) | The capturing module (observer, importer) per its declared outputs | 76 75 | Index (SQLite, `indexer/*`) | `think/indexer/*` | 77 76
-2
talent/decisions.md
··· 3 3 4 4 "title": "Decision Actions", 5 5 "description": "Tracks consequential decision-actions that change state, plans, resources, responsibilities, or timing in ways that affect other people.", 6 - "occurrences": "Create an occurrence for every decision-action observed. Include the time span, decision type, actors involved, entities affected, and impact assessment. Each occurrence should capture both the intent and enactment of the decision.", 7 - "hook": {"post": "occurrence"}, 8 6 "color": "#dc3545", 9 7 "schedule": "activity", 10 8 "activities": ["meeting", "call", "messaging", "email"],
-142
talent/flow.md
··· 1 - { 2 - "type": "generate", 3 - 4 - "title": "Day Overview", 5 - "description": "Summarizes the overall flow of the workday. Looks for patterns in focus, energy, context switching and highlights productivity insights in a Markdown report.", 6 - "occurrences": "Create an occurrence for noteworthy shifts in work rhythms or focus. Include timestamps when deep work starts or ends, or when energy levels noticeably change. Classify each as work or personal based on the surrounding context.", 7 - "hook": {"post": "occurrence"}, 8 - "color": "#17a2b8", 9 - "schedule": "daily", 10 - "priority": 10, 11 - "output": "md", 12 - "load": {"transcripts": true, "percepts": false, "talents": {"screen": true}} 13 - } 14 - 15 - $facets 16 - 17 - $daily_preamble 18 - 19 - # Workday Productivity & Flow Analysis 20 - 21 - ## Objective 22 - 23 - Analyze the full workday transcript to provide deep insights into work patterns, energy management, and productivity optimization, focusing on the holistic flow and rhythm of the workday. 24 - 25 - ## Analysis Framework 26 - 27 - ### 1. Workday Architecture & Time Patterns 28 - - **Work Session Analysis**: 29 - - Map focused work blocks vs. fragmented segments 30 - - Identify optimal session lengths for different work types 31 - - Analyze startup/shutdown rituals and $pronouns_possessive effectiveness 32 - - **Rhythm & Cadence**: 33 - - Natural work cycles throughout the day 34 - - Productive vs. recovery segments 35 - - Time between breaks and $pronouns_possessive impact on subsequent focus 36 - - **Context Switching Cost**: 37 - - Frequency and depth of task switches 38 - - Recovery time after interruptions 39 - - Strategies used to maintain or regain focus 40 - 41 - ### 2. Deep Work & Flow States 42 - - **Flow Identification**: 43 - - Segments of sustained, uninterrupted focus 44 - - Environmental conditions that enabled flow 45 - - Types of work that achieved flow state 46 - - **Flow Blockers**: 47 - - What disrupted deep work sessions 48 - - Attempted deep work that failed and why 49 - - Patterns in when deep work is most/least successful 50 - - **Attention Residue**: 51 - - Impact of incomplete tasks on subsequent work 52 - - How well transitions between different work modes were managed 53 - 54 - ### 3. Energy & Cognitive Load Management 55 - - **Energy Patterns**: 56 - - Peak performance windows and what characterized them 57 - - Energy dips and $pronouns_possessive triggers 58 - - Correlation between task type and energy levels 59 - - **Cognitive Load Indicators**: 60 - - Signs of mental fatigue (errors, slower responses, frustration) 61 - - Complexity handling throughout the day 62 - - Decision fatigue patterns 63 - - **Recovery & Renewal**: 64 - - Quality and timing of breaks 65 - - Activities that restored vs. drained energy 66 - - Missed opportunities for strategic renewal 67 - 68 - ### 4. Work Style & Personal Productivity Patterns 69 - - **Working Memory Management**: 70 - - How information is tracked across tasks 71 - - Use of external memory systems (notes, tabs, etc.) 72 - - Information loss between sessions 73 - - **Momentum Patterns**: 74 - - How work sessions build on each other 75 - - When momentum is gained or lost 76 - - Impact of starting tasks vs. completing them 77 - - **Personal Productivity Style**: 78 - - Preference for batching vs. interleaving 79 - - Monotropic vs. polytropic work patterns 80 - - Procrastination patterns and triggers 81 - 82 - ### 5. Environmental & Contextual Factors 83 - - **Physical Environment Impact**: 84 - - References to workspace, comfort, distractions 85 - - Environmental changes and $pronouns_possessive effects 86 - - **Psychological State**: 87 - - Mood indicators throughout the day 88 - - Stress responses and coping mechanisms 89 - - Enthusiasm/engagement fluctuations 90 - - **External Pressures**: 91 - - Deadline impacts on work patterns 92 - - How external demands shaped the day's flow 93 - 94 - ### 6. Productivity Optimization Opportunities 95 - 96 - #### Work Design Improvements: 97 - - **Time Blocking Recommendations**: Specific suggestions for structuring future days 98 - - **Energy-Task Matching**: Aligning high-energy segments with complex work 99 - - **Focus Protection Strategies**: Concrete ways to preserve deep work time 100 - - **Transition Rituals**: Practices to improve task switching 101 - 102 - #### Systemic Improvements: 103 - - **Workflow Bottlenecks**: Recurring friction points in work processes 104 - - **Attention Leaks**: Where focus consistently gets diverted 105 - - **Decision Fatigue Reduction**: Opportunities to streamline or automate decisions 106 - - **Cognitive Load Distribution**: Better ways to spread mental effort 107 - 108 - #### Personal Effectiveness: 109 - - **Leverage Points**: Small changes that could yield big productivity gains 110 - - **Habit Stacking Opportunities**: Beneficial routines to develop 111 - - **Energy Investment ROI**: Where effort produces best/worst returns 112 - 113 - ### 7. Comparative Performance Analysis 114 - - **Performance Variability**: What distinguished high vs. low productivity segments 115 - - **Success Pattern Recognition**: Common elements in productive sequences 116 - - **Failure Pattern Analysis**: Recurring productivity obstacles 117 - - **Personal Best Practices**: Effective strategies observed in action 118 - 119 - ### 8. Strategic Insights & Meta-Patterns 120 - - **Work-Life Integration**: How personal and professional elements affected each other 121 - - **Capacity Utilization**: Whether operating at sustainable levels 122 - - **Growth Edge Identification**: Where pushing comfort zone vs. playing safe 123 - - **Systemic Issues**: Patterns suggesting organizational or structural problems 124 - 125 - ## Output Requirements 126 - 127 - Create a comprehensive markdown report that: 128 - - Uses clear headers and visual hierarchy for scannability 129 - - Includes specific timestamps for key examples 130 - - Balances detailed analysis with actionable insights 131 - - Provides both immediate tactics and long-term strategies 132 - - Uses data visualization descriptions where helpful (e.g., "Energy levels would show a bell curve peaking at 10am-12pm") 133 - 134 - ## Special Considerations 135 - 136 - - $Preferred often multi-tasks with background meetings while doing other work 137 - - Consider both explicit productivity (tasks completed) and implicit productivity (relationship building, learning, thinking) 138 - - Note the difference between "busy" and "productive" time 139 - - Identify both successful strategies to replicate and problematic patterns to address 140 - - Consider productivity in context of overall wellbeing and sustainability 141 - 142 - Remember: The goal is to reveal the hidden architecture of the workday—the rhythms, patterns, and dynamics that shape productivity—providing insights that help optimize not just what gets done, but how work flows throughout the day.
-2
talent/followups.md
··· 3 3 4 4 "title": "Follow-Up Items", 5 5 "description": "Detects promised tasks, commitments, and reminders for future action within each activity. Outputs a concise Markdown list of follow-ups with context.", 6 - "occurrences": "Whenever a future task or commitment is mentioned, create an occurrence with the expected action and deadline if known. Note who requested it and whether it is work or personal.", 7 - "hook": {"post": "occurrence"}, 8 6 "color": "#ffc107", 9 7 "schedule": "activity", 10 8 "activities": ["meeting", "call", "messaging", "email"],
-67
talent/knowledge_graph.md
··· 1 - { 2 - "type": "generate", 3 - 4 - "title": "Knowledge Graph", 5 - "description": "Extracts people, projects, tools and other entities from the transcript and maps how they relate. Produces a Markdown report plus narrative describing network hubs and bridges discovered during the day.", 6 - "occurrences": "For each entity interaction or relationship mentioned, create an occurrence describing the connection. Include start and end times when the relationship is visible, and capture the type of link such as works-on or discusses-with.", 7 - "hook": {"post": "occurrence"}, 8 - "color": "#6f42c1", 9 - "schedule": "daily", 10 - "priority": 10, 11 - "output": "md", 12 - "load": {"transcripts": true, "percepts": false, "talents": {"screen": true}} 13 - } 14 - 15 - $facets 16 - 17 - $daily_preamble 18 - 19 - # Comprehensive Workday Knowledge Graph and Network Analysis from Transcripts 20 - 21 - **Input:** A markdown file containing a chronologically ordered transcript of a workday for $name. The transcript is organized by recording segments, each combining information from audio recordings and screen activity for that time period. 22 - 23 - **Objective:** Generate a comprehensive knowledge graph and perform a network analysis based on the provided workday transcripts. 24 - 25 - **Instructions:** 26 - 27 - 1. **Entity Extraction and Profiling:** 28 - * Identify all distinct entities. Categorize entities using appropriate types such as: `Person`, `Organization` (companies, groups, teams), `Project`, `Task`, `Concept` (abstract ideas, features, problem domains), `Tool` (software, applications, physical tools), `Location`, `Event`, and `Topic` (general subjects of discussion), or other contextually relevant categories. 29 - * For each identified entity, provide: 30 - * `Entity Name`: The canonical name of the entity. 31 - * `Entity Type`: Its category from the list above. 32 - * `First Appearance`: Time. 33 - * `Total Engagement`: Approximately how many times was it mentioned throughout the day. 34 - * `Context`: Provide a concise (1-2 sentence) summary of its role or context in how it was used, referencing key actions or discussions from both audio and screen data if distinguishable. 35 - * **Concept Quality Filter:** Distinguish `Concept` from `Topic`. A Concept must be a genuinely reusable idea — a mental model, framework, strategic insight, or principle — valuable to recall in a different context. "Discussed the migration timeline" is a Topic (use `Topic` type). "Conway's Law applies to their API design" is a Concept (use `Concept` type). When extracting Concepts, capture WHY the concept matters in the `Context` field, not just that it was mentioned. 36 - 37 - 2. **Relationship Mapping:** 38 - * Identify and map all significant connections between entities. 39 - * For each connection, define: 40 - * `Source Name` 41 - * `Target Name` 42 - * `Relationship Type`: Use descriptive labels (e.g., `works-on`, `discusses-with`, `uses-tool-for-project`, `reports-to`, `blocked-by`, `enables`, `references-concept`, `mentions-organization`). Be prepared to infer novel relationship types if the provided examples are insufficient, and briefly justify any novel types. 43 - 44 - 3. **Network Analysis and Insights:** 45 - * Based on the extracted entities and relationships, identify and describe: 46 - * `Hub Entities`: Entities with the highest number of diverse connections, acting as central points in the workday. List the top 3-5. 47 - * `Bridge Entities`: Entities that uniquely connect disparate clusters of entities or topics that would otherwise be disconnected. 48 - * `Orphan Idea`: Concepts, tasks, or topics mentioned but not substantially connected to other entities or followed up upon. 49 - * `Temporal Flows`: Describe how attention and activity move through the network over time. For example, "Morning focused on Project X involving Person A and Tool Y, shifting to Concept Z discussions with Person B in the afternoon." 50 - 51 - 4. **Output Format:** 52 - * **Part 1: Friendly Markdown Report:** 53 - * A list of all entities with their profiles (from step 1). 54 - * A list of all relationships (from step 2). 55 - * **Part 2: Narrative Analysis (Markdown):** 56 - * A summary of the key findings from the Network Analysis (step 3). 57 - * A qualitative description of what a visual network diagram of this day would highlight. Include specific examples of the 2-3 most interesting or unexpected connections discovered, explaining why they are noteworthy (e.g., "An interesting connection is Person A using Tool Z, typically associated with Project Q, for an ad-hoc task related to Concept R. This suggests a novel application or workaround."). 58 - 59 - **Key Considerations:** 60 - * Synthesize information from all transcript content within each chunk. 61 - * Disambiguate entities by consolidating name variants to a single canonical entity using the most complete name when the same entity is referenced by different names within the transcript (e.g., "John", "John D.", "John Doe" → use "John Doe" throughout). 62 - * When first-name-only references are ambiguous, note the ambiguity in the entity's Context field rather than guessing identity. 63 - * Cross-reference names against attendees and participants mentioned earlier in the transcript for spelling corrections and consistent naming. 64 - * Infer implicit relationships where explicit statements are lacking but context strongly suggests a connection. 65 - * Focus on the most relevant and significant entities and relationships to avoid an overly noisy graph. 66 - * For live capture, $preferred often multi-tasks — e.g., joined on a team zoom in the background while working on an unrelated task — so different content streams may not always align. 67 - * Take time to consider all of the nuance of the interactions from the day, deeply think through how best to prioritize the most important aspects and understandings, formulate the best approach for each step of the analysis.
-2
talent/meetings.md
··· 3 3 4 4 "title": "Meeting Notes", 5 5 "description": "Produces detailed meeting notes for each meeting activity, including participants, topics discussed, action items, and presentation details.", 6 - "occurrences": "Each meeting should generate an occurrence with start and end times, list of participants and a concise summary. If slides are present, mention them in the details field.", 7 - "hook": {"post": "occurrence"}, 8 6 "color": "#e83e8c", 9 7 "schedule": "activity", 10 8 "activities": ["meeting"],
-2
talent/messaging.md
··· 3 3 4 4 "title": "Messaging Summary", 5 5 "description": "Extracts contacts, channels, apps, and message content from completed messaging and email activities.", 6 - "occurrences": "Create an occurrence for every distinct message interaction. Include the time block, app name, contacts or channels involved, whether $preferred was reading or replying, and a summary of visible content.", 7 - "hook": {"post": "occurrence"}, 8 6 "color": "#78909c", 9 7 "schedule": "activity", 10 8 "activities": ["messaging", "email"],
-74
talent/occurrence.md
··· 1 - { 2 - 3 - "title": "Occurrence Extraction", 4 - "description": "Extracts structured occurrence events from insight summaries.", 5 - "color": "#37474f" 6 - 7 - } 8 - 9 - # Occurrence JSON Conversion 10 - 11 - ## Objective 12 - 13 - Extract events from a Markdown summary generated from daily transcripts and convert them into structured JSON occurrences. 14 - 15 - ## Instructions 16 - 1. **Extract every distinct event** mentioned in the summary - no matter how brief 17 - 2. **Be comprehensive** - capture meetings, messages, file activities, follow-ups, documentation work, research, media consumption, etc. 18 - 3. **Preserve timing information** when available in the source 19 - 4. **Infer missing details** reasonably when context provides clues 20 - 5. **Assign facets** - for every occurrence, always choose the best matching facet from the Available Facets context. Use the facet name/ID (e.g., `facet_name`) based on entities mentioned, subject matter, or context. This field is required. 21 - 6. **Return only valid JSON** - no commentary, explanations, or wrapper objects 22 - 7. **Handle empty sources** - if the source indicates no events occurred (e.g., "No meetings detected"), return an empty array: `[]` 23 - 8. **Separate concurrent activities** - if the owner is engaged in two unrelated activities simultaneously (e.g., a meeting on one screen and a text conversation on another), these must be separate events with separate participant lists. A person texting during a meeting is NOT a participant in that meeting. Signals of concurrent-but-unrelated activity include: different communication channels (Signal vs Webex), different subject matter, different facets. 24 - 9. **Respect facet boundaries for participants** - participants should only be listed for events within their relevant facet. A personal contact who appears in a concurrent personal activity should not be listed in a work event's participants. When the source shows interleaved activities across facets, separate them into distinct occurrences and assign participants only to the occurrence they actually belong to. 25 - 26 - ## Occurrence Fields 27 - - **type** – the kind of occurrence such as `meeting`, `message`, `file`, `followup`, `documentation`, `research`, `media`, etc. 28 - - **start** and **end** – HH:MM:SS timestamps containing the occurrence (use "00:00:00" if unknown) 29 - - **title** – short descriptive title for display 30 - - **summary** – concise one-sentence recap of what happened 31 - - **work** – boolean classification: `true` for work-related, `false` for personal or not work related 32 - - **participants** – optional list of people or entities involved (empty array if none) 33 - - **facet** – required facet identifier; use the facet name/ID from Available Facets context that best matches the occurrence based on entities, subject matter, or context 34 - - **details** – free-form string capturing all additional context, specifics, outcomes, or other relevant information from the original document that is not covered already by the other fields 35 - 36 - ## Output Format 37 - Return a JSON array of occurrences only. Each occurrence must include all required fields. 38 - 39 - ## Example 40 - [ 41 - { 42 - "type": "meeting", 43 - "start": "09:00:00", 44 - "end": "09:30:00", 45 - "title": "Team stand-up", 46 - "summary": "Daily status update with the engineering team discussing sprint progress and blockers.", 47 - "work": true, 48 - "participants": ["$name", "Alice", "Bob"], 49 - "facet": "work_project_alpha", 50 - "details": "Alice reported database optimization complete ahead of schedule. Bob mentioned UI testing delays due to missing design assets. Scheduled follow-up meeting for authentication module review. Sprint velocity tracking discussed." 51 - }, 52 - { 53 - "type": "message", 54 - "start": "14:22:00", 55 - "end": "14:22:00", 56 - "title": "Slack message to design team", 57 - "summary": "Requested updated mockups for the login flow redesign.", 58 - "work": true, 59 - "participants": ["$name", "Design Team"], 60 - "facet": "work_project_alpha", 61 - "details": "Specifically asked for mobile responsive versions and accessibility considerations. Mentioned deadline of end of week." 62 - }, 63 - { 64 - "type": "research", 65 - "start": "15:30:00", 66 - "end": "16:15:00", 67 - "title": "Authentication framework comparison", 68 - "summary": "Researched OAuth 2.0 vs Auth0 implementation options for the new user system.", 69 - "work": true, 70 - "participants": [], 71 - "facet": "work_project_alpha", 72 - "details": "Compared security features, pricing models, and integration complexity. Created comparison spreadsheet with pros/cons. Leaning toward Auth0 for faster implementation." 73 - } 74 - ]
-119
talent/occurrence.py
··· 1 - # SPDX-License-Identifier: AGPL-3.0-only 2 - # Copyright (c) 2026 sol pbc 3 - 4 - """Hook for extracting occurrence events from generator output results. 5 - 6 - This hook is invoked via "hook": {"post": "occurrence"} in generator frontmatter. 7 - It extracts structured JSON events from markdown summaries and writes 8 - them to facet-based JSONL files. 9 - """ 10 - 11 - import json 12 - import logging 13 - from pathlib import Path 14 - 15 - from think.facets import facet_summaries 16 - from think.hooks import ( 17 - compute_output_source, 18 - log_extraction_failure, 19 - should_skip_extraction, 20 - write_events_jsonl, 21 - ) 22 - from think.models import generate 23 - from think.prompts import load_prompt 24 - from think.talent import get_output_name 25 - 26 - 27 - def post_process(result: str, context: dict) -> str | None: 28 - """Extract occurrence events from generator output result. 29 - 30 - This hook extracts structured JSON events from markdown output summaries 31 - and writes them to facet-based JSONL files. 32 - 33 - Args: 34 - result: The generated output markdown content. 35 - context: Config dict with keys including day, segment, name, 36 - output_path, meta, transcript, span, span_mode. 37 - 38 - Returns: 39 - None - this hook does not modify the output result. 40 - """ 41 - # Check skip conditions 42 - skip_reason = should_skip_extraction(result, context) 43 - if skip_reason: 44 - logging.info("Skipping occurrence extraction: %s", skip_reason) 45 - return None 46 - 47 - # Load extraction prompt 48 - prompt_content = load_prompt("occurrence", base_dir=Path(__file__).parent) 49 - 50 - # Build context with facets + agent-specific instructions 51 - facets_context = facet_summaries(detailed=True) 52 - agent_instructions = context.get("meta", {}).get("occurrences") 53 - if agent_instructions and isinstance(agent_instructions, str): 54 - extra_instructions = f"{facets_context}\n\n{agent_instructions}" 55 - else: 56 - extra_instructions = facets_context 57 - 58 - # Extract events 59 - name = context.get("name", "unknown") 60 - contents = [extra_instructions, result] 61 - 62 - try: 63 - response_text = generate( 64 - contents=contents, 65 - context=f"talent.system.{name}", 66 - temperature=0.3, 67 - max_output_tokens=24576, 68 - thinking_budget=0, 69 - system_instruction=prompt_content.text, 70 - json_output=True, 71 - ) 72 - except Exception as e: 73 - log_extraction_failure(e, name) 74 - return None 75 - 76 - try: 77 - events = json.loads(response_text) 78 - except json.JSONDecodeError as e: 79 - logging.error("Invalid JSON from occurrence extraction: %s", e) 80 - return None 81 - 82 - if not isinstance(events, list): 83 - logging.error("Extraction did not return array") 84 - return None 85 - 86 - filtered_events = [] 87 - for event in events: 88 - if event.get("type") == "meeting" and len(event.get("participants", [])) > 25: 89 - logging.warning( 90 - "Dropping megameeting occurrence: title=%r agent=%s participants=%d", 91 - event.get("title", ""), 92 - name, 93 - len(event.get("participants", [])), 94 - ) 95 - continue 96 - filtered_events.append(event) 97 - events = filtered_events 98 - 99 - # Write to facet JSONL files 100 - source_output = compute_output_source(context) 101 - output_name = get_output_name(name) 102 - day = context.get("day", "") 103 - 104 - written_paths = write_events_jsonl( 105 - events=events, 106 - agent=output_name, 107 - occurred=True, 108 - source_output=source_output, 109 - capture_day=day, 110 - ) 111 - 112 - if written_paths: 113 - print(f"Events written to {len(written_paths)} JSONL file(s):") 114 - for p in written_paths: 115 - print(f" {p}") 116 - else: 117 - print("No events with valid facets to write") 118 - 119 - return None # Don't modify insight result
-82
talent/timeline.md
··· 1 - { 2 - "type": "generate", 3 - 4 - "title": "Day Timeline", 5 - "description": "Constructs a detailed chronological timeline documenting every activity, task shift, and event throughout the workday. Creates a comprehensive historical record with rich descriptions of what happened when.", 6 - "occurrences": "Create an occurrence for each hour segment, don't break down hours into any smaller segments the goal for timeline occurrences is for them to capture whatever happened within each hour of the day where there was activity.", 7 - "hook": {"post": "occurrence"}, 8 - "color": "#7b1fa2", 9 - "schedule": "daily", 10 - "priority": 10, 11 - "output": "md", 12 - "load": {"transcripts": true, "percepts": false, "talents": {"screen": true}} 13 - 14 - } 15 - 16 - $daily_preamble 17 - 18 - # Comprehensive Workday Timeline Documentation 19 - 20 - ## Objective 21 - 22 - Create a detailed, chronological timeline of the entire workday, documenting everything that happened with rich, descriptive detail. The transcript combines audio conversations and screen activity organized by recording segments. 23 - 24 - ## Documentation Approach 25 - 26 - ### Timeline Structure 27 - - Organize by hour (e.g., 09:00-10:00) 28 - - Within each hour, create sub-sections for every activity shift 29 - - Include precise timestamps in HHMMSS format for all transitions 30 - - Document parallel activities (e.g., coding while in a background meeting) 31 - - Capture even brief activities if they represent a focus shift 32 - 33 - ### What to Capture 34 - For each time block, capture notes such as: 35 - - **Primary Activity**: What was being actively worked on 36 - - **Tools & Applications**: All software/websites being used 37 - - **Files & Documents**: Specific files opened, edited, or referenced 38 - - **Collaborator Context**: Mention collaborators in prose only when materially relevant to the activity; do not emit standalone lists of names 39 - - **Content Details**: Topics discussed, code written, problems solved 40 - - **Parallel Activities**: Background meetings, music, notifications 41 - - **Physical Context**: Any mentions of location, movement, breaks 42 - 43 - ### Level of Detail 44 - - Include specific project names and file names; mention collaborators in prose only when materially relevant, never as standalone name lists 45 - - Describe the work or activity (what code was written, what was discussed) 46 - - Note transitions between activities, even small ones 47 - - Capture the substance of meetings and conversations 48 - - Include break activities (media, games, etc) 49 - - Document both completed and attempted tasks 50 - 51 - ## Format 52 - 53 - Output a nicely formatted markdown document with per-hour sections and sub-sections within each hour for every task or focus shift. Only include time segments when there was activity in the transcripts. 54 - 55 - ## Documentation Guidelines 56 - 57 - 1. **Be Descriptive, Not Analytical** 58 - - Write what happened, not why or how well 59 - - Include details that paint a picture of the day 60 - - Avoid productivity assessments or recommendations 61 - 62 - 2. **Capture Everything** 63 - - Don't omit "minor" activities if they represent focus shifts 64 - - Include failed attempts and dead ends 65 - - Document personal moments that interrupt work 66 - 67 - 3. **Use Precise Language** 68 - - Specific file names, not "worked on code" 69 - - Actual meeting topics, not "discussed project" 70 - - Real error messages and solutions found 71 - 72 - 4. **Maintain Chronological Integrity** 73 - - If activities overlap, note both 74 - - Don't group similar activities from different times 75 - - Preserve the actual flow of the day 76 - 77 - 5. **Rich Context** 78 - - Include enough detail that someone could understand what was accomplished 79 - - Note tools, files, collaborators, and resources involved in prose when materially relevant; do not emit standalone lists of names 80 - - Capture the substance of work, not just categories 81 - 82 - Remember: The goal is to create a detailed historical record of the day that captures not just what activities occurred, but the rich detail of how the work actually unfolded. This timeline should serve as a comprehensive reference that could help reconstruct any part of the day's work.
+4 -4
tests/baselines/api/activities/day-events.json
··· 1 1 [ 2 2 { 3 3 "agent": "flow", 4 - "color": "#17a2b8", 4 + "color": "#6c757d", 5 5 "details": "Attended keynote on unified API gateways", 6 6 "endTime": "2026-03-04T12:00:00", 7 7 "facet": "montague", ··· 18 18 }, 19 19 { 20 20 "agent": "flow", 21 - "color": "#17a2b8", 21 + "color": "#6c757d", 22 22 "details": "Juliet and Romeo exchanged Signal contacts", 23 23 "endTime": "2026-03-04T20:00:00", 24 24 "facet": "capulet", ··· 35 35 }, 36 36 { 37 37 "agent": "flow", 38 - "color": "#17a2b8", 38 + "color": "#6c757d", 39 39 "details": "Standing ovation for architecture presentation", 40 40 "endTime": "2026-03-04T10:00:00", 41 41 "facet": "capulet", ··· 51 51 }, 52 52 { 53 53 "agent": "flow", 54 - "color": "#17a2b8", 54 + "color": "#6c757d", 55 55 "details": "Tybalt confronted Romeo", 56 56 "endTime": "2026-03-04T18:00:00", 57 57 "facet": "montague",
-30
tests/baselines/api/settings/generators.json
··· 22 22 }, 23 23 { 24 24 "app": null, 25 - "description": "Constructs a detailed chronological timeline documenting every activity, task shift, and event throughout the workday. Creates a comprehensive historical record with rich descriptions of what happened when.", 26 - "disabled": false, 27 - "extract": true, 28 - "has_extraction": true, 29 - "key": "timeline", 30 - "source": "system", 31 - "title": "Day Timeline" 32 - }, 33 - { 34 - "app": null, 35 25 "description": "Extracts future scheduled events and calendar activities into structured anticipation records. Captures dates, times, participants, and cancellation state.", 36 26 "disabled": false, 37 27 "extract": null, ··· 39 29 "key": "schedule", 40 30 "source": "system", 41 31 "title": "Upcoming Schedule" 42 - }, 43 - { 44 - "app": null, 45 - "description": "Extracts people, projects, tools and other entities from the transcript and maps how they relate. Produces a Markdown report plus narrative describing network hubs and bridges discovered during the day.", 46 - "disabled": false, 47 - "extract": true, 48 - "has_extraction": true, 49 - "key": "knowledge_graph", 50 - "source": "system", 51 - "title": "Knowledge Graph" 52 - }, 53 - { 54 - "app": null, 55 - "description": "Summarizes the overall flow of the workday. Looks for patterns in focus, energy, context switching and highlights productivity insights in a Markdown report.", 56 - "disabled": false, 57 - "extract": true, 58 - "has_extraction": true, 59 - "key": "flow", 60 - "source": "system", 61 - "title": "Day Overview" 62 32 } 63 33 ], 64 34 "segment": [
-38
tests/baselines/api/settings/providers.json
··· 223 223 }, 224 224 "talent.system.decisions": { 225 225 "disabled": false, 226 - "extract": true, 227 226 "group": "Think", 228 227 "label": "Decision Actions", 229 228 "schedule": "activity", ··· 252 251 "label": "facet_newsletter", 253 252 "tier": 2, 254 253 "type": null 255 - }, 256 - "talent.system.flow": { 257 - "disabled": false, 258 - "extract": true, 259 - "group": "Think", 260 - "label": "Day Overview", 261 - "schedule": "daily", 262 - "tier": 2, 263 - "type": "generate" 264 254 }, 265 255 "talent.system.followups": { 266 256 "disabled": false, 267 - "extract": true, 268 257 "group": "Think", 269 258 "label": "Follow-Up Items", 270 259 "schedule": "activity", ··· 287 276 "tier": 2, 288 277 "type": "cogitate" 289 278 }, 290 - "talent.system.knowledge_graph": { 291 - "disabled": false, 292 - "extract": true, 293 - "group": "Think", 294 - "label": "Knowledge Graph", 295 - "schedule": "daily", 296 - "tier": 2, 297 - "type": "generate" 298 - }, 299 279 "talent.system.meetings": { 300 280 "disabled": false, 301 - "extract": true, 302 281 "group": "Think", 303 282 "label": "Meeting Notes", 304 283 "schedule": "activity", ··· 307 286 }, 308 287 "talent.system.messaging": { 309 288 "disabled": false, 310 - "extract": true, 311 289 "group": "Think", 312 290 "label": "Messaging Summary", 313 291 "schedule": "activity", ··· 328 306 "label": "Naming", 329 307 "tier": 2, 330 308 "type": "cogitate" 331 - }, 332 - "talent.system.occurrence": { 333 - "disabled": false, 334 - "group": "Think", 335 - "label": "Occurrence Extraction", 336 - "tier": 2, 337 - "type": null 338 309 }, 339 310 "talent.system.participation": { 340 311 "disabled": false, ··· 414 385 "label": "test_missing_type_output_1568d299dc474aa9ba42401c7b1b75e2", 415 386 "tier": 2, 416 387 "type": null 417 - }, 418 - "talent.system.timeline": { 419 - "disabled": false, 420 - "extract": true, 421 - "group": "Think", 422 - "label": "Day Timeline", 423 - "schedule": "daily", 424 - "tier": 2, 425 - "type": "generate" 426 388 }, 427 389 "talent.system.triage": { 428 390 "disabled": false,
-44
tests/baselines/api/sol/talents-day.json
··· 203 203 "title": "facet_newsletter", 204 204 "type": null 205 205 }, 206 - "flow": { 207 - "app": null, 208 - "color": "#17a2b8", 209 - "description": "Summarizes the overall flow of the workday. Looks for patterns in focus, energy, context switching and highlights productivity insights in a Markdown report.", 210 - "multi_facet": false, 211 - "output_format": "md", 212 - "schedule": "daily", 213 - "source": "system", 214 - "title": "Day Overview", 215 - "type": "generate" 216 - }, 217 206 "followups": { 218 207 "app": null, 219 208 "color": "#ffc107", ··· 290 279 "source": "system", 291 280 "title": "Joke Bot", 292 281 "type": "cogitate" 293 - }, 294 - "knowledge_graph": { 295 - "app": null, 296 - "color": "#6f42c1", 297 - "description": "Extracts people, projects, tools and other entities from the transcript and maps how they relate. Produces a Markdown report plus narrative describing network hubs and bridges discovered during the day.", 298 - "multi_facet": false, 299 - "output_format": "md", 300 - "schedule": "daily", 301 - "source": "system", 302 - "title": "Knowledge Graph", 303 - "type": "generate" 304 282 }, 305 283 "meetings": { 306 284 "app": null, ··· 346 324 "title": "Naming", 347 325 "type": "cogitate" 348 326 }, 349 - "occurrence": { 350 - "app": null, 351 - "color": "#37474f", 352 - "description": "Extracts structured occurrence events from insight summaries.", 353 - "multi_facet": false, 354 - "output_format": null, 355 - "schedule": null, 356 - "source": "system", 357 - "title": "Occurrence Extraction", 358 - "type": null 359 - }, 360 327 "participation": { 361 328 "app": null, 362 329 "color": "#6c757d", ··· 477 444 "source": "system", 478 445 "title": "test_missing_type_output_1568d299dc474aa9ba42401c7b1b75e2", 479 446 "type": null 480 - }, 481 - "timeline": { 482 - "app": null, 483 - "color": "#7b1fa2", 484 - "description": "Constructs a detailed chronological timeline documenting every activity, task shift, and event throughout the workday. Creates a comprehensive historical record with rich descriptions of what happened when.", 485 - "multi_facet": false, 486 - "output_format": "md", 487 - "schedule": "daily", 488 - "source": "system", 489 - "title": "Day Timeline", 490 - "type": "generate" 491 447 }, 492 448 "todos:daily": { 493 449 "app": "todos",
-85
tests/baselines/api/stats/stats.json
··· 32 32 ], 33 33 "color": "#dc3545", 34 34 "description": "Tracks consequential decision-actions that change state, plans, resources, responsibilities, or timing in ways that affect other people.", 35 - "hook": { 36 - "post": "occurrence" 37 - }, 38 35 "load": { 39 36 "percepts": false, 40 37 "talents": { ··· 43 40 "transcripts": true 44 41 }, 45 42 "mtime": 0, 46 - "occurrences": "Create an occurrence for every decision-action observed. Include the time span, decision type, actors involved, entities affected, and impact assessment. Each occurrence should capture both the intent and enactment of the decision.", 47 43 "output": "md", 48 44 "path": "<PROJECT>/talent/decisions.md", 49 45 "priority": 10, ··· 122 118 "title": "Entity Observer", 123 119 "type": "generate" 124 120 }, 125 - "flow": { 126 - "color": "#17a2b8", 127 - "description": "Summarizes the overall flow of the workday. Looks for patterns in focus, energy, context switching and highlights productivity insights in a Markdown report.", 128 - "hook": { 129 - "post": "occurrence" 130 - }, 131 - "load": { 132 - "percepts": false, 133 - "talents": { 134 - "screen": true 135 - }, 136 - "transcripts": true 137 - }, 138 - "mtime": 0, 139 - "occurrences": "Create an occurrence for noteworthy shifts in work rhythms or focus. Include timestamps when deep work starts or ends, or when energy levels noticeably change. Classify each as work or personal based on the surrounding context.", 140 - "output": "md", 141 - "path": "<PROJECT>/talent/flow.md", 142 - "priority": 10, 143 - "schedule": "daily", 144 - "source": "system", 145 - "title": "Day Overview", 146 - "type": "generate" 147 - }, 148 121 "followups": { 149 122 "activities": [ 150 123 "meeting", ··· 154 127 ], 155 128 "color": "#ffc107", 156 129 "description": "Detects promised tasks, commitments, and reminders for future action within each activity. Outputs a concise Markdown list of follow-ups with context.", 157 - "hook": { 158 - "post": "occurrence" 159 - }, 160 130 "load": { 161 131 "percepts": false, 162 132 "talents": { ··· 165 135 "transcripts": true 166 136 }, 167 137 "mtime": 0, 168 - "occurrences": "Whenever a future task or commitment is mentioned, create an occurrence with the expected action and deadline if known. Note who requested it and whether it is work or personal.", 169 138 "output": "md", 170 139 "path": "<PROJECT>/talent/followups.md", 171 140 "priority": 10, ··· 174 143 "title": "Follow-Up Items", 175 144 "type": "generate" 176 145 }, 177 - "knowledge_graph": { 178 - "color": "#6f42c1", 179 - "description": "Extracts people, projects, tools and other entities from the transcript and maps how they relate. Produces a Markdown report plus narrative describing network hubs and bridges discovered during the day.", 180 - "hook": { 181 - "post": "occurrence" 182 - }, 183 - "load": { 184 - "percepts": false, 185 - "talents": { 186 - "screen": true 187 - }, 188 - "transcripts": true 189 - }, 190 - "mtime": 0, 191 - "occurrences": "For each entity interaction or relationship mentioned, create an occurrence describing the connection. Include start and end times when the relationship is visible, and capture the type of link such as works-on or discusses-with.", 192 - "output": "md", 193 - "path": "<PROJECT>/talent/knowledge_graph.md", 194 - "priority": 10, 195 - "schedule": "daily", 196 - "source": "system", 197 - "title": "Knowledge Graph", 198 - "type": "generate" 199 - }, 200 146 "meetings": { 201 147 "activities": [ 202 148 "meeting" 203 149 ], 204 150 "color": "#e83e8c", 205 151 "description": "Produces detailed meeting notes for each meeting activity, including participants, topics discussed, action items, and presentation details.", 206 - "hook": { 207 - "post": "occurrence" 208 - }, 209 152 "load": { 210 153 "percepts": false, 211 154 "talents": { ··· 214 157 "transcripts": true 215 158 }, 216 159 "mtime": 0, 217 - "occurrences": "Each meeting should generate an occurrence with start and end times, list of participants and a concise summary. If slides are present, mention them in the details field.", 218 160 "output": "md", 219 161 "path": "<PROJECT>/talent/meetings.md", 220 162 "priority": 10, ··· 230 172 ], 231 173 "color": "#78909c", 232 174 "description": "Extracts contacts, channels, apps, and message content from completed messaging and email activities.", 233 - "hook": { 234 - "post": "occurrence" 235 - }, 236 175 "load": { 237 176 "percepts": false, 238 177 "talents": { ··· 241 180 "transcripts": true 242 181 }, 243 182 "mtime": 0, 244 - "occurrences": "Create an occurrence for every distinct message interaction. Include the time block, app name, contacts or channels involved, whether $preferred was reading or replying, and a summary of visible content.", 245 183 "output": "md", 246 184 "path": "<PROJECT>/talent/messaging.md", 247 185 "priority": 10, ··· 380 318 "schedule": "segment", 381 319 "source": "system", 382 320 "title": "Speaker Attribution", 383 - "type": "generate" 384 - }, 385 - "timeline": { 386 - "color": "#7b1fa2", 387 - "description": "Constructs a detailed chronological timeline documenting every activity, task shift, and event throughout the workday. Creates a comprehensive historical record with rich descriptions of what happened when.", 388 - "hook": { 389 - "post": "occurrence" 390 - }, 391 - "load": { 392 - "percepts": false, 393 - "talents": { 394 - "screen": true 395 - }, 396 - "transcripts": true 397 - }, 398 - "mtime": 0, 399 - "occurrences": "Create an occurrence for each hour segment, don't break down hours into any smaller segments the goal for timeline occurrences is for them to capture whatever happened within each hour of the day where there was activity.", 400 - "output": "md", 401 - "path": "<PROJECT>/talent/timeline.md", 402 - "priority": 10, 403 - "schedule": "daily", 404 - "source": "system", 405 - "title": "Day Timeline", 406 321 "type": "generate" 407 322 } 408 323 },
+2 -2
tests/test_generate_full.py
··· 320 320 from think.talent import load_post_hook 321 321 322 322 # Config with named hook (new format) 323 - config = {"hook": {"post": "occurrence"}} 323 + config = {"hook": {"post": "schedule"}} 324 324 hook_fn = load_post_hook(config) 325 325 326 - # Should resolve to talent/occurrence.py and be callable 326 + # Should resolve to talent/schedule.py and be callable 327 327 assert callable(hook_fn)
+6 -6
tests/test_generate_scan_day.py
··· 18 18 copytree_tracked(src, dest) 19 19 talents_dir = dest / "talents" 20 20 talents_dir.mkdir(exist_ok=True) # Allow existing directory 21 - (talents_dir / "flow.md").write_text("done") 21 + (talents_dir / "schedule.json").write_text("[]") 22 22 return dest 23 23 24 24 ··· 28 28 monkeypatch.setenv("_SOLSTONE_JOURNAL_OVERRIDE", str(tmp_path)) 29 29 30 30 info = mod.scan_day("20240101") 31 - assert "talents/flow.md" in info["processed"] 32 - assert "talents/timeline.md" in info["repairable"] 31 + assert "talents/schedule.json" in info["processed"] 32 + assert "talents/daily_schedule.json" in info["repairable"] 33 33 34 - (day_dir / "talents" / "timeline.md").write_text("done") 34 + (day_dir / "talents" / "daily_schedule.json").write_text("[]") 35 35 info_after = mod.scan_day("20240101") 36 - assert "talents/timeline.md" in info_after["processed"] 37 - assert "talents/timeline.md" not in info_after["repairable"] 36 + assert "talents/daily_schedule.json" in info_after["processed"] 37 + assert "talents/daily_schedule.json" not in info_after["repairable"]
+4 -4
tests/test_generators.py
··· 13 13 """Test that system generators are discovered with source field.""" 14 14 talent = importlib.import_module("think.talent") 15 15 generators = talent.get_talent_configs(type="generate") 16 - assert "flow" in generators 17 - info = generators["flow"] 18 - assert os.path.basename(info["path"]) == "flow.md" 16 + assert "schedule" in generators 17 + info = generators["schedule"] 18 + assert os.path.basename(info["path"]) == "schedule.md" 19 19 assert isinstance(info["color"], str) 20 20 assert isinstance(info["mtime"], int) 21 21 assert "title" in info 22 - assert "occurrences" in info 22 + assert "occurrences" not in info 23 23 # New: check source field 24 24 assert info.get("source") == "system" 25 25
+2 -285
tests/test_output_hooks.py
··· 12 12 import importlib 13 13 import io 14 14 import json 15 - import logging 16 15 import os 17 16 from pathlib import Path 18 17 19 - import talent.occurrence as occurrence 20 18 from tests.conftest import copytree_tracked 21 - from think.hooks import write_events_jsonl 22 19 from think.talent import load_post_hook, load_pre_hook 23 20 from think.talents import _apply_template_vars 24 21 from think.utils import day_path ··· 121 118 122 119 def test_load_post_hook_named_resolution(): 123 120 """Test that named hooks resolve to talent/{name}.py.""" 124 - # occurrence.py exists in talent/ 125 - config = {"hook": {"post": "occurrence"}} 121 + # schedule.py exists in talent/ 122 + config = {"hook": {"post": "schedule"}} 126 123 hook_fn = load_post_hook(config) 127 124 assert callable(hook_fn) 128 125 ··· 304 301 finish_events = [e for e in events if e["event"] == "finish"] 305 302 assert len(finish_events) == 1 306 303 assert finish_events[0]["result"] == MOCK_RESULT["text"] 307 - 308 - 309 - def test_occurrence_post_process_drops_meeting_with_26_participants( 310 - monkeypatch, caplog 311 - ): 312 - """Test megameeting occurrences are dropped before writing.""" 313 - captured = {} 314 - participants = [f"Person {i}" for i in range(26)] 315 - 316 - def mock_generate(**kwargs): 317 - return json.dumps( 318 - [ 319 - { 320 - "type": "meeting", 321 - "title": "All Hands", 322 - "summary": "Large meeting", 323 - "work": True, 324 - "participants": participants, 325 - "facet": "capulet", 326 - "details": "", 327 - } 328 - ] 329 - ) 330 - 331 - def mock_write_events_jsonl(**kwargs): 332 - captured.update(kwargs) 333 - return [] 334 - 335 - monkeypatch.setattr(occurrence, "generate", mock_generate) 336 - monkeypatch.setattr( 337 - occurrence, "compute_output_source", lambda context: "source.md" 338 - ) 339 - monkeypatch.setattr(occurrence, "write_events_jsonl", mock_write_events_jsonl) 340 - caplog.set_level(logging.WARNING) 341 - 342 - result = occurrence.post_process( 343 - "x" * 60, 344 - { 345 - "name": "meetings", 346 - "day": "20240101", 347 - "meta": {}, 348 - "output_path": "ignored", 349 - }, 350 - ) 351 - 352 - assert result is None 353 - assert captured["events"] == [] 354 - assert "Dropping megameeting occurrence" in caplog.text 355 - assert "All Hands" in caplog.text 356 - assert "meetings" in caplog.text 357 - assert "26" in caplog.text 358 - 359 - 360 - def test_occurrence_post_process_keeps_meeting_with_25_participants( 361 - monkeypatch, caplog 362 - ): 363 - """Test meetings at the participant threshold are preserved.""" 364 - captured = {} 365 - event = { 366 - "type": "meeting", 367 - "title": "Planning", 368 - "summary": "Planning meeting", 369 - "work": True, 370 - "participants": [f"Person {i}" for i in range(25)], 371 - "facet": "capulet", 372 - "details": "", 373 - } 374 - 375 - monkeypatch.setattr(occurrence, "generate", lambda **kwargs: json.dumps([event])) 376 - monkeypatch.setattr( 377 - occurrence, "compute_output_source", lambda context: "source.md" 378 - ) 379 - monkeypatch.setattr( 380 - occurrence, 381 - "write_events_jsonl", 382 - lambda **kwargs: captured.update(kwargs) or [], 383 - ) 384 - caplog.set_level(logging.WARNING) 385 - 386 - result = occurrence.post_process( 387 - "x" * 60, 388 - { 389 - "name": "meetings", 390 - "day": "20240101", 391 - "meta": {}, 392 - "output_path": "ignored", 393 - }, 394 - ) 395 - 396 - assert result is None 397 - assert captured["events"] == [event] 398 - assert "Dropping megameeting occurrence" not in caplog.text 399 - 400 - 401 - def test_occurrence_post_process_keeps_non_meeting_with_large_participants_list( 402 - monkeypatch, caplog 403 - ): 404 - """Test non-meeting events are not filtered by participant count.""" 405 - captured = {} 406 - event = { 407 - "type": "message", 408 - "title": "Inbox review", 409 - "summary": "Reviewed messages", 410 - "work": True, 411 - "participants": [f"Person {i}" for i in range(100)], 412 - "facet": "capulet", 413 - "details": "", 414 - } 415 - 416 - monkeypatch.setattr(occurrence, "generate", lambda **kwargs: json.dumps([event])) 417 - monkeypatch.setattr( 418 - occurrence, "compute_output_source", lambda context: "source.md" 419 - ) 420 - monkeypatch.setattr( 421 - occurrence, 422 - "write_events_jsonl", 423 - lambda **kwargs: captured.update(kwargs) or [], 424 - ) 425 - caplog.set_level(logging.WARNING) 426 - 427 - result = occurrence.post_process( 428 - "x" * 60, 429 - { 430 - "name": "timeline", 431 - "day": "20240101", 432 - "meta": {}, 433 - "output_path": "ignored", 434 - }, 435 - ) 436 - 437 - assert result is None 438 - assert captured["events"] == [event] 439 - assert "Dropping megameeting occurrence" not in caplog.text 440 - 441 - 442 - def test_write_events_jsonl_skips_trailing_comma_facet(journal_copy, caplog): 443 - """Test invalid trailing punctuation facets are rejected.""" 444 - caplog.set_level(logging.WARNING) 445 - 446 - written = write_events_jsonl( 447 - events=[ 448 - { 449 - "type": "message", 450 - "title": "Chat", 451 - "summary": "Sent a chat", 452 - "work": True, 453 - "participants": [], 454 - "facet": "kognova,", 455 - "details": "", 456 - } 457 - ], 458 - agent="timeline", 459 - occurred=True, 460 - source_output="20240101/agents/timeline.md", 461 - capture_day="20240101", 462 - ) 463 - 464 - assert written == [] 465 - assert "Skipping event with unknown facet" in caplog.text 466 - assert "kognova," in caplog.text 467 - assert "timeline" in caplog.text 468 - assert "20240101/agents/timeline.md" in caplog.text 469 - assert not (journal_copy / "facets" / "kognova," / "events").exists() 470 - 471 - 472 - def test_write_events_jsonl_skips_unknown_person_facet(journal_copy, caplog): 473 - """Test unknown person-like facet names are rejected.""" 474 - caplog.set_level(logging.WARNING) 475 - 476 - written = write_events_jsonl( 477 - events=[ 478 - { 479 - "type": "message", 480 - "title": "Chat", 481 - "summary": "Sent a chat", 482 - "work": True, 483 - "participants": [], 484 - "facet": "Person", 485 - "details": "", 486 - } 487 - ], 488 - agent="timeline", 489 - occurred=True, 490 - source_output="20240101/agents/timeline.md", 491 - capture_day="20240101", 492 - ) 493 - 494 - assert written == [] 495 - assert "Skipping event with unknown facet" in caplog.text 496 - assert "Person" in caplog.text 497 - assert not (journal_copy / "facets" / "Person" / "events").exists() 498 - 499 - 500 - def test_write_events_jsonl_skips_mixed_case_known_facet(journal_copy, caplog): 501 - """Test mixed-case facet values are rejected when not exact registry matches.""" 502 - caplog.set_level(logging.WARNING) 503 - 504 - written = write_events_jsonl( 505 - events=[ 506 - { 507 - "type": "message", 508 - "title": "Chat", 509 - "summary": "Sent a chat", 510 - "work": True, 511 - "participants": [], 512 - "facet": "Capulet", 513 - "details": "", 514 - } 515 - ], 516 - agent="timeline", 517 - occurred=True, 518 - source_output="20240101/agents/timeline.md", 519 - capture_day="20240101", 520 - ) 521 - 522 - assert written == [] 523 - assert "Skipping event with unknown facet" in caplog.text 524 - assert "Capulet" in caplog.text 525 - assert not (journal_copy / "facets" / "Capulet" / "events").exists() 526 - 527 - 528 - def test_write_events_jsonl_writes_valid_registry_facet(journal_copy): 529 - """Test valid registry facets are written normally.""" 530 - written = write_events_jsonl( 531 - events=[ 532 - { 533 - "type": "message", 534 - "title": "Chat", 535 - "summary": "Sent a chat", 536 - "work": True, 537 - "participants": [], 538 - "facet": "capulet", 539 - "details": "", 540 - } 541 - ], 542 - agent="timeline", 543 - occurred=True, 544 - source_output="20240101/agents/timeline.md", 545 - capture_day="20240101", 546 - ) 547 - 548 - jsonl_path = journal_copy / "facets" / "capulet" / "events" / "20240101.jsonl" 549 - 550 - assert written == [jsonl_path] 551 - rows = [ 552 - json.loads(line) 553 - for line in jsonl_path.read_text(encoding="utf-8").splitlines() 554 - if line 555 - ] 556 - assert len(rows) == 1 557 - assert rows[0]["facet"] == "capulet" 558 - assert rows[0]["agent"] == "timeline" 559 - assert rows[0]["source"] == "20240101/agents/timeline.md" 560 - 561 - 562 - def test_write_events_jsonl_skips_empty_facet(journal_copy, caplog): 563 - """Test missing facets are skipped and logged.""" 564 - caplog.set_level(logging.WARNING) 565 - 566 - written = write_events_jsonl( 567 - events=[ 568 - { 569 - "type": "message", 570 - "title": "Chat", 571 - "summary": "Sent a chat", 572 - "work": True, 573 - "participants": [], 574 - "details": "", 575 - } 576 - ], 577 - agent="timeline", 578 - occurred=True, 579 - source_output="20240101/agents/timeline.md", 580 - capture_day="20240101", 581 - ) 582 - 583 - assert written == [] 584 - assert "Skipping event with unknown facet" in caplog.text 585 - assert "timeline" in caplog.text 586 - assert not (journal_copy / "facets" / "" / "events").exists() 587 304 588 305 589 306 # =============================================================================
+1 -1
tests/test_stats_contract.py
··· 82 82 ) 83 83 84 84 (seg2 / "audio.flac").write_bytes(b"fLaC") 85 - (day / "talents" / "flow.md").write_text("") 85 + (day / "talents" / "schedule.json").write_text("[]") 86 86 87 87 events_dir = journal / "facets" / "work" / "events" 88 88 events_dir.mkdir(parents=True)
+16 -16
tests/test_talent_cli.py
··· 25 25 def test_collect_configs_returns_prompts(): 26 26 """All configs include known system prompts.""" 27 27 configs = _collect_configs(include_disabled=True) 28 - assert "flow" in configs 28 + assert "schedule" in configs 29 29 assert "sense" in configs 30 30 assert "chat" in configs 31 31 ··· 36 36 with_disabled = _collect_configs(include_disabled=True) 37 37 # include_disabled should return at least as many configs 38 38 assert len(with_disabled) >= len(without) 39 - assert "flow" in without 40 - assert "flow" in with_disabled 39 + assert "schedule" in without 40 + assert "schedule" in with_disabled 41 41 42 42 43 43 def test_collect_configs_filter_schedule(): ··· 128 128 129 129 # Prompt names 130 130 assert "activity" in output 131 - assert "flow" in output 131 + assert "schedule" in output 132 132 133 133 # Last run column is present 134 134 assert "LAST RUN" in output ··· 150 150 output = capsys.readouterr().out 151 151 152 152 # all agents should appear in the listing 153 - assert "flow" in output 153 + assert "schedule" in output 154 154 155 155 156 156 def test_show_prompt_known(capsys): 157 157 """Detail view shows expected fields for a known prompt.""" 158 - show_prompt("flow") 158 + show_prompt("schedule") 159 159 output = capsys.readouterr().out 160 160 161 - assert "talent/flow.md" in output 161 + assert "talent/schedule.md" in output 162 162 assert "title:" in output 163 163 assert "schedule:" in output 164 164 assert "daily" in output 165 165 assert "hook:" in output 166 - assert "occurrence" in output 166 + assert "schedule" in output 167 167 assert "variables:" in output 168 168 assert "$daily_preamble" in output 169 169 assert "body:" in output ··· 200 200 201 201 records = [json.loads(x) for x in output.strip().splitlines() if x.strip()] 202 202 files = {r["file"] for r in records} 203 - assert any("flow.md" in f for f in files) 203 + assert any("schedule.md" in f for f in files) 204 204 assert any("sense.md" in f for f in files) 205 205 206 206 # Check a specific record has expected fields 207 - flow = next(r for r in records if "flow.md" in r["file"]) 208 - assert "title" in flow 209 - assert "schedule" in flow 207 + schedule = next(r for r in records if "schedule.md" in r["file"]) 208 + assert "title" in schedule 209 + assert "schedule" in schedule 210 210 211 211 212 212 def test_json_output_schedule_filter(capsys): ··· 221 221 222 222 def test_show_prompt_as_json(capsys): 223 223 """Detail view with --json outputs single JSONL record.""" 224 - show_prompt("flow", as_json=True) 224 + show_prompt("schedule", as_json=True) 225 225 output = capsys.readouterr().out 226 226 227 227 lines = [x for x in output.strip().splitlines() if x.strip()] 228 228 assert len(lines) == 1 229 229 230 230 record = json.loads(lines[0]) 231 - assert record["file"].endswith("flow.md") 231 + assert record["file"].endswith("schedule.md") 232 232 assert "title" in record 233 233 assert "schedule" in record 234 234 # Should not contain expanded instruction text ··· 291 291 292 292 # Too short 293 293 with pytest.raises(SystemExit): 294 - show_prompt_context("flow", day="2026") 294 + show_prompt_context("schedule", day="2026") 295 295 296 296 output = capsys.readouterr().err 297 297 assert "invalid --day format" in output.lower() 298 298 299 299 # Non-numeric 300 300 with pytest.raises(SystemExit): 301 - show_prompt_context("flow", day="abcdefgh") 301 + show_prompt_context("schedule", day="abcdefgh") 302 302 303 303 output = capsys.readouterr().err 304 304 assert "invalid --day format" in output.lower()
+56
think/_extraction_utils.py
··· 1 + # SPDX-License-Identifier: AGPL-3.0-only 2 + # Copyright (c) 2026 sol pbc 3 + 4 + """Generic helper for logging extraction (IncompleteJSONError) failures.""" 5 + 6 + import logging 7 + 8 + 9 + def log_extraction_failure(e: Exception, name: str) -> None: 10 + """Log enhanced diagnostics for extraction generation failures. 11 + 12 + Handles IncompleteJSONError specially by logging a single-line summary 13 + with a head+tail sample and degenerate repetition detection. 14 + 15 + Args: 16 + e: The exception from generate(). 17 + name: Generator name for log context. 18 + """ 19 + from think.models import IncompleteJSONError 20 + 21 + if not isinstance(e, IncompleteJSONError): 22 + logging.error("Extraction generation failed for %s: %s", name, e) 23 + return 24 + 25 + partial = e.partial_text 26 + length = len(partial) 27 + 28 + # Build single-line head+tail sample (newlines collapsed for log grep) 29 + def _collapse(s: str) -> str: 30 + return s.replace("\n", "\\n").replace("\r", "") 31 + 32 + if length <= 300: 33 + sample = _collapse(partial) 34 + else: 35 + sample = f"{_collapse(partial[:150])} ... {_collapse(partial[-150:])}" 36 + 37 + # Repetition detection: count unique chars in last 1000 38 + tail = partial[-1000:] if length >= 1000 else partial 39 + unique_count = len(set(tail)) 40 + repetition_flag = "" 41 + if unique_count < 20: 42 + repetition_flag = ( 43 + f" [POSSIBLE DEGENERATE REPETITION: " 44 + f"{unique_count} unique chars in last {len(tail)}]" 45 + ) 46 + 47 + logging.error( 48 + "Extraction generation failed for %s: %s " 49 + "(partial_text: %d chars, %d unique in tail%s) sample: %s", 50 + name, 51 + e, 52 + length, 53 + unique_count, 54 + repetition_flag, 55 + sample, 56 + )
-215
think/hooks.py
··· 1 - # SPDX-License-Identifier: AGPL-3.0-only 2 - # Copyright (c) 2026 sol pbc 3 - 4 - """Shared utilities for output-side event hooks.""" 5 - 6 - import json 7 - import logging 8 - import os 9 - from pathlib import Path 10 - 11 - from think.facets import get_facets 12 - 13 - # Minimum content length for meaningful event extraction 14 - MIN_EXTRACTION_CHARS = 50 15 - 16 - 17 - def should_skip_extraction(result: str, context: dict) -> str | None: 18 - """Check if extraction should be skipped and return reason, or None to proceed. 19 - 20 - Args: 21 - result: The generated output markdown content. 22 - context: Hook context dict with meta and span. 23 - 24 - Returns: 25 - Skip reason string if extraction should be skipped, None otherwise. 26 - """ 27 - meta = context.get("meta", {}) 28 - 29 - # Skip if extraction disabled via journal config 30 - if meta.get("extract") is False: 31 - return "extraction disabled via journal config" 32 - 33 - # Skip for JSON output (output IS the structured data) 34 - if meta.get("output") == "json": 35 - return "JSON output (already structured)" 36 - 37 - # Skip in span mode (multiple sequential segments) 38 - if context.get("span"): 39 - return "span mode" 40 - 41 - # Skip for minimal content 42 - if len(result.strip()) < MIN_EXTRACTION_CHARS: 43 - return f"minimal content ({len(result.strip())} chars < {MIN_EXTRACTION_CHARS})" 44 - 45 - return None 46 - 47 - 48 - def log_extraction_failure(e: Exception, name: str) -> None: 49 - """Log enhanced diagnostics for extraction generation failures. 50 - 51 - Handles IncompleteJSONError specially by logging a single-line summary 52 - with a head+tail sample and degenerate repetition detection. 53 - 54 - Args: 55 - e: The exception from generate(). 56 - name: Generator name for log context. 57 - """ 58 - from think.models import IncompleteJSONError 59 - 60 - if not isinstance(e, IncompleteJSONError): 61 - logging.error("Extraction generation failed for %s: %s", name, e) 62 - return 63 - 64 - partial = e.partial_text 65 - length = len(partial) 66 - 67 - # Build single-line head+tail sample (newlines collapsed for log grep) 68 - def _collapse(s: str) -> str: 69 - return s.replace("\n", "\\n").replace("\r", "") 70 - 71 - if length <= 300: 72 - sample = _collapse(partial) 73 - else: 74 - sample = f"{_collapse(partial[:150])} ... {_collapse(partial[-150:])}" 75 - 76 - # Repetition detection: count unique chars in last 1000 77 - tail = partial[-1000:] if length >= 1000 else partial 78 - unique_count = len(set(tail)) 79 - repetition_flag = "" 80 - if unique_count < 20: 81 - repetition_flag = ( 82 - f" [POSSIBLE DEGENERATE REPETITION: " 83 - f"{unique_count} unique chars in last {len(tail)}]" 84 - ) 85 - 86 - logging.error( 87 - "Extraction generation failed for %s: %s " 88 - "(partial_text: %d chars, %d unique in tail%s) sample: %s", 89 - name, 90 - e, 91 - length, 92 - unique_count, 93 - repetition_flag, 94 - sample, 95 - ) 96 - 97 - 98 - def write_events_jsonl( 99 - events: list[dict], 100 - agent: str, 101 - occurred: bool, 102 - source_output: str, 103 - capture_day: str, 104 - ) -> list[Path]: 105 - """Write events to facet-based JSONL files. 106 - 107 - Groups events by facet and writes each to the appropriate file: 108 - facets/{facet}/events/{event_day}.jsonl 109 - 110 - Args: 111 - events: List of event dictionaries from extraction. 112 - agent: Source generator agent (e.g., "meetings", "flow"). 113 - occurred: True for occurrence rows, False for future-dated event rows. 114 - source_output: Relative path to source output file. 115 - capture_day: Day the output was captured (YYYYMMDD). 116 - 117 - Returns: 118 - List of paths to written JSONL files. 119 - """ 120 - from think.utils import get_journal 121 - 122 - journal = get_journal() 123 - known_facets = set(get_facets().keys()) 124 - 125 - # Group events by (facet, event_day) 126 - grouped: dict[tuple[str, str], list[dict]] = {} 127 - 128 - for event in events: 129 - raw_facet = event.get("facet", "") 130 - facet = raw_facet.strip().lower() 131 - if facet not in known_facets or raw_facet != facet: 132 - logging.warning( 133 - "Skipping event with unknown facet: facet=%r agent=%s source=%s", 134 - raw_facet, 135 - agent, 136 - source_output, 137 - ) 138 - continue 139 - 140 - # Determine the event day 141 - if occurred: 142 - # Occurrences use capture day 143 - event_day = capture_day 144 - else: 145 - # Future-dated event rows use their scheduled date 146 - event_date = event.get("date", "") 147 - # Convert YYYY-MM-DD to YYYYMMDD 148 - event_day = event_date.replace("-", "") if event_date else capture_day 149 - 150 - if not event_day: 151 - continue 152 - 153 - key = (facet, event_day) 154 - if key not in grouped: 155 - grouped[key] = [] 156 - 157 - # Enrich event with metadata 158 - enriched = dict(event) 159 - enriched["agent"] = agent 160 - enriched["occurred"] = occurred 161 - enriched["source"] = source_output 162 - 163 - grouped[key].append(enriched) 164 - 165 - # Write each group to its JSONL file 166 - written_paths: list[Path] = [] 167 - 168 - for (facet, event_day), facet_events in grouped.items(): 169 - events_dir = Path(journal) / "facets" / facet / "events" 170 - events_dir.mkdir(parents=True, exist_ok=True) 171 - 172 - jsonl_path = events_dir / f"{event_day}.jsonl" 173 - with open(jsonl_path, "a", encoding="utf-8") as f: 174 - for event in facet_events: 175 - f.write(json.dumps(event, ensure_ascii=False) + "\n") 176 - 177 - written_paths.append(jsonl_path) 178 - 179 - return written_paths 180 - 181 - 182 - def compute_output_source(context: dict) -> str: 183 - """Compute relative source output path from hook context. 184 - 185 - Args: 186 - context: Hook context dict with day, segment, name, output_path, meta. 187 - 188 - Returns: 189 - Relative path like "20240101/talents/meetings.md". 190 - """ 191 - from think.talent import get_output_name 192 - from think.utils import CHRONICLE_DIR, get_journal 193 - 194 - day = context.get("day", "") 195 - output_path = context.get("output_path", "") 196 - name = context.get("name", "unknown") 197 - journal = get_journal() 198 - 199 - try: 200 - rel = os.path.relpath(output_path, journal).replace("\\", "/") 201 - return rel.removeprefix(f"{CHRONICLE_DIR}/") 202 - except ValueError: 203 - segment = context.get("segment") 204 - output_name = get_output_name(name) 205 - # Check for facet in meta (for multi-facet talents) 206 - meta = context.get("meta", {}) 207 - facet = meta.get("facet") if meta else None 208 - filename = f"{output_name}.md" 209 - if segment and facet: 210 - return os.path.join(day, segment, "talents", facet, filename) 211 - if segment: 212 - return os.path.join(day, segment, "talents", filename) 213 - if facet: 214 - return os.path.join(day, "talents", facet, filename) 215 - return os.path.join(day, "talents", filename)
+1 -1
think/talents.py
··· 1276 1276 "ts": now_ms(), 1277 1277 } 1278 1278 if isinstance(e, IncompleteJSONError): 1279 - from think.hooks import log_extraction_failure 1279 + from think._extraction_utils import log_extraction_failure 1280 1280 1281 1281 event["partial_text_length"] = len(e.partial_text) 1282 1282 event["partial_text_tail"] = e.partial_text[-500:]