personal memory agent
0
fork

Configure Feed

Select the types of activity you want to include in your feed.

activity_state: move since tracking from LLM to tooling

The LLM produced `since` values in 8+ inconsistent formats and broke
continuity ~40% of the time. Restructure so the LLM only reports
semantic state (continuing/new/ended) and the post-hook stamps `since`
from the previous segment's stored output.

- Flatten output schema from {active:[], ended:[]} to a single array
with per-item `state` field
- Add post-hook that resolves `since` by matching against previous
segment, with rapidfuzz disambiguation for concurrent same-type
activities
- Remove `since` from pre-hook context shown to LLM
- Add active-vs-visible rule and explicit ended-reporting guidance
- 35 tests (was 22)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

+635 -125
+28 -63
muse/activity_state.md
··· 7 7 "priority": 95, 8 8 "multi_facet": true, 9 9 "output": "json", 10 - "hook": {"pre": "activity_state"}, 10 + "hook": {"pre": "activity_state", "post": "activity_state"}, 11 11 "tier": 3, 12 12 "thinking_budget": 2048, 13 13 "max_output_tokens": 512, ··· 29 29 30 30 ## Task 31 31 32 - Analyze the current segment and determine: 33 - - Which configured activities are happening at the END of this segment (active) 34 - - Which previously active activities ended DURING this segment (ended) 35 - 36 - Consider continuity: if an activity was active in the previous segment and you see evidence of it continuing, keep tracking it with the same `since` value and update the description if context has evolved. 32 + Analyze the current segment and determine the state of each detected activity: 33 + - **continuing** - Was active in the previous segment AND you see evidence it is still happening 34 + - **new** - Just started in this segment (or same type restarted — e.g., one meeting ended, another began) 35 + - **ended** - Was active in the previous segment but stopped during this segment 37 36 38 37 ## Output Format 39 38 40 - Return a JSON object with two arrays: 39 + Return a JSON array of activity objects: 41 40 42 41 ```json 43 - { 44 - "active": [ 45 - { 46 - "activity": "meeting", 47 - "since": "143000_300", 48 - "description": "Design review with UX team discussing navigation patterns", 49 - "level": "high" 50 - } 51 - ], 52 - "ended": [ 53 - { 54 - "activity": "email", 55 - "since": "140000_300", 56 - "description": "Replied to deployment notification from ops team" 57 - } 58 - ] 59 - } 42 + [ 43 + {"activity": "meeting", "state": "continuing", "description": "Design review with UX team, now discussing navigation", "level": "high"}, 44 + {"activity": "messaging", "state": "new", "description": "Slack thread about deployment", "level": "low"}, 45 + {"activity": "email", "state": "ended", "description": "Replied to deployment notification from ops team"} 46 + ] 60 47 ``` 61 48 62 49 ### Field Definitions 63 50 64 - **active** - Activities ongoing at segment end: 65 51 - `activity`: Activity ID from the configured list 66 - - `since`: Segment key when this activity instance started (copy from previous if continuing, use current segment if new) 52 + - `state`: One of `"continuing"`, `"new"`, or `"ended"` 67 53 - `description`: Brief description of what this activity involves (update as context evolves) 68 - - `level`: Engagement level - "high" (primary focus), "medium" (secondary), "low" (background) 69 - 70 - **ended** - Activities that stopped during this segment: 71 - - `activity`: Activity ID 72 - - `since`: When this instance started (for duration tracking) 73 - - `description`: Final summary of what the activity was 54 + - `level`: Engagement level — `"high"` (primary focus), `"medium"` (secondary), `"low"` (background). Only for continuing/new, omit for ended. 74 55 75 56 ## Rules 76 57 77 - 1. **Only detect configured activities** - Ignore activity that doesn't match the facet's list 78 - 2. **One instance per type** - If a meeting ends and another starts, the first goes to `ended`, the new one to `active` 79 - 3. **Preserve `since`** - For continuing activities, keep the original start segment 80 - 4. **Update descriptions** - As activities continue, refine the description with new context 81 - 5. **Empty is valid** - `{"active": [], "ended": []}` is correct when no activities detected 58 + 1. **Only detect configured activities** — Ignore activity that doesn't match the facet's list 59 + 2. **Active vs. visible** — Only report an activity if the user is actively interacting with it during this segment. An application merely visible on screen but unchanged is NOT active. Look for evidence of interaction: typing, clicking, new content, spoken discussion. 60 + 3. **Report endings** — If a previously active activity is no longer happening, always report it as `"ended"` so it can be tracked 61 + 4. **Same-type transitions** — If a meeting ends and a different meeting starts, report both: the old one as `"ended"` and the new one as `"new"` 62 + 5. **Update descriptions** — As activities continue, refine the description with new context 63 + 6. **Empty is valid** — `[]` is correct when no activities are detected 82 64 83 65 ## Examples 84 66 85 67 **New activity starts:** 86 68 ```json 87 - { 88 - "active": [{"activity": "coding", "since": "143500_300", "description": "Implementing user auth flow", "level": "high"}], 89 - "ended": [] 90 - } 69 + [{"activity": "coding", "state": "new", "description": "Implementing user auth flow", "level": "high"}] 91 70 ``` 92 71 93 72 **Activity continues from previous:** 94 73 ```json 95 - { 96 - "active": [{"activity": "meeting", "since": "140000_300", "description": "Sprint planning - now discussing blockers", "level": "high"}], 97 - "ended": [] 98 - } 74 + [{"activity": "meeting", "state": "continuing", "description": "Sprint planning - now discussing blockers", "level": "high"}] 99 75 ``` 100 76 101 - **One activity ends, another starts (same type):** 77 + **One meeting ends, another starts:** 102 78 ```json 103 - { 104 - "active": [{"activity": "meeting", "since": "144500_300", "description": "1:1 with manager", "level": "high"}], 105 - "ended": [{"activity": "meeting", "since": "140000_300", "description": "Sprint planning completed"}] 106 - } 107 - ``` 108 - 109 - **Multiple concurrent activities:** 110 - ```json 111 - { 112 - "active": [ 113 - {"activity": "meeting", "since": "143000_300", "description": "Team standup", "level": "high"}, 114 - {"activity": "messaging", "since": "143000_300", "description": "Slack thread about deployment", "level": "low"} 115 - ], 116 - "ended": [] 117 - } 79 + [ 80 + {"activity": "meeting", "state": "ended", "description": "Sprint planning completed"}, 81 + {"activity": "meeting", "state": "new", "description": "1:1 with manager", "level": "high"} 82 + ] 118 83 ``` 119 84 120 85 **No activities detected:** 121 86 ```json 122 - {"active": [], "ended": []} 87 + [] 123 88 ``` 124 89 125 - Return ONLY the JSON object, no other text. 90 + Return ONLY the JSON array, no other text.
+200 -13
muse/activity_state.py
··· 1 1 # SPDX-License-Identifier: AGPL-3.0-only 2 2 # Copyright (c) 2026 sol pbc 3 3 4 - """Pre-hook for activity_state generator. 4 + """Pre/post hooks for activity_state generator. 5 5 6 - Builds context for activity detection by: 6 + Pre-hook builds context for activity detection by: 7 7 1. Loading the facet's configured activities 8 8 2. Finding and loading previous segment's activity state 9 9 3. Formatting both as context for the prompt 10 + 11 + Post-hook resolves timing metadata: 12 + 1. Stamps `since` (segment key) from tooling — never from LLM 13 + 2. Normalizes `state` from LLM values (continuing/new) to stored values (active) 14 + 3. Matches continuing/ended activities to previous state via activity type + fuzzy description 10 15 """ 11 16 12 17 import json ··· 108 113 109 114 def load_previous_state( 110 115 day: str, segment: str, facet: str 111 - ) -> tuple[dict | None, str | None]: 116 + ) -> tuple[list | None, str | None]: 112 117 """Load activity state from a previous segment. 113 118 114 - Returns tuple of (state, segment_key) where state is the parsed JSON 115 - or None if not found/invalid. 119 + Returns tuple of (state_list, segment_key) where state_list is the 120 + parsed JSON array or None if not found/invalid. 116 121 """ 117 122 state_path = day_path(day) / segment / f"activity_state_{facet}.json" 118 123 if not state_path.exists(): ··· 124 129 return None, segment 125 130 126 131 data = json.loads(content) 127 - return data, segment 132 + if isinstance(data, list): 133 + return data, segment 134 + # Unexpected format 135 + logger.warning("activity_state is not an array: %s", state_path) 136 + return None, segment 128 137 except (json.JSONDecodeError, OSError) as e: 129 138 logger.warning("Failed to load previous state from %s: %s", state_path, e) 130 139 return None, segment ··· 188 197 189 198 190 199 def format_previous_state( 191 - state: dict | None, 200 + state: list | None, 192 201 segment: str | None, 193 202 current_segment: str, 194 203 timed_out: bool, ··· 196 205 """Format previous state as context for the prompt. 197 206 198 207 Args: 199 - state: Previous segment's activity state dict 208 + state: Previous segment's activity state list (flat array) 200 209 segment: Previous segment key 201 210 current_segment: Current segment key 202 211 timed_out: Whether the gap exceeded timeout threshold ··· 220 229 221 230 lines = [f"## Previous State (from {segment}){time_note}", ""] 222 231 223 - active = state.get("active", []) 224 - ended = state.get("ended", []) 232 + active = [item for item in state if item.get("state") == "active"] 233 + ended = [item for item in state if item.get("state") == "ended"] 225 234 226 235 if active: 227 236 lines.append("**Active activities (may be continuing):**") 228 237 for item in active: 229 238 activity_id = item.get("activity", "") 230 - since = item.get("since", "") 231 239 description = item.get("description", "") 232 240 level = item.get("level", "") 233 241 234 242 parts = [f"- {activity_id}"] 235 - if since: 236 - parts.append(f"(since {since})") 237 243 if level: 238 244 parts.append(f"[{level}]") 239 245 if description: ··· 255 261 lines.append("No activities were detected in the previous segment.") 256 262 257 263 return "\n".join(lines) 264 + 265 + 266 + # --------------------------------------------------------------------------- 267 + # Pre-hook 268 + # --------------------------------------------------------------------------- 258 269 259 270 260 271 def pre_process(context: dict) -> dict | None: ··· 318 329 ] 319 330 320 331 return {"transcript": "\n".join(enriched_parts)} 332 + 333 + 334 + # --------------------------------------------------------------------------- 335 + # Post-hook 336 + # --------------------------------------------------------------------------- 337 + 338 + 339 + def _find_best_match( 340 + activity_id: str, 341 + description: str, 342 + candidates: list[tuple[int, dict]], 343 + ) -> tuple[int, dict] | None: 344 + """Find the best matching previous activity by type, then description. 345 + 346 + If only one candidate matches the activity type, returns it directly. 347 + If multiple match (rare — concurrent same-type activities), uses fuzzy 348 + description matching to pick the best one. 349 + 350 + Args: 351 + activity_id: Activity type to match (e.g., "meeting") 352 + description: Description from LLM output for fuzzy matching 353 + candidates: List of (index, item) tuples from previous active state 354 + 355 + Returns: 356 + (index, item) tuple of the best match, or None if no match found. 357 + """ 358 + matches = [(i, c) for i, c in candidates if c.get("activity") == activity_id] 359 + 360 + if not matches: 361 + return None 362 + if len(matches) == 1: 363 + return matches[0] 364 + 365 + # Multiple same-type activities — use fuzzy matching on description 366 + if not description: 367 + return matches[0] 368 + 369 + try: 370 + from rapidfuzz import fuzz, process 371 + 372 + # Build list of (description, index into matches) for fuzzy comparison. 373 + # Using a list avoids key collision when two activities share a description. 374 + desc_list: list[tuple[str, int]] = [] 375 + for mi, (idx, m) in enumerate(matches): 376 + desc = m.get("description", "") 377 + if desc: 378 + desc_list.append((desc, mi)) 379 + 380 + if not desc_list: 381 + return matches[0] 382 + 383 + result = process.extractOne( 384 + description, 385 + [d for d, _ in desc_list], 386 + scorer=fuzz.token_sort_ratio, 387 + ) 388 + if result: 389 + _matched_str, _score, list_idx = result 390 + _, mi = desc_list[list_idx] 391 + return matches[mi] 392 + except ImportError: 393 + pass 394 + 395 + return matches[0] 396 + 397 + 398 + def post_process(result: str, context: dict) -> str | None: 399 + """Resolve timing metadata on LLM activity state output. 400 + 401 + Stamps `since` field from tooling (never from LLM) and normalizes 402 + state values from LLM format (continuing/new/ended) to stored format 403 + (active/ended). 404 + 405 + Args: 406 + result: Raw LLM JSON output (flat array of activities) 407 + context: HookContext with day, segment, output_path, meta 408 + 409 + Returns: 410 + Transformed JSON string with since fields resolved, 411 + or None to keep original on error. 412 + """ 413 + segment = context.get("segment") 414 + if not segment: 415 + logger.warning("activity_state post-hook requires segment") 416 + return None 417 + 418 + day = context.get("day") 419 + output_path = context.get("output_path", "") 420 + 421 + # Parse LLM output 422 + try: 423 + items = json.loads(result.strip()) 424 + except (json.JSONDecodeError, ValueError) as e: 425 + logger.warning("Failed to parse activity_state LLM output: %s", e) 426 + return None 427 + 428 + if not isinstance(items, list): 429 + logger.warning("activity_state output is not an array") 430 + return None 431 + 432 + # Load previous state for since resolution 433 + prev_active: list[dict] = [] 434 + if day: 435 + facet = _extract_facet_from_output_path(output_path) 436 + if facet: 437 + previous_segment = find_previous_segment(day, segment) 438 + if previous_segment: 439 + prev_state, _ = load_previous_state(day, previous_segment, facet) 440 + if prev_state: 441 + prev_active = [ 442 + item for item in prev_state if item.get("state") == "active" 443 + ] 444 + 445 + # Track which previous items have been claimed to avoid double-matching 446 + claimed: set[int] = set() 447 + 448 + resolved: list[dict] = [] 449 + for item in items: 450 + activity_id = item.get("activity", "") 451 + state = item.get("state", "new") 452 + description = item.get("description", "") 453 + 454 + # Build unclaimed candidates with their original indices 455 + unclaimed = [(i, c) for i, c in enumerate(prev_active) if i not in claimed] 456 + 457 + if state == "continuing": 458 + result = _find_best_match(activity_id, description, unclaimed) 459 + if result: 460 + idx, matched = result 461 + claimed.add(idx) 462 + since = matched.get("since", segment) 463 + else: 464 + # No previous match — treat as new 465 + since = segment 466 + 467 + resolved.append( 468 + { 469 + "activity": activity_id, 470 + "state": "active", 471 + "since": since, 472 + "description": description, 473 + "level": item.get("level", "medium"), 474 + } 475 + ) 476 + 477 + elif state == "ended": 478 + result = _find_best_match(activity_id, description, unclaimed) 479 + if result: 480 + idx, matched = result 481 + claimed.add(idx) 482 + since = matched.get("since", segment) 483 + else: 484 + since = segment 485 + 486 + resolved.append( 487 + { 488 + "activity": activity_id, 489 + "state": "ended", 490 + "since": since, 491 + "description": description, 492 + } 493 + ) 494 + 495 + else: 496 + # "new" or any unrecognized state — stamp current segment 497 + resolved.append( 498 + { 499 + "activity": activity_id, 500 + "state": "active", 501 + "since": segment, 502 + "description": description, 503 + "level": item.get("level", "medium"), 504 + } 505 + ) 506 + 507 + return json.dumps(resolved, ensure_ascii=False)
+407 -49
tests/test_activity_state.py
··· 1 1 # SPDX-License-Identifier: AGPL-3.0-only 2 2 # Copyright (c) 2026 sol pbc 3 3 4 - """Tests for the activity_state pre-hook module.""" 4 + """Tests for the activity_state pre/post hook module.""" 5 5 6 6 import json 7 7 import os ··· 141 141 os.environ["JOURNAL_PATH"] = tmpdir 142 142 143 143 try: 144 - # Create state file 144 + # Create state file (new flat format) 145 145 segment_dir = Path(tmpdir) / "20260130" / "100000_300" 146 146 segment_dir.mkdir(parents=True) 147 147 148 - state = { 149 - "active": [ 150 - { 151 - "activity": "meeting", 152 - "since": "100000_300", 153 - "description": "Standup", 154 - "level": "high", 155 - } 156 - ], 157 - "ended": [], 158 - } 148 + state = [ 149 + { 150 + "activity": "meeting", 151 + "state": "active", 152 + "since": "100000_300", 153 + "description": "Standup", 154 + "level": "high", 155 + } 156 + ] 159 157 (segment_dir / "activity_state_work.json").write_text(json.dumps(state)) 160 158 161 159 loaded, segment = load_previous_state("20260130", "100000_300", "work") 162 160 assert segment == "100000_300" 163 - assert loaded["active"][0]["activity"] == "meeting" 161 + assert isinstance(loaded, list) 162 + assert loaded[0]["activity"] == "meeting" 164 163 165 164 finally: 166 165 if original_path: ··· 180 179 loaded, segment = load_previous_state("20260130", "100000_300", "work") 181 180 assert loaded is None 182 181 assert segment is None 182 + 183 + finally: 184 + if original_path: 185 + os.environ["JOURNAL_PATH"] = original_path 186 + 187 + def test_rejects_non_array(self): 188 + from muse.activity_state import load_previous_state 189 + 190 + with tempfile.TemporaryDirectory() as tmpdir: 191 + original_path = os.environ.get("JOURNAL_PATH") 192 + os.environ["JOURNAL_PATH"] = tmpdir 193 + 194 + try: 195 + segment_dir = Path(tmpdir) / "20260130" / "100000_300" 196 + segment_dir.mkdir(parents=True) 197 + 198 + # Write a dict (old format) — should be rejected 199 + (segment_dir / "activity_state_work.json").write_text( 200 + '{"active": [], "ended": []}' 201 + ) 202 + 203 + loaded, segment = load_previous_state("20260130", "100000_300", "work") 204 + assert loaded is None 205 + assert segment == "100000_300" 183 206 184 207 finally: 185 208 if original_path: ··· 245 268 def test_formats_active_activities(self): 246 269 from muse.activity_state import format_previous_state 247 270 248 - state = { 249 - "active": [ 250 - { 251 - "activity": "meeting", 252 - "since": "100000_300", 253 - "description": "Team standup", 254 - "level": "high", 255 - } 256 - ], 257 - "ended": [], 258 - } 271 + state = [ 272 + { 273 + "activity": "meeting", 274 + "state": "active", 275 + "since": "100000_300", 276 + "description": "Team standup", 277 + "level": "high", 278 + } 279 + ] 259 280 260 281 result = format_previous_state( 261 282 state, "100000_300", "100500_300", timed_out=False 262 283 ) 263 284 assert "Previous State" in result 264 285 assert "meeting" in result 265 - assert "since 100000_300" in result 266 286 assert "Team standup" in result 287 + # since should NOT appear in context shown to LLM 288 + assert "since" not in result 267 289 268 290 def test_formats_ended_activities(self): 269 291 from muse.activity_state import format_previous_state 270 292 271 - state = { 272 - "active": [], 273 - "ended": [ 274 - { 275 - "activity": "email", 276 - "since": "093000_300", 277 - "description": "Replied to boss", 278 - } 279 - ], 280 - } 293 + state = [ 294 + { 295 + "activity": "email", 296 + "state": "ended", 297 + "since": "093000_300", 298 + "description": "Replied to boss", 299 + } 300 + ] 281 301 282 302 result = format_previous_state( 283 303 state, "100000_300", "100500_300", timed_out=False ··· 288 308 def test_handles_timeout(self): 289 309 from muse.activity_state import format_previous_state 290 310 291 - state = {"active": [{"activity": "meeting"}], "ended": []} 311 + state = [{"activity": "meeting", "state": "active"}] 292 312 result = format_previous_state( 293 313 state, "100000_300", "120000_300", timed_out=True 294 314 ) ··· 300 320 301 321 result = format_previous_state(None, None, "100000_300", timed_out=False) 302 322 assert "No previous segment state" in result 323 + 324 + def test_handles_empty_list(self): 325 + from muse.activity_state import format_previous_state 326 + 327 + result = format_previous_state([], "100000_300", "100500_300", timed_out=False) 328 + assert "No activities were detected" in result 303 329 304 330 305 331 class TestPreProcess: ··· 327 353 '{"id": "meeting"}\n{"id": "coding"}' 328 354 ) 329 355 330 - # Create previous state 331 - prev_state = { 332 - "active": [ 333 - { 334 - "activity": "meeting", 335 - "since": "100000_300", 336 - "description": "Standup", 337 - "level": "high", 338 - } 339 - ], 340 - "ended": [], 341 - } 356 + # Create previous state (new flat format) 357 + prev_state = [ 358 + { 359 + "activity": "meeting", 360 + "state": "active", 361 + "since": "100000_300", 362 + "description": "Standup", 363 + "level": "high", 364 + } 365 + ] 342 366 (day_dir / "100000_300" / "activity_state_work.json").write_text( 343 367 json.dumps(prev_state) 344 368 ) ··· 394 418 "output_path": "/path/to/something_else.json", 395 419 } 396 420 assert pre_process(context) is None 421 + 422 + 423 + class TestPostProcess: 424 + """Tests for the post_process hook function.""" 425 + 426 + def test_new_activity_gets_current_segment(self): 427 + from muse.activity_state import post_process 428 + 429 + llm_output = json.dumps( 430 + [ 431 + { 432 + "activity": "coding", 433 + "state": "new", 434 + "description": "Writing tests", 435 + "level": "high", 436 + } 437 + ] 438 + ) 439 + 440 + result = post_process(llm_output, {"segment": "143000_300"}) 441 + assert result is not None 442 + items = json.loads(result) 443 + assert len(items) == 1 444 + assert items[0]["state"] == "active" 445 + assert items[0]["since"] == "143000_300" 446 + assert items[0]["level"] == "high" 447 + 448 + def test_continuing_activity_copies_since(self): 449 + from muse.activity_state import post_process 450 + 451 + with tempfile.TemporaryDirectory() as tmpdir: 452 + original_path = os.environ.get("JOURNAL_PATH") 453 + os.environ["JOURNAL_PATH"] = tmpdir 454 + 455 + try: 456 + day_dir = Path(tmpdir) / "20260130" 457 + day_dir.mkdir() 458 + 459 + # Previous segment with active meeting 460 + prev_dir = day_dir / "100000_300" 461 + prev_dir.mkdir() 462 + prev_state = [ 463 + { 464 + "activity": "meeting", 465 + "state": "active", 466 + "since": "093000_300", 467 + "description": "Sprint planning", 468 + "level": "high", 469 + } 470 + ] 471 + (prev_dir / "activity_state_work.json").write_text( 472 + json.dumps(prev_state) 473 + ) 474 + 475 + # Current segment 476 + (day_dir / "100500_300").mkdir() 477 + 478 + llm_output = json.dumps( 479 + [ 480 + { 481 + "activity": "meeting", 482 + "state": "continuing", 483 + "description": "Sprint planning - discussing blockers", 484 + "level": "high", 485 + } 486 + ] 487 + ) 488 + 489 + context = { 490 + "day": "20260130", 491 + "segment": "100500_300", 492 + "output_path": f"{tmpdir}/20260130/100500_300/activity_state_work.json", 493 + } 494 + 495 + result = post_process(llm_output, context) 496 + items = json.loads(result) 497 + assert items[0]["state"] == "active" 498 + assert items[0]["since"] == "093000_300" # Copied from previous 499 + 500 + finally: 501 + if original_path: 502 + os.environ["JOURNAL_PATH"] = original_path 503 + 504 + def test_ended_activity_copies_since(self): 505 + from muse.activity_state import post_process 506 + 507 + with tempfile.TemporaryDirectory() as tmpdir: 508 + original_path = os.environ.get("JOURNAL_PATH") 509 + os.environ["JOURNAL_PATH"] = tmpdir 510 + 511 + try: 512 + day_dir = Path(tmpdir) / "20260130" 513 + day_dir.mkdir() 514 + 515 + # Previous segment with active meeting 516 + prev_dir = day_dir / "100000_300" 517 + prev_dir.mkdir() 518 + prev_state = [ 519 + { 520 + "activity": "meeting", 521 + "state": "active", 522 + "since": "093000_300", 523 + "description": "Sprint planning", 524 + "level": "high", 525 + } 526 + ] 527 + (prev_dir / "activity_state_work.json").write_text( 528 + json.dumps(prev_state) 529 + ) 530 + 531 + (day_dir / "100500_300").mkdir() 532 + 533 + llm_output = json.dumps( 534 + [ 535 + { 536 + "activity": "meeting", 537 + "state": "ended", 538 + "description": "Sprint planning completed", 539 + } 540 + ] 541 + ) 542 + 543 + context = { 544 + "day": "20260130", 545 + "segment": "100500_300", 546 + "output_path": f"{tmpdir}/20260130/100500_300/activity_state_work.json", 547 + } 548 + 549 + result = post_process(llm_output, context) 550 + items = json.loads(result) 551 + assert items[0]["state"] == "ended" 552 + assert items[0]["since"] == "093000_300" 553 + assert "level" not in items[0] 554 + 555 + finally: 556 + if original_path: 557 + os.environ["JOURNAL_PATH"] = original_path 558 + 559 + def test_no_previous_state_all_new(self): 560 + from muse.activity_state import post_process 561 + 562 + llm_output = json.dumps( 563 + [ 564 + { 565 + "activity": "coding", 566 + "state": "continuing", 567 + "description": "Writing code", 568 + "level": "high", 569 + }, 570 + { 571 + "activity": "meeting", 572 + "state": "ended", 573 + "description": "Standup ended", 574 + }, 575 + ] 576 + ) 577 + 578 + # No day/output_path — no previous state available 579 + result = post_process(llm_output, {"segment": "143000_300"}) 580 + items = json.loads(result) 581 + 582 + # "continuing" with no match falls back to current segment 583 + assert items[0]["since"] == "143000_300" 584 + assert items[0]["state"] == "active" 585 + 586 + # "ended" with no match also uses current segment 587 + assert items[1]["since"] == "143000_300" 588 + assert items[1]["state"] == "ended" 589 + 590 + def test_empty_array_passthrough(self): 591 + from muse.activity_state import post_process 592 + 593 + result = post_process("[]", {"segment": "143000_300"}) 594 + assert result is not None 595 + assert json.loads(result) == [] 596 + 597 + def test_malformed_json_returns_none(self): 598 + from muse.activity_state import post_process 599 + 600 + result = post_process("not json", {"segment": "143000_300"}) 601 + assert result is None 602 + 603 + def test_non_array_returns_none(self): 604 + from muse.activity_state import post_process 605 + 606 + result = post_process('{"active": []}', {"segment": "143000_300"}) 607 + assert result is None 608 + 609 + def test_missing_segment_returns_none(self): 610 + from muse.activity_state import post_process 611 + 612 + result = post_process("[]", {}) 613 + assert result is None 614 + 615 + def test_same_type_transition_end_and_new(self): 616 + """One meeting ends, another starts — both get correct since.""" 617 + from muse.activity_state import post_process 618 + 619 + with tempfile.TemporaryDirectory() as tmpdir: 620 + original_path = os.environ.get("JOURNAL_PATH") 621 + os.environ["JOURNAL_PATH"] = tmpdir 622 + 623 + try: 624 + day_dir = Path(tmpdir) / "20260130" 625 + day_dir.mkdir() 626 + 627 + prev_dir = day_dir / "100000_300" 628 + prev_dir.mkdir() 629 + prev_state = [ 630 + { 631 + "activity": "meeting", 632 + "state": "active", 633 + "since": "093000_300", 634 + "description": "Sprint planning", 635 + "level": "high", 636 + } 637 + ] 638 + (prev_dir / "activity_state_work.json").write_text( 639 + json.dumps(prev_state) 640 + ) 641 + 642 + (day_dir / "100500_300").mkdir() 643 + 644 + llm_output = json.dumps( 645 + [ 646 + { 647 + "activity": "meeting", 648 + "state": "ended", 649 + "description": "Sprint planning completed", 650 + }, 651 + { 652 + "activity": "meeting", 653 + "state": "new", 654 + "description": "1:1 with manager", 655 + "level": "high", 656 + }, 657 + ] 658 + ) 659 + 660 + context = { 661 + "day": "20260130", 662 + "segment": "100500_300", 663 + "output_path": f"{tmpdir}/20260130/100500_300/activity_state_work.json", 664 + } 665 + 666 + result = post_process(llm_output, context) 667 + items = json.loads(result) 668 + 669 + ended = [i for i in items if i["state"] == "ended"] 670 + active = [i for i in items if i["state"] == "active"] 671 + 672 + assert len(ended) == 1 673 + assert ended[0]["since"] == "093000_300" # From previous 674 + 675 + assert len(active) == 1 676 + assert active[0]["since"] == "100500_300" # Current segment 677 + 678 + finally: 679 + if original_path: 680 + os.environ["JOURNAL_PATH"] = original_path 681 + 682 + def test_default_level_for_new(self): 683 + """New activity without level gets default 'medium'.""" 684 + from muse.activity_state import post_process 685 + 686 + llm_output = json.dumps( 687 + [{"activity": "coding", "state": "new", "description": "Writing code"}] 688 + ) 689 + 690 + result = post_process(llm_output, {"segment": "143000_300"}) 691 + items = json.loads(result) 692 + assert items[0]["level"] == "medium" 693 + 694 + def test_fuzzy_match_disambiguates_same_type(self): 695 + """Multiple same-type previous activities matched by description.""" 696 + from muse.activity_state import post_process 697 + 698 + with tempfile.TemporaryDirectory() as tmpdir: 699 + original_path = os.environ.get("JOURNAL_PATH") 700 + os.environ["JOURNAL_PATH"] = tmpdir 701 + 702 + try: 703 + day_dir = Path(tmpdir) / "20260130" 704 + day_dir.mkdir() 705 + 706 + prev_dir = day_dir / "100000_300" 707 + prev_dir.mkdir() 708 + prev_state = [ 709 + { 710 + "activity": "meeting", 711 + "state": "active", 712 + "since": "090000_300", 713 + "description": "Sprint planning with engineering team", 714 + "level": "high", 715 + }, 716 + { 717 + "activity": "meeting", 718 + "state": "active", 719 + "since": "093000_300", 720 + "description": "Customer support standup", 721 + "level": "medium", 722 + }, 723 + ] 724 + (prev_dir / "activity_state_work.json").write_text( 725 + json.dumps(prev_state) 726 + ) 727 + 728 + (day_dir / "100500_300").mkdir() 729 + 730 + llm_output = json.dumps( 731 + [ 732 + { 733 + "activity": "meeting", 734 + "state": "continuing", 735 + "description": "Customer support standup - discussing tickets", 736 + "level": "medium", 737 + } 738 + ] 739 + ) 740 + 741 + context = { 742 + "day": "20260130", 743 + "segment": "100500_300", 744 + "output_path": f"{tmpdir}/20260130/100500_300/activity_state_work.json", 745 + } 746 + 747 + result = post_process(llm_output, context) 748 + items = json.loads(result) 749 + # Should match the standup (093000_300), not sprint planning 750 + assert items[0]["since"] == "093000_300" 751 + 752 + finally: 753 + if original_path: 754 + os.environ["JOURNAL_PATH"] = original_path