story hook: add talent attribution; normalize topics; clamp confidence

-265

docs/design/story-talent-refactor.md

··· 1 - # Story-Talent Refactor 2 - This refactor replaces the old storyteller span-row write path with activity-record 3 - story merges. It is a clean break: 4 - - storytellers stop writing `facets/*/spans/*.jsonl` 5 - - story content lives on the activity record itself 6 - - `think/activities.py` remains the only writer for activity records 7 - - priority ordering, not extra locking, serializes participation before story 8 - ## 1. New writer: `merge_story_fields` 9 - Location: 10 - - add `merge_story_fields(...)` in `think/activities.py` 11 - - place it next to `update_record_fields()` and `update_activity_record()` 12 - - this is the only activity-record write added by the refactor, so L2 stays 13 - satisfied inside the domain owner 14 - Signature and docstring: 15 - ```python 16 - def merge_story_fields( 17 - facet: str, 18 - day: str, 19 - record_id: str, 20 - *, 21 - story: dict, 22 - commitments: list[dict], 23 - closures: list[dict], 24 - decisions: list[dict], 25 - actor: str, 26 - note: str | None = None, 27 - ) -> bool: 28 - """Replace story-derived fields on an activity record and append one edit.""" 29 - ``` 30 - Behavior and return semantics: 31 - - use one `locked_modify(...)` call only, following the same pattern as 32 - `update_activity_record()` and `_set_activity_hidden_state()` 33 - (`think/activities.py:762-790`, `1046-1089`, `1092-1133`) 34 - - inside the callback, find the record with the same `record.get("id") == record_id` 35 - match used by `update_activity_record()` 36 - - if found: 37 - - normalize the current record 38 - - replace `story`, `commitments`, `closures`, and `decisions` wholesale 39 - - call `append_edit(...)` exactly once with 40 - `fields=["story", "commitments", "closures", "decisions"]` 41 - - pass through `actor` 42 - - pass through `note` 43 - - return `True` 44 - - if the day file is missing or the record is absent: 45 - - log a warning 46 - - return `False` 47 - - do not raise 48 - Why this shape: 49 - - `update_activity_record()` is intentionally narrow and only allows 50 - `title`, `description`, and `details` (`think/activities.py:1046-1062`) 51 - - the CLI mirrors that exact scope 52 - (`apps/activities/call.py:503-532`, 53 - `apps/activities/talent/activities/SKILL.md:116-130`) 54 - - `_set_activity_hidden_state()` already establishes the specialized-writer 55 - pattern in this module (`think/activities.py:1092-1133`) 56 - - `update_record_fields()` stays the generic no-edit helper used by participation 57 - (`think/activities.py:979-1011`, `talent/participation.py:99-105`) 58 - ## 2. `story.py` `post_process` flow 59 - The new hook lives at `talent/story.py` and always returns `""` so the JSON 60 - generator artifact is suppressed. 61 - Dispatcher context shape, confirmed: 62 - - `run_activity_prompts()` sends `facet`, `day`, `span`, `activity`, and 63 - `output_path` in the activity request (`think/thinking.py:2064-2083`) 64 - - `prepare_config()` merges those request keys into the full talent config and 65 - always carries `name` (`think/talents.py:438-520`) 66 - - `_run_post_hooks()` passes the full prepared config dict directly to the hook 67 - (`think/talents.py:712-734`) 68 - The hook can rely on: 69 - - `context["name"]` 70 - - `context["facet"]` 71 - - `context["day"]` 72 - - `context["activity"]` 73 - - `context["span"]` 74 - - `context["output_path"]` 75 - Execution order: 76 - 1. Parse `result` with `json.loads(result.strip())`. 77 - On failure: log and return `""`. 78 - 2. Require a top-level `dict`. 79 - Otherwise: log and return `""`. 80 - 3. Validate required top-level fields. 81 - - `body`: `str`, non-empty after strip 82 - - `topics`: `list[str]`, may be empty 83 - - `confidence`: numeric in `0.0..1.0` 84 - - `commitments`: `list` 85 - - `closures`: `list` 86 - - `decisions`: `list` 87 - Any missing field, wrong type, or out-of-range `confidence` logs and returns 88 - `""`. 89 - 4. Validate required context. 90 - - `context["activity"]` must be a `dict` 91 - - `context["activity"]["id"]` must exist 92 - - `context["facet"]` and `context["day"]` must exist 93 - Missing context logs and returns `""`. 94 - 5. Load entities once with: 95 - `load_entities(facet=context["facet"], day=context["day"])`. 96 - 6. Validate `commitments` entry by entry. 97 - - each entry must be a `dict` 98 - - required keys: `owner`, `action`, `counterparty`, `when`, `context` 99 - - each required value must be a `str` 100 - - invalid entries are skipped with a per-entry log 101 - 7. Validate `closures` entry by entry. 102 - - each entry must be a `dict` 103 - - required keys: `owner`, `action`, `counterparty`, `resolution`, `context` 104 - - each required value must be a `str` 105 - - `resolution` must be one of: 106 - `sent`, `done`, `signed`, `dropped`, `deferred` 107 - - invalid entries are skipped with a per-entry log 108 - 8. Validate `decisions` entry by entry. 109 - - each entry must be a `dict` 110 - - required keys: `owner`, `action`, `context` 111 - - each required value must be a `str` 112 - - invalid entries are skipped with a per-entry log 113 - 9. Resolve entity ids for every valid entry with 114 - `find_matching_entity(name, entities, fuzzy_threshold=90)`. 115 - - commitments: add `owner_entity_id` and `counterparty_entity_id` 116 - - closures: add `owner_entity_id` and `counterparty_entity_id` 117 - - decisions: add `owner_entity_id` 118 - - unmatched values become `None` 119 - - preserve the original `owner` and `counterparty` strings 120 - 10. Build: 121 - `story = {"body": body, "topics": topics, "confidence": confidence}`. 122 - 11. Extract: 123 - - `record_id = context["activity"]["id"]` 124 - - `facet = context["facet"]` 125 - - `day = context["day"]` 126 - 12. Call: 127 - `merge_story_fields(facet, day, record_id, story=..., commitments=..., closures=..., decisions=..., actor="story", note=None)`. 128 - If it returns `False`: log a warning and return `""`. 129 - 13. Return `""`. 130 - This is required because `_execute_with_tools()` only writes the output file 131 - when `result` is truthy (`think/talents.py:837-846`); returning `None` would 132 - fall back to the original JSON result (`think/talents.py:726-734`). 133 - Intentional differences from `talent/spans.py`: 134 - - no spans-file write 135 - - no topic dedupe/normalization 136 - - no confidence clamping 137 - - no fence-stripping carryover unless explicitly added during implementation 138 - ## 3. Activity-record formatter extension 139 - Target: 140 - - extend `think/activities.py::format_activities()` 141 - - current order is: 142 - title, activity, facet, day, time, level, description, details, participation, 143 - hidden (`think/activities.py:1271-1307`) 144 - Chosen insertion point: 145 - - add the story block after participation and before hidden 146 - Behavior: 147 - - if `record.get("story")` is not a `dict`, do nothing 148 - - if `story["body"]` is a non-empty string, render it as prose rather than 149 - `- Story: ...` 150 - - if `story["topics"]` is a non-empty list of strings, render one line as 151 - `Topics: a, b, c` 152 - - if `body` is missing/empty, skip the prose block 153 - - if `topics` is missing, non-list, or empty, skip the topics line 154 - - keep all other formatter output unchanged 155 - - keep the existing activity formatter registration; no new registry entry is 156 - needed because activities are already mapped to `format_activities()` 157 - (`think/formatters.py:143-144`) 158 - Why this insertion point is best: 159 - - description/details remain raw activity metadata 160 - - participation remains the structured who-was-involved summary 161 - - story reads naturally after those structured fields 162 - - hidden stays last because it is record state, not content 163 - ## 4. Storyteller prompt changes 164 - Common frontmatter changes for all three storyteller talents: 165 - - `priority: 10` -> `priority: 20` 166 - - `hook: {"post": "spans"}` -> `hook: {"post": "story"}` 167 - - keep `schedule: "activity"` 168 - - keep `output: "json"` 169 - - keep existing activity filters per talent 170 - Common schema changes for all three: 171 - - require exactly: 172 - `body`, `topics`, `confidence`, `commitments`, `closures`, `decisions` 173 - - all six fields are required on every response 174 - - `topics` may be `[]` 175 - - `commitments`, `closures`, and `decisions` may be `[]` 176 - - add the explicit instruction: 177 - `Return [] if you do not observe a clear commitment / closure / decision. Better to omit than invent.` 178 - - state the controlled closure `resolution` vocabulary exactly: 179 - `sent`, `done`, `signed`, `dropped`, `deferred` 180 - `talent/conversation.md` 181 - - keep the meeting/call/messaging/email narrative focus 182 - - expand the schema block to the six-field JSON shape 183 - - inline examples: 184 - - commitment: send a follow-up, draft, or deck by a date 185 - - closure: an open item was `sent` or `done` 186 - - decision: the group chose a direction, owner, or timing 187 - - keep the current guidance that brief quotes are allowed when they sharpen a 188 - decision, commitment, or disagreement 189 - `talent/work.md` 190 - - keep the coding/browsing/reading progress focus 191 - - expand the schema block to the six-field JSON shape 192 - - inline examples: 193 - - commitment: ship a patch, benchmark, or send results 194 - - closure: a task was `done` or a review was `sent` 195 - - decision: a code-path, retry strategy, or API choice was made 196 - - keep the instruction to emphasize actual work performed over UI description 197 - `talent/event.md` 198 - - keep the appointment/event/travel/errand outcome focus 199 - - expand the schema block to the six-field JSON shape 200 - - inline examples: 201 - - commitment: a travel or logistics follow-up 202 - - closure: a form was `signed`, a reservation was `done`, or a task was 203 - `deferred` 204 - - decision: a route, plan, or next-step choice was made 205 - - keep the guidance to prefer what actually happened over generic event labels 206 - ## 5. Test matrix 207 - | test name | file | pins | 208 - | --- | --- | --- | 209 - | `test_story_hook_parses_and_writes` | `tests/test_story_hook.py` | Valid JSON writes `story`, `commitments`, `closures`, `decisions` onto the activity record and appends one edit with actor `story`. | 210 - | `test_story_hook_empty_arrays` | `tests/test_story_hook.py` | Empty `commitments`/`closures`/`decisions` still persist alongside the story payload. | 211 - | `test_story_hook_bad_resolution_skipped` | `tests/test_story_hook.py` | Invalid closure `resolution` is dropped while valid sibling closures survive. | 212 - | `test_story_hook_missing_required_field_skipped` | `tests/test_story_hook.py` | Missing required per-entry fields skip only the bad item. | 213 - | `test_story_hook_resolves_entities` | `tests/test_story_hook.py` | `owner`/`counterparty` resolve to `*_entity_id` with `fuzzy_threshold=90`; misses become `None`. | 214 - | `test_story_hook_idempotent_rerun` | `tests/test_story_hook.py` | Second run replaces story/list fields wholesale and appends one more edit entry. | 215 - | `test_story_hook_missing_record_logs_and_returns` | `tests/test_story_hook.py` | `merge_story_fields()` returns `False`, hook logs warning, nothing raises. | 216 - | `test_story_hook_no_json_file_written` | `tests/test_story_hook.py` | Returning `""` suppresses the storyteller JSON artifact. | 217 - | `test_format_activities_renders_story` | `tests/test_activities.py` | Story prose and `Topics:` line appear when present and disappear cleanly when absent. | 218 - | `test_no_spans_formatter_registered` | `tests/test_formatters.py` | `"facets/*/spans/*.jsonl"` is removed from `FORMATTERS`; spans paths no longer resolve to a formatter. | 219 - | `test_no_spans_writes` | `tests/test_formatters.py` | Search-style assertion that no `format_spans` or `spans/` write targets remain in `think/`, `talent/`, or `apps/`. | 220 - Existing test templates to reuse: 221 - - `tests/test_activity_record_merge.py` for temp-journal setup, activity seeding, 222 - hook execution, and record reload assertions 223 - - `tests/test_schedule_hook.py` for per-entry skip behavior and entity-resolution 224 - patterns 225 - ## 6. Files touched / deleted 226 - Create: 227 - - `docs/design/story-talent-refactor.md` 228 - - `talent/story.py` 229 - - `tests/test_story_hook.py` 230 - Modify: 231 - - `think/activities.py` 232 - - `think/formatters.py` 233 - - `talent/conversation.md` 234 - - `talent/work.md` 235 - - `talent/event.md` 236 - - `tests/test_activities.py` 237 - - `tests/test_formatters.py` 238 - - `tests/baselines/api/stats/stats.json` 239 - - `talent/journal/references/captures.md` 240 - Delete: 241 - - `talent/spans.py` 242 - - `think/spans.py` 243 - - `tests/test_spans_hook.py` 244 - - `tests/test_spans_formatter.py` 245 - Intentionally untouched: 246 - - `think/thinking.py` because priority-group serialization already does the job 247 - - `apps/activities/call.py` because no CLI surface change is needed 248 - ## 7. Risks / gotchas 249 - - Preserve the hook-return behavior exactly: `""`, not `None`. 250 - `None` would fall back to the original JSON result and write a generator file 251 - (`think/talents.py:726-734`, `837-846`). 252 - - Preserve the missing-record behavior of `update_record_fields()`: 253 - no raise, but the story path must log the failure like participation does 254 - (`think/activities.py:1007-1011`, `talent/participation.py:104-105`). 255 - - `format_activities()` is already registered for 256 - `"facets/*/activities/*.jsonl"` (`think/formatters.py:144`). 257 - Do not add a new formatter entry for story data. 258 - - Layer hygiene L2 stays strict: 259 - only `think/activities.py` writes the activity record. 260 - `talent/story.py` imports and calls the new writer; it does not perform raw 261 - file I/O. 262 - - Story merges serialize after participation via priority ordering, not new locks. 263 - Keep participation at `10`, storytellers at `20`, and rely on the existing 264 - group ordering/drain in `run_activity_prompts()` 265 - (`think/thinking.py:1925-1928`, `2150-2183`).

+54 -12

talent/story.py

··· 7 7 8 8 import json 9 9 import logging 10 + import math 10 11 from typing import Any 11 12 12 13 from think.activities import merge_story_fields ··· 18 19 ALLOWED_RESOLUTIONS = frozenset({"sent", "done", "signed", "dropped", "deferred"}) 19 20 20 21 22 + def _normalize_topics(value: Any) -> list[str] | None: 23 + if not isinstance(value, list): 24 + logger.warning("story hook: missing topics list") 25 + return None 26 + 27 + topics: list[str] = [] 28 + seen: set[str] = set() 29 + for item in value: 30 + if not isinstance(item, str): 31 + logger.warning("story hook: invalid topics list") 32 + return None 33 + topic = item.strip().lower() 34 + if not topic or topic in seen: 35 + continue 36 + seen.add(topic) 37 + topics.append(topic) 38 + if len(topics) >= 10: 39 + break 40 + 41 + if not topics: 42 + logger.warning("story hook: empty topics after normalization") 43 + return None 44 + 45 + return topics 46 + 47 + 48 + def _normalize_confidence(value: Any) -> float | None: 49 + if isinstance(value, bool) or not isinstance(value, (int, float)): 50 + logger.warning("story hook: invalid confidence value") 51 + return None 52 + 53 + confidence = float(value) 54 + if math.isnan(confidence): 55 + logger.warning("story hook: invalid confidence value") 56 + return None 57 + 58 + clamped = min(1.0, max(0.0, confidence)) 59 + if clamped != confidence: 60 + logger.warning("story hook: clamped confidence %s to %s", confidence, clamped) 61 + return clamped 62 + 63 + 21 64 def _resolve_entity_id(name: str, entities: list[dict[str, Any]]) -> str | None: 22 65 match = find_matching_entity(name, entities, fuzzy_threshold=90) 23 66 return match.get("id") if match else None ··· 57 100 if not isinstance(body, str) or not body.strip(): 58 101 logger.warning("story hook: missing body") 59 102 return "" 60 - if not isinstance(topics, list) or any( 61 - not isinstance(topic, str) for topic in topics 62 - ): 63 - logger.warning("story hook: invalid topics") 103 + topics = _normalize_topics(topics) 104 + if topics is None: 64 105 return "" 65 - if ( 66 - isinstance(confidence, bool) 67 - or not isinstance(confidence, (int, float)) 68 - or not 0.0 <= float(confidence) <= 1.0 69 - ): 70 - logger.warning("story hook: invalid confidence") 106 + confidence = _normalize_confidence(confidence) 107 + if confidence is None: 71 108 return "" 72 109 if not isinstance(commitments, list): 73 110 logger.warning("story hook: missing commitments list") ··· 170 207 ) 171 208 resolved_decisions.append(resolved_decision) 172 209 210 + talent_name = context.get("name") or "" 211 + if not talent_name: 212 + logger.warning("story hook: missing talent name in context") 213 + 173 214 story = { 215 + "talent": talent_name, 174 216 "body": body.strip(), 175 - "topics": list(topics), 176 - "confidence": float(confidence), 217 + "topics": topics, 218 + "confidence": confidence, 177 219 } 178 220 179 221 merge_story_fields(

+89

tests/test_story_hook.py

··· 2 2 # Copyright (c) 2026 sol pbc 3 3 4 4 import json 5 + import logging 5 6 from pathlib import Path 6 7 7 8 ··· 30 31 facet: str = "work", 31 32 day: str = "20260418", 32 33 record_id: str = "meeting_090000_300", 34 + name: str = "conversation", 33 35 ) -> dict: 34 36 return { 35 37 "facet": facet, 36 38 "day": day, 39 + "name": name, 37 40 "activity": {"id": record_id}, 38 41 "output_path": str( 39 42 tmp_path / "facets" / facet / "activities" / day / record_id / "story.json" ··· 98 101 record = _load_record("work", "20260418") 99 102 assert returned == "" 100 103 assert record["story"] == { 104 + "talent": "conversation", 101 105 "body": "Wrapped the launch prep and assigned follow-up.", 102 106 "topics": ["launch", "follow-up"], 103 107 "confidence": 0.82, ··· 326 330 327 331 second = _load_record("work", "20260418") 328 332 assert second["story"] == { 333 + "talent": "conversation", 329 334 "body": "Second pass with a clearer summary.", 330 335 "topics": ["handoff"], 331 336 "confidence": 0.82, ··· 341 346 } 342 347 ] 343 348 assert len(second["edits"]) == 2 349 + 350 + 351 + def test_story_hook_normalizes_topics(tmp_path, monkeypatch): 352 + from talent.story import post_process 353 + from think.activities import append_activity_record 354 + 355 + monkeypatch.setenv("_SOLSTONE_JOURNAL_OVERRIDE", str(tmp_path)) 356 + append_activity_record("work", "20260418", _activity_record()) 357 + 358 + post_process( 359 + _valid_result( 360 + topics=["Launch Plan", "follow-up", "LAUNCH plan", " ", "follow-up"] 361 + ), 362 + _context(tmp_path), 363 + ) 364 + 365 + record = _load_record("work", "20260418") 366 + assert record["story"]["topics"] == ["launch plan", "follow-up"] 367 + 368 + 369 + def test_story_hook_clamps_confidence(tmp_path, monkeypatch, caplog): 370 + from talent.story import post_process 371 + from think.activities import append_activity_record, load_activity_records 372 + 373 + monkeypatch.setenv("_SOLSTONE_JOURNAL_OVERRIDE", str(tmp_path)) 374 + caplog.set_level(logging.WARNING) 375 + 376 + append_activity_record( 377 + "work", 378 + "20260418", 379 + _activity_record(record_id="meeting_090000_300"), 380 + ) 381 + post_process(_valid_result(confidence=1.4), _context(tmp_path)) 382 + first = _load_record("work", "20260418") 383 + assert first["story"]["confidence"] == 1.0 384 + assert "clamped" in caplog.text 385 + 386 + append_activity_record( 387 + "work", 388 + "20260418", 389 + _activity_record(record_id="meeting_100000_300"), 390 + ) 391 + post_process( 392 + _valid_result(confidence=-0.2), 393 + _context(tmp_path, record_id="meeting_100000_300"), 394 + ) 395 + records = load_activity_records("work", "20260418", include_hidden=True) 396 + by_id = {record["id"]: record for record in records} 397 + assert by_id["meeting_090000_300"]["story"]["confidence"] == 1.0 398 + assert by_id["meeting_100000_300"]["story"]["confidence"] == 0.0 399 + 400 + 401 + def test_story_hook_rejects_nan_confidence(tmp_path, monkeypatch): 402 + from talent.story import post_process 403 + from think.activities import append_activity_record 404 + 405 + monkeypatch.setenv("_SOLSTONE_JOURNAL_OVERRIDE", str(tmp_path)) 406 + append_activity_record("work", "20260418", _activity_record()) 407 + 408 + returned = post_process( 409 + _valid_result(confidence=float("nan")), 410 + _context(tmp_path), 411 + ) 412 + 413 + record = _load_record("work", "20260418") 414 + assert returned == "" 415 + assert "story" not in record 416 + 417 + 418 + def test_story_hook_rejects_non_numeric_confidence(tmp_path, monkeypatch): 419 + from talent.story import post_process 420 + from think.activities import append_activity_record 421 + 422 + monkeypatch.setenv("_SOLSTONE_JOURNAL_OVERRIDE", str(tmp_path)) 423 + append_activity_record("work", "20260418", _activity_record()) 424 + 425 + returned = post_process( 426 + _valid_result(confidence="high"), 427 + _context(tmp_path), 428 + ) 429 + 430 + record = _load_record("work", "20260418") 431 + assert returned == "" 432 + assert "story" not in record 344 433 345 434 346 435 def test_story_hook_missing_record_logs_and_returns(tmp_path, monkeypatch, caplog):

Configure Feed

Configure Feed