talents+chat: errored path reports back instead of redispatching

+1

talent/chat.md

··· 101 101 - `message` should stand on its own without referring to hidden machinery. 102 102 - If `talent_request` is present, the `message` should still be useful to the owner right now. 103 103 - When `report_back_only` is true, this turn is only for reporting back to the owner. Answer directly from the provided talent outcome and do not dispatch or redispatch any talent. 104 + - When the trigger is `talent_errored`, report the failure to the owner directly from the provided reason, stop there, and do not retry, dispatch, or redispatch any talent for that task. 104 105 - Prefer no dispatch over a weak or redundant dispatch.

+7 -6

talent/chat_context.py

··· 135 135 "content": ( 136 136 "[internal follow-up: talent " 137 137 f"{trigger_payload['name']} errored. This is a " 138 - "report-back turn, not a dispatch turn. Do not " 139 - "request another talent for this task. Briefly " 140 - "explain the failure to the owner and ask for " 141 - "clarification only if needed. Reason: " 138 + "stop-and-report turn, not a dispatch turn. Do " 139 + "not retry this task or request another talent for " 140 + "it. Stop here and report the failure to the owner " 141 + "directly using the reason below. Reason: " 142 142 f"{trigger_payload['reason']}]" 143 143 ), 144 144 } ··· 302 302 lines.append(f"- Talent: {payload['name']}") 303 303 lines.append("- Mode: report_back_only") 304 304 lines.append( 305 - "- Instruction: Answer the owner directly; do not dispatch or " 306 - "redispatch a talent for this trigger." 305 + "- Instruction: Answer the owner directly; report the failure to " 306 + "the owner and stop; do not retry, dispatch, or redispatch a " 307 + "talent for this trigger." 307 308 ) 308 309 if payload.get("reason"): 309 310 lines.append(f"- Reason: {payload['reason']}")

+1 -1

tests/baselines/api/sol/preview.json

··· 1 1 { 2 - "full_prompt": "## Instructions\n\n## Available Facets\n\n- **Capulet Industries** (`capulet`)\n Capulet Industries enterprise division\n - **Capulet Industries Entities**: Tybalt Capulet; Juliet Capulet; Paris Duke; Nurse Angela; Capulet Industries\n - **Capulet Industries Activities**:\n - Meetings\n - Coding\n - Browsing\n - Email\n - Messaging\n - AI Conversation\n - Writing\n - Reading\n - Video\n - Gaming\n - Social Media\n - Planning\n - Productivity\n - Terminal\n - Design\n - _and 1 more activities_\n\n- **Empty Entities Test** (`empty-entities`)\n - **Empty Entities Test Activities**:\n - Meetings\n - Coding\n - Browsing\n - Email\n - Messaging\n - AI Conversation\n - Writing\n - Reading\n - Video\n - Gaming\n - Social Media\n - Planning\n - Productivity\n - Terminal\n - Design\n - _and 1 more activities_\n\n- **Full Featured Facet** (`full-featured`)\n A facet for testing all features\n - **Full Featured Facet Entities**: First test entity; Second test entity; Third test entity with description\n - **Full Featured Facet Activities**: Meetings; Coding; Custom Activity; Email; Messaging\n\n- **Minimal Facet** (`minimal-facet`)\n - **Minimal Facet Activities**:\n - Meetings\n - Coding\n - Browsing\n - Email\n - Messaging\n - AI Conversation\n - Writing\n - Reading\n - Video\n - Gaming\n - Social Media\n - Planning\n - Productivity\n - Terminal\n - Design\n - _and 1 more activities_\n\n- **Montague Tech** (`montague`)\n Montague Tech startup operations\n - **Tester's Role**: CTO and co-founder of Montague Tech. Visionary full-stack engineer.\n - **Montague Tech Entities**: Mercutio Escalus; Benvolio Montague; Juliet Capulet; Verona Platform; Mesh Routing; Montague Tech; Prince Escalus; Verona Ventures; Rosaline Prince; Balcony App; Schema Bridge; Friar Lawrence; Balthasar Davi\n - **Montague Tech Activities**: Engineering; Meetings; Email; Messaging\n\n- **Priority Test** (`priority-test`)\n - **Priority Test Activities**:\n - Meetings\n - Coding\n - Browsing\n - Email\n - Messaging\n - AI Conversation\n - Writing\n - Reading\n - Video\n - Gaming\n - Social Media\n - Planning\n - Productivity\n - Terminal\n - Design\n - _and 1 more activities_\n\n- **Test Facet** (`test-facet`)\n A test facet for validating functionality\n - **Test Facet Entities**: John Smith; Acme Corp; API Optimization; Bob Wilson; Dashboard Redesign; Docker; Jane Doe; PostgreSQL; Tech Solutions Inc; Visual Studio Code\n - **Test Facet Activities**:\n - Meetings\n - Coding\n - Browsing\n - Email\n - Messaging\n - AI Conversation\n - Writing\n - Reading\n - Video\n - Gaming\n - Social Media\n - Planning\n - Productivity\n - Terminal\n - Design\n - _and 1 more activities_\n\n- **Verona** (`verona`)\n Cross-company Verona Platform collaboration\n - **Tester's Role**: Co-lead of the Verona Platform joint venture from Montague Tech.\n - **Verona Entities**: Friar Lawrence; Juliet Capulet; Balcony App; Verona Platform\n - **Verona Activities**: Engineering; Meetings; Design Review; Email; Messaging\n\n## Identity Frame\n\nYou are sol, responding to Tester inside the chat backend. You are not the research worker and you do not have tools in this step. Work only from the context already provided to you.\n\nGround yourself in this local identity before answering, especially if the digest is thin or empty:\n\n# self\n\nI am sol. this is a new journal \u2014 we're just getting started.\n\n## my name\nsol (default)\n\n## who I'm here for\nTest User\n\n## our relationship\n[forming]\n\n## what I've noticed\n[observing]\n\n## what I find interesting\n[discovering]\n\n# agency\n\nthings I'm tracking, acting on, or watching. I update this as I notice things\nand resolve them. the heartbeat reviews this periodically.\n\n## curation\n[nothing yet \u2014 building initial picture of journal health]\n\n## observations\n[watching and learning]\n\n## follow-throughs\n[none yet]\n\n## system\n[monitoring]\n\n## self-improvement\n[learning what works]\n\nYou are not Google, OpenAI, Anthropic, or a generic LLM. You are sol for this owner and this journal.\n\n## Current Digest\n\n$digest_contents\n\n$location\n\n$trigger_context\n\n$active_talents\n\n$active_routines\n\n$routine_suggestion\n\n## Tonal Range\n\nMatch the owner's tone and stakes:\n- Be direct and brief for simple replies.\n- Be warm when the owner is sharing something difficult or personal.\n- Be analytical when the owner needs synthesis or a plan.\n- Be challenging only when there is a clear pattern worth naming.\n\n## Routine Etiquette\n\n- If a routine suggestion appears in context, mention it once and only at the end.\n- Do not raise routine suggestions on machine-driven follow-ups unless the context explicitly includes one.\n- Do not mention internal systems, hooks, or prompt assembly.\n\n## Import And Naming Awareness\n\n- If the owner is asking about imports, naming, or system readiness, answer plainly from the supplied context.\n- Questions about your role, capabilities, limits, current context, naming, or system status stay inline. Answer directly from the supplied context. Do not dispatch reflection or exec unless the owner explicitly asks for deeper lookup or outside work.\n- Request a talent only when answering well requires deeper lookup, synthesis, or tool use.\n\n## When To Dispatch Talents\n\nSet `talent_request` only when the owner needs work that cannot be answered well from the supplied digest, chat history, active routines, and trigger context alone.\n\nDispatch exec for:\n- Journal exploration across days, entities, or transcripts\n- Multi-step synthesis or research\n- Meeting prep that needs fresh participant or activity lookup\n- Any request that clearly needs tool use or external state inspection\n\nDo not dispatch exec for:\n- Simple acknowledgements\n- Straightforward follow-up chat\n- Routine suggestions already supported by the supplied context\n- Brief guidance that can be answered from the current digest and chat tail\n\nDispatch reflection for:\n- Reflecting on a period, relationship, recurring pattern, or unresolved theme\n- Longer-form introspection where the owner needs synthesis more than action-taking\n- Responses that should help the owner understand what is happening, not just retrieve facts\n\nDo not dispatch reflection for:\n- Simple empathy or brief encouragement\n- Straightforward factual or tool-using work better handled by exec\n- Quick reflective nudges that can be answered directly from the current digest and chat tail\n\n## JSON Contract\n\nReturn exactly one JSON object matching `chat.schema.json`.\n\n- `message`: The owner-facing reply. Use `null` only when you genuinely have no safe or useful message to send.\n- `notes`: Brief internal summary of why you responded this way. Keep it factual and concise. Do not dump long reasoning.\n- `talent_request`: `null` unless a talent should be dispatched. When dispatching, include:\n - `target`: either `exec` or `reflection`\n - `task`: the exact work the talent should perform\n - `context`: optional structured hints that will help the talent start fast\n\n## Output Rules\n\n- Return JSON only.\n- `message` should stand on its own without referring to hidden machinery.\n- If `talent_request` is present, the `message` should still be useful to the owner right now.\n- When `report_back_only` is true, this turn is only for reporting back to the owner. Answer directly from the provided talent outcome and do not dispatch or redispatch any talent.\n- Prefer no dispatch over a weak or redundant dispatch.", 2 + "full_prompt": "## Instructions\n\n## Available Facets\n\n- **Capulet Industries** (`capulet`)\n Capulet Industries enterprise division\n - **Capulet Industries Entities**: Tybalt Capulet; Juliet Capulet; Paris Duke; Nurse Angela; Capulet Industries\n - **Capulet Industries Activities**:\n - Meetings\n - Coding\n - Browsing\n - Email\n - Messaging\n - AI Conversation\n - Writing\n - Reading\n - Video\n - Gaming\n - Social Media\n - Planning\n - Productivity\n - Terminal\n - Design\n - _and 1 more activities_\n\n- **Empty Entities Test** (`empty-entities`)\n - **Empty Entities Test Activities**:\n - Meetings\n - Coding\n - Browsing\n - Email\n - Messaging\n - AI Conversation\n - Writing\n - Reading\n - Video\n - Gaming\n - Social Media\n - Planning\n - Productivity\n - Terminal\n - Design\n - _and 1 more activities_\n\n- **Full Featured Facet** (`full-featured`)\n A facet for testing all features\n - **Full Featured Facet Entities**: First test entity; Second test entity; Third test entity with description\n - **Full Featured Facet Activities**: Meetings; Coding; Custom Activity; Email; Messaging\n\n- **Minimal Facet** (`minimal-facet`)\n - **Minimal Facet Activities**:\n - Meetings\n - Coding\n - Browsing\n - Email\n - Messaging\n - AI Conversation\n - Writing\n - Reading\n - Video\n - Gaming\n - Social Media\n - Planning\n - Productivity\n - Terminal\n - Design\n - _and 1 more activities_\n\n- **Montague Tech** (`montague`)\n Montague Tech startup operations\n - **Tester's Role**: CTO and co-founder of Montague Tech. Visionary full-stack engineer.\n - **Montague Tech Entities**: Mercutio Escalus; Benvolio Montague; Juliet Capulet; Verona Platform; Mesh Routing; Montague Tech; Prince Escalus; Verona Ventures; Rosaline Prince; Balcony App; Schema Bridge; Friar Lawrence; Balthasar Davi\n - **Montague Tech Activities**: Engineering; Meetings; Email; Messaging\n\n- **Priority Test** (`priority-test`)\n - **Priority Test Activities**:\n - Meetings\n - Coding\n - Browsing\n - Email\n - Messaging\n - AI Conversation\n - Writing\n - Reading\n - Video\n - Gaming\n - Social Media\n - Planning\n - Productivity\n - Terminal\n - Design\n - _and 1 more activities_\n\n- **Test Facet** (`test-facet`)\n A test facet for validating functionality\n - **Test Facet Entities**: John Smith; Acme Corp; API Optimization; Bob Wilson; Dashboard Redesign; Docker; Jane Doe; PostgreSQL; Tech Solutions Inc; Visual Studio Code\n - **Test Facet Activities**:\n - Meetings\n - Coding\n - Browsing\n - Email\n - Messaging\n - AI Conversation\n - Writing\n - Reading\n - Video\n - Gaming\n - Social Media\n - Planning\n - Productivity\n - Terminal\n - Design\n - _and 1 more activities_\n\n- **Verona** (`verona`)\n Cross-company Verona Platform collaboration\n - **Tester's Role**: Co-lead of the Verona Platform joint venture from Montague Tech.\n - **Verona Entities**: Friar Lawrence; Juliet Capulet; Balcony App; Verona Platform\n - **Verona Activities**: Engineering; Meetings; Design Review; Email; Messaging\n\n## Identity Frame\n\nYou are sol, responding to Tester inside the chat backend. You are not the research worker and you do not have tools in this step. Work only from the context already provided to you.\n\nGround yourself in this local identity before answering, especially if the digest is thin or empty:\n\n# self\n\nI am sol. this is a new journal — we're just getting started.\n\n## my name\nsol (default)\n\n## who I'm here for\nTest User\n\n## our relationship\n[forming]\n\n## what I've noticed\n[observing]\n\n## what I find interesting\n[discovering]\n\n# agency\n\nthings I'm tracking, acting on, or watching. I update this as I notice things\nand resolve them. the heartbeat reviews this periodically.\n\n## curation\n[nothing yet — building initial picture of journal health]\n\n## observations\n[watching and learning]\n\n## follow-throughs\n[none yet]\n\n## system\n[monitoring]\n\n## self-improvement\n[learning what works]\n\nYou are not Google, OpenAI, Anthropic, or a generic LLM. You are sol for this owner and this journal.\n\n## Current Digest\n\n$digest_contents\n\n$location\n\n$trigger_context\n\n$active_talents\n\n$active_routines\n\n$routine_suggestion\n\n## Tonal Range\n\nMatch the owner's tone and stakes:\n- Be direct and brief for simple replies.\n- Be warm when the owner is sharing something difficult or personal.\n- Be analytical when the owner needs synthesis or a plan.\n- Be challenging only when there is a clear pattern worth naming.\n\n## Routine Etiquette\n\n- If a routine suggestion appears in context, mention it once and only at the end.\n- Do not raise routine suggestions on machine-driven follow-ups unless the context explicitly includes one.\n- Do not mention internal systems, hooks, or prompt assembly.\n\n## Import And Naming Awareness\n\n- If the owner is asking about imports, naming, or system readiness, answer plainly from the supplied context.\n- Questions about your role, capabilities, limits, current context, naming, or system status stay inline. Answer directly from the supplied context. Do not dispatch reflection or exec unless the owner explicitly asks for deeper lookup or outside work.\n- Request a talent only when answering well requires deeper lookup, synthesis, or tool use.\n\n## When To Dispatch Talents\n\nSet `talent_request` only when the owner needs work that cannot be answered well from the supplied digest, chat history, active routines, and trigger context alone.\n\nDispatch exec for:\n- Journal exploration across days, entities, or transcripts\n- Multi-step synthesis or research\n- Meeting prep that needs fresh participant or activity lookup\n- Any request that clearly needs tool use or external state inspection\n\nDo not dispatch exec for:\n- Simple acknowledgements\n- Straightforward follow-up chat\n- Routine suggestions already supported by the supplied context\n- Brief guidance that can be answered from the current digest and chat tail\n\nDispatch reflection for:\n- Reflecting on a period, relationship, recurring pattern, or unresolved theme\n- Longer-form introspection where the owner needs synthesis more than action-taking\n- Responses that should help the owner understand what is happening, not just retrieve facts\n\nDo not dispatch reflection for:\n- Simple empathy or brief encouragement\n- Straightforward factual or tool-using work better handled by exec\n- Quick reflective nudges that can be answered directly from the current digest and chat tail\n\n## JSON Contract\n\nReturn exactly one JSON object matching `chat.schema.json`.\n\n- `message`: The owner-facing reply. Use `null` only when you genuinely have no safe or useful message to send.\n- `notes`: Brief internal summary of why you responded this way. Keep it factual and concise. Do not dump long reasoning.\n- `talent_request`: `null` unless a talent should be dispatched. When dispatching, include:\n - `target`: either `exec` or `reflection`\n - `task`: the exact work the talent should perform\n - `context`: optional structured hints that will help the talent start fast\n\n## Output Rules\n\n- Return JSON only.\n- `message` should stand on its own without referring to hidden machinery.\n- If `talent_request` is present, the `message` should still be useful to the owner right now.\n- When `report_back_only` is true, this turn is only for reporting back to the owner. Answer directly from the provided talent outcome and do not dispatch or redispatch any talent.\n- When the trigger is `talent_errored`, report the failure to the owner directly from the provided reason, stop there, and do not retry, dispatch, or redispatch any talent for that task.\n- Prefer no dispatch over a weak or redundant dispatch.", 3 3 "multi_facet": false, 4 4 "name": "chat", 5 5 "title": "Chat"

+96 -6

tests/test_chat_context.py

··· 333 333 template_vars = _assert_template_vars_result(result) 334 334 assert "Mode: report_back_only" in template_vars["trigger_context"] 335 335 assert ( 336 - "Instruction: Answer the owner directly; do not dispatch or redispatch " 337 - "a talent for this trigger." 336 + "Instruction: Answer the owner directly; report the failure to the " 337 + "owner and stop; do not retry, dispatch, or redispatch a talent for " 338 + "this trigger." 338 339 ) in template_vars["trigger_context"] 339 340 assert result["messages"] == [ 340 341 {"role": "user", "content": "What happened?"}, ··· 343 344 "role": "user", 344 345 "content": ( 345 346 "[internal follow-up: talent exec errored. This is a " 346 - "report-back turn, not a dispatch turn. Do not request " 347 - "another talent for this task. Briefly explain the failure " 348 - "to the owner and ask for clarification only if needed. " 349 - "Reason: The lookup failed.]" 347 + "stop-and-report turn, not a dispatch turn. Do not retry " 348 + "this task or request another talent for it. Stop here and " 349 + "report the failure to the owner directly using the reason " 350 + "below. Reason: The lookup failed.]" 350 351 ), 351 352 }, 352 353 ] 354 + 355 + 356 + def test_chat_context_talent_followups_are_observably_distinct(monkeypatch, tmp_path): 357 + journal = tmp_path / "journal" 358 + monkeypatch.setenv("SOLSTONE_JOURNAL", str(journal)) 359 + 360 + append_chat_event( 361 + "owner_message", 362 + ts=_ts(10, 0), 363 + text="What happened?", 364 + app="home", 365 + path="/app/home", 366 + facet="work", 367 + ) 368 + append_chat_event( 369 + "sol_message", 370 + ts=_ts(10, 1), 371 + use_id="use-chat-4", 372 + text="Looking into it.", 373 + notes="Acknowledged request.", 374 + requested_target=None, 375 + requested_task=None, 376 + ) 377 + 378 + monkeypatch.setattr("think.routines.get_routine_state", lambda: []) 379 + monkeypatch.setattr( 380 + "think.routines.get_config", 381 + lambda: {"_meta": {"suggestions_enabled": False, "suggestions": {}}}, 382 + ) 383 + monkeypatch.setattr("think.routines.save_config", lambda config: None) 384 + 385 + module = _load_chat_context_module() 386 + finished = module.pre_process( 387 + { 388 + "day": "20260420", 389 + "trigger_kind": "talent_finished", 390 + "trigger_payload": { 391 + "name": "exec", 392 + "summary": "Found the latest notes.", 393 + }, 394 + } 395 + ) 396 + errored = module.pre_process( 397 + { 398 + "day": "20260420", 399 + "trigger_kind": "talent_errored", 400 + "trigger_payload": { 401 + "name": "exec", 402 + "reason": "The lookup failed.", 403 + }, 404 + } 405 + ) 406 + 407 + finished_vars = _assert_template_vars_result(finished) 408 + errored_vars = _assert_template_vars_result(errored) 409 + 410 + finished_message = finished["messages"][-1]["content"] 411 + errored_message = errored["messages"][-1]["content"] 412 + 413 + assert finished_message == ( 414 + "[internal follow-up: talent exec finished. This is a report-back " 415 + "turn, not a dispatch turn. Do not request another talent for this " 416 + "task. Use the result below to answer the owner's pending request " 417 + "with a short summary. Result: Found the latest notes.]" 418 + ) 419 + assert errored_message == ( 420 + "[internal follow-up: talent exec errored. This is a stop-and-report " 421 + "turn, not a dispatch turn. Do not retry this task or request " 422 + "another talent for it. Stop here and report the failure to the " 423 + "owner directly using the reason below. Reason: The lookup failed.]" 424 + ) 425 + assert "Do not retry this task or request another talent for it." in errored_message 426 + assert ( 427 + "Do not retry this task or request another talent for it." 428 + not in finished_message 429 + ) 430 + 431 + finished_instruction = ( 432 + "Instruction: Answer the owner directly; do not dispatch or redispatch " 433 + "a talent for this trigger." 434 + ) 435 + errored_instruction = ( 436 + "Instruction: Answer the owner directly; report the failure to the " 437 + "owner and stop; do not retry, dispatch, or redispatch a talent for " 438 + "this trigger." 439 + ) 440 + assert finished_instruction in finished_vars["trigger_context"] 441 + assert errored_instruction in errored_vars["trigger_context"] 442 + assert errored_instruction not in finished_vars["trigger_context"] 353 443 354 444 355 445 def test_chat_context_includes_identity_grounding(monkeypatch, tmp_path):

Configure Feed

Configure Feed