Remove Claude Code SDK backend and doctor agent

solpbc.org / solstone

fork

Configure Feed

Issues Pull Requests Commits Tags

Feed URL

Select the types of activity you want to include in your feed.

personal memory agent

fork

Configure Feed

Issues Pull Requests Commits Tags

Feed URL

Select the types of activity you want to include in your feed.

Remove Claude Code SDK backend and doctor agent

The Claude Code SDK pattern (using local CLI with filesystem tools) wasn't
a good fit for the current architecture. Archive the files to scratch/ for
future reference and remove all special-casing from the codebase.

Removed:
- muse/claude.py (Claude Code SDK backend)
- muse/agents/doctor.md (diagnostics agent persona)
- tests/integration/test_claude_provider.py
- claude-agent-sdk dependency from pyproject.toml

Updated routing in muse/agents.py, muse/cli.py, muse/cortex.py to remove
"claude" provider special-casing. Updated docs/CORTEX.md and docs/MUSE.md
to reflect the three remaining providers (openai, google, anthropic).

Archived files preserved locally in scratch/claude_archive/ with notes.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Jer Miller 5 months ago 46fbf2c7 25168128

+35 -784

14 changed files

expand all collapse all

docs

CORTEX.md

MUSE.md

muse

agents

doctor.md

agents.py

claude.py

cli.py

cortex.py

cortex_client.py

providers

__init__.py

pyproject.toml

tests

integration

conftest.py

test_claude_provider.py

test_agents_ndjson.py

test_cortex.py

+1 -9

docs/CORTEX.md

reviewed

··· 243 243 ### Persona Configuration Options 244 244 245 245 The JSON frontmatter for a persona can include: 246 246 - - `claude`: Boolean flag to use Claude Code SDK instead of API providers 247 247 - - When true, uses filesystem tools instead of MCP; requires `facet` in request 248 248 - - This flag is NOT inherited by handoff agents 249 246 - `max_tokens`: Maximum response token limit 250 247 - `tools`: MCP tools configuration (string or array) 251 248 - String: Comma-separated pack names (e.g., `"journal"`, `"journal, todo"`) - expanded via `get_tools()` ··· 281 278 { 282 279 "providers": { 283 280 "contexts": { 284 284 - "agent.system.doctor": {"tier": 1}, 281 281 + "agent.system.default": {"tier": 1}, 285 282 "agent.*": {"tier": 2} 286 283 } 287 284 } ··· 294 291 295 292 ### Backend Support 296 293 - **OpenAI, Anthropic, Google**: Full MCP tool support via HTTP transport 297 297 - - **Claude**: Uses filesystem tools instead; requires `facet` configuration in spawn request 298 294 299 295 ### Tool Discovery 300 296 MCP tools are provided by the `muse.mcp_tools` FastMCP server, which: ··· 310 306 - **OpenAI** (`muse/providers/openai.py`): GPT models with OpenAI Agents SDK 311 307 - **Google** (`muse/providers/google.py`): Gemini models with Google AI SDK 312 308 - **Anthropic** (`muse/providers/anthropic.py`): Claude models with Anthropic SDK 313 313 - - **Claude** (`muse/claude.py`): Claude models via Claude Code SDK 314 314 - - Uses filesystem tools (Read, Write, Edit, etc.) instead of MCP 315 315 - - Requires `facet` configuration specifying journal facet directory 316 316 - - Operates within facet-scoped file permissions 317 309 318 310 All providers: 319 311 - Emit JSON events to stdout (one per line)

+1 -2

docs/MUSE.md

reviewed

+2 -5

muse/agents.py

reviewed

··· 328 328 app_logger.debug(f"Processing request: provider={provider}") 329 329 330 330 # Route to appropriate provider module 331 331 - # "claude" is a special case (Claude Code SDK) handled separately 332 332 - if provider == "claude": 333 333 - from . import claude as provider_mod 334 334 - elif provider in PROVIDER_REGISTRY: 331 331 + if provider in PROVIDER_REGISTRY: 335 332 provider_mod = get_provider_module(provider) 336 333 else: 337 334 # Explicit error for unknown providers 338 338 - valid = ", ".join(sorted(PROVIDER_REGISTRY.keys()) + ["claude"]) 335 335 + valid = ", ".join(sorted(PROVIDER_REGISTRY.keys())) 339 336 raise ValueError( 340 337 f"Unknown provider: {provider!r}. Valid providers: {valid}" 341 338 )

-118

muse/agents/doctor.md

reviewed

··· 1 1 - { 2 2 - 3 3 - "title": "System Diagnostics", 4 4 - "claude": true, 5 5 - "tools": "default" 6 6 - 7 7 - } 8 8 - 9 9 - You are the solstone System Doctor, a diagnostic agent specialized in analyzing and troubleshooting the solstone journal system. You have read-only access to the entire journal directory and can run diagnostic shell commands to assess system health. 10 10 - 11 11 - The user may provide specific instructions, a description of an issue they're experiencing, or a particular area to focus on. If so, prioritize investigating that. If no specific issue is mentioned, perform a general health check. 12 12 - 13 13 - ## Core Capabilities 14 14 - 15 15 - You can read files and run diagnostic commands throughout the journal directory. Your working directory is the journal root. 16 16 - 17 17 - ### Available File Access 18 18 - - Read any file in the journal directory tree 19 19 - - List directories and examine file metadata 20 20 - - Search file contents with grep 21 21 - 22 22 - ### Available Shell Commands 23 23 - - `ls`, `cat`, `head`, `tail` - File inspection 24 24 - - `grep` - Content searching 25 25 - - `jq` - JSON parsing 26 26 - - `wc` - Counting 27 27 - - `pgrep` - Process status 28 28 - - `stat`, `find`, `test` - File system inspection 29 29 - - `date`, `basename`, `dirname` - Utility commands 30 30 - 31 31 - ## Journal Structure 32 32 - 33 33 - The journal is organized as: 34 34 - ``` 35 35 - ./ 36 36 - ├── health/ # Service health and logs 37 37 - │ ├── *.log # Service log symlinks 38 38 - │ └── callosum.sock # Message bus socket 39 39 - ├── agents/ # Agent execution logs 40 40 - │ ├── *.jsonl # Completed agent runs 41 41 - │ └── *_active.jsonl # Currently running agents 42 42 - ├── tokens/ # Token usage logs 43 43 - │ └── YYYYMMDD.jsonl 44 44 - ├── YYYYMMDD/ # Daily directories 45 45 - │ ├── health/ # Day's process logs 46 46 - │ ├── agents/ # (not used, agents at root level) 47 47 - │ └── ... # Transcripts, insights, etc. 48 48 - └── facets/ # Project-specific directories 49 49 - ``` 50 50 - 51 51 - ## Diagnostic Procedures 52 52 - 53 53 - ### Quick Health Check 54 54 - 1. Check if supervisor services are running: `pgrep -af "observer|observe-sense|think-supervisor"` 55 55 - 2. Check Callosum socket exists: `ls -la health/callosum.sock` 56 56 - 3. Check for stuck agents: `ls agents/*_active.jsonl 2>/dev/null` 57 57 - 4. Check observer log for recent activity: `tail -20 health/observer.log` 58 58 - 59 59 - **Healthy state:** 60 60 - - All three processes running 61 61 - - `callosum.sock` exists 62 62 - - Observer log shows recent status emissions (health derived from Callosum events) 63 63 - - No `_active.jsonl` files older than a few minutes 64 64 - 65 65 - ### Service Status 66 66 - Check specific service logs: 67 67 - - Observer: `tail -50 health/observer.log` 68 68 - - Sense: `tail -50 health/observe-sense.log` 69 69 - - Supervisor: Check for `think-supervisor` process 70 70 - 71 71 - ### Agent Analysis 72 72 - - View agent's final result: `jq -r 'select(.event=="finish") | .result' agents/TIMESTAMP.jsonl` 73 73 - - List today's agents with prompts: Iterate through `agents/*.jsonl` 74 74 - - Find errors: `grep -l '"event":"error"' agents/*.jsonl` 75 75 - - Check active agents: `ls -la agents/*_active.jsonl` 76 76 - 77 77 - ### Common Issues 78 78 - 79 79 - **Observer not capturing:** 80 80 - - Check log for errors: `tail -50 health/observer.log | grep -i error` 81 81 - - Check for recent status emissions in log (health is derived from Callosum events) 82 82 - - Causes: DBus issues, screencast permissions, audio device unavailable 83 83 - 84 84 - **Agent appears stuck:** 85 85 - - Find active agents: `ls -la agents/*_active.jsonl` 86 86 - - Check last event: `tail -1 agents/*_active.jsonl | jq .` 87 87 - - Causes: Backend timeout, tool hanging, network issues 88 88 - 89 89 - **No Callosum events:** 90 90 - - Verify socket: `ls -la health/callosum.sock` 91 91 - - Check supervisor: `pgrep -af think-supervisor` 92 92 - - Causes: Supervisor not started, socket path permissions 93 93 - 94 94 - **Processing backlog:** 95 95 - - Check sense log: `grep -i "queue" health/observe-sense.log | tail -10` 96 96 - - Causes: Slow transcription, API rate limits 97 97 - 98 98 - ### Useful Commands 99 99 - - View recent logs: `tail -50 health/*.log` 100 100 - - Count agents by status: Count files in `agents/` 101 101 - - Check token usage: `wc -l tokens/$(date +%Y%m%d).jsonl` 102 102 - - Find errors in logs: `grep -i error $(date +%Y%m%d)/health/*.log` 103 103 - 104 104 - ## Response Guidelines 105 105 - 106 106 - 1. **Start with Quick Health Check** when asked about system status 107 107 - 2. **Be systematic** - gather data before drawing conclusions 108 108 - 3. **Explain findings clearly** - what's normal vs concerning 109 109 - 4. **Suggest remediation** when problems are found 110 110 - 5. **Use relative paths** - you're already in the journal root 111 111 - 112 112 - When investigating issues: 113 113 - 1. Gather evidence from multiple sources 114 114 - 2. Look for patterns (timestamps, error messages) 115 115 - 3. Consider root causes, not just symptoms 116 116 - 4. Provide actionable recommendations 117 117 - 118 118 - Remember: You have read-only access. You cannot modify files or restart services. Your role is to diagnose and report.

-296

muse/claude.py

reviewed

··· 1 1 - #!/usr/bin/env python3 2 2 - # SPDX-License-Identifier: AGPL-3.0-only 3 3 - # Copyright (c) 2026 sol pbc 4 4 - 5 5 - """Claude Code SDK backend agent implementation. 6 6 - 7 7 - This module exposes agent functionality for interacting with Claude Code 8 8 - via the SDK and is used by the ``sol agents`` CLI. 9 9 - 10 10 - The Claude backend provides read-only access to the entire journal with 11 11 - diagnostic shell commands for system analysis and health checks. 12 12 - """ 13 13 - 14 14 - from __future__ import annotations 15 15 - 16 16 - import os 17 17 - import time 18 18 - import traceback 19 19 - from pathlib import Path 20 20 - from typing import Any, Callable, Dict, Optional 21 21 - 22 22 - from claude_agent_sdk import ( 23 23 - AssistantMessage, 24 24 - ClaudeAgentOptions, 25 25 - CLINotFoundError, 26 26 - ProcessError, 27 27 - TextBlock, 28 28 - ThinkingBlock, 29 29 - ToolResultBlock, 30 30 - ToolUseBlock, 31 31 - UserMessage, 32 32 - query, 33 33 - ) 34 34 - 35 35 - from muse.models import CLAUDE_SONNET_4 36 36 - from think.utils import get_journal 37 37 - 38 38 - from .agents import JSONEventCallback, ThinkingEvent 39 39 - 40 40 - # Add local claude installation to PATH if it exists 41 41 - _claude_bin = Path.home() / ".claude" / "local" / "node_modules" / ".bin" 42 42 - if _claude_bin.exists(): 43 43 - current_path = os.environ.get("PATH", "") 44 44 - if str(_claude_bin) not in current_path: 45 45 - os.environ["PATH"] = f"{_claude_bin}:{current_path}" 46 46 - 47 47 - _DEFAULT_MODEL = CLAUDE_SONNET_4 48 48 - 49 49 - 50 50 - def _get_readonly_tools(journal_path: str) -> list[str]: 51 51 - """Return allowed tools for read-only journal access with diagnostic commands.""" 52 52 - return [ 53 53 - # Read-only file access to journal 54 54 - f"Read({journal_path}/**)", 55 55 - f"Glob({journal_path}/**)", 56 56 - f"LS({journal_path}/**)", 57 57 - # Diagnostic shell commands (read-only) 58 58 - "Bash(ls:*)", 59 59 - "Bash(cat:*)", 60 60 - "Bash(head:*)", 61 61 - "Bash(tail:*)", 62 62 - "Bash(grep:*)", 63 63 - "Bash(jq:*)", 64 64 - "Bash(wc:*)", 65 65 - "Bash(date:*)", 66 66 - "Bash(pgrep:*)", 67 67 - "Bash(basename:*)", 68 68 - "Bash(dirname:*)", 69 69 - "Bash(test:*)", 70 70 - "Bash(stat:*)", 71 71 - "Bash(find:*)", 72 72 - ] 73 73 - 74 74 - 75 75 - async def run_agent( 76 76 - config: Dict[str, Any], 77 77 - on_event: Optional[Callable[[dict], None]] = None, 78 78 - ) -> str: 79 79 - """Run a single prompt through the Claude Code SDK and return the response. 80 80 - 81 81 - Uses persona configuration from the unified config dict. 82 82 - The Claude backend provides read-only access to the entire journal 83 83 - with diagnostic shell commands for system analysis. 84 84 - 85 85 - Args: 86 86 - config: Complete configuration dictionary including prompt, system_instruction, 87 87 - user_instruction, extra_context, model, etc. 88 88 - on_event: Optional event callback 89 89 - """ 90 90 - # Extract values from unified config 91 91 - prompt = config.get("prompt", "") 92 92 - if not prompt: 93 93 - raise ValueError("Missing 'prompt' in config") 94 94 - 95 95 - # Model has a default for Claude Code SDK since it bypasses tier-based resolution 96 96 - # Cortex sets provider="claude" but doesn't set model for this special case 97 97 - model = config.get("model", _DEFAULT_MODEL) 98 98 - max_turns = config.get("max_turns", 32) 99 99 - persona = config.get("persona", "default") 100 100 - 101 101 - callback = JSONEventCallback(on_event) 102 102 - 103 103 - try: 104 104 - # Get journal path for file permissions 105 105 - journal_path = get_journal() 106 106 - 107 107 - # Extract instruction components from config 108 108 - # Structure: system=journal.md, context+user_instruction prepended to prompt 109 109 - system_instruction = config.get("system_instruction", "") 110 110 - extra_context = config.get("extra_context", "") 111 111 - user_instruction = config.get("user_instruction", "") 112 112 - 113 113 - # Claude Code SDK only takes prompt, so prepend context and user instruction 114 114 - prompt_parts = [] 115 115 - if extra_context: 116 116 - prompt_parts.append(extra_context) 117 117 - if user_instruction: 118 118 - prompt_parts.append(user_instruction) 119 119 - prompt_parts.append(prompt) 120 120 - prompt = "\n\n".join(prompt_parts) 121 121 - 122 122 - callback.emit( 123 123 - { 124 124 - "event": "start", 125 125 - "prompt": prompt, 126 126 - "persona": persona, 127 127 - "model": model, 128 128 - "provider": "claude", 129 129 - "journal_path": journal_path, 130 130 - "ts": int(time.time() * 1000), 131 131 - } 132 132 - ) 133 133 - 134 134 - # Configure Claude Code options with read-only journal access 135 135 - options = ClaudeAgentOptions( 136 136 - system_prompt=system_instruction, 137 137 - model=model, 138 138 - cwd=journal_path, # Set working directory to journal root 139 139 - allowed_tools=_get_readonly_tools(journal_path), 140 140 - disallowed_tools=["mcp_*"], # Disable MCP tools 141 141 - permission_mode="bypassPermissions", # Skip prompts, rely on allowed_tools 142 142 - max_turns=max_turns, 143 143 - ) 144 144 - 145 145 - # Track tool calls for pairing start/end events 146 146 - tool_calls = {} 147 147 - response_text = [] 148 148 - 149 149 - # Stream responses from Claude Code 150 150 - async for message in query(prompt=prompt, options=options): 151 151 - if isinstance(message, AssistantMessage): 152 152 - # Process each content block in the assistant's message 153 153 - for block in message.content: 154 154 - if isinstance(block, TextBlock): 155 155 - # Regular text response 156 156 - response_text.append(block.text) 157 157 - 158 158 - elif isinstance(block, ToolUseBlock): 159 159 - # Tool being called 160 160 - tool_id = getattr(block, "id", str(time.time())) 161 161 - tool_name = getattr(block, "name", "unknown") 162 162 - tool_input = getattr(block, "input", {}) 163 163 - 164 164 - tool_calls[tool_id] = { 165 165 - "name": tool_name, 166 166 - "input": tool_input, 167 167 - } 168 168 - 169 169 - callback.emit( 170 170 - { 171 171 - "event": "tool_start", 172 172 - "tool": tool_name, 173 173 - "args": tool_input, 174 174 - "call_id": tool_id, 175 175 - } 176 176 - ) 177 177 - 178 178 - elif isinstance(block, ToolResultBlock): 179 179 - # Tool result received 180 180 - tool_id = getattr(block, "tool_use_id", None) 181 181 - content = getattr(block, "content", "") 182 182 - 183 183 - if tool_id and tool_id in tool_calls: 184 184 - tool_info = tool_calls[tool_id] 185 185 - callback.emit( 186 186 - { 187 187 - "event": "tool_end", 188 188 - "tool": tool_info["name"], 189 189 - "args": tool_info["input"], 190 190 - "result": content, 191 191 - "call_id": tool_id, 192 192 - } 193 193 - ) 194 194 - 195 195 - elif isinstance(block, ThinkingBlock): 196 196 - # Thinking/reasoning block 197 197 - thinking_content = block.thinking 198 198 - if thinking_content: 199 199 - thinking_event: ThinkingEvent = { 200 200 - "event": "thinking", 201 201 - "ts": int(time.time() * 1000), 202 202 - "summary": thinking_content, 203 203 - "model": model, 204 204 - } 205 205 - callback.emit(thinking_event) 206 206 - 207 207 - elif isinstance(message, UserMessage): 208 208 - # User message in conversation (shouldn't happen in our case) 209 209 - pass 210 210 - 211 211 - # Handle other message types or raw events 212 212 - elif hasattr(message, "__dict__"): 213 213 - # Check for streaming events or other message types 214 214 - msg_dict = message.__dict__ if hasattr(message, "__dict__") else {} 215 215 - 216 216 - # Look for tool events in the message structure 217 217 - if msg_dict.get("type") == "tool_use": 218 218 - tool_id = msg_dict.get("id", str(time.time())) 219 219 - tool_name = msg_dict.get("name", "unknown") 220 220 - tool_input = msg_dict.get("input", {}) 221 221 - 222 222 - tool_calls[tool_id] = { 223 223 - "name": tool_name, 224 224 - "input": tool_input, 225 225 - } 226 226 - 227 227 - callback.emit( 228 228 - { 229 229 - "event": "tool_start", 230 230 - "tool": tool_name, 231 231 - "args": tool_input, 232 232 - "call_id": tool_id, 233 233 - } 234 234 - ) 235 235 - 236 236 - elif msg_dict.get("type") == "tool_result": 237 237 - tool_id = msg_dict.get("tool_use_id") 238 238 - content = msg_dict.get("content", "") 239 239 - 240 240 - if tool_id and tool_id in tool_calls: 241 241 - tool_info = tool_calls[tool_id] 242 242 - callback.emit( 243 243 - { 244 244 - "event": "tool_end", 245 245 - "tool": tool_info["name"], 246 246 - "args": tool_info["input"], 247 247 - "result": content, 248 248 - "call_id": tool_id, 249 249 - } 250 250 - ) 251 251 - 252 252 - # Combine all response text 253 253 - final_text = "".join(response_text).strip() 254 254 - 255 255 - callback.emit({"event": "finish", "result": final_text}) 256 256 - return final_text 257 257 - 258 258 - except CLINotFoundError: 259 259 - error_msg = "Claude Code CLI not found. Please install with: npm install -g @anthropic-ai/claude-code" 260 260 - callback.emit( 261 261 - { 262 262 - "event": "error", 263 263 - "error": error_msg, 264 264 - "trace": traceback.format_exc(), 265 265 - } 266 266 - ) 267 267 - raise RuntimeError(error_msg) 268 268 - 269 269 - except ProcessError as e: 270 270 - error_msg = ( 271 271 - f"Claude Code process failed with exit code {e.exit_code}: {e.stderr}" 272 272 - ) 273 273 - callback.emit( 274 274 - { 275 275 - "event": "error", 276 276 - "error": error_msg, 277 277 - "trace": traceback.format_exc(), 278 278 - } 279 279 - ) 280 280 - raise RuntimeError(error_msg) 281 281 - 282 282 - except Exception as exc: 283 283 - callback.emit( 284 284 - { 285 285 - "event": "error", 286 286 - "error": str(exc), 287 287 - "trace": traceback.format_exc(), 288 288 - } 289 289 - ) 290 290 - setattr(exc, "_evented", True) 291 291 - raise 292 292 - 293 293 - 294 294 - __all__ = [ 295 295 - "run_agent", 296 296 - ]

+16 -23

muse/cli.py

reviewed

··· 6 6 7 7 Usage: 8 8 muse "prompt" # Run through Cortex (default) 9 9 - muse -p doctor "prompt" # Use specific persona 10 10 - muse --direct -p doctor "prompt" # Run directly, bypass Cortex 9 9 + muse -p default "prompt" # Use specific persona 10 10 + muse --direct -p default "prompt" # Run directly, bypass Cortex 11 11 echo "prompt" | muse # Read prompt from stdin 12 12 """ 13 13 ··· 127 127 app, name = "system", persona 128 128 agent_context = f"agent.{app}.{name}" 129 129 130 130 - # Check for claude: true flag (special case for Claude Code SDK) 131 131 - if config.get("claude"): 132 132 - config["provider"] = "claude" 133 133 - # Claude SDK doesn't need model - it uses its own 134 134 - else: 135 135 - # Resolve default provider and model from context 136 136 - default_provider, model = resolve_provider(agent_context) 130 130 + # Resolve default provider and model from context 131 131 + default_provider, model = resolve_provider(agent_context) 137 132 138 138 - # Provider can be overridden by parameter or persona config 139 139 - final_provider = provider or config.get("provider") or default_provider 133 133 + # Provider can be overridden by parameter or persona config 134 134 + final_provider = provider or config.get("provider") or default_provider 140 135 141 141 - # If provider was overridden, re-resolve model for that provider 142 142 - if final_provider != default_provider: 143 143 - model = resolve_model_for_provider(agent_context, final_provider) 136 136 + # If provider was overridden, re-resolve model for that provider 137 137 + if final_provider != default_provider: 138 138 + model = resolve_model_for_provider(agent_context, final_provider) 144 139 145 145 - config["provider"] = final_provider 146 146 - config["model"] = model 140 140 + config["provider"] = final_provider 141 141 + config["model"] = model 147 142 148 143 # Expand tools if it's a string (tool pack name) 149 144 tools_config = config.get("tools") ··· 267 262 # Route to appropriate provider using registry 268 263 from muse.providers import PROVIDER_REGISTRY, get_provider_module 269 264 270 270 - if provider_name == "claude": 271 271 - from muse import claude as provider_mod 272 272 - elif provider_name in PROVIDER_REGISTRY: 265 265 + if provider_name in PROVIDER_REGISTRY: 273 266 provider_mod = get_provider_module(provider_name) 274 267 else: 275 275 - valid = ", ".join(sorted(PROVIDER_REGISTRY.keys()) + ["claude"]) 268 268 + valid = ", ".join(sorted(PROVIDER_REGISTRY.keys())) 276 269 raise ValueError( 277 270 f"Unknown provider: {provider_name!r}. Valid providers: {valid}" 278 271 ) ··· 297 290 epilog=""" 298 291 Examples: 299 292 muse "What time is it?" Run with default persona through Cortex 300 300 - muse -p doctor "check health" Run doctor persona 301 301 - muse --direct -p doctor "check" Run directly, bypass Cortex 293 293 + muse -p joke_bot "tell me a joke" Run with specific persona 294 294 + muse --direct "prompt" Run directly, bypass Cortex 302 295 echo "prompt" | muse Read prompt from stdin 303 296 muse --json "prompt" | jq .event JSON output for scripting 304 297 """, ··· 309 302 "-p", "--persona", default="default", help="Agent persona (default: default)" 310 303 ) 311 304 parser.add_argument( 312 312 - "-b", "--provider", help="Override provider (openai, anthropic, google, claude)" 305 305 + "-b", "--provider", help="Override provider (openai, anthropic, google)" 313 306 ) 314 307 parser.add_argument( 315 308 "--direct", action="store_true", help="Run directly, bypass Cortex"

+11 -18

muse/cortex.py

reviewed

··· 339 339 app, name = "system", persona 340 340 agent_context = f"agent.{app}.{name}" 341 341 342 342 - # Check for claude: true flag (special case for Claude Code SDK) 343 343 - if config.get("claude"): 344 344 - config["provider"] = "claude" 345 345 - # Claude SDK doesn't need model - it uses its own 346 346 - else: 347 347 - # Resolve default provider and model from context 348 348 - default_provider, model = resolve_provider(agent_context) 342 342 + # Resolve default provider and model from context 343 343 + default_provider, model = resolve_provider(agent_context) 349 344 350 350 - # Provider can be overridden by request or persona config 351 351 - # Model is always resolved from context tier + final provider 352 352 - provider = config.get("provider") or default_provider 345 345 + # Provider can be overridden by request or persona config 346 346 + # Model is always resolved from context tier + final provider 347 347 + provider = config.get("provider") or default_provider 353 348 354 354 - # If provider was overridden, re-resolve model for that provider 355 355 - if provider != default_provider: 356 356 - model = resolve_model_for_provider(agent_context, provider) 349 349 + # If provider was overridden, re-resolve model for that provider 350 350 + if provider != default_provider: 351 351 + model = resolve_model_for_provider(agent_context, provider) 357 352 358 358 - config["provider"] = provider 359 359 - config["model"] = model 353 353 + config["provider"] = provider 354 354 + config["model"] = model 360 355 361 356 # Capture handoff configuration for post-run processing while 362 357 # leaving it in the merged config for logging transparency. ··· 778 773 # the handoff persona resolve its own provider from context 779 774 provider = handoff_config.pop("provider", None) 780 775 781 781 - # Ensure we do not propagate parent handoff metadata or claude flag. 782 782 - # Each persona must declare claude: true in its own config. 776 776 + # Ensure we do not propagate parent handoff metadata. 783 777 handoff_config.pop("handoff", None) 784 778 handoff_config.pop("handoff_from", None) 785 785 - handoff_config.pop("claude", None) 786 779 handoff_config.pop("model", None) 787 780 788 781 # Inherit env from parent if not explicitly set in handoff config

+2 -2

muse/cortex_client.py

reviewed

··· 32 32 Args: 33 33 prompt: The task or question for the agent 34 34 persona: Agent persona - system (e.g., "default") or app-qualified (e.g., "entities:entity_assist") 35 35 - provider: AI provider - openai, google, anthropic, or claude 35 35 + provider: AI provider - openai, google, or anthropic 36 36 handoff_from: Previous agent ID if this is a handoff request 37 37 - config: Provider-specific configuration (model, max_tokens, facet for Claude) 37 37 + config: Provider-specific configuration (model, max_tokens, etc.) 38 38 save: Optional filename to save result to in current day directory 39 39 40 40 Returns:

-4

muse/providers/__init__.py

reviewed

··· 28 28 # - generate(contents, model, ...) -> str 29 29 # - agenerate(contents, model, ...) -> str 30 30 # - run_agent(config, on_event) -> str 31 31 - # 32 32 - # The "claude" provider (Claude Code SDK) is intentionally excluded from this 33 33 - # registry as it uses a fundamentally different execution model (local CLI) 34 34 - # and is handled as a special case in muse/agents.py. 35 31 # --------------------------------------------------------------------------- 36 32 37 33 PROVIDER_REGISTRY: Dict[str, str] = {

-3

pyproject.toml

reviewed

··· 76 76 # Media processing 77 77 "opencv-python", 78 78 79 79 - # Additional AI SDKs 80 80 - "claude-agent-sdk>=0.1.0", 81 81 - 82 79 # Development tools 83 80 "black", 84 81 "flake8",

-19

tests/integration/conftest.py

reviewed

··· 16 16 config.addinivalue_line( 17 17 "markers", "requires_api: mark test as requiring external API access" 18 18 ) 19 19 - config.addinivalue_line( 20 20 - "markers", "requires_claude_sdk: mark test as requiring Claude Agent SDK" 21 21 - ) 22 22 - 23 23 - 24 24 - def pytest_collection_modifyitems(config, items): 25 25 - """Skip tests that require claude_agent_sdk if it's not installed.""" 26 26 - try: 27 27 - import claude_agent_sdk # noqa: F401 28 28 - 29 29 - sdk_available = True 30 30 - except ImportError: 31 31 - sdk_available = False 32 32 - 33 33 - if not sdk_available: 34 34 - skip_claude = pytest.mark.skip(reason="claude_agent_sdk not installed") 35 35 - for item in items: 36 36 - if "requires_claude_sdk" in item.keywords: 37 37 - item.add_marker(skip_claude) 38 19 39 20 40 21 @pytest.fixture(scope="session")

-278

tests/integration/test_claude_provider.py

reviewed

··· 1 1 - # SPDX-License-Identifier: AGPL-3.0-only 2 2 - # Copyright (c) 2026 sol pbc 3 3 - 4 4 - """Integration test for Claude provider with SDK integration.""" 5 5 - 6 6 - import json 7 7 - import os 8 8 - import subprocess 9 9 - from pathlib import Path 10 10 - 11 11 - import pytest 12 12 - from dotenv import load_dotenv 13 13 - 14 14 - from muse.models import CLAUDE_SONNET_4 15 15 - 16 16 - # --- Shared Test Helpers --- 17 17 - 18 18 - 19 19 - def get_fixtures_env(): 20 20 - """Load the fixtures/.env file and return the environment.""" 21 21 - fixtures_env = Path(__file__).parent.parent.parent / "fixtures" / ".env" 22 22 - if not fixtures_env.exists(): 23 23 - return None, None 24 24 - 25 25 - load_dotenv(fixtures_env, override=True) 26 26 - journal_path = os.getenv("JOURNAL_PATH") 27 27 - return fixtures_env, journal_path 28 28 - 29 29 - 30 30 - def skip_if_claude_cli_unavailable(): 31 31 - """Check if Claude Code CLI is available, skip test if not.""" 32 32 - claude_path = Path.home() / ".claude" / "local" / "node_modules" / ".bin" / "claude" 33 33 - if not claude_path.exists(): 34 34 - pytest.skip(f"Claude Code CLI not found at {claude_path}") 35 35 - 36 36 - try: 37 37 - result = subprocess.run( 38 38 - [str(claude_path), "--version"], 39 39 - capture_output=True, 40 40 - text=True, 41 41 - timeout=5, 42 42 - ) 43 43 - if result.returncode != 0: 44 44 - pytest.skip("Claude Code CLI not working properly") 45 45 - except (subprocess.TimeoutExpired, FileNotFoundError): 46 46 - pytest.skip("Claude Code CLI not found") 47 47 - 48 48 - 49 49 - def setup_test_journal(journal_path: str) -> Path: 50 50 - """Create journal structure for testing. Returns journal_dir path.""" 51 51 - journal_dir = Path(journal_path) 52 52 - journal_dir.mkdir(parents=True, exist_ok=True) 53 53 - 54 54 - agents_dir = journal_dir / "agents" 55 55 - agents_dir.mkdir(exist_ok=True) 56 56 - 57 57 - health_dir = journal_dir / "health" 58 58 - health_dir.mkdir(exist_ok=True) 59 59 - 60 60 - return journal_dir 61 61 - 62 62 - 63 63 - def prepare_test_env(journal_path: str, max_tokens: int = 100) -> dict: 64 64 - """Prepare environment variables for test subprocess.""" 65 65 - env = os.environ.copy() 66 66 - env["JOURNAL_PATH"] = journal_path 67 67 - claude_bin_dir = str(Path.home() / ".claude" / "local" / "node_modules" / ".bin") 68 68 - env["PATH"] = claude_bin_dir + ":" + env.get("PATH", "") 69 69 - env["CLAUDE_AGENT_MODEL"] = CLAUDE_SONNET_4 70 70 - env["CLAUDE_AGENT_MAX_TOKENS"] = str(max_tokens) 71 71 - return env 72 72 - 73 73 - 74 74 - def parse_events(stdout: str, strict: bool = True) -> list: 75 75 - """Parse JSONL events from stdout.""" 76 76 - events = [] 77 77 - for line in stdout.strip().split("\n"): 78 78 - if line: 79 79 - try: 80 80 - events.append(json.loads(line)) 81 81 - except json.JSONDecodeError as e: 82 82 - if strict: 83 83 - pytest.fail(f"Failed to parse JSON line: {line}\nError: {e}") 84 84 - # Non-strict: skip non-JSON lines (verbose output) 85 85 - return events 86 86 - 87 87 - 88 88 - def handle_error_event(finish_event: dict): 89 89 - """Handle error events, skipping for known intermittent issues.""" 90 90 - if finish_event.get("event") == "error": 91 91 - error_msg = finish_event.get("error", "Unknown error") 92 92 - if "CLI not found" in error_msg: 93 93 - pytest.skip(f"Claude Code CLI issue: {error_msg}") 94 94 - elif "rate" in error_msg.lower() or "retry" in error_msg.lower(): 95 95 - pytest.skip(f"Intermittent Claude API error: {error_msg}") 96 96 - else: 97 97 - pytest.fail(f"Unexpected error: {finish_event}") 98 98 - 99 99 - 100 100 - # --- Tests --- 101 101 - 102 102 - 103 103 - @pytest.mark.integration 104 104 - @pytest.mark.requires_claude_sdk 105 105 - def test_claude_provider_real_sdk(): 106 106 - """Test Claude provider with real SDK call if Claude Code CLI is available.""" 107 107 - fixtures_env, journal_path = get_fixtures_env() 108 108 - if not fixtures_env: 109 109 - pytest.skip("fixtures/.env not found") 110 110 - if not journal_path: 111 111 - pytest.skip("JOURNAL_PATH not found in fixtures/.env file") 112 112 - 113 113 - skip_if_claude_cli_unavailable() 114 114 - setup_test_journal(journal_path) 115 115 - env = prepare_test_env(journal_path, max_tokens=100) 116 116 - 117 117 - ndjson_input = json.dumps( 118 118 - { 119 119 - "prompt": "what is 2+2? Just give me the number.", 120 120 - "provider": "claude", 121 121 - "persona": "default", 122 122 - "model": CLAUDE_SONNET_4, 123 123 - "max_tokens": 100, 124 124 - } 125 125 - ) 126 126 - 127 127 - result = subprocess.run( 128 128 - ["sol", "agents"], 129 129 - env=env, 130 130 - input=ndjson_input, 131 131 - capture_output=True, 132 132 - text=True, 133 133 - timeout=30, 134 134 - ) 135 135 - 136 136 - assert result.returncode == 0, f"Command failed with stderr: {result.stderr}" 137 137 - 138 138 - events = parse_events(result.stdout, strict=True) 139 139 - assert ( 140 140 - len(events) >= 2 141 141 - ), f"Expected at least start and finish events, got {len(events)}" 142 142 - 143 143 - # Check start event 144 144 - start_event = events[0] 145 145 - assert start_event["event"] == "start" 146 146 - assert start_event["prompt"] == "what is 2+2? Just give me the number." 147 147 - assert start_event["model"] == CLAUDE_SONNET_4 148 148 - assert start_event["persona"] == "default" 149 149 - assert start_event["provider"] == "claude" 150 150 - # Claude provider now emits journal_path instead of facet 151 151 - assert "journal_path" in start_event 152 152 - if "ts" in start_event: 153 153 - assert isinstance(start_event["ts"], int) 154 154 - 155 155 - # Check finish event 156 156 - finish_event = events[-1] 157 157 - handle_error_event(finish_event) 158 158 - 159 159 - assert ( 160 160 - finish_event["event"] == "finish" 161 161 - ), f"Expected finish event, got: {finish_event}" 162 162 - if "ts" in finish_event: 163 163 - assert isinstance(finish_event["ts"], int) 164 164 - assert "result" in finish_event 165 165 - 166 166 - result_text = finish_event["result"].lower() 167 167 - assert ( 168 168 - "4" in result_text or "four" in result_text 169 169 - ), f"Expected '4' in response, got: {finish_event['result']}" 170 170 - 171 171 - error_events = [e for e in events if e.get("event") == "error"] 172 172 - assert len(error_events) == 0, f"Found error events: {error_events}" 173 173 - assert result.stderr == "", f"Expected empty stderr, got: {result.stderr}" 174 174 - 175 175 - 176 176 - @pytest.mark.integration 177 177 - @pytest.mark.requires_claude_sdk 178 178 - def test_claude_provider_with_tool_calls(): 179 179 - """Test Claude provider with tool calls (read-only file access).""" 180 180 - fixtures_env, journal_path = get_fixtures_env() 181 181 - if not fixtures_env: 182 182 - pytest.skip("fixtures/.env not found") 183 183 - if not journal_path: 184 184 - pytest.skip("JOURNAL_PATH not found in fixtures/.env file") 185 185 - 186 186 - skip_if_claude_cli_unavailable() 187 187 - journal_dir = setup_test_journal(journal_path) 188 188 - env = prepare_test_env(journal_path, max_tokens=200) 189 189 - 190 190 - # Create a test file in the journal (not in a facet) 191 191 - test_file = journal_dir / "test_file.txt" 192 192 - test_file.write_text("Hello from test file!") 193 193 - 194 194 - try: 195 195 - ndjson_input = json.dumps( 196 196 - { 197 197 - "prompt": f"Read the file at {test_file} and tell me what it says.", 198 198 - "provider": "claude", 199 199 - "persona": "default", 200 200 - "model": CLAUDE_SONNET_4, 201 201 - "max_tokens": 200, 202 202 - } 203 203 - ) 204 204 - 205 205 - result = subprocess.run( 206 206 - ["sol", "agents", "-v"], 207 207 - env=env, 208 208 - input=ndjson_input, 209 209 - capture_output=True, 210 210 - text=True, 211 211 - timeout=30, 212 212 - ) 213 213 - finally: 214 214 - test_file.unlink(missing_ok=True) 215 215 - 216 216 - assert result.returncode == 0, f"Command failed with stderr: {result.stderr}" 217 217 - 218 218 - events = parse_events(result.stdout, strict=False) 219 219 - 220 220 - finish_event = events[-1] 221 221 - handle_error_event(finish_event) 222 222 - 223 223 - assert ( 224 224 - finish_event["event"] == "finish" 225 225 - ), f"Expected finish event, got: {finish_event}" 226 226 - result_text = finish_event["result"].lower() 227 227 - assert ( 228 228 - "hello" in result_text 229 229 - ), f"Expected 'hello' in response, got: {finish_event['result']}" 230 230 - 231 231 - 232 232 - @pytest.mark.integration 233 233 - @pytest.mark.requires_claude_sdk 234 234 - def test_claude_provider_with_thinking(): 235 235 - """Test Claude provider thinking/reasoning events.""" 236 236 - fixtures_env, journal_path = get_fixtures_env() 237 237 - if not fixtures_env: 238 238 - pytest.skip("fixtures/.env not found") 239 239 - if not journal_path: 240 240 - pytest.skip("JOURNAL_PATH not found in fixtures/.env file") 241 241 - 242 242 - skip_if_claude_cli_unavailable() 243 243 - setup_test_journal(journal_path) 244 244 - env = prepare_test_env(journal_path, max_tokens=200) 245 245 - 246 246 - ndjson_input = json.dumps( 247 247 - { 248 248 - "prompt": "Think step by step: If I have 3 apples and give away 1, how many do I have left? Just give the number.", 249 249 - "provider": "claude", 250 250 - "persona": "default", 251 251 - "model": CLAUDE_SONNET_4, 252 252 - "max_tokens": 200, 253 253 - } 254 254 - ) 255 255 - 256 256 - result = subprocess.run( 257 257 - ["sol", "agents"], 258 258 - env=env, 259 259 - input=ndjson_input, 260 260 - capture_output=True, 261 261 - text=True, 262 262 - timeout=30, 263 263 - ) 264 264 - 265 265 - assert result.returncode == 0, f"Command failed with stderr: {result.stderr}" 266 266 - 267 267 - events = parse_events(result.stdout, strict=False) 268 268 - 269 269 - finish_event = events[-1] 270 270 - handle_error_event(finish_event) 271 271 - 272 272 - assert ( 273 273 - finish_event["event"] == "finish" 274 274 - ), f"Expected finish event, got: {finish_event}" 275 275 - result_text = finish_event["result"].lower() 276 276 - assert ( 277 277 - "2" in result_text or "two" in result_text 278 278 - ), f"Expected '2' in response, got: {finish_event['result']}"

-5

tests/test_agents_ndjson.py

reviewed

··· 65 65 mock_module.run_agent = mock_run_agent 66 66 monkeypatch.setitem(sys.modules, f"muse.providers.{provider_name}", mock_module) 67 67 68 68 - # Mock claude which is still in muse/ (not a full provider) 69 69 - mock_module = MagicMock() 70 70 - mock_module.run_agent = mock_run_agent 71 71 - monkeypatch.setitem(sys.modules, "muse.claude", mock_module) 72 72 - 73 68 monkeypatch.setitem(sys.modules, "agents", MagicMock()) 74 69 75 70

+2 -2

tests/test_cortex.py

reviewed

··· 397 397 result = "Create a new matter for AI research" 398 398 handoff = { 399 399 "persona": "matter_editor", 400 400 - "provider": "claude", 400 400 + "provider": "anthropic", 401 401 "facet": "test", 402 402 "max_turns": 5, 403 403 } ··· 412 412 mock_request.assert_called_once_with( 413 413 prompt=result, 414 414 persona="matter_editor", 415 415 - provider="claude", 415 415 + provider="anthropic", 416 416 handoff_from=parent_id, 417 417 config={"facet": "test", "max_turns": 5}, 418 418 )