Move the tool surface to FastMCP and tighten the input schemas
The hand-rolled MCP server (mcp.server.lowlevel + a custom uvicorn/SSE
shim) and the 160-line execute_tool dispatcher were getting in the way
more than they were earning. This swaps them out for stock FastMCP 3.2:
each tools/*.py module owns a module-level FastMCP instance with
@mcp.tool decorators, and storied.mcp_server composes a per-role
top-level server by mounting them and applying tag-based visibility
filters.
Tags do most of the work now. Every tool carries role tags ("dm",
"planner", "seeder", "advancement") and category/mode tags, and
per-role filtering becomes a single disable+enable pair on the parent
server. The dynamic combat-tools-when-initiative-is-active swap moves
to mcp.enable/disable(tags=...) calls inside enter_initiative and
end_initiative themselves, where the state actually changes. The
PLANNER_TOOLS / SEEDER_TOOLS / ADVANCEMENT_TOOLS allowlists, the
*_execute_tool dispatchers, and the hand-maintained character_schemas
dict are gone.
ToolContext is now a process-global initialised once via init_ctx(),
and each of its fields is exposed via a small uncalled-for Dependency
subclass (Combat, Lore, Timekeeper, World, Player, StorageRoot,
Entities). Tools declare only the slices they actually need at the
parameter site, and FastMCP introspection skips the Dependency-default
parameters so the LLM never sees them. While I was in there, the loose
input types got tightened too: enter_initiative.combatants is a
CombatantInput Pydantic model, create_character.abilities/purse are
Abilities/Purse, adjust_coins.deltas is CoinDelta, update_character.updates
is dict[str, JsonValue], and a handful of conceptually-enum string
parameters became Literal types so they show up as JSON Schema enums
(rest type, set_item_status, condition action/ends_on, recall scope,
the entity_type/content_type families).
The character.yaml shape gets a Pydantic schema too. update_character
validates the result before saving and rejects bad writes with a
DM-readable error (so the LLM can correct itself), and load_character
runs a small coercer that heals known mis-shapes from older sessions
(e.g. resources written as a list-of-pools instead of a dict-of-pools).
A few related runtime crashes that surfaced in sandbox sessions are
fixed: the format_* helpers now tolerate wrong-shaped containers
instead of crashing the turn, /me threads base_path through to
load_character so sandbox sessions don't read the cwd character, the
sandbox dir gets a symlink to rules/ so recall(scope="rules") works,
and enter_initiative's combatant fields are now in the schema instead
of having to be guessed.
Test coverage went from 86% to 96% in the process. A bunch of new
tests cover the FastMCP composition + tag filtering, the in-memory
client call paths through every tool wrapper, the deferred-notification
formatters, the schema validation/coercion, and the engine helpers.
The bits that drive a real `claude` subprocess (stream_action,
stream_with_tools, run_prompt, tick_world, BackgroundTicker._run,
BackgroundAdvancement._run, start_server, MCPServerHandle.stop) are
marked `# pragma: no cover` with notes explaining why — mocking them
would test the mock, not the launcher. The coverage threshold in
pyproject.toml is bumped from 85% to 95% so this can't slip back.
Mostly held together so far, but it's a lot of moving pieces — please
yell if anything in the sandbox or normal play feels off.