Storied: Architecture#
Concept#
A text-based solo 5e RPG where a frontier LLM plays the DM. The LLM runs the world, generates content lazily as the player explores, and persists everything as markdown files so the campaign accumulates continuity across sessions.
The Core Invariant: Tools Support, They Don't Implement#
The DM's toolkit is a bookkeeping and state-tracking surface. It is not a 5e rules engine.
The DM — the LLM — is in charge of rules. When it needs the text of
Fireball or the rules for grappling, it uses recall to look them up from
the SRD. When it needs to decide whether exhaustion applies to a Wisdom
save, it reads the character sheet, knows the rule (or looks it up), and
rolls appropriately. When the player is hit with a damage type they're
resistant to, the DM does the math and records the result.
Tools track state: HP, conditions, effects, inventory, resources, coins, campaign log, entity history. Tools do not apply rules — no automatic resistance halving, no forced concentration-save DCs, no condition-derived disadvantage markers, no exhaustion math folded into skill totals. When a tool looks like it's "helping" by doing 5e math, it's actually taking authority away from the DM and making homebrew impossible.
This invariant has two practical payoffs:
- Homebrew and variant rulesets just work. The tools don't care whether you're on 2024 rules, 2014 rules, a retroclone, or your own house rules — they record what the DM says happened.
- The DM never fights the tools. There's no surprise halving, no silent concentration drop, no "wait, why did passive perception change?" Tools do exactly what they're asked, and nothing more.
When in doubt: make the tool dumber. A dumber tool is one the DM can trust.
Stack#
- Language: Python 3.12+, fully typed (mypy strict)
- Interface: CLI with Rich-rendered streaming output and readline
history; slash commands (
/me,/status,/context,/note,/save, …) - LLM: Anthropic Claude via Claude Code subprocess driver, stream-json I/O
- Tool protocol: FastMCP in-process HTTP server (SSE transport), composed per-role with tag-filtered visibility
- Sandbox: Pydantic Monty for DM-authored Python execution — no filesystem, no network, 5 s timeout, 10 MB memory cap
- Search: sqlite-vec + fastembed BGE-small for semantic search over rules, world, and player content
- Persistence: Markdown files with YAML frontmatter under
worlds/,players/,rules/ - Campaign log: GameTime anchors (
#dX-HHMM) with append-only event history - Rules reference: 5e SRD 5.2.1 processed into per-section markdown
Overview#
┌──────────────────────────────────────────────────────────┐
│ CLI (streaming) │
│ Player types → stream DM output → render via Rich │
└──────────────────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────┐
│ DM Engine (engine.py) │
│ • Build context (character + session + log + memories) │
│ • Spawn Claude subprocess with MCP config │
│ • Stream response, route tool calls, display output │
└──────────────────────────────────────────────────────────┘
│ │ │
▼ ▼ ▼
┌─────────────────┐ ┌────────────────┐ ┌──────────────────┐
│ FastMCP Server │ │ Background │ │ Vector Index │
│ (per-role, │ │ Agents │ │ (sqlite-vec) │
│ tag-filtered) │ │ (seeder, │ │ │
│ │ │ ticker, │ │ rules + world │
│ │ │ advancement) │ │ + player notes │
└─────────────────┘ └────────────────┘ └──────────────────┘
│
▼
┌──────────────────────────────────────────────────────────┐
│ State (markdown + yaml) │
│ worlds/{world}/ players/{player}/ rules/srd-5.2.1/ │
└──────────────────────────────────────────────────────────┘
Component Responsibilities#
Cold-Start Onboarding (cli.py + prompts/new-game.md)#
When storied play launches against a fresh world — no character
sheet or no style.md — it enters an onboarding sequence rather than
the normal DM loop. The sequence runs entirely in one invocation:
- Onboarding conversation. A DM engine loads
prompts/new-game.mdand weaves character creation with worldbuilding preferences. The DM has two mandatory deliverables beforeend_session:tune()(writesworlds/{world}/style.mdcapturing tone, themes, pacing) andcreate_character(). The prompt forbidsset_scene,establish, and any attempt to start the adventure — those belong to the seeder and the live DM. - Artifact gate. After the onboarding engine exits,
cmd_playre-checks both files. If either is missing, it prints a friendly "onboarding incomplete" message and exits; re-runningstoried playre-enters onboarding to fill the gap. The DM notices what already exists in context and only works on the missing deliverable. - Synchronous planning phase.
seed_world()runs as a blocking Claude subprocess usingprompts/world-seed.md. It now readsstyle.mdand prepends it as a## Player Preferencesblock to the seeder's user message, so the 12–16 entities it establishes and the openingset_scenereflect the player's preferences — not a lowest-common-denominator interpretation of the character sheet alone. - Live play. A fresh
DMEnginestarts withprompts/dm-system.md,BackgroundTickerandBackgroundAdvancementcome online, and the player drops into the opening scene.
cli.py extracts the input/stream/ticker/advancement loop into
_run_engine_loop so both the onboarding engine and the play engine
share the same interactive shell. Onboarding runs with
is_onboarding=True and no background agents; play runs with
is_onboarding=False and full ticker + advancement. --sandbox
skips cold-start entirely and goes straight to the DM system prompt.
DM Engine (engine.py)#
The agentic loop: builds the turn's context, spawns Claude with the MCP config, streams the response, and routes tool calls back into the FastMCP server. Context building pulls from the character sheet, session state, recent campaign log entries, style tuning, world memories, and pending background notifications. Every turn refreshes dynamic tool visibility (combat tags, advancement gates) before issuing the prompt.
FastMCP Composition (mcp_server.py + tools/*.py)#
Each tools/*.py module defines its own module-level FastMCP instance.
At startup, _compose_server(role) builds a top-level server by mounting
the six tool modules and applying tag filters: other-role tags are
disabled, then the active role's tags are re-enabled so shared tools (like
update_character, tagged for both dm and advancement) survive.
For DM mode, two additional gates apply at compose time:
- Combat: tools tagged
combatbut notcombat_controlare disabled at startup.enter_initiativeflips them on via_root.enable(keys=…)from insidecombat._flip_into_combat;end_initiativeflips them back off. - Advancement: tools tagged
advancement_available(currently justlevel_up) are disabled at startup. The engine callscharacter.refresh_advancement_visibilityat the top of each turn, which enables them iff the character sheet hasadvancement_readyset.
Dynamic visibility mutations target the top-level composed server — that's what Claude sees, so changes propagate immediately without re-composition.
Roles: dm, planner, seeder, advancement. Each sees a different
subset of the tool surface.
Tool Modules#
| Module | Tools |
|---|---|
tools/mechanics.py |
roll (dumb dice), recall (vector search with scope filter + recency decay) |
tools/scene.py |
set_scene (keystone — logs event, advances clock, updates session state), tune, end_session, notify_dm |
tools/entities.py |
establish, mark, amend_mark, note_discovery — CRUD for world entities with Is/Was/Knows/Wants/Will structure |
tools/character.py |
Character sheet bookkeeping — raw state operations only (damage, heal, conditions, effects, inventory, resources, rest, level_up, notes, …) |
tools/combat.py |
Initiative tracker — enter_initiative, end_initiative, next_turn, add_combatant, remove_combatant, condition |
tools/run_code.py |
Sandboxed Python execution — all DM tools exposed as sync functions for orchestration |
The character tools are deliberately narrow: each one records exactly
what the DM tells it. damage(amount, type="fire") subtracts amount
from HP. It does not consult resistances. It does not halve anything. The
DM already knows the character is fire-resistant, pre-applied the math,
and the tool just records the result. The type field is metadata for
narration and the campaign log.
Background Agents (planner.py)#
Three independent Claude-subprocess workers run between turns, each with its own role and a narrow tool surface:
- Seeder (
seederrole): cold-start world building from the character sheet. Establishes 12-16 initial entities with Knows/Wants/Will hooks. - Ticker (
plannerrole): small off-screen world motion between turns — firing Will triggers, advancing thread deadlines, adding beats to entities whose state should change. - Advancement (
advancementrole): holistic level-up evaluation based on pacing, triumphs, narrative beats. When it decides the character has earned the next level, it setsadvancement_readyon the sheet and notifies the DM. The DM picks the narrative moment to apply it.
Each agent writes back to the world state or notifies the DM via
notify_dm, which appends to a notifications queue the DM sees at the
top of its next turn.
Campaign Log (log.py)#
Canonical clock. Every set_scene appends a GameTime-anchored entry. Time
only advances when the DM logs it — the narrative and clock must agree, or
the DM is lying to the player. Event history is the DM's memory between
turns.
GameTime parses and renders #dX-HHMM anchors; the tool layer uses
from_anchor / to_anchor so the DM can backdate mark events with a
when parameter without re-computing the format by hand.
Vector Index (search.py)#
sqlite-vec + BGE-small embeddings over content tagged by source —
srd (shipped), user (homebrew), world (campaign-specific), and
player (player knowledge / discoveries). Age-decay scoring favors
recent world beats. recall accepts a scope filter:
scope="rules"→ searchessrd + user + world(rule content can live at any of the three layers)scope="world"→ narrative-only world contentscope="all"→ no filter
VectorIndex.search accepts source_filter: str | list[str] | None
for OR semantics across multiple sources. The index reseeds from a
pre-built SRD sqlite snapshot when present, otherwise re-indexes from
markdown sections, then appends user homebrew and world content on
top.
Path Configuration (paths.py)#
Storied is "one process, one game." Filesystem paths live in module
globals on storied.paths, set once at CLI startup via configure(...).
Library code reads via getters — data_home(), worlds_path(),
players_path(), world_path(id), player_path(id), user_rules_path(),
shipped_rules_path(). There is no base_path parameter threaded
through anything; each function asks the module for the specific path
it needs.
from storied.paths import configure, data_home, world_path
# CLI startup (once)
configure(data_home=Path.home() / ".storied")
# Library code (anywhere)
def load_session(player_id: str) -> dict | None:
path = player_path(player_id) / "session.md"
...
Two independent globals back the configuration:
_data_home(default~/.storied) holds worlds, players, sessions, transcripts, vector indices._user_rules_home(default<data_home>/rules) holds personal homebrew. Sandbox mode points the data home at a tempdir but leaves the user-rules home at the real~/.storied/rules/so your homebrew is still available in throwaway sessions.
STORIED_HOME env var override; --base-path CLI flag overrides
that. The storied.paths.using_data_home(path) context manager is
the test-friendly form (used by the autouse _isolate_storied_paths
fixture in tests/conftest.py to point each test at its own
tmp_path).
Threads inherit module globals automatically — no contextvars
gymnastics. Subprocesses inherit via STORIED_HOME in the env dict
(exported in cmd_play for any future child process that re-enters
storied).
Content Layers#
See content-layers.md for the full spec. Three layers, priority
world > user > shipped (first match wins):
~/.storied/worlds/{world}/ → campaign-specific (top priority)
~/.storied/rules/ → personal homebrew (overrides shipped)
<repo>/rules/srd-5.2.1/sections/ → stock 5e SRD
ContentResolver.find() walks all three layers for any content type
that might appear at multiple layers (monsters, spells, etc.).
Narrative content (npcs, locations, factions, threads, lore, maps)
only exists at the world layer; the upper-layer lookups miss
harmlessly.
Entity model (Is/Was/Knows/Wants/Will) is narrative, not mechanical. A door can "want" to stay closed. A coin can "know" where it was forged. The model gives the DM rich material to act on without encoding literal consciousness or rule triggers.
Lazy World Generation#
The world expands as the player explores: entities move through phases — referenced → sketched → detailed → living. Each phase writes more to the file; once written, details become constraints on future generation. The seeder agent populates the initial layer; the ticker and the DM add to it over time.
Player State#
Lives under players/{player_id}/:
character.yaml— structured bookkeeping (abilities, HP, resources, conditions, effects, inventory, magic_items, features, defenses)character.md— free-text backstory, personality, aliases, voicenotes.md— appending timestamped journalsession.md— current situation, location, present NPCs, open threadsworlds/{world}/{npcs,locations,…}/*.md— the player's knowledge of the world, which may differ from DM truth
Player state is separate from world state so multiple characters can live in the same world (future multiplayer) and so the player's partial knowledge stays distinct from the DM's omniscient view.
What Lives Where#
| Concern | Home |
|---|---|
| SRD rules text | rules/srd-5.2.1/sections/* (read-only, searchable via recall) |
| World state | worlds/{world}/{npcs,locations,items,factions,threads,lore,maps}/*.md |
| Player character | players/{player}/character.yaml + character.md |
| Campaign history | worlds/{world}/campaign-log.md (GameTime-anchored) |
| DM style tuning | worlds/{world}/style.md |
| Background notifications | worlds/{world}/notifications.md (queue for next turn) |
| Session state | players/{player}/session.md |
| Vector index | worlds/{world}/search.db (sqlite-vec) |
Open Questions#
- Multiplayer: How do we present multiple characters' state to the DM without bloating context?
- Rules drift: The SRD will update; how do we manage the index when rules text changes?
- Session compaction: The campaign log is append-only; eventually we'll need windowing or summarization for very long campaigns.