A 5e storytelling engine with an LLM DM
0
fork

Configure Feed

Select the types of activity you want to include in your feed.

Storied: Architecture#

Concept#

A text-based solo 5e RPG where a frontier LLM plays the DM. The LLM runs the world, generates content lazily as the player explores, and persists everything as markdown files so the campaign accumulates continuity across sessions.

The Core Invariant: Tools Support, They Don't Implement#

The DM's toolkit is a bookkeeping and state-tracking surface. It is not a 5e rules engine.

The DM — the LLM — is in charge of rules. When it needs the text of Fireball or the rules for grappling, it uses recall to look them up from the SRD. When it needs to decide whether exhaustion applies to a Wisdom save, it reads the character sheet, knows the rule (or looks it up), and rolls appropriately. When the player is hit with a damage type they're resistant to, the DM does the math and records the result.

Tools track state: HP, conditions, effects, inventory, resources, coins, campaign log, entity history. Tools do not apply rules — no automatic resistance halving, no forced concentration-save DCs, no condition-derived disadvantage markers, no exhaustion math folded into skill totals. When a tool looks like it's "helping" by doing 5e math, it's actually taking authority away from the DM and making homebrew impossible.

This invariant has two practical payoffs:

  1. Homebrew and variant rulesets just work. The tools don't care whether you're on 2024 rules, 2014 rules, a retroclone, or your own house rules — they record what the DM says happened.
  2. The DM never fights the tools. There's no surprise halving, no silent concentration drop, no "wait, why did passive perception change?" Tools do exactly what they're asked, and nothing more.

When in doubt: make the tool dumber. A dumber tool is one the DM can trust.

Stack#

  • Language: Python 3.12+, fully typed (mypy strict)
  • Interface: CLI with Rich-rendered streaming output and readline history; slash commands (/me, /status, /context, /note, /save, …)
  • LLM: Anthropic Claude via Claude Code subprocess driver, stream-json I/O
  • Tool protocol: FastMCP in-process HTTP server (SSE transport), composed per-role with tag-filtered visibility
  • Sandbox: Pydantic Monty for DM-authored Python execution — no filesystem, no network, 5 s timeout, 10 MB memory cap
  • Search: sqlite-vec + fastembed BGE-small for semantic search over rules, world, and player content
  • Persistence: Markdown files with YAML frontmatter under worlds/, players/, rules/
  • Campaign log: GameTime anchors (#dX-HHMM) with append-only event history
  • Rules reference: 5e SRD 5.2.1 processed into per-section markdown

Overview#

┌──────────────────────────────────────────────────────────┐
│                     CLI (streaming)                       │
│  Player types → stream DM output → render via Rich       │
└──────────────────────────────────────────────────────────┘
                              │
                              ▼
┌──────────────────────────────────────────────────────────┐
│                    DM Engine (engine.py)                  │
│  • Build context (character + session + log + memories)  │
│  • Spawn Claude subprocess with MCP config                │
│  • Stream response, route tool calls, display output     │
└──────────────────────────────────────────────────────────┘
         │                    │                    │
         ▼                    ▼                    ▼
┌─────────────────┐  ┌────────────────┐  ┌──────────────────┐
│  FastMCP Server │  │  Background    │  │  Vector Index    │
│  (per-role,     │  │  Agents        │  │  (sqlite-vec)    │
│   tag-filtered) │  │  (seeder,      │  │                  │
│                 │  │  ticker,       │  │  rules + world   │
│                 │  │  advancement)  │  │  + player notes  │
└─────────────────┘  └────────────────┘  └──────────────────┘
         │
         ▼
┌──────────────────────────────────────────────────────────┐
│                  State (markdown + yaml)                  │
│  worlds/{world}/   players/{player}/   rules/srd-5.2.1/  │
└──────────────────────────────────────────────────────────┘

Component Responsibilities#

Cold-Start Onboarding (cli.py + prompts/new-game.md)#

When storied play launches against a fresh world — no character sheet or no style.md — it enters an onboarding sequence rather than the normal DM loop. The sequence runs entirely in one invocation:

  1. Onboarding conversation. A DM engine loads prompts/new-game.md and weaves character creation with worldbuilding preferences. The DM has two mandatory deliverables before end_session: tune() (writes worlds/{world}/style.md capturing tone, themes, pacing) and create_character(). The prompt forbids set_scene, establish, and any attempt to start the adventure — those belong to the seeder and the live DM.
  2. Artifact gate. After the onboarding engine exits, cmd_play re-checks both files. If either is missing, it prints a friendly "onboarding incomplete" message and exits; re-running storied play re-enters onboarding to fill the gap. The DM notices what already exists in context and only works on the missing deliverable.
  3. Synchronous planning phase. seed_world() runs as a blocking Claude subprocess using prompts/world-seed.md. It now reads style.md and prepends it as a ## Player Preferences block to the seeder's user message, so the 12–16 entities it establishes and the opening set_scene reflect the player's preferences — not a lowest-common-denominator interpretation of the character sheet alone.
  4. Live play. A fresh DMEngine starts with prompts/dm-system.md, BackgroundTicker and BackgroundAdvancement come online, and the player drops into the opening scene.

cli.py extracts the input/stream/ticker/advancement loop into _run_engine_loop so both the onboarding engine and the play engine share the same interactive shell. Onboarding runs with is_onboarding=True and no background agents; play runs with is_onboarding=False and full ticker + advancement. --sandbox skips cold-start entirely and goes straight to the DM system prompt.

DM Engine (engine.py)#

The agentic loop: builds the turn's context, spawns Claude with the MCP config, streams the response, and routes tool calls back into the FastMCP server. Context building pulls from the character sheet, session state, recent campaign log entries, style tuning, world memories, and pending background notifications. Every turn refreshes dynamic tool visibility (combat tags, advancement gates) before issuing the prompt.

FastMCP Composition (mcp_server.py + tools/*.py)#

Each tools/*.py module defines its own module-level FastMCP instance. At startup, _compose_server(role) builds a top-level server by mounting the six tool modules and applying tag filters: other-role tags are disabled, then the active role's tags are re-enabled so shared tools (like update_character, tagged for both dm and advancement) survive.

For DM mode, two additional gates apply at compose time:

  • Combat: tools tagged combat but not combat_control are disabled at startup. enter_initiative flips them on via _root.enable(keys=…) from inside combat._flip_into_combat; end_initiative flips them back off.
  • Advancement: tools tagged advancement_available (currently just level_up) are disabled at startup. The engine calls character.refresh_advancement_visibility at the top of each turn, which enables them iff the character sheet has advancement_ready set.

Dynamic visibility mutations target the top-level composed server — that's what Claude sees, so changes propagate immediately without re-composition.

Roles: dm, planner, seeder, advancement. Each sees a different subset of the tool surface.

Tool Modules#

Module Tools
tools/mechanics.py roll (dumb dice), recall (vector search with scope filter + recency decay)
tools/scene.py set_scene (keystone — logs event, advances clock, updates session state), tune, end_session, notify_dm
tools/entities.py establish, mark, amend_mark, note_discovery — CRUD for world entities with Is/Was/Knows/Wants/Will structure
tools/character.py Character sheet bookkeeping — raw state operations only (damage, heal, conditions, effects, inventory, resources, rest, level_up, notes, …)
tools/combat.py Initiative tracker — enter_initiative, end_initiative, next_turn, add_combatant, remove_combatant, condition
tools/run_code.py Sandboxed Python execution — all DM tools exposed as sync functions for orchestration

The character tools are deliberately narrow: each one records exactly what the DM tells it. damage(amount, type="fire") subtracts amount from HP. It does not consult resistances. It does not halve anything. The DM already knows the character is fire-resistant, pre-applied the math, and the tool just records the result. The type field is metadata for narration and the campaign log.

Background Agents (planner.py)#

Three independent Claude-subprocess workers run between turns, each with its own role and a narrow tool surface:

  • Seeder (seeder role): cold-start world building from the character sheet. Establishes 12-16 initial entities with Knows/Wants/Will hooks.
  • Ticker (planner role): small off-screen world motion between turns — firing Will triggers, advancing thread deadlines, adding beats to entities whose state should change.
  • Advancement (advancement role): holistic level-up evaluation based on pacing, triumphs, narrative beats. When it decides the character has earned the next level, it sets advancement_ready on the sheet and notifies the DM. The DM picks the narrative moment to apply it.

Each agent writes back to the world state or notifies the DM via notify_dm, which appends to a notifications queue the DM sees at the top of its next turn.

Campaign Log (log.py)#

Canonical clock. Every set_scene appends a GameTime-anchored entry. Time only advances when the DM logs it — the narrative and clock must agree, or the DM is lying to the player. Event history is the DM's memory between turns.

GameTime parses and renders #dX-HHMM anchors; the tool layer uses from_anchor / to_anchor so the DM can backdate mark events with a when parameter without re-computing the format by hand.

Vector Index (search.py)#

sqlite-vec + BGE-small embeddings over content tagged by source — srd (shipped), user (homebrew), world (campaign-specific), and player (player knowledge / discoveries). Age-decay scoring favors recent world beats. recall accepts a scope filter:

  • scope="rules" → searches srd + user + world (rule content can live at any of the three layers)
  • scope="world" → narrative-only world content
  • scope="all" → no filter

VectorIndex.search accepts source_filter: str | list[str] | None for OR semantics across multiple sources. The index reseeds from a pre-built SRD sqlite snapshot when present, otherwise re-indexes from markdown sections, then appends user homebrew and world content on top.

Path Configuration (paths.py)#

Storied is "one process, one game." Filesystem paths live in module globals on storied.paths, set once at CLI startup via configure(...). Library code reads via getters — data_home(), worlds_path(), players_path(), world_path(id), player_path(id), user_rules_path(), shipped_rules_path(). There is no base_path parameter threaded through anything; each function asks the module for the specific path it needs.

from storied.paths import configure, data_home, world_path

# CLI startup (once)
configure(data_home=Path.home() / ".storied")

# Library code (anywhere)
def load_session(player_id: str) -> dict | None:
    path = player_path(player_id) / "session.md"
    ...

Two independent globals back the configuration:

  • _data_home (default ~/.storied) holds worlds, players, sessions, transcripts, vector indices.
  • _user_rules_home (default <data_home>/rules) holds personal homebrew. Sandbox mode points the data home at a tempdir but leaves the user-rules home at the real ~/.storied/rules/ so your homebrew is still available in throwaway sessions.

STORIED_HOME env var override; --base-path CLI flag overrides that. The storied.paths.using_data_home(path) context manager is the test-friendly form (used by the autouse _isolate_storied_paths fixture in tests/conftest.py to point each test at its own tmp_path).

Threads inherit module globals automatically — no contextvars gymnastics. Subprocesses inherit via STORIED_HOME in the env dict (exported in cmd_play for any future child process that re-enters storied).

Content Layers#

See content-layers.md for the full spec. Three layers, priority world > user > shipped (first match wins):

~/.storied/worlds/{world}/        → campaign-specific (top priority)
~/.storied/rules/                 → personal homebrew (overrides shipped)
<repo>/rules/srd-5.2.1/sections/  → stock 5e SRD

ContentResolver.find() walks all three layers for any content type that might appear at multiple layers (monsters, spells, etc.). Narrative content (npcs, locations, factions, threads, lore, maps) only exists at the world layer; the upper-layer lookups miss harmlessly.

Entity model (Is/Was/Knows/Wants/Will) is narrative, not mechanical. A door can "want" to stay closed. A coin can "know" where it was forged. The model gives the DM rich material to act on without encoding literal consciousness or rule triggers.

Lazy World Generation#

The world expands as the player explores: entities move through phases — referenced → sketched → detailed → living. Each phase writes more to the file; once written, details become constraints on future generation. The seeder agent populates the initial layer; the ticker and the DM add to it over time.

Player State#

Lives under players/{player_id}/:

  • character.yaml — structured bookkeeping (abilities, HP, resources, conditions, effects, inventory, magic_items, features, defenses)
  • character.md — free-text backstory, personality, aliases, voice
  • notes.md — appending timestamped journal
  • session.md — current situation, location, present NPCs, open threads
  • worlds/{world}/{npcs,locations,…}/*.md — the player's knowledge of the world, which may differ from DM truth

Player state is separate from world state so multiple characters can live in the same world (future multiplayer) and so the player's partial knowledge stays distinct from the DM's omniscient view.

What Lives Where#

Concern Home
SRD rules text rules/srd-5.2.1/sections/* (read-only, searchable via recall)
World state worlds/{world}/{npcs,locations,items,factions,threads,lore,maps}/*.md
Player character players/{player}/character.yaml + character.md
Campaign history worlds/{world}/campaign-log.md (GameTime-anchored)
DM style tuning worlds/{world}/style.md
Background notifications worlds/{world}/notifications.md (queue for next turn)
Session state players/{player}/session.md
Vector index worlds/{world}/search.db (sqlite-vec)

Open Questions#

  • Multiplayer: How do we present multiple characters' state to the DM without bloating context?
  • Rules drift: The SRD will update; how do we manage the index when rules text changes?
  • Session compaction: The campaign log is append-only; eventually we'll need windowing or summarization for very long campaigns.