···11-# 9plan Agent Validation System
22-33-This directory contains infrastructure for validating the 9plan MCP server using the **"Claude sandwich"** pattern - an outer AI agent controls an inner AI agent to test the server in realistic conditions.
44-55-## Overview
66-77-The validation system tests 9plan by having an outer Claude:
88-1. Spawn an inner Claude that thinks it's just building an app
99-2. Guide the inner Claude step-by-step through building "Notekeeper" (a test project)
1010-3. Verify the inner Claude correctly uses 9plan tools
1111-4. Check invariants and state between steps
1212-1313-This provides more realistic testing than unit tests because the inner agent behaves naturally, not knowing it's being tested.
1414-1515-## Directory Structure
1616-1717-```
1818-validation/
1919-├── README.md # This file
2020-├── outer-claude-guide.md # Instructions for the outer Claude
2121-├── sandbox/
2222-│ └── notekeeper/ # Test project for inner Claude to build
2323-│ ├── package.json # Pre-configured
2424-│ ├── tsconfig.json # Pre-configured
2525-│ └── src/ # Empty - inner Claude populates this
2626-└── scenarios/
2727- ├── schema.md # Scenario file format documentation
2828- ├── notekeeper-full.yaml # Complete Notekeeper build scenario
2929- ├── decomposition-test.yaml # Test plan decomposition workflow
3030- ├── error-conditions.yaml # Test error handling
3131- ├── dependency-resolution.yaml # Test semantic search for dependencies
3232- └── session-recovery.yaml # Test session resume after context loss
3333-```
3434-3535-## How It Works
3636-3737-### The "Claude Sandwich" Pattern
3838-3939-```
4040-┌─────────────────────────────────────────────────────┐
4141-│ OUTER CLAUDE (the puppetmaster) │
4242-│ │ │
4343-│ ├─→ Reads outer-claude-guide.md │
4444-│ ├─→ Loads scenario file (e.g., notekeeper-full) │
4545-│ │ │
4646-│ ├─→ Runs: claude -p "Create session" --json │
4747-│ │ └─→ INNER CLAUDE creates 9plan session │
4848-│ │ │
4949-│ ├─→ Validates state (admin tools, filesystem) │
5050-│ │ │
5151-│ ├─→ Runs: claude -p "Add plans" --resume $sid │
5252-│ │ └─→ INNER CLAUDE adds plans to queue │
5353-│ │ │
5454-│ ├─→ Validates state... │
5555-│ │ │
5656-│ └─→ Continues until scenario complete │
5757-└─────────────────────────────────────────────────────┘
5858-```
5959-6060-### Key Components
6161-6262-1. **Outer Claude** - Reads scenarios, runs commands, validates state
6363-2. **Inner Claude** - Builds the app using 9plan (doesn't know it's a test)
6464-3. **Notekeeper** - Simple CLI app used as test project
6565-4. **Scenarios** - YAML files describing what to test and expected outcomes
6666-5. **Admin Tools** - `9plan_admin_*` tools for state verification
6767-6868-## Running Validation
6969-7070-### Prerequisites
7171-7272-1. 9plan MCP server is built (`npm run build`)
7373-2. `.mcp.json` is configured in project root
7474-3. `claude` CLI is available
7575-7676-### Running a Scenario
7777-7878-Ask your outer Claude (the one you're talking to):
7979-8080-```
8181-Please run the 9plan validation scenario at validation/scenarios/notekeeper-full.yaml
8282-```
8383-8484-The outer Claude will:
8585-1. Read `outer-claude-guide.md` for instructions
8686-2. Parse the scenario file
8787-3. Execute step-by-step, spawning inner Claude instances
8888-4. Validate state between steps
8989-5. Report pass/fail with details
9090-9191-### Manual Testing
9292-9393-You can also run individual steps manually. **Important**: You must include `--mcp-config` and `--allowedTools` flags:
9494-9595-```powershell
9696-# Define allowed tools
9797-$ALLOWED = "Read,Write,mcp__9plan__9plan_session_create,mcp__9plan__9plan_session_resume,mcp__9plan__9plan_queue_add,mcp__9plan__9plan_queue_pull,mcp__9plan__9plan_plan_complete,mcp__9plan__9plan_plan_defer,mcp__9plan__9plan_plan_discard,mcp__9plan__9plan_history_search,mcp__9plan__9plan_history_get"
9898-9999-# Start a session
100100-$r = claude -p "Create a 9plan session for Notekeeper" --mcp-config ".mcp.json" --allowedTools $ALLOWED --output-format json
101101-$sid = ($r | ConvertFrom-Json).session_id
102102-103103-# Continue with more prompts (using --resume)...
104104-claude -p "Add a plan for the storage module" --resume $sid --mcp-config ".mcp.json" --allowedTools $ALLOWED
105105-```
106106-107107-**Note on `--allowedTools`**: MCP tools use the format `mcp__<server>__<toolname>`. The inner Claude cannot prompt for permissions in non-interactive mode, so all needed tools must be pre-approved.
108108-109109-## Scenarios
110110-111111-| Scenario | Description |
112112-|----------|-------------|
113113-| `notekeeper-full` | Build complete Notekeeper app from scratch |
114114-| `decomposition-test` | Test plan decomposition and parent aggregation |
115115-| `error-conditions` | Verify error handling (pull with active, etc.) |
116116-| `dependency-resolution` | Test semantic search for input dependencies |
117117-| `session-recovery` | Test resuming session after context loss |
118118-119119-## Admin Tools
120120-121121-These tools help the outer Claude verify state between steps:
122122-123123-| Tool | Purpose |
124124-|------|---------|
125125-| `9plan_admin_validate` | Check all invariants hold |
126126-| `9plan_admin_sessions` | List all sessions |
127127-| `9plan_admin_state` | Dump detailed session state |
128128-129129-## Success Criteria
130130-131131-Validation passes when:
132132-- [ ] All scenarios execute without errors
133133-- [ ] Invariants hold at every step (checked via admin tools)
134134-- [ ] Inner Claude successfully builds Notekeeper
135135-- [ ] Notekeeper CLI works (`add`, `list`, `search`, `delete` commands)
136136-- [ ] History search returns expected results
137137-- [ ] Decomposition/aggregation workflow completes correctly
-229
validation/outer-claude-guide.md
···11-# Outer Claude Validation Guide
22-33-You are the **outer Claude** - the puppetmaster who will control an inner Claude to validate the 9plan MCP server. This guide explains how to run validation scenarios.
44-55-## Pre-Validation Checklist
66-77-Before starting, verify:
88-99-- [ ] 9plan MCP server is built: Run `npm run build` in project root
1010-- [ ] `.mcp.json` exists in project root with 9plan configured
1111-- [ ] The `claude` CLI is available and authenticated
1212-- [ ] The sandbox project exists at `validation/sandbox/notekeeper/`
1313-1414-## Critical: CLI Flags for Inner Claude
1515-1616-Every `claude -p` command MUST include these flags:
1717-1818-### `--mcp-config ".mcp.json"`
1919-Loads the 9plan MCP server configuration so the inner Claude can use 9plan tools.
2020-2121-### `--allowedTools "..."`
2222-Pre-approves tools for non-interactive use. The inner Claude cannot prompt for permissions in `-p` mode, so you must pre-approve all tools it will need.
2323-2424-**Tool naming format**: MCP tools use the pattern `mcp__<server>__<tool>`. For 9plan, this is `mcp__9plan__<toolname>`.
2525-2626-**Standard allowedTools for full validation:**
2727-```
2828---allowedTools "Read,Write,mcp__9plan__9plan_session_create,mcp__9plan__9plan_session_resume,mcp__9plan__9plan_queue_add,mcp__9plan__9plan_queue_pull,mcp__9plan__9plan_plan_complete,mcp__9plan__9plan_plan_defer,mcp__9plan__9plan_plan_discard,mcp__9plan__9plan_history_search,mcp__9plan__9plan_history_get"
2929-```
3030-3131-**Why `Read,Write`?** The inner Claude needs file access to read plan files and create implementation files.
3232-3333-## How to Run Validation
3434-3535-### Step 1: Load the Scenario
3636-3737-Read the scenario file you want to run (e.g., `validation/scenarios/notekeeper-full.yaml`). This contains:
3838-- `task_description`: What to tell the inner Claude
3939-- `steps`: Sequence of prompts and expected outcomes
4040-- `verification`: Commands to run at the end
4141-4242-### Step 2: Start Inner Claude Session
4343-4444-Run the first prompt to create a session:
4545-4646-```powershell
4747-$ALLOWED = "Read,Write,mcp__9plan__9plan_session_create,mcp__9plan__9plan_session_resume,mcp__9plan__9plan_queue_add,mcp__9plan__9plan_queue_pull,mcp__9plan__9plan_plan_complete,mcp__9plan__9plan_plan_defer,mcp__9plan__9plan_plan_discard,mcp__9plan__9plan_history_search,mcp__9plan__9plan_history_get"
4848-4949-$result = claude -p "<first prompt from scenario>" --mcp-config ".mcp.json" --allowedTools $ALLOWED --output-format json
5050-$sessionId = ($result | ConvertFrom-Json).session_id
5151-```
5252-5353-Save the `session_id` - you'll need it for `--resume` in subsequent steps. Also note: defining `$ALLOWED` once makes subsequent commands cleaner.
5454-5555-### Step 3: Validate State
5656-5757-Between each step, check that the inner Claude did the right thing:
5858-5959-1. **Check filesystem** - Use `ls`, `cat`, or Read tool to verify files were created/modified
6060-2. **Use admin tools** - Call `9plan_admin_state` to see queue/active plan state
6161-3. **Check for errors** - Look for error messages in the inner Claude's response
6262-6363-### Step 4: Send Next Prompt
6464-6565-Continue the conversation with `--resume`:
6666-6767-```powershell
6868-$result = claude -p "<next prompt from scenario>" --resume $sessionId --mcp-config ".mcp.json" --allowedTools $ALLOWED --output-format json
6969-```
7070-7171-### Step 5: Repeat Until Done
7272-7373-Continue steps 3-4 until all prompts in the scenario are complete.
7474-7575-### Step 6: Final Verification
7676-7777-Run the verification commands from the scenario:
7878-7979-```powershell
8080-# Example: Test the Notekeeper CLI
8181-cd validation/sandbox/notekeeper
8282-npm run build
8383-node dist/index.js add "Test note"
8484-node dist/index.js list
8585-```
8686-8787-## Handling Failures
8888-8989-### Inner Claude Makes Mistake
9090-9191-If the inner Claude does something unexpected:
9292-1. Note what went wrong
9393-2. Decide if this is a 9plan bug or expected agent behavior
9494-3. You may need to guide the inner Claude with a corrective prompt
9595-9696-### State Verification Fails
9797-9898-If `9plan_admin_validate` returns issues:
9999-1. This is likely a 9plan bug
100100-2. Document the exact state and what went wrong
101101-3. Report the failure
102102-103103-### Inner Claude Gets Stuck
104104-105105-If the inner Claude seems confused or stuck:
106106-1. Try a more specific prompt
107107-2. Check if the scenario description is unclear
108108-3. You may need to provide hints
109109-110110-## Prompting the Inner Claude
111111-112112-### Good Prompts
113113-114114-- Be specific about what you want done
115115-- Reference 9plan tools naturally (e.g., "create a session", "add a plan")
116116-- Don't mention that this is a test
117117-118118-### Bad Prompts
119119-120120-- "Test the 9plan server" (reveals it's a test)
121121-- "Use 9plan_session_create" (too prescriptive about tool names)
122122-- Vague instructions that could be interpreted multiple ways
123123-124124-## Example Validation Flow
125125-126126-```powershell
127127-# Define allowed tools once
128128-$ALLOWED = "Read,Write,mcp__9plan__9plan_session_create,mcp__9plan__9plan_session_resume,mcp__9plan__9plan_queue_add,mcp__9plan__9plan_queue_pull,mcp__9plan__9plan_plan_complete,mcp__9plan__9plan_plan_defer,mcp__9plan__9plan_plan_discard,mcp__9plan__9plan_history_search,mcp__9plan__9plan_history_get"
129129-130130-# Step 1: Create session
131131-$r1 = claude -p "I want to build a simple CLI note-taking app called Notekeeper. Start by creating a 9plan session to track this work." --mcp-config ".mcp.json" --allowedTools $ALLOWED --output-format json
132132-$sid = ($r1 | ConvertFrom-Json).session_id
133133-134134-# Validate: Session should be created
135135-# - Check $env:LOCALAPPDATA/9plan/Data/sessions/ for new directory
136136-# - The inner Claude should tell you the session name
137137-138138-# Step 2: Add plans
139139-$r2 = claude -p "Now let's plan out the work. Add plans for: 1) the storage module, 2) add command, 3) list command, 4) search command, 5) delete command. Add them in the order they should be executed." --resume $sid --mcp-config ".mcp.json" --allowedTools $ALLOWED --output-format json
140140-141141-# Validate: Plans should be in queue
142142-# - Check plans/ directory in the session folder
143143-# - Verify plan files exist (e.g., k7f3m.txt)
144144-145145-# Step 3: Execute first plan
146146-$r3 = claude -p "Pull the first plan and implement the storage module in validation/sandbox/notekeeper/src/storage.ts" --resume $sid --mcp-config ".mcp.json" --allowedTools $ALLOWED --output-format json
147147-148148-# Validate: Storage module created
149149-# - Check validation/sandbox/notekeeper/src/storage.ts exists
150150-# - Check plan file was deleted from plans/
151151-# - History should now contain the completed plan
152152-153153-# Continue with remaining plans...
154154-```
155155-156156-## Using Admin Tools
157157-158158-Between steps, use these tools to verify state:
159159-160160-### 9plan_admin_validate
161161-162162-Returns whether all invariants hold:
163163-```json
164164-{
165165- "valid": true,
166166- "invariants": {
167167- "single_active_plan": true,
168168- "queue_order_preserved": true,
169169- "files_match_database": true
170170- },
171171- "issues": []
172172-}
173173-```
174174-175175-### 9plan_admin_state
176176-177177-Returns detailed state dump:
178178-```json
179179-{
180180- "session_name": "copper-velvet-morning",
181181- "queue": [
182182- {"id": "k7f3m", "goal": "Create storage module", "position": 1}
183183- ],
184184- "active_plan": null,
185185- "completed_count": 0,
186186- "plan_files": ["k7f3m.txt"]
187187-}
188188-```
189189-190190-### 9plan_admin_sessions
191191-192192-Lists all sessions:
193193-```json
194194-{
195195- "sessions": [
196196- {"name": "copper-velvet-morning", "created": "2024-01-15", "plans": 5}
197197- ]
198198-}
199199-```
200200-201201-## Success Criteria
202202-203203-A scenario passes when:
204204-205205-1. **All prompts execute** - Inner Claude responds to each step
206206-2. **State is valid at each step** - `9plan_admin_validate` returns no issues
207207-3. **Expected outcomes match** - Files created, plans completed, etc.
208208-4. **Verification passes** - Final commands work as expected
209209-210210-## Reporting Results
211211-212212-After running a scenario, report:
213213-214214-```
215215-## Validation Report: [scenario-name]
216216-217217-**Status**: PASS / FAIL
218218-219219-**Steps Completed**: X/Y
220220-221221-**Issues Found**:
222222-- (list any problems)
223223-224224-**Inner Claude Behavior**:
225225-- (notes on how the inner Claude performed)
226226-227227-**9plan Bugs Found**:
228228-- (list any bugs in the MCP server)
229229-```
···11-schema_version: "1.0"
22-scenario_id: decomposition-test
33-description: |
44- Test plan decomposition and parent aggregation workflow.
55- Verifies that parent plans can be deferred, child plans executed,
66- and parent plans re-pulled for aggregation.
77-88-prerequisites:
99- 9plan_built: true
1010-1111-steps:
1212- # Step 1: Create session
1313- - id: create_session
1414- description: "Create a session for decomposition testing"
1515- prompt: |
1616- Create a 9plan session to test building a small utility library.
1717-1818- expected:
1919- tools_called:
2020- - 9plan_session_create
2121- state:
2222- session_exists: true
2323-2424- # Step 2: Add parent plan that's too big
2525- - id: add_parent_plan
2626- description: "Add a plan that needs decomposition"
2727- prompt: |
2828- Add a plan for "Build complete utility library" that includes:
2929- - String utilities (capitalize, truncate, etc.)
3030- - Array utilities (unique, flatten, etc.)
3131- - Date utilities (format, parse, etc.)
3232-3333- This is a big plan that covers multiple areas.
3434-3535- expected:
3636- tools_called:
3737- - 9plan_queue_add
3838- state:
3939- queue_length: 1
4040-4141- # Step 3: Pull and recognize need for decomposition
4242- - id: pull_and_decompose
4343- description: "Pull the plan and decompose it"
4444- prompt: |
4545- Pull the plan. This is too big to do in one go - let's break it down.
4646-4747- Add three sub-plans:
4848- 1. String utilities module
4949- 2. Array utilities module
5050- 3. Date utilities module
5151-5252- Then defer the parent plan to the back of the queue so we can
5353- aggregate the results later.
5454-5555- expected:
5656- tools_called:
5757- - 9plan_queue_pull
5858- - 9plan_queue_add # Adding subplans
5959- - 9plan_plan_defer
6060- state:
6161- # After: 3 subplans at front, parent at back = 4 total
6262- queue_length: 4
6363-6464- validation:
6565- admin_validate: true
6666- # Verify parent was deferred with reason
6767- custom_check: |
6868- # The parent plan's notes should mention decomposition
6969-7070- # Step 4: Execute first subplan
7171- - id: execute_subplan_1
7272- description: "Execute string utilities subplan"
7373- prompt: |
7474- Pull and complete the string utilities plan.
7575- Just describe what would be in it - no need to actually implement.
7676-7777- expected:
7878- tools_called:
7979- - 9plan_queue_pull
8080- - 9plan_plan_complete
8181- state:
8282- queue_length: 3
8383- completed_count: 1
8484-8585- # Step 5: Execute second subplan
8686- - id: execute_subplan_2
8787- description: "Execute array utilities subplan"
8888- prompt: |
8989- Pull and complete the array utilities plan.
9090-9191- expected:
9292- tools_called:
9393- - 9plan_queue_pull
9494- - 9plan_plan_complete
9595- state:
9696- queue_length: 2
9797- completed_count: 2
9898-9999- # Step 6: Execute third subplan
100100- - id: execute_subplan_3
101101- description: "Execute date utilities subplan"
102102- prompt: |
103103- Pull and complete the date utilities plan.
104104-105105- expected:
106106- tools_called:
107107- - 9plan_queue_pull
108108- - 9plan_plan_complete
109109- state:
110110- queue_length: 1 # Only parent remains
111111- completed_count: 3
112112-113113- # Step 7: Re-pull parent and aggregate
114114- - id: aggregate_parent
115115- description: "Re-pull parent plan and aggregate child outcomes"
116116- prompt: |
117117- Pull the remaining plan (the parent).
118118- Use 9plan_history_search or 9plan_history_get to find the child outcomes.
119119- Then complete the parent with an aggregated summary of all the work done.
120120-121121- expected:
122122- tools_called:
123123- - 9plan_queue_pull
124124- - 9plan_history_search # or 9plan_history_get
125125- - 9plan_plan_complete
126126- state:
127127- queue_length: 0
128128- completed_count: 4
129129- queue_empty: true
130130-131131- validation:
132132- admin_validate: true
133133-134134-verification:
135135- history_searches:
136136- - query: "string utilities"
137137- min_results: 1
138138-139139- - query: "array utilities"
140140- min_results: 1
141141-142142- - query: "date utilities"
143143- min_results: 1
144144-145145- - query: "utility library"
146146- min_results: 1
147147- description: "Should find the parent plan"
148148-149149- final_state:
150150- queue_empty: true
151151- all_plans_completed: true
152152- completed_count: 4
-148
validation/scenarios/dependency-resolution.yaml
···11-schema_version: "1.0"
22-scenario_id: dependency-resolution
33-description: |
44- Test semantic search for resolving input dependencies between plans.
55- Verifies that 9plan_history_search correctly finds completed plans
66- that match input descriptions.
77-88-prerequisites:
99- 9plan_built: true
1010-1111-steps:
1212- # Step 1: Create session
1313- - id: create_session
1414- description: "Create a session for dependency testing"
1515- prompt: |
1616- Create a 9plan session to build a simple API client library.
1717-1818- expected:
1919- tools_called:
2020- - 9plan_session_create
2121- state:
2222- session_exists: true
2323-2424- # Step 2: Add foundation plan
2525- - id: add_foundation
2626- description: "Add a plan that produces outputs other plans will need"
2727- prompt: |
2828- Add a plan for creating an HTTP client wrapper.
2929- The outputs should include:
3030- - httpClient module with get(), post(), put(), delete() methods
3131- - Error handling types (ApiError, NetworkError)
3232-3333- expected:
3434- tools_called:
3535- - 9plan_queue_add
3636- state:
3737- queue_length: 1
3838-3939- # Step 3: Add dependent plan
4040- - id: add_dependent
4141- description: "Add a plan that depends on the foundation"
4242- prompt: |
4343- Add a plan for creating user API methods.
4444- The inputs should reference the httpClient module from the previous plan.
4545- The outputs should include:
4646- - userApi module with getUser(), createUser(), updateUser() methods
4747-4848- expected:
4949- tools_called:
5050- - 9plan_queue_add
5151- state:
5252- queue_length: 2
5353-5454- # Step 4: Add another dependent plan
5555- - id: add_another_dependent
5656- description: "Add another plan that also depends on the foundation"
5757- prompt: |
5858- Add a plan for creating posts API methods.
5959- The inputs should also reference the httpClient module.
6060- The outputs should include:
6161- - postsApi module with getPosts(), createPost() methods
6262-6363- expected:
6464- tools_called:
6565- - 9plan_queue_add
6666- state:
6767- queue_length: 3
6868-6969- # Step 5: Execute foundation plan
7070- - id: execute_foundation
7171- description: "Pull and complete the HTTP client plan"
7272- prompt: |
7373- Pull the first plan (HTTP client) and complete it.
7474- Describe what was created in the outcome - mention the httpClient module,
7575- the get/post/put/delete methods, and the error types.
7676-7777- expected:
7878- tools_called:
7979- - 9plan_queue_pull
8080- - 9plan_plan_complete
8181- state:
8282- queue_length: 2
8383- completed_count: 1
8484-8585- # Step 6: Execute dependent plan with dependency resolution
8686- - id: execute_with_resolution
8787- description: "Pull user API plan and resolve its dependency"
8888- prompt: |
8989- Pull the next plan (user API).
9090- Before implementing, search the history for "httpClient module" to find
9191- where the HTTP client was created. Then complete the plan, mentioning
9292- that you found and used the httpClient from the previous work.
9393-9494- expected:
9595- tools_called:
9696- - 9plan_queue_pull
9797- - 9plan_history_search
9898- - 9plan_plan_complete
9999- state:
100100- queue_length: 1
101101- completed_count: 2
102102- response_contains:
103103- - "httpClient" # Should mention finding the dependency
104104-105105- validation:
106106- # Verify the search actually found the foundation plan
107107- custom_check: |
108108- # history_search for "httpClient" should return 1 result
109109-110110- # Step 7: Execute final plan with same dependency
111111- - id: execute_final
112112- description: "Pull posts API plan and resolve same dependency"
113113- prompt: |
114114- Pull the final plan (posts API).
115115- Search history for the httpClient again and complete the plan.
116116-117117- expected:
118118- tools_called:
119119- - 9plan_queue_pull
120120- - 9plan_history_search
121121- - 9plan_plan_complete
122122- state:
123123- queue_length: 0
124124- completed_count: 3
125125-126126-verification:
127127- # Verify semantic search works for various queries
128128- history_searches:
129129- - query: "httpClient get post"
130130- min_results: 1
131131- description: "Should find HTTP client plan"
132132-133133- - query: "user API getUser"
134134- min_results: 1
135135- description: "Should find user API plan"
136136-137137- - query: "posts API createPost"
138138- min_results: 1
139139- description: "Should find posts API plan"
140140-141141- - query: "ApiError NetworkError"
142142- min_results: 1
143143- description: "Should find HTTP client by error types"
144144-145145- final_state:
146146- queue_empty: true
147147- all_plans_completed: true
148148- completed_count: 3
-138
validation/scenarios/error-conditions.yaml
···11-schema_version: "1.0"
22-scenario_id: error-conditions
33-description: |
44- Test error handling for invalid operations.
55- Verifies that 9plan returns appropriate errors for invalid state transitions.
66-77-prerequisites:
88- 9plan_built: true
99-1010-steps:
1111- # Step 1: Create session
1212- - id: create_session
1313- description: "Create a session for error testing"
1414- prompt: |
1515- Create a 9plan session for testing error conditions.
1616-1717- expected:
1818- tools_called:
1919- - 9plan_session_create
2020- state:
2121- session_exists: true
2222-2323- # Step 2: Try to pull from empty queue
2424- - id: pull_empty_queue
2525- description: "Try to pull when queue is empty"
2626- prompt: |
2727- Try to pull a plan from the queue.
2828- The queue should be empty, so this should fail or indicate there's nothing to pull.
2929-3030- expected:
3131- tools_called:
3232- - 9plan_queue_pull
3333- response_contains:
3434- - "empty" # Should mention queue is empty
3535-3636- # Step 3: Try to complete without active plan
3737- - id: complete_no_active
3838- description: "Try to complete when no plan is active"
3939- prompt: |
4040- Try to complete a plan with outcome "test".
4141- There's no active plan, so this should fail.
4242-4343- expected:
4444- tools_called:
4545- - 9plan_plan_complete
4646- response_contains:
4747- - "no active" # Should mention no active plan
4848-4949- # Step 4: Try to defer without active plan
5050- - id: defer_no_active
5151- description: "Try to defer when no plan is active"
5252- prompt: |
5353- Try to defer a plan with reason "testing".
5454- There's no active plan, so this should fail.
5555-5656- expected:
5757- tools_called:
5858- - 9plan_plan_defer
5959- response_contains:
6060- - "no active"
6161-6262- # Step 5: Add a plan and pull it
6363- - id: setup_active_plan
6464- description: "Add and pull a plan to set up active state"
6565- prompt: |
6666- Add a simple test plan and then pull it so we have an active plan.
6767-6868- expected:
6969- tools_called:
7070- - 9plan_queue_add
7171- - 9plan_queue_pull
7272- state:
7373- active_plan: true
7474-7575- # Step 6: Try to pull again while plan is active
7676- - id: pull_while_active
7777- description: "Try to pull when a plan is already active"
7878- prompt: |
7979- Try to pull another plan.
8080- We already have an active plan, so this should fail.
8181-8282- expected:
8383- tools_called:
8484- - 9plan_queue_pull
8585- response_contains:
8686- - "already active" # Should mention plan already active
8787-8888- # Step 7: Clean up - complete the active plan
8989- - id: cleanup
9090- description: "Complete the active plan to clean up"
9191- prompt: |
9292- Complete the active plan with outcome "test completed".
9393-9494- expected:
9595- tools_called:
9696- - 9plan_plan_complete
9797- state:
9898- active_plan: false
9999- completed_count: 1
100100-101101- # Step 8: Try to get non-existent plan from history
102102- - id: history_get_invalid
103103- description: "Try to get a plan that doesn't exist"
104104- prompt: |
105105- Try to get plan details for a non-existent plan ID like "xxxxx".
106106-107107- expected:
108108- tools_called:
109109- - 9plan_history_get
110110- response_contains:
111111- - "not found" # Should indicate plan not found
112112-113113- # Step 9: Try to resume non-existent session
114114- - id: resume_invalid_session
115115- description: "Try to resume a session that doesn't exist"
116116- prompt: |
117117- Try to resume a session called "nonexistent-fake-session".
118118-119119- expected:
120120- tools_called:
121121- - 9plan_session_resume
122122- response_contains:
123123- - "not found" # Should indicate session not found
124124-125125-verification:
126126- final_state:
127127- # After all tests, should have 1 completed plan
128128- completed_count: 1
129129- queue_empty: true
130130-131131- # All error conditions should have been tested
132132- error_tests_passed:
133133- - pull_empty_queue
134134- - complete_no_active
135135- - defer_no_active
136136- - pull_while_active
137137- - history_get_invalid
138138- - resume_invalid_session
-175
validation/scenarios/notekeeper-full.yaml
···11-schema_version: "1.0"
22-scenario_id: notekeeper-full
33-description: |
44- Complete Notekeeper CLI application build from scratch.
55- Tests the full 9plan workflow: session creation, planning, execution, and completion.
66-77-prerequisites:
88- 9plan_built: true
99- sandbox_clean: true # validation/sandbox/notekeeper/src/ should be empty
1010-1111-# The overall task the inner Claude is working on
1212-task_description: |
1313- Build a simple CLI note-taking application called Notekeeper.
1414- It should support: add, list, search, and delete commands.
1515- Notes are stored in a JSON file.
1616-1717-steps:
1818- # Step 1: Create the session
1919- - id: create_session
2020- description: "Create a 9plan session for the Notekeeper project"
2121- prompt: |
2222- I want to build a simple CLI note-taking app called Notekeeper.
2323- It should let users add notes, list all notes, search notes, and delete notes.
2424- Notes will be stored in a JSON file.
2525-2626- Start by creating a 9plan session to track this work.
2727-2828- expected:
2929- tools_called:
3030- - 9plan_session_create
3131- state:
3232- session_exists: true
3333- queue_length: 0
3434- response_contains:
3535- - "Session"
3636-3737- validation:
3838- admin_validate: true
3939-4040- # Step 2: Bootstrap plans
4141- - id: bootstrap_plans
4242- description: "Add initial plans for all components"
4343- prompt: |
4444- Now let's plan out the work. Think at the FEATURE level, not individual functions.
4545-4646- We need two main pieces:
4747- 1. Storage layer - Note type definition and JSON file persistence (loadNotes, saveNotes, generateId)
4848- 2. CLI layer - All commands (add, list, search, delete) plus the entry point that routes to them
4949-5050- Add these as 2 plans in the order they should be executed.
5151- The storage module should be first since the CLI depends on it.
5252-5353- expected:
5454- tools_called:
5555- - 9plan_queue_add
5656- state:
5757- queue_length: 2 # 2 plans: storage layer, CLI layer
5858- files:
5959- - pattern: "~/.9plan/sessions/*/plans/*.txt"
6060- count: 2
6161-6262- validation:
6363- admin_validate: true
6464-6565- # Step 3: Execute storage layer
6666- - id: execute_storage
6767- description: "Pull and implement the storage layer"
6868- prompt: |
6969- Pull the first plan and implement the storage layer.
7070- Create the files in validation/sandbox/notekeeper/src/.
7171-7272- The Note type should have: id (string), content (string), createdAt (string).
7373- The storage module should export: loadNotes(), saveNotes(), generateId().
7474-7575- expected:
7676- tools_called:
7777- - 9plan_queue_pull
7878- - 9plan_plan_complete
7979- state:
8080- queue_length: 1
8181- completed_count: 1
8282- files:
8383- - pattern: "validation/sandbox/notekeeper/src/types.ts"
8484- exists: true
8585- contains:
8686- - "interface Note"
8787- - "id"
8888- - "content"
8989- - pattern: "validation/sandbox/notekeeper/src/storage.ts"
9090- exists: true
9191- contains:
9292- - "loadNotes"
9393- - "saveNotes"
9494-9595- validation:
9696- admin_validate: true
9797-9898- # Step 4: Execute CLI layer (all commands + entry point)
9999- - id: execute_cli
100100- description: "Pull and implement all CLI commands and entry point"
101101- prompt: |
102102- Pull the final plan and implement the complete CLI layer:
103103- - add command: takes content string, saves new note
104104- - list command: displays all notes with IDs and content
105105- - search command: finds notes by keyword
106106- - delete command: removes note by ID
107107- - Entry point: parses args and routes to the right command
108108-109109- Usage: notekeeper <command> [args]
110110- Commands: add <content>, list, search <keyword>, delete <id>
111111-112112- Create these in validation/sandbox/notekeeper/src/commands/ and src/index.ts.
113113-114114- expected:
115115- tools_called:
116116- - 9plan_queue_pull
117117- - 9plan_plan_complete
118118- state:
119119- queue_length: 0
120120- completed_count: 2
121121- queue_empty: true
122122- files:
123123- - pattern: "validation/sandbox/notekeeper/src/commands/add.ts"
124124- exists: true
125125- - pattern: "validation/sandbox/notekeeper/src/commands/list.ts"
126126- exists: true
127127- - pattern: "validation/sandbox/notekeeper/src/commands/search.ts"
128128- exists: true
129129- - pattern: "validation/sandbox/notekeeper/src/commands/delete.ts"
130130- exists: true
131131- - pattern: "validation/sandbox/notekeeper/src/index.ts"
132132- exists: true
133133-134134- validation:
135135- admin_validate: true
136136-137137-# Final verification after all steps
138138-verification:
139139- # Build the project
140140- commands:
141141- - command: "cd validation/sandbox/notekeeper && npm install"
142142- description: "Install dependencies"
143143- success: true
144144-145145- - command: "cd validation/sandbox/notekeeper && npm run build"
146146- description: "Build TypeScript"
147147- success: true
148148-149149- - command: "node validation/sandbox/notekeeper/dist/index.js add \"Test note from validation\""
150150- description: "Test add command"
151151- output_contains: "added"
152152-153153- - command: "node validation/sandbox/notekeeper/dist/index.js list"
154154- description: "Test list command"
155155- output_contains: "Test note from validation"
156156-157157- - command: "node validation/sandbox/notekeeper/dist/index.js search test"
158158- description: "Test search command"
159159- output_contains: "Test note from validation"
160160-161161- # Verify history search works
162162- history_searches:
163163- - query: "storage loadNotes saveNotes"
164164- min_results: 1
165165- description: "Should find the storage layer plan"
166166-167167- - query: "CLI commands add list search delete"
168168- min_results: 1
169169- description: "Should find the CLI layer plan"
170170-171171- # Final state check
172172- final_state:
173173- queue_empty: true
174174- all_plans_completed: true
175175- completed_count: 2
-232
validation/scenarios/schema.md
···11-# Validation Scenario File Schema
22-33-This document describes the YAML format for validation scenario files.
44-55-## Overview
66-77-Scenario files define a sequence of steps for the outer Claude to execute against an inner Claude, along with expected outcomes and verification commands.
88-99-## Schema Version
1010-1111-All scenario files must specify the schema version:
1212-1313-```yaml
1414-schema_version: "1.0"
1515-```
1616-1717-## Top-Level Fields
1818-1919-| Field | Type | Required | Description |
2020-|-------|------|----------|-------------|
2121-| `schema_version` | string | Yes | Schema version (currently "1.0") |
2222-| `scenario_id` | string | Yes | Unique identifier for this scenario |
2323-| `description` | string | Yes | Human-readable description |
2424-| `prerequisites` | object | No | What must be true before running |
2525-| `steps` | array | Yes | Sequence of prompts and validations |
2626-| `verification` | object | No | Final verification commands |
2727-2828-## Prerequisites
2929-3030-Optional conditions that must be met before running:
3131-3232-```yaml
3333-prerequisites:
3434- 9plan_built: true # npm run build completed
3535- sandbox_clean: true # sandbox/notekeeper/src/ is empty
3636- no_existing_sessions: true # no 9plan sessions exist
3737-```
3838-3939-## Steps
4040-4141-Each step contains a prompt to send and expected outcomes:
4242-4343-```yaml
4444-steps:
4545- - id: create_session
4646- description: "Create a 9plan session for the project"
4747- prompt: |
4848- I want to build a simple CLI note-taking app called Notekeeper.
4949- Start by creating a 9plan session to track this work.
5050-5151- expected:
5252- # What the inner Claude should do
5353- tools_called:
5454- - 9plan_session_create
5555-5656- # State after this step
5757- state:
5858- session_exists: true
5959- queue_length: 0
6060- active_plan: null
6161-6262- # Files that should exist
6363- files:
6464- - pattern: "~/.9plan/sessions/*/session.db"
6565- exists: true
6666- - pattern: "~/.9plan/sessions/*/plans/"
6767- is_directory: true
6868-6969- # Optional: How to validate
7070- validation:
7171- admin_validate: true # Run 9plan_admin_validate
7272- custom_check: |
7373- # PowerShell to run for custom validation
7474- Test-Path ~/.9plan/sessions/*/session.db
7575-```
7676-7777-## Step Fields
7878-7979-| Field | Type | Required | Description |
8080-|-------|------|----------|-------------|
8181-| `id` | string | Yes | Unique step identifier |
8282-| `description` | string | Yes | What this step does |
8383-| `prompt` | string | Yes | Prompt to send to inner Claude |
8484-| `expected` | object | No | Expected outcomes |
8585-| `validation` | object | No | How to validate this step |
8686-| `on_failure` | string | No | What to do if step fails ("abort", "continue", "retry") |
8787-8888-## Expected Outcomes
8989-9090-### tools_called
9191-9292-List of MCP tools the inner Claude should call:
9393-9494-```yaml
9595-expected:
9696- tools_called:
9797- - 9plan_session_create
9898- - 9plan_queue_add
9999-```
100100-101101-### state
102102-103103-Expected 9plan state after this step:
104104-105105-```yaml
106106-expected:
107107- state:
108108- session_exists: true
109109- queue_length: 3
110110- active_plan: null
111111- completed_count: 0
112112-```
113113-114114-### files
115115-116116-Expected file system state:
117117-118118-```yaml
119119-expected:
120120- files:
121121- - pattern: "validation/sandbox/notekeeper/src/storage.ts"
122122- exists: true
123123- contains:
124124- - "export function loadNotes"
125125- - "export function saveNotes"
126126- - pattern: "~/.9plan/sessions/*/plans/*.txt"
127127- count: 3 # Exactly 3 plan files
128128-```
129129-130130-### response_contains
131131-132132-Keywords that should appear in inner Claude's response:
133133-134134-```yaml
135135-expected:
136136- response_contains:
137137- - "session created"
138138- - "Session:"
139139-```
140140-141141-## Verification
142142-143143-Final verification after all steps complete:
144144-145145-```yaml
146146-verification:
147147- # Commands to run
148148- commands:
149149- - command: "cd validation/sandbox/notekeeper && npm run build"
150150- success: true
151151-152152- - command: "node validation/sandbox/notekeeper/dist/index.js add 'Test note'"
153153- output_contains: "Note added"
154154-155155- - command: "node validation/sandbox/notekeeper/dist/index.js list"
156156- output_contains: "Test note"
157157-158158- # History searches to verify
159159- history_searches:
160160- - query: "storage module"
161161- min_results: 1
162162-163163- - query: "add command"
164164- results_contain:
165165- goal_keywords: ["add", "command"]
166166-167167- # Final state check
168168- final_state:
169169- queue_empty: true
170170- all_plans_completed: true
171171-```
172172-173173-## Complete Example
174174-175175-```yaml
176176-schema_version: "1.0"
177177-scenario_id: simple-session-test
178178-description: "Test basic session creation and plan lifecycle"
179179-180180-prerequisites:
181181- 9plan_built: true
182182-183183-steps:
184184- - id: create_session
185185- description: "Create a session"
186186- prompt: "Create a 9plan session for testing"
187187- expected:
188188- tools_called:
189189- - 9plan_session_create
190190- state:
191191- session_exists: true
192192- validation:
193193- admin_validate: true
194194-195195- - id: add_plan
196196- description: "Add a test plan"
197197- prompt: "Add a plan to test something simple"
198198- expected:
199199- tools_called:
200200- - 9plan_queue_add
201201- state:
202202- queue_length: 1
203203-204204- - id: pull_plan
205205- description: "Pull the plan"
206206- prompt: "Pull the plan and mark it complete with a simple outcome"
207207- expected:
208208- tools_called:
209209- - 9plan_queue_pull
210210- - 9plan_plan_complete
211211- state:
212212- queue_length: 0
213213- completed_count: 1
214214-215215-verification:
216216- history_searches:
217217- - query: "test"
218218- min_results: 1
219219-220220- final_state:
221221- queue_empty: true
222222-```
223223-224224-## Scenario Files in This Directory
225225-226226-| File | Description |
227227-|------|-------------|
228228-| `notekeeper-full.yaml` | Complete Notekeeper build scenario |
229229-| `decomposition-test.yaml` | Test plan decomposition and aggregation |
230230-| `error-conditions.yaml` | Test error handling |
231231-| `dependency-resolution.yaml` | Test semantic search for dependencies |
232232-| `session-recovery.yaml` | Test session resume after context loss |
-153
validation/scenarios/session-recovery.yaml
···11-schema_version: "1.0"
22-scenario_id: session-recovery
33-description: |
44- Test session resume after context loss.
55- Simulates a scenario where the agent loses context and needs to resume
66- a session using 9plan_session_resume.
77-88-prerequisites:
99- 9plan_built: true
1010-1111-# Special instruction for outer Claude:
1212-# After step 3, you will start a NEW inner Claude session (new --resume chain)
1313-# to simulate context loss. The new session should use session_resume.
1414-1515-steps:
1616- # Phase 1: Initial work (first inner Claude session)
1717-1818- - id: create_session
1919- description: "Create a session and do some work"
2020- prompt: |
2121- Create a 9plan session for building a calculator app.
2222-2323- expected:
2424- tools_called:
2525- - 9plan_session_create
2626- state:
2727- session_exists: true
2828- # IMPORTANT: Outer Claude must save the session name for later
2929-3030- - id: add_plans
3131- description: "Add several plans"
3232- prompt: |
3333- Add plans for:
3434- 1. Basic operations (add, subtract, multiply, divide)
3535- 2. Advanced operations (power, sqrt, log)
3636- 3. Memory functions (store, recall, clear)
3737-3838- expected:
3939- tools_called:
4040- - 9plan_queue_add
4141- state:
4242- queue_length: 3
4343-4444- - id: start_work
4545- description: "Pull and complete the first plan"
4646- prompt: |
4747- Pull the first plan (basic operations) and complete it.
4848-4949- expected:
5050- tools_called:
5151- - 9plan_queue_pull
5252- - 9plan_plan_complete
5353- state:
5454- queue_length: 2
5555- completed_count: 1
5656-5757- # --- CONTEXT LOSS SIMULATION ---
5858- # Outer Claude: Start a new inner Claude session here (no --resume)
5959- # This simulates the agent losing context mid-task
6060-6161- - id: context_loss
6262- description: "Simulate context loss by starting fresh inner Claude"
6363- special_instruction: |
6464- OUTER CLAUDE: Start a completely new inner Claude session.
6565- Do NOT use --resume. This simulates the agent losing all context.
6666- Save the session name from step 1 to give to the new session.
6767- prompt: null # No prompt - this is an instruction for outer Claude
6868-6969- # Phase 2: Recovery (new inner Claude session)
7070-7171- - id: resume_session
7272- description: "Resume the session using session name"
7373- # Outer Claude should tell the new inner Claude about the session
7474- prompt: |
7575- You were working on a calculator app but lost context.
7676- Resume the 9plan session named "{SESSION_NAME_FROM_STEP_1}".
7777- (Outer Claude: substitute the actual session name here)
7878-7979- expected:
8080- tools_called:
8181- - 9plan_session_resume
8282- state:
8383- session_exists: true
8484- response_contains:
8585- - "resumed"
8686-8787- - id: check_state_after_resume
8888- description: "Check the queue state after resuming"
8989- prompt: |
9090- Check what plans are in the queue. You should see some plans remaining
9191- from before the context loss.
9292-9393- expected:
9494- state:
9595- queue_length: 2 # 2 plans should remain
9696- completed_count: 1 # 1 was completed before
9797-9898- - id: continue_work
9999- description: "Continue working after resume"
100100- prompt: |
101101- Pull the next plan and complete it.
102102-103103- expected:
104104- tools_called:
105105- - 9plan_queue_pull
106106- - 9plan_plan_complete
107107- state:
108108- queue_length: 1
109109- completed_count: 2
110110-111111- - id: finish_work
112112- description: "Complete the remaining work"
113113- prompt: |
114114- Pull and complete the final plan.
115115-116116- expected:
117117- tools_called:
118118- - 9plan_queue_pull
119119- - 9plan_plan_complete
120120- state:
121121- queue_length: 0
122122- completed_count: 3
123123-124124-verification:
125125- # Verify all plans were completed despite context loss
126126- history_searches:
127127- - query: "basic operations add subtract"
128128- min_results: 1
129129- description: "First plan should be in history"
130130-131131- - query: "advanced operations power sqrt"
132132- min_results: 1
133133- description: "Second plan should be in history"
134134-135135- - query: "memory functions store recall"
136136- min_results: 1
137137- description: "Third plan should be in history"
138138-139139- final_state:
140140- queue_empty: true
141141- all_plans_completed: true
142142- completed_count: 3
143143-144144-# Notes for outer Claude:
145145-notes: |
146146- This scenario is special because it requires starting a new inner Claude
147147- session mid-way through to simulate context loss.
148148-149149- Steps 1-3: Use one inner Claude session (with --resume between steps)
150150- Step 4: Instruction to outer Claude - no inner Claude action
151151- Steps 5-8: Use a NEW inner Claude session (fresh start, no --resume from before)
152152-153153- The key validation is that session_resume allows recovery and work continues.
-113
validation/scripts/invoke-inner.ps1
···11-<#
22-.SYNOPSIS
33- Invoke inner Claude with pre-approved 9plan tools
44-.DESCRIPTION
55- Helper script for the "Claude sandwich" validation pattern.
66- Bundles the --mcp-config and --allowedTools flags so you don't have to type them every time.
77-.PARAMETER Prompt
88- The prompt to send to inner Claude
99-.PARAMETER Resume
1010- Optional session ID to resume a previous conversation
1111-.PARAMETER OutputFormat
1212- Output format: "text" (default) or "json"
1313-.EXAMPLE
1414- .\invoke-inner.ps1 -Prompt "Create a 9plan session for Notekeeper"
1515-.EXAMPLE
1616- .\invoke-inner.ps1 -Prompt "Add plans for storage module" -Resume "abc123" -OutputFormat json
1717-#>
1818-param(
1919- [Parameter(Mandatory=$true)]
2020- [string]$Prompt,
2121-2222- [Parameter(Mandatory=$false)]
2323- [string]$Resume,
2424-2525- [Parameter(Mandatory=$false)]
2626- [ValidateSet("text", "json")]
2727- [string]$OutputFormat = "text",
2828-2929- [Parameter(Mandatory=$false)]
3030- [switch]$Bootstrap
3131-)
3232-3333-# All 9plan MCP tools + file access for implementation
3434-$ALLOWED_TOOLS = @(
3535- # File operations (needed for reading plans and writing code)
3636- "Read",
3737- "Write",
3838- # 9plan session tools
3939- "mcp__9plan__9plan_session_create",
4040- "mcp__9plan__9plan_session_resume",
4141- # 9plan queue tools
4242- "mcp__9plan__9plan_queue_add",
4343- "mcp__9plan__9plan_queue_pull",
4444- # 9plan plan lifecycle tools
4545- "mcp__9plan__9plan_plan_complete",
4646- "mcp__9plan__9plan_plan_defer",
4747- "mcp__9plan__9plan_plan_discard",
4848- # 9plan history tools
4949- "mcp__9plan__9plan_history_search",
5050- "mcp__9plan__9plan_history_get",
5151- # 9plan admin tools (for validation)
5252- "mcp__9plan__9plan_admin_validate",
5353- "mcp__9plan__9plan_admin_sessions",
5454- "mcp__9plan__9plan_admin_state"
5555-) -join ","
5656-5757-# Bootstrap prompt content (condensed from src/prompts/bootstrap.ts)
5858-$BOOTSTRAP_PROMPT = @"
5959-You have access to 9plan, a work queue system for tracking complex tasks.
6060-6161-WORKFLOW:
6262-1. CREATE SESSION: Use 9plan_session_create with a task description
6363-2. DECOMPOSE: Break the task into discrete, self-contained plans
6464-3. ENQUEUE: Add plans with 9plan_queue_add (use "back" position, add in dependency order)
6565-4. EXECUTE: Pull plans with 9plan_queue_pull, implement, then complete with 9plan_plan_complete
6666-6767-PLAN SCOPE - AVOID OVER-DECOMPOSITION:
6868-Plans should be at the FEATURE level, not the function level.
6969-- GOOD: "Implement CLI commands for notes (add, list, search, delete)" - groups related functionality
7070-- BAD: "Implement add command" then "Implement list command" - too granular, should be ONE plan
7171-- NEVER A PLAN: "Write tests for X", "Commit changes", "Run the build" - these are part of completing plans, not separate plans
7272-Rule of thumb: A simple CLI app needs 1-2 plans, not 6-7.
7373-7474-PLAN STRUCTURE - Each plan MUST have:
7575-- Context: Where this fits in the overall task
7676-- Goal: Specific, measurable objective (at FEATURE level, not function level)
7777-- Inputs: What this plan needs from other plans (by description)
7878-- Outputs: What this plan produces that others may need
7979-- Approach: Concrete, actionable steps. Include enough detail (code samples, structure, edge cases) that the plan is executable without additional context.
8080-- Testing: REQUIRED! Specific commands to verify the implementation works (e.g., "Run npm run build - should compile without errors")
8181-- Success Criteria: How you'll know it's done
8282-8383-TESTING IS REQUIRED - Every plan must have specific verification commands. Before completing a plan, RUN THE TESTS.
8484-Good: "Run node dist/index.js list - should show all notes"
8585-Bad: "(none)" or "verify it works" - TOO VAGUE, will be rejected
8686-8787-DEPENDENCY RESOLUTION:
8888-- Plans reference dependencies by description, not ID
8989-- Use 9plan_history_search to find completed plan outputs when needed
9090-9191-Use these tools to track work formally so progress survives context limits.
9292-"@
9393-9494-# Build the command
9595-$cmdArgs = @(
9696- "-p", $Prompt,
9797- "--mcp-config", ".mcp.json",
9898- "--allowedTools", $ALLOWED_TOOLS,
9999- "--output-format", $OutputFormat
100100-)
101101-102102-if ($Bootstrap) {
103103- $cmdArgs += "--append-system-prompt"
104104- $cmdArgs += $BOOTSTRAP_PROMPT
105105-}
106106-107107-if ($Resume) {
108108- $cmdArgs += "--resume"
109109- $cmdArgs += $Resume
110110-}
111111-112112-# Execute
113113-& claude @cmdArgs