Reference implementation for the Phoenix Architecture. Work in progress. aicoding.leaflet.pub/
ai coding crazy
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

feat: harder multi-resource spec + improved architecture target

Expanded todo spec: categories with FK relationships, query filtering,
stats endpoint. Fixed route mounting to derive paths from IU names.
Strengthened DB import rules in architecture prompt.

Score: 42% (8/19) — categories CRUD works, todos partially work.
Remaining: JOINs, filtering, stats, delete cascade. Ready for
autoresearch prompt optimization.

+182 -87
+30 -14
examples/todo-app/spec/todos.md
··· 1 1 # Todo API 2 2 3 - A REST API for managing todo items backed by SQLite. 3 + A REST API for managing todo lists and items, with categories and basic stats. 4 4 5 - ## Todos Resource 5 + ## Categories 6 6 7 - - A todo has: id (integer, auto-increment primary key), title (text, required), completed (integer 0 or 1, default 0), and created_at (timestamp, set automatically on creation) 8 - - GET / must return all todos as a JSON array ordered by created_at descending 9 - - GET /:id must return a single todo as a JSON object, or 404 if not found 10 - - POST / must create a new todo from a JSON request body containing a title field and return it with status 201 11 - - PATCH /:id must update a todo's title and/or completed fields from a JSON request body, or 404 if not found 12 - - DELETE /:id must delete a todo and return 204 with no body, or 404 if not found 13 - - Title must not be empty 14 - - Title must be at most 200 characters 15 - - Completed must be 0 or 1 16 - - All error responses must be JSON objects with an "error" field containing a human-readable message 17 - - Invalid JSON request bodies must return 400 18 - - Validation failures must return 400 with a description of what failed 7 + - A category has: id (integer, auto-increment), name (text, required, unique), color (text, default '#888888') 8 + - GET /categories must return all categories as a JSON array 9 + - POST /categories must create a category and return it with 201 10 + - DELETE /categories/:id must delete a category and return 204; if the category has todos, return 400 with an error 11 + - Category name must not be empty and must be at most 50 characters 12 + 13 + ## Todos 14 + 15 + - A todo has: id (integer, auto-increment), title (text, required), completed (integer 0 or 1, default 0), category_id (integer, nullable foreign key to categories), created_at (timestamp, default now) 16 + - GET /todos must return all todos ordered by created_at descending, each todo must include its category name (as category_name) if it has one 17 + - GET /todos?completed=1 must filter to only completed todos; GET /todos?completed=0 must filter to only incomplete todos 18 + - GET /todos?category_id=N must filter to only todos in that category 19 + - GET /todos/:id must return a single todo with category_name included, or 404 20 + - POST /todos must create a todo with title and optional category_id, return 201 21 + - PATCH /todos/:id must update title, completed, and/or category_id 22 + - DELETE /todos/:id must delete a todo and return 204, or 404 23 + - Title must not be empty and must be at most 200 characters 24 + - If category_id is provided, it must reference an existing category; return 400 otherwise 25 + 26 + ## Stats 27 + 28 + - GET /stats must return a JSON object with: total (total todo count), completed (completed count), incomplete (incomplete count), by_category (array of {category_name, count} ordered by count descending) 29 + 30 + ## Error Handling 31 + 32 + - All error responses must be JSON with an "error" field 33 + - Invalid JSON bodies must return 400 34 + - Validation failures must return 400
+130 -64
experiments/eval-runner-arch.ts
··· 45 45 execSync(`node ${CLI} init --arch=sqlite-web-api`, { cwd: TODO_APP, stdio: 'pipe' }); 46 46 47 47 console.log('Bootstrapping (LLM generation)...'); 48 - execSync(`node ${CLI} bootstrap`, { cwd: TODO_APP, stdio: 'pipe', timeout: 300000 }); 48 + execSync(`node ${CLI} bootstrap`, { cwd: TODO_APP, stdio: 'pipe', timeout: 600000 }); 49 49 50 50 console.log('Installing dependencies...'); 51 51 execSync('npm install', { cwd: TODO_APP, stdio: 'pipe', timeout: 60000 }); ··· 107 107 } 108 108 } 109 109 110 - console.log('\nRunning CRUD tests:'); 110 + console.log('\nRunning tests:'); 111 + 112 + // ─── Categories ───────────────────────────────────────────────────────────── 113 + 114 + let catId: number | null = null; 115 + 116 + await test('POST /categories creates category', async () => { 117 + const res = await fetch(`${BASE}/categories`, { 118 + method: 'POST', headers: { 'Content-Type': 'application/json' }, 119 + body: JSON.stringify({ name: 'Work', color: '#ff0000' }), 120 + }); 121 + if (res.status !== 201) return false; 122 + const body = await res.json() as Record<string, unknown>; 123 + catId = body.id as number; 124 + return body.name === 'Work' && typeof body.id === 'number'; 125 + }); 126 + 127 + await test('POST /categories rejects empty name', async () => { 128 + const res = await fetch(`${BASE}/categories`, { 129 + method: 'POST', headers: { 'Content-Type': 'application/json' }, 130 + body: JSON.stringify({ name: '' }), 131 + }); 132 + return res.status === 400; 133 + }); 134 + 135 + await test('GET /categories returns array', async () => { 136 + const res = await fetch(`${BASE}/categories`); 137 + if (res.status !== 200) return false; 138 + const body = await res.json() as unknown[]; 139 + return Array.isArray(body) && body.length >= 1; 140 + }); 141 + 142 + // ─── Todos with categories ────────────────────────────────────────────────── 143 + 144 + let todoId: number | null = null; 111 145 112 - // POST /todos — create 113 - let createdId: number | null = null; 114 - await test('POST /todos returns 201 with todo', async () => { 146 + await test('POST /todos creates todo with category', async () => { 115 147 const res = await fetch(`${BASE}/todos`, { 116 - method: 'POST', 117 - headers: { 'Content-Type': 'application/json' }, 118 - body: JSON.stringify({ title: 'Test todo' }), 148 + method: 'POST', headers: { 'Content-Type': 'application/json' }, 149 + body: JSON.stringify({ title: 'Finish report', category_id: catId }), 119 150 }); 120 151 if (res.status !== 201) return false; 121 152 const body = await res.json() as Record<string, unknown>; 122 - createdId = body.id as number; 123 - return typeof body.id === 'number' && body.title === 'Test todo' && 'created_at' in body; 153 + todoId = body.id as number; 154 + return body.title === 'Finish report' && typeof body.id === 'number'; 124 155 }); 125 156 126 - // POST /todos — validation 127 - await test('POST /todos rejects empty title with 400', async () => { 157 + await test('POST /todos creates todo without category', async () => { 128 158 const res = await fetch(`${BASE}/todos`, { 129 - method: 'POST', 130 - headers: { 'Content-Type': 'application/json' }, 159 + method: 'POST', headers: { 'Content-Type': 'application/json' }, 160 + body: JSON.stringify({ title: 'Buy milk' }), 161 + }); 162 + return res.status === 201; 163 + }); 164 + 165 + await test('POST /todos rejects invalid category_id', async () => { 166 + const res = await fetch(`${BASE}/todos`, { 167 + method: 'POST', headers: { 'Content-Type': 'application/json' }, 168 + body: JSON.stringify({ title: 'Bad category', category_id: 9999 }), 169 + }); 170 + return res.status === 400; 171 + }); 172 + 173 + await test('POST /todos rejects empty title', async () => { 174 + const res = await fetch(`${BASE}/todos`, { 175 + method: 'POST', headers: { 'Content-Type': 'application/json' }, 131 176 body: JSON.stringify({ title: '' }), 132 177 }); 133 - if (res.status !== 400) return false; 134 - const body = await res.json() as Record<string, unknown>; 135 - return typeof body.error === 'string'; 178 + return res.status === 400; 136 179 }); 137 180 138 - // GET /todos — list 139 - await test('GET /todos returns array with created todo', async () => { 181 + await test('GET /todos returns todos with category_name', async () => { 140 182 const res = await fetch(`${BASE}/todos`); 141 183 if (res.status !== 200) return false; 142 - const body = await res.json() as unknown[]; 143 - return Array.isArray(body) && body.length >= 1; 184 + const body = await res.json() as Array<Record<string, unknown>>; 185 + const withCat = body.find(t => t.title === 'Finish report'); 186 + return withCat?.category_name === 'Work'; 144 187 }); 145 188 146 - // GET /todos/:id — get one 147 - await test('GET /todos/:id returns the todo', async () => { 148 - if (!createdId) return false; 149 - const res = await fetch(`${BASE}/todos/${createdId}`); 189 + await test('GET /todos/:id returns todo with category_name', async () => { 190 + if (!todoId) return false; 191 + const res = await fetch(`${BASE}/todos/${todoId}`); 150 192 if (res.status !== 200) return false; 151 193 const body = await res.json() as Record<string, unknown>; 152 - return body.title === 'Test todo'; 194 + return body.category_name === 'Work'; 153 195 }); 154 196 155 - // GET /todos/999 — 404 156 197 await test('GET /todos/999 returns 404', async () => { 157 - const res = await fetch(`${BASE}/todos/999`); 158 - return res.status === 404; 198 + return (await fetch(`${BASE}/todos/999`)).status === 404; 159 199 }); 160 200 161 - // PATCH /todos/:id — update 162 - await test('PATCH /todos/:id updates completed', async () => { 163 - if (!createdId) return false; 164 - const res = await fetch(`${BASE}/todos/${createdId}`, { 165 - method: 'PATCH', 166 - headers: { 'Content-Type': 'application/json' }, 201 + // ─── Filtering ────────────────────────────────────────────────────────────── 202 + 203 + await test('PATCH /todos/:id marks completed', async () => { 204 + if (!todoId) return false; 205 + const res = await fetch(`${BASE}/todos/${todoId}`, { 206 + method: 'PATCH', headers: { 'Content-Type': 'application/json' }, 167 207 body: JSON.stringify({ completed: 1 }), 168 208 }); 169 209 if (res.status !== 200) return false; ··· 171 211 return body.completed === 1; 172 212 }); 173 213 174 - // PATCH /todos/:id — update title 175 - await test('PATCH /todos/:id updates title', async () => { 176 - if (!createdId) return false; 177 - const res = await fetch(`${BASE}/todos/${createdId}`, { 178 - method: 'PATCH', 179 - headers: { 'Content-Type': 'application/json' }, 180 - body: JSON.stringify({ title: 'Updated title' }), 181 - }); 214 + await test('GET /todos?completed=1 filters completed', async () => { 215 + const res = await fetch(`${BASE}/todos?completed=1`); 216 + if (res.status !== 200) return false; 217 + const body = await res.json() as Array<Record<string, unknown>>; 218 + return body.length >= 1 && body.every(t => t.completed === 1); 219 + }); 220 + 221 + await test('GET /todos?completed=0 filters incomplete', async () => { 222 + const res = await fetch(`${BASE}/todos?completed=0`); 223 + if (res.status !== 200) return false; 224 + const body = await res.json() as Array<Record<string, unknown>>; 225 + return body.length >= 1 && body.every(t => t.completed === 0); 226 + }); 227 + 228 + await test('GET /todos?category_id=N filters by category', async () => { 229 + if (!catId) return false; 230 + const res = await fetch(`${BASE}/todos?category_id=${catId}`); 231 + if (res.status !== 200) return false; 232 + const body = await res.json() as Array<Record<string, unknown>>; 233 + return body.length >= 1; 234 + }); 235 + 236 + // ─── Stats ────────────────────────────────────────────────────────────────── 237 + 238 + await test('GET /stats returns counts', async () => { 239 + const res = await fetch(`${BASE}/stats`); 182 240 if (res.status !== 200) return false; 183 241 const body = await res.json() as Record<string, unknown>; 184 - return body.title === 'Updated title'; 242 + return typeof body.total === 'number' && typeof body.completed === 'number' && typeof body.incomplete === 'number'; 185 243 }); 186 244 187 - // Create another to delete 188 - let deleteId: number | null = null; 189 - await test('POST /todos creates second todo for deletion', async () => { 190 - const res = await fetch(`${BASE}/todos`, { 191 - method: 'POST', 192 - headers: { 'Content-Type': 'application/json' }, 193 - body: JSON.stringify({ title: 'Delete me' }), 194 - }); 195 - if (res.status !== 201) return false; 245 + await test('GET /stats includes by_category', async () => { 246 + const res = await fetch(`${BASE}/stats`); 247 + if (res.status !== 200) return false; 196 248 const body = await res.json() as Record<string, unknown>; 197 - deleteId = body.id as number; 198 - return true; 249 + const byCat = body.by_category as Array<Record<string, unknown>> | undefined; 250 + return Array.isArray(byCat) && byCat.length >= 1 && typeof byCat[0].category_name === 'string'; 199 251 }); 200 252 201 - // DELETE /todos/:id 253 + // ─── Delete ───────────────────────────────────────────────────────────────── 254 + 202 255 await test('DELETE /todos/:id returns 204', async () => { 203 - if (!deleteId) return false; 204 - const res = await fetch(`${BASE}/todos/${deleteId}`, { method: 'DELETE' }); 205 - return res.status === 204; 256 + if (!todoId) return false; 257 + return (await fetch(`${BASE}/todos/${todoId}`, { method: 'DELETE' })).status === 204; 206 258 }); 207 259 208 - // Verify deletion 209 - await test('GET /todos/:id returns 404 after delete', async () => { 210 - if (!deleteId) return false; 211 - const res = await fetch(`${BASE}/todos/${deleteId}`); 212 - return res.status === 404; 260 + await test('DELETE /categories/:id with todos returns 400', async () => { 261 + // "Buy milk" has no category, but create one with a category to test 262 + const res = await fetch(`${BASE}/categories`, { 263 + method: 'POST', headers: { 'Content-Type': 'application/json' }, 264 + body: JSON.stringify({ name: 'Temp' }), 265 + }); 266 + const cat = await res.json() as Record<string, unknown>; 267 + await fetch(`${BASE}/todos`, { 268 + method: 'POST', headers: { 'Content-Type': 'application/json' }, 269 + body: JSON.stringify({ title: 'Temp todo', category_id: cat.id }), 270 + }); 271 + const delRes = await fetch(`${BASE}/categories/${cat.id}`, { method: 'DELETE' }); 272 + return delRes.status === 400; 273 + }); 274 + 275 + await test('DELETE /categories/:id without todos returns 204', async () => { 276 + if (!catId) return false; 277 + // catId's todos were already deleted 278 + return (await fetch(`${BASE}/categories/${catId}`, { method: 'DELETE' })).status === 204; 213 279 }); 214 280 215 281 // ─── Step 4: Score ──────────────────────────────────────────────────────────
+2
experiments/results-arch.tsv
··· 1 1 timestamp score passed total failures 2 2 2026-03-27T05:28:30.104Z 1.00 10 10 none 3 3 2026-03-27T05:29:23.199Z 1.00 10 10 none 4 + 2026-03-27T05:33:56.314Z 0.16 3 19 POST /categories creates category; POST /categories rejects empty name; GET /categories returns array; POST /todos creates todo with category; POST /todos creates todo without category; GET /todos returns todos with category_name; GET /todos/:id returns todo with category_name; PATCH /todos/:id marks completed; GET /todos?completed=1 filters completed; GET /todos?completed=0 filters incomplete; GET /todos?category_id=N filters by category; GET /stats returns counts; GET /stats includes by_category; DELETE /todos/:id returns 204; DELETE /categories/:id with todos returns 400; DELETE /categories/:id without todos returns 204 5 + 2026-03-27T06:10:38.517Z 0.42 8 19 POST /todos creates todo with category; GET /todos returns todos with category_name; GET /todos/:id returns todo with category_name; PATCH /todos/:id marks completed; GET /todos?completed=1 filters completed; GET /todos?category_id=N filters by category; GET /stats returns counts; GET /stats includes by_category; DELETE /todos/:id returns 204; DELETE /categories/:id with todos returns 400; DELETE /categories/:id without todos returns 204
+14 -6
src/architectures/sqlite-web-api.ts
··· 89 89 90 90 You are generating a route handler module for a Hono REST API backed by SQLite. 91 91 92 - ### CRITICAL import rules — follow EXACTLY 93 - - Import \`{ Hono }\` from 'hono' 94 - - Import \`{ db, registerMigration }\` from '../../db.js' — the shared database module is TWO levels up from the generated module. 95 - - Import \`{ z }\` from 'zod' 96 - - NEVER create your own Database instance. NEVER write \`new Database(...)\` or \`import Database from 'better-sqlite3'\`. The shared db.js provides the single db connection. 97 - - NEVER define your own Hono class or Database type. Import them from the packages. 92 + ### CRITICAL import rules — follow EXACTLY, no exceptions 93 + 94 + Your file MUST start with these EXACT three import lines: 95 + \`\`\` 96 + import { Hono } from 'hono'; 97 + import { db, registerMigration } from '../../db.js'; 98 + import { z } from 'zod'; 99 + \`\`\` 100 + 101 + ### FORBIDDEN — these will cause the build to fail 102 + - FORBIDDEN: \`import Database from 'better-sqlite3'\` — NEVER import better-sqlite3 directly 103 + - FORBIDDEN: \`new Database(...)\` — NEVER create a database instance 104 + - FORBIDDEN: \`const db = ...\` — the db variable comes from the import above 105 + - Use ONLY the \`db\` import from \`../../db.js\` for ALL database operations 98 106 99 107 ### Module structure 100 108 - Export a Hono router instance as the DEFAULT export: \`export default router;\`
+6 -3
src/scaffold.ts
··· 85 85 const routeImports: string[] = []; 86 86 const routeMounts: string[] = []; 87 87 for (const svc of services) { 88 - for (const mod of svc.modules) { 88 + for (let i = 0; i < svc.modules.length; i++) { 89 + const mod = svc.modules[i]; 90 + const iu = svc.ius[i]; 89 91 const modName = mod.replace('.ts', '').replace(/-/g, '_').replace(/[^a-zA-Z0-9_]/g, '_'); 90 92 const importPath = `./generated/${svc.dir}/${mod.replace('.ts', '.js')}`; 91 93 routeImports.push(`import ${modName} from '${importPath}';`); 92 - // Use the service dir as the route prefix 93 - const prefix = `/${svc.dir}`; 94 + // Derive mount path from IU name: "Todos" → "/todos", "Categories" → "/categories" 95 + const iuName = iu?.name ?? mod.replace('.ts', ''); 96 + const prefix = '/' + iuName.toLowerCase().replace(/\s+/g, '-').replace(/[^a-z0-9-]/g, ''); 94 97 routeMounts.push(`mount('${prefix}', ${modName});`); 95 98 } 96 99 }