personal memory agent
0
fork

Configure Feed

Select the types of activity you want to include in your feed.

think/detect_created: constrain generation with detect_created.schema.json

Add think/detect_created.schema.json (Draft 2020-12) and pass it as
json_schema= to the existing generate(...) call in detect_created().
This is the first non-talent-dispatcher consumer of the L1 json_schema
kwarg threaded through in c030248d.

Approach mirrors the L3 talent migrations (8c952dc4 sense, 50693752
story, 0e098e7b daily_schedule) but applied via direct generate() rather
than the talent dispatcher: detect_created.md is loaded through
think.prompts.load_prompt, not think.talent.get_talent, so the schema
lives co-located at think/detect_created.schema.json and is passed
explicitly.

Schema uses the provider-intersection subset only (type, enum, pattern,
required, additionalProperties, properties, minLength), with root
additionalProperties: false and required: [day, time, confidence, source,
utc]. The module memoizes the schema at import with a module-level
_SCHEMA constant; a malformed schema fails import loudly.

Caller wiring, parsing, UTC->local conversion, and the return shape are
unchanged. think/models.py validates advisorily via Draft202012Validator
and logs violations; no provider plumbing or caller edits were needed.

Live provider validation deferred: the worktree has no .env and provider
keys are unavailable. Advisory schema_validation will engage on the next
real run against google (primary) and anthropic (backup), matching the
0e098e7b precedent.

Tests: tests/test_detect_created_schema.py adds (1) Draft202012Validator
schema-validity, (2) accept/reject matrix covering each field's
constraint, and (3) a wiring assertion that detect_created() passes
_SCHEMA to generate() via monkeypatched think.models.generate. Existing
tests/test_importer.py mocks are unaffected (they return plain dicts and
bypass the schema path). make ci green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

+112
+80
tests/test_detect_created_schema.py
··· 1 + # SPDX-License-Identifier: AGPL-3.0-only 2 + # Copyright (c) 2026 sol pbc 3 + 4 + import importlib 5 + import json 6 + from pathlib import Path 7 + 8 + from jsonschema import Draft202012Validator 9 + 10 + import think.models as models 11 + 12 + detect_created_mod = importlib.import_module("think.detect_created") 13 + 14 + DETECT_CREATED_SCHEMA_PATH = ( 15 + Path(__file__).resolve().parents[1] / "think" / "detect_created.schema.json" 16 + ) 17 + 18 + 19 + def _load_detect_created_schema() -> dict: 20 + return json.loads(DETECT_CREATED_SCHEMA_PATH.read_text(encoding="utf-8")) 21 + 22 + 23 + def test_detect_created_schema_file_is_valid_draft_2020_12(): 24 + Draft202012Validator.check_schema(_load_detect_created_schema()) 25 + 26 + 27 + def test_detect_created_schema_accepts_and_rejects_expected_values(): 28 + schema = _load_detect_created_schema() 29 + validator = Draft202012Validator(schema) 30 + valid = { 31 + "day": "20240315", 32 + "time": "143052", 33 + "confidence": "high", 34 + "source": "QuickTime:CreateDate", 35 + "utc": True, 36 + } 37 + 38 + assert validator.is_valid(valid) 39 + assert not validator.is_valid( 40 + { 41 + "day": "20240315", 42 + "time": "143052", 43 + "confidence": "high", 44 + "source": "QuickTime:CreateDate", 45 + } 46 + ) 47 + assert not validator.is_valid({**valid, "day": "2024-03-15"}) 48 + assert not validator.is_valid({**valid, "time": "14:30:52"}) 49 + assert not validator.is_valid({**valid, "confidence": "certain"}) 50 + assert not validator.is_valid({**valid, "extra": "x"}) 51 + assert not validator.is_valid({**valid, "source": ""}) 52 + 53 + 54 + def test_detect_created_passes_schema_to_generate(monkeypatch): 55 + captured = {} 56 + 57 + def fake_generate(**kwargs): 58 + captured.update(kwargs) 59 + return ( 60 + '{"day": "20240315", "time": "143052", "confidence": "high", ' 61 + '"source": "QuickTime:CreateDate", "utc": false}' 62 + ) 63 + 64 + monkeypatch.setattr(models, "generate", fake_generate) 65 + monkeypatch.setattr( 66 + detect_created_mod, 67 + "_extract_metadata", 68 + lambda path: "QuickTime Create Date : 2024:03:15 14:30:52", 69 + ) 70 + 71 + result = detect_created_mod.detect_created("/dev/null") 72 + 73 + assert captured["json_schema"] is detect_created_mod._SCHEMA 74 + assert result == { 75 + "day": "20240315", 76 + "time": "143052", 77 + "confidence": "high", 78 + "source": "QuickTime:CreateDate", 79 + "utc": False, 80 + }
+5
think/detect_created.py
··· 15 15 16 16 from .prompts import load_prompt 17 17 18 + _SCHEMA = json.loads( 19 + (Path(__file__).parent / "detect_created.schema.json").read_text(encoding="utf-8") 20 + ) 21 + 18 22 19 23 def _load_system_prompt() -> str: 20 24 """Load the system prompt from detect_created.txt file.""" ··· 99 103 thinking_budget=4096, 100 104 system_instruction=_load_system_prompt(), 101 105 json_output=True, 106 + json_schema=_SCHEMA, 102 107 ) 103 108 104 109 try:
+27
think/detect_created.schema.json
··· 1 + { 2 + "$schema": "https://json-schema.org/draft/2020-12/schema", 3 + "type": "object", 4 + "additionalProperties": false, 5 + "required": ["day", "time", "confidence", "source", "utc"], 6 + "properties": { 7 + "day": { 8 + "type": "string", 9 + "pattern": "^\\d{8}$" 10 + }, 11 + "time": { 12 + "type": "string", 13 + "pattern": "^\\d{6}$" 14 + }, 15 + "confidence": { 16 + "type": "string", 17 + "enum": ["high", "medium", "low"] 18 + }, 19 + "source": { 20 + "type": "string", 21 + "minLength": 1 22 + }, 23 + "utc": { 24 + "type": "boolean" 25 + } 26 + } 27 + }