commits
Migrate entity_observer to an external Draft 2020-12 schema and flip the
observations payload from a dynamic-key dict ({slug: [...]}) to a typed
list of groups ([{entity_id, items: [...]}]). Typed lists validate under
OpenAI strict mode where patternProperties does not (founder decision #4,
2026-04-19 audit).
Clean break: no dict-of-lists fallback in the post-hook; schema + prompt
+ hook flip together. Stats baseline updated to reflect the new schema
field surfacing for this talent (matches the speaker_attribution pattern).
Hook tests rewritten to the new shape; a new schema test file covers
validator + positive/negative payloads.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Scaffolding chunks 1-4 of the spl-solstone integration (vpe req_xhoetsh6).
Complete fork from github.com/solpbc/spl home/ — no pip dep, no submodule,
no sync. Two projects are fully independent from here.
- think/link/: tunnel service (service.py, relay_client.py, wsgi_bridge.py,
ca.py, auth.py, nonces.py, mux.py, framing.py, tls_adapter.py, paths.py).
pair_server.py dropped — pair runs through convey's existing listener.
CA has no passphrase layer per spec (journal/link/ca/private.pem mode 0600).
Added last_seen_at column + touch_last_seen() on AuthorizedClients.
WSGI bridge pipes tunnel bytes to convey's real Flask app.
- apps/link/: dashboard (workspace.html), Flask routes (/pair-start,
/pair, /unpair, /api/devices, /api/status), Typer CLI (pair/list/
unpair/status). All spec literal-copy strings landed verbatim.
- supervisor: link service launches alongside cortex (--no-link opt-out).
- sol.py: 'sol link' command + GROUPS entry.
- pyproject.toml: pyOpenSSL + websockets deps.
- tests/link/test_framing.py: 17 tests ported from spl-repo (all green).
Privacy invariant: no payload bytes in logs. Rendezvous-only (method,
path, status, byte counts, tunnel_id, stream_id). Callosum tract 'link'
emits enrolled/connecting/connected/disconnect/tunnel_pair/tunnel_close/
last_seen.
Remaining chunks (ca/auth/mux/nonces/wsgi unit tests; in-tree test
client; end-to-end integration test; blindness grep; spl-repo
cross-reference PR) delegated to a continuation hopper lode.
Full solstone suite: 3498 tests passing.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Apply founder decision #3 from the 2026-04-19 audit by schema-constraining speaker_attribution output to a top-level array, loading that schema from prompt frontmatter, and removing post_process tolerance for legacy wrapper shapes; this keeps the migration a clean break per CLAUDE.md §8, adds focused schema and post-hook coverage, and mirrors the new schema field in the stats API baseline.
Next structured-outputs migration: constrain the schedule talent with an
external Draft 2020-12 schema. This introduces
`talent/schedule.schema.json`, wires it via
`"schema": "schedule.schema.json"` in the frontmatter, and encodes
every field the post-hook reads. `talent/schedule.py` is unchanged:
strict hook validation remains authoritative, and the schema adds
defense-in-depth plus provider-side constraint.
The single semantic tightening is `activity`: it now uses a closed enum
of 10 values (`meeting`, `call`, `deadline`, `appointment`, `event`,
`travel`, `reminder`, `errand`, `celebration`,
`doctor_appointment`) instead of the prompt's prior "not a restricted
enum" disclaimer. The scope's seed list held: a fixture scan across
`tests/fixtures/` and `tests/baselines/` found zero real anticipated
records, so no widening was required.
The inline participation entry deliberately omits `entity_id` (5 fields
instead of the shared fragment's 6) because the schedule hook assigns it
itself via `find_matching_entity`. A drift test in
`tests/test_schedule_schema.py` asserts the inline shape equals the
shared fragment minus `entity_id`, so any future fragment change forces
an explicit update here. The loader still has no cross-file `$ref`
plumbing, so the shape remains inlined rather than referenced.
A few schema decisions are worth recording. `participation_confidence`
is required but typed `["number", "null"]`, matching the hook's
`.get(...)` and null-tolerant prompt contract. `details` is required but
typed `"string"` with no `minLength`, because empty string is valid per
both the prompt and `str(... or "")` in the hook. `start` and `end`
use `["string", "null"]` plus an `HH:MM:SS` pattern; under Draft
2020-12 that pattern only applies to string instances.
This continues the structured-outputs series after participation
(07bc7b8e), detect_created (b58bd862), daily_schedule (0e098e7b), story
(50693752), and sense (8c952dc4). Live provider validation remains
deferred per precedent in those prior lodes.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds read-only meta-surface reporting journal trust state: capture
coverage, synthesis completeness, consumer-signal health. Emits explicit
HealthNotes for missing signals and stale state rather than silently
zeroing.
- think/surfaces/types.py: append 5 frozen dataclasses per LOCKED shape
(CaptureHealth, SynthesisHealth, ConsumerSignalHealth, HealthNote,
HealthReport). Field names/types/order fixed by scope contract.
- think/surfaces/health.py: summary/full/for_range returning HealthReport.
One shared range scan across get_facets() × days feeds capture and
synthesis builders; consumer-signals composes ledger.list() twice.
Silent-facet ladder emits at most one note per facet (highest
severity); stale/missing indexer and missing talent day-index emit
dedicated notes. Two structural info notes (coverage_ratio and
corrections-ledger) always emitted per scope.
- think/tools/health.py: Typer sub-app mounting summary / full /
for-range / pipeline under sol call health. Pipeline subcommand
relocated verbatim from apps/health/call.py (local-time default
preserved). Top-level help disambiguates from sol health (service
liveness).
- think/call.py: re-alphabetize built-in import/mount block; register
health_app once.
- apps/health/call.py + apps/health/tests/test_call.py deleted to free
the sol call health namespace for the new built-in. Web surface
(routes.py, workspace.html, app.json, talent/health/) left intact.
- tests/test_surfaces_health.py: 23 tests covering every LOCKED metric,
note-emission rule, range validation, deterministic sort order, CLI
JSON round-trip, and pipeline subcommand behavior preservation.
Note sort order: (severity_rank, category, message) with
critical=0/warn=1/info=2, applied once in _build_report. Never silently
zero a missing signal. coverage_ratio always None in V1; hours_total
denominator shipped, but the ratio is withheld until Sprint 5+ per
cpo/specs/in-flight/consumer-surface-health.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Next L3 structured-outputs migration: constrain the participation talent
with an external schema.
Follows sense (8c952dc4), story (50693752), daily_schedule (0e098e7b),
and detect_created (b58bd862).
Introduces talent/participation.schema.json plus a shared
participation_entry.schema.json fragment. The top-level schema inlines
that fragment shape (Option B) instead of using cross-file $ref because
think/talent.py::_load_talent_schema has no cross-file registry
plumbing; drift is guarded by
tests/test_participation_schema.py::test_participation_schema_items_match_fragment.
No post-hook changes: talent/participation.py is byte-identical, and the
prompt body in talent/participation.md is untouched.
Provider-side validation remains deferred per precedent; advisory local
validation continues through the existing dispatcher path.
Forward note: talent/schedule.md:55-63 is the logical next adoption site
for the shared fragment, but its current participation example does not
include entity_id, which the fragment requires, so that follow-up lode
will need to update the example or introduce a variant.
Add think/detect_created.schema.json (Draft 2020-12) and pass it as
json_schema= to the existing generate(...) call in detect_created().
This is the first non-talent-dispatcher consumer of the L1 json_schema
kwarg threaded through in c030248d.
Approach mirrors the L3 talent migrations (8c952dc4 sense, 50693752
story, 0e098e7b daily_schedule) but applied via direct generate() rather
than the talent dispatcher: detect_created.md is loaded through
think.prompts.load_prompt, not think.talent.get_talent, so the schema
lives co-located at think/detect_created.schema.json and is passed
explicitly.
Schema uses the provider-intersection subset only (type, enum, pattern,
required, additionalProperties, properties, minLength), with root
additionalProperties: false and required: [day, time, confidence, source,
utc]. The module memoizes the schema at import with a module-level
_SCHEMA constant; a malformed schema fails import loudly.
Caller wiring, parsing, UTC->local conversion, and the return shape are
unchanged. think/models.py validates advisorily via Draft202012Validator
and logs violations; no provider plumbing or caller edits were needed.
Live provider validation deferred: the worktree has no .env and provider
keys are unavailable. Advisory schema_validation will engage on the next
real run against google (primary) and anthropic (backup), matching the
0e098e7b precedent.
Tests: tests/test_detect_created_schema.py adds (1) Draft202012Validator
schema-validity, (2) accept/reject matrix covering each field's
constraint, and (3) a wiring assertion that detect_created() passes
_SCHEMA to generate() via monkeypatched think.models.generate. Existing
tests/test_importer.py mocks are unaffected (they return plain dicts and
bypass the schema path). make ci green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Third structured-outputs L3 migration after talents/sense (8c952dc4)
and talents/story (50693752). Introduces talent/daily_schedule.schema.json —
Draft 2020-12, provider-intersection subset (type, pattern, required,
additionalProperties, properties) — two HH:MM-pattern string fields
{primary, fallback}.
talent/daily_schedule.py is unchanged — post_process's strptime("%H:%M")
remains the authoritative gate; schema enforcement is advisory via the
existing dispatcher plumbing (think/talent.py and think/talents.py) and
surfaces on the finish event's schema_validation.
Live validation deferred — GOOGLE_API_KEY not available in this worktree
environment. Following the sense+story precedent, dispatcher pass-through
coverage remains provided by tests/test_generate_full.py
(test_dispatcher_passes_json_schema and
test_finish_event_includes_schema_validation), and new unit tests assert
schema validity and pattern behavior end-to-end.
Tests:
- test_daily_schedule_schema_file_is_valid_draft_2020_12
- test_daily_schedule_loaded_json_schema_matches_on_disk_schema
- test_daily_schedule_pattern_accepts_and_rejects_expected_values
Stats API baseline refresh: one new `schema` field in the daily_schedule
generator block in tests/baselines/api/stats/stats.json.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Second structured-outputs L3 migration after talents/sense (8c952dc4).
Introduces talent/story.schema.json — Draft 2020-12, provider-intersection
subset (type, enum, required, additionalProperties, properties, items,
minLength, minimum, maximum, maxItems) — shared by the three story
talents. Schema enforcement is advisory: schema_validation surfaces on
the finish event via existing dispatcher plumbing (think/talent.py and
think/talents.py), but never blocks generation or the story post-hook.
talent/story.py is unchanged — it remains the authoritative source for
normalization (topics lowercase/dedupe/cap, confidence clamp, resolution
enum filtering). The schema mirrors its required-field contract:
- top-level: body, topics, confidence, commitments, closures, decisions
- commitments: 5 required string fields
- closures: 5 required string fields + resolution enum
{sent, done, signed, dropped, deferred}
- decisions: 3 required string fields
- additionalProperties: false at every closed object level
- topics deliberately permissive (no minItems/uniqueItems/per-item
minLength) — the hook does its own normalization
Tests:
- test_story_schema_file_is_valid_draft_2020_12
- test_story_talents_load_shared_schema (all three talents resolve the
shared schema through get_talent)
- test_story_schema_mirrors_hook_requirements (asserts required-field
parity with talent/story.py and enum parity with ALLOWED_RESOLUTIONS)
- test_story_hook_fixtures_validate_against_schema (hook test fixtures
stay schema-clean)
- All 12 existing test_story_hook.py tests remain green.
Stats API baseline refresh: the stats API exposes raw generator
frontmatter, so the new `schema` field appears in three generator blocks
in tests/baselines/api/stats/stats.json — updated accordingly.
Live validation deferred — provider creds not available in this worktree
environment, matching the sense migration's disposition. End-to-end
wiring is verified by the dispatcher pass-through test plus existing
json_schema coverage in tests/test_generate_full.py and
tests/test_talent.py.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Five field-shape corrections to match cpo/specs/in-flight/consumer-surface-profile.md:
- Cadence.interactions_90d renamed to recent_interactions_count_30d; count
now filters the existing 90-day scan to the 30-day window (same anchor as
closed_since in full()). Interval math, last_seen, and quiet_gap_days still
derive from the full 90-day history.
- Cadence.gone_quiet_since now returns int days-since-last-seen (was the
YYYYMMDD string).
- Profile.generated_at is now an int UTC-ms timestamp (was an ISO string).
- Profile.full()'s decisions_involving_them drops the since= filter so
older-than-30-day decisions surface (previously suppressed).
- ProfileBrief reshaped to the spec's 7-field form: drops is_self and
generated_at; adds type and description (via new _ResolvedTarget.
description_for() helper shared with full()).
No backwards-compat shims; every caller updated in place. CLI renderers
and tests updated to match.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Second of three V1 consumer surfaces after Ledger. Profile is an
entity-centric, read-time, no-cache library: cadence math from
participation records composed with per-counterparty Ledger state.
Exposes full/brief/cadence/list_active via library and sol call profile.
Make run_generate and run_agenerate robust to two latent Anthropic SDK
constraints that surfaced during the L3 sense pilot. Both apply only to
the generate path; cogitate is untouched.
Fix A: when thinking_budget is positive and max_output_tokens falls at
or below thinking_budget + 1000, lift max_tokens rather than clamp the
caller's thinking budget. The caller's declared max is a stated output
floor when thinking is active; clamping thinking would silently shrink
a deliberate reasoning budget. A logger.info emits before/after on each
lift. The BadRequestError retry inherits the adjustment via dict copy.
Fix B: route requests through client.messages.stream(...) when
max_tokens trips the SDK's non-streaming guard, either by exceeding
MODEL_NONSTREAMING_TOKENS[model] or the time formula threshold
(60 * 60 * max_tokens / 128_000 > 600 ≈ 21,333 tokens). Downstream
extraction is unchanged since ParsedMessage subclasses Message.
MODEL_NONSTREAMING_TOKENS is imported from anthropic._constants — the
SDK itself imports it via the public messages path at
anthropic/resources/messages/messages.py, so the symbol is stable.
Both fixes compose: thinking_budget=24576 + max_output_tokens=24576 is
lifted to 25577 and then routed through streaming. The retry path
re-evaluates the dispatch decision, so a primary create() call that
raises BadRequestError with post-lift max_tokens above the threshold
will route its retry through streaming.
Live validation with production-scale budgets deferred:
ANTHROPIC_API_KEY is not available in this worktree. Unit tests cover
Fix A (adjust / no-adjust / async), Fix B (create vs stream per model,
sync + async), the Fix A → Fix B interaction, and the tool-use fallback
routing under streaming.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Live validation of the L3 sense pilot surfaced a real bug in L1's
Anthropic structured-output fallback path: when the primary
output_config call raises BadRequestError, the fallback to forced
tool_use kept the `thinking` parameter, which Anthropic's API rejects
("Thinking may not be enabled when tool_choice forces tool use"). The
fallback then bubbled a confusing secondary 400 instead of recovering.
Drop `thinking` from retry_kwargs in both sync + async paths. Restore
the temperature value that thinking originally displaced (the primary
path sets thinking xor temperature). Add a regression test asserting
the retry kwargs strip thinking and carry temperature forward.
Pre-existing Anthropic constraints surfaced during the same live test
but are out of scope here:
1. max_tokens must be > thinking.budget_tokens (production sense
defaults satisfy this)
2. SDK requires streaming for max_tokens that could take >10 min
(~30k+ for sonnet) — production sense default of 49152 hits this
Both affect any thinking-enabled Anthropic caller, schema or no
schema. Filed as separate VPE follow-up notes.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move sense's structured-output schema out of the inline ```json fenced
block in talent/sense.md into talent/sense.schema.json (Draft 2020-12,
additionalProperties:false at every closed level), wired via the
`schema` frontmatter field that landed in cde43afc. The inline ##
Output Schema block becomes a one-line pointer; the field-by-field
prose stays as the model's semantic reference.
First real consumer of the structured-outputs infrastructure landed
in c030248d (Lode 1 — provider json_schema plumbing) and cde43afc
(Lode 2 — talent dispatcher schema wiring). This lode is the L3 pilot;
remaining `output: "json"` talents migrate one at a time.
Schema design:
- 9 required top-level fields; all enums copied verbatim from the
field-by-field prose
- speakers tightened to array<string> per apps/speakers/routes.py and
apps/speakers/bootstrap.py contracts
- Provider-intersection subset only: type, enum, required,
additionalProperties, properties, items, minLength
Test fixture alignment:
- _make_sense_output() helper now returns a schema-valid default
- test_pipeline_smoke SEGMENTS padded to schema-valid shape
- test_sense_contamination_guard payloads padded to schema-valid shape
- Intentionally degraded splitter inputs ({} and null-valued dict)
retained as defensive tests with comments explaining why
Live validation deferred — provider creds not available in this
worktree environment. End-to-end wiring is verified by:
- get_talent("sense")["json_schema"] equals the on-disk schema
(new test_sense_loaded_json_schema_matches_on_disk_schema)
- existing test_generate_full.py:129/:221 cover json_schema forwarding
and schema_validation surfacing on the finish event
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Cortex spawns talents via `python -m think.talents` (think/cortex.py:32
TALENT_EXECUTION_MODULE, :241 spawn args). Without an `if __name__ ==
"__main__":` guard, the module loaded, defined `main()`, and exited 0
without ever invoking it — so every talent run since the agents→talents
rename in 0050be8c (2026-04-17) silently failed with "Talent exited
with code 0 without finish event" and no sense.json (or any other
talent output) was written.
Confirmed by surveying /data/solstone/journal/talents/sense/*.jsonl:
50/50 most recent runs error, all runs prior to 2026-04-17T16:53 finish.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
docs/APPS.md gains a Skills app section covering the CLI verb surface
(Lode A), the owner-wide storage under journal/skills/, and the
daily observer (41) + editor (60) talent pair. API baselines are
refreshed: the legacy "skills" entry is gone; skill_observer +
skill_editor appear at the expected priorities in stats, talents-day,
providers, and generators.
Daily priority ladder after this lode:
30 activities_review
40 facet_newsletter
41 skill_observer ← new
50 morning_briefing, todos/daily (pre-existing collision)
55 entities
56 entities_review
57 entity_observer
60 skill_editor ← new
Part of Lode B of the skills-observer-editor refactor.
Deletes talent/skills.{md,py} and tests/test_skills_hook.py. The
new apps/skills/talent pair (skill_observer + skill_editor) plus
apps/skills/call.py now owns the full skills pipeline end to end.
tests/test_journal_skill.py is unrelated (tests the journal SKILL.md
symlinks) and is preserved.
Part of Lode B of the skills-observer-editor refactor.
Rewrites _collect_skills() to read journal/skills/ directly; drops
the per-facet walk and the legacy skill_generated flag. Returns the
new shape (slug, description, category, confidence, status, facets,
first_seen, last_seen), excludes retired patterns and patterns
without a profile markdown, sorts by confidence desc then last_seen.
Threads status through both the Jinja card block and the
refreshSkills() JS builder, appending " (dormant)" to dormant
cards. test_home_skills.py migrates to owner-wide fixtures with
dormant-visible and retired-hidden coverage.
Part of Lode B of the skills-observer-editor refactor.
Daily owner-wide talent pair that replaces the old per-activity
talent/skills observer/generator:
- skill_observer.md (cogitate, priority 41): reads recent activities
and promotes/seeds/refreshes patterns via sol call skills.
- skill_editor.md + skill_editor.py (generate, priority 60): writes
one profile per run from the pending queue (edit_requests →
needs_profile → needs_refresh → skip).
- Pre-hook builds $skill_context from metadata + full observation
ledger + last-5 activity records + last-3 narratives + per-span
JSONL reads for observations[-3:] (≈4.3–4.8k tokens worst case).
- Post-hook validates frontmatter (name, display_name, description
1–1024, category, numeric confidence 0–1), atomic writes via
think.skills.save_profile, clears pending flags idempotently, and
fires an agency.md nudge only on first-time creates.
- 19 hook tests + 1 prompt smoke test.
Part of Lode B of the skills-observer-editor refactor.
Lands the storage layout, shared helpers, and `sol call skills` CLI
verbs (list, show, observe, seed, promote, refresh, mark-dormant,
retire, edit-request, rename) as the foundation for the skills
observer/editor refactor. Zero user-visible change — the old
per-activity talent and per-facet data stay dormant on disk; the new
owner-wide journal/skills/{patterns.jsonl,edit_requests.jsonl,*.md}
starts empty. Writes are fcntl-locked, atomic, and idempotent. Lode B
will add the observer/editor talents and cut Pulse over to read the
new storage.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lazy-load a schema frontmatter field in get_talent(): require a
relative path, reject traversal and symlink escapes, parse the JSON,
validate Draft 2020-12 well-formedness, and attach the result as
config[\"json_schema\"]. Thread that through both
_execute_generate() call sites and propagate advisory
schema_validation onto finish events only when present.
Only _execute_generate is wired here. _execute_batch_generate does
not exist, and think/batch.py remains a separate, non-talent-aware
path that is explicitly out of scope for this lode.
Schema loading and validation happens lazily in get_talent(), not in
get_talent_configs(). This keeps discovery, status, and settings
paths tolerant of one broken unused talent instead of failing
broadly at metadata load time.
No real talent migrates here; Lode 3 will migrate talent/sense.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Close three schema/normalization gaps in the storyteller post-hook.
Writes record story.talent from context name; topics are stripped,
lowercased, deduped, capped at 10; confidence is clamped to [0,1]
instead of dropping out-of-range rows. NaN, non-numeric, and bool
confidence still reject. Delete the orphaned design doc the refactor
was handed from.
Adds an optional `json_schema: dict | None` parameter to generate(),
agenerate(), and generate_with_result() in think/models.py and threads
it through all four provider modules. Each provider translates to its
native structured-output mechanism (Gemini response_json_schema, OpenAI
Responses text.format.json_schema, Anthropic output_config with
forced-tool-use fallback on BadRequestError, Ollama dict format).
After the call, an advisory jsonschema.validate pass logs violations
and surfaces a schema_validation field on GenerateResult from
generate_with_result; it never raises and never mutates the returned
text. When json_schema is None, every provider's SDK kwargs are
byte-for-byte identical to before. Lode 1 of 3 — talents and callers
unchanged.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Consolidate the three storytelling talents (conversation/work/event) onto
the activity record instead of a separate facets/*/spans/*.jsonl sink.
Each talent now emits story (body + topics + confidence) plus structured
commitments, closures, and decisions; the new talent/story.py hook merges
these onto the activity record via a dedicated writer in think/activities.py
(merge_story_fields), atomically under one file lock with a single edits[]
entry per merge.
- delete talent/spans.py, think/spans.py, and their tests
- drop the facets/*/spans/*.jsonl formatter entry
- storytellers run at priority 20 so they serialize after participation (10),
relying on existing priority-group dispatch rather than new locking
- format_activities renders story body + topics so FTS coverage moves onto
the activity record formatter
- owner/counterparty fuzzy-resolve to *_entity_id via find_matching_entity;
originals preserved, null on miss
- closures with non-vocab resolution dropped per-entry
- hook returns "" to suppress the generator JSON artifact
Unblocks the Ledger consumer surface without any back-compat shims or
historical backfill.
Retire meetings/decisions/followups/messaging per-span talents plus the
orphaned decisionalizer cogitate. Introduce conversation/work/event
storytelling talents that each emit one structured JSONL row per
(span, talent) into journal/facets/{facet}/spans/{YYYYMMDD}.jsonl via a
shared talent/spans.py post-hook. think/spans.py::format_spans renders
each row as an FTS-indexable markdown chunk.
Dispatch changes in think/thinking.py skip all three talents on
synthetic (cogitate/anticipated) or segmentless activity records, and
skip the work talent for browsing/reading activities below level_avg
0.4. Coding is ungated.
Forward-only flip: historical on-disk markdown under
facets/*/activities/{date}/{span_id}/*.md is preserved and still
indexed. Downstream consumers that read the retired filenames
(speakers app, activities_review, morning_briefing) continue to work on
pre-flip-day data and degrade gracefully on new data; a follow-up will
migrate them to read the new spans.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Eliminates the startup race where sense/cortex called require_solstone()
before convey's TCP listener was bound, producing false ERROR /
Restarting noise and dropping startup-catchup tasks with exit code 1.
- think/supervisor.py: new wait_for_convey_ready() called between
start_convey_server() and start_sense/cortex. TaskQueue gains ready
flag + _pending buffer + set_ready() drain so startup catchup tasks
submit before convey is up but dispatch after. main() sets
SOL_SUPERVISOR_SPAWNED=1 before any _launch_process so children
inherit it.
- think/utils.py: EXIT_TEMPFAIL hoisted here as the single source of
truth; require_solstone() exits 75 (absorbed quietly by
handle_runner_exits) when SOL_SUPERVISOR_SPAWNED=1, else keeps the
existing exit-1 + friendly-stderr path for external CLIs.
- tests/test_supervisor_startup.py: 10 new deterministic tests.
- AGENTS.md: tiny fix for an unrelated pre-existing CI blocker
surfaced during `make ci`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move the "Requires solstone to be installed with the sol CLI on PATH"
infrastructure detail out of the frontmatter description — the
Prerequisites section in the body already covers it. Keep the description
purely task-framed ("Query and search your solstone journal from any
project..."). Add sol call command literals to TRIGGER so sessions that
already know the commands still activate the skill. Add Invoke via Bash
line to the body summary.
This skill is installed user-wide via `npx skills add -g` by
`make install-service` (the dual-install pattern documented in the
synthesis plan) — intentional, not orphaned.
Part of skills audit req_loq3e2lk pass 1+2.
Clarify that tos.txt lives at <journal_root>/apps/support/portal/tos.txt
(resolved via journal app storage) to remove any ambiguity about source
tree vs. runtime location. Reformat TRIGGER into playbook style with
task-intent dash prefix + literal command list. Add Invoke via Bash line
and a Gotchas section (KB-first on create, --product default, diagnostic
leakage, attachment ordering).
Part of skills audit req_loq3e2lk pass 1+2.
The talent/vit/ skill is intentionally narrow — it's the publish step
that runs at the end of a VPE playbook when a feature lands through
hopper. Discovery and consumption commands (skim, follow, learn, remix,
vet) are covered by the user-wide `using-vit` skill; duplicating them
here would create activation-ambiguity between two skills.
Make the scope explicit in description, body, and a dedicated Scope
note. Add a structured TRIGGER (was previously inline "Activates when..."
prose) with literal ship-workflow phrases. Add Invoke via Bash line.
Replace the "other commands" section with a pre-ship-diagnostics
section + pointer to using-vit.
Part of skills audit req_loq3e2lk pass 1+2. The audit suggested promoting
skim/follow/learn to peer sections; rejected here because that would
duplicate using-vit and split activation unpredictably. Scope note in
the synthesis plan to address this.
Post-revamp exemplar still missed a structured TRIGGER line and the
"Invoke via Bash" clause. Adds both. Also replaces the stale
"Occurrence" vocabulary row (legacy term) with "Activity" to match
the reform shipped in references/facets.md and references/storage.md.
Part of skills audit req_loq3e2lk pass 1+2.
apps/health/call.py exposes `sol call health pipeline` (the only
command actually defined by the health app itself) but the skill
never documented it. Add a pipeline-summary section with --day /
--yesterday flags and examples.
Reframe description from "diagnose" infrastructure framing to task
framing ("monitor service uptime, troubleshoot capture gaps"). Make
the tri-scope surface explicit (sol health*, sol talent*,
sol call health*) since health diagnostics cross levels. Add a
Gotchas block (callosum 10s timeout, ms-precision talent log IDs,
today-vs-yesterday pipeline window). Expand TRIGGER.
Part of skills audit req_loq3e2lk pass 1+2.
Two user-facing commands existed in think/tools/routines.py (lines 395-442)
but were absent from the skill: suggest-respond (record accept/decline on
a routine suggestion) and suggest-state (print raw suggestion state).
Add both to the command table and a short "Responding to suggestions"
section. Also add a Gotchas block (IANA timezone requirement, suggestion
idempotency). Reframe description from infrastructure framing
("create, manage, and inspect") to task framing ("set up recurring
routines — daily briefings, weekly reviews..."). Expand TRIGGER with
literal command names.
Part of skills audit req_loq3e2lk pass 1+2.
Commands renamed in the write-verb polarity inversion (commit 1c59dad6):
owner detect → detect
owner confirm → confirm-owner (backfill default)
owner reject → reject-owner (14-day cooldown)
Skill still taught the old names; updated examples, headings, and the
conversational curation guidance to match.
Add sections for nine previously-undocumented commands: bootstrap,
resolve-names, attribute-segment, backfill, discover, link-import,
seed-from-imports, owner-ready. Add a Gotchas block led by the
preview-by-default writer polarity — the most common silent-failure
in this CLI.
Plus Invoke via Bash line and expanded TRIGGER with literal command names.
Part of skills audit req_loq3e2lk pass 1+2.
Nudge scheduling (call.py:120-149, 441, 490) was live but the skill only
documented basic add/done/cancel/move. Add the --nudge/-n flag to the
add section with the four accepted formats, then add list-nudges-due
and dispatch-nudges as full sections. Also add a Gotchas block for the
fuzzy-duplicate silent-rejection behavior and the read/mutate split
between list-nudges-due and dispatch-nudges.
Plus Invoke via Bash and expanded TRIGGER.
Part of skills audit req_loq3e2lk pass 1+2.
Three commands shipped to production without matching skill updates:
- move (3/18, commit 064c50e2) — cross-facet entity moves
- consolidate (4/17, commit 9730cd6d) — roll segment detections into journal entities
- merge (4/17, commit c9c1c8a9) — dedup two journal entities, preview-by-default
Add sections for each plus a Gotchas block calling out merge's --no-commit
default, consolidate's 85% fuzzy threshold, and the detect-vs-observe split.
Add Invoke via Bash line and expand TRIGGER with literal command names.
Part of skills audit req_loq3e2lk pass 1+2.
Participation shipped 4/18 (commit 101da52e) but skill was never updated.
Adds the schema (name, role ∈ {attendee, mentioned}, source ∈ {voice,
speaker_label, transcript, screen, other}, confidence, context) and a
worked example. Also adds the --source filter flag on list (anticipated,
user, cogitate).
Plus Invoke via Bash line and literal command names in TRIGGER.
Part of skills audit req_loq3e2lk pass 1+2.
Code renamed in March (f8bd2868, bf1cab94) but skill still taught the
deprecated --audio / --screen flags as primary. Update source-flag table,
source-type meanings, scan description, and example to match current code.
Keep --audio / --screen mentioned as hidden aliases. Also add "Invoke via
Bash" line and literal command names to TRIGGER.
Part of skills audit req_loq3e2lk pass 1+2.
Full revamp of the repo-root AGENTS.md per vpe req_7szh7vce. Promotes
load-bearing invariants from docs/coding-standards.md inline so they
can't be skipped: L1–L9 layer hygiene (with the domain-ownership table
at the top), no-compat-shims, trust get_journal(), SPDX header, fail
loudly. Adds a start-here ordered reading list, a top-level repo map,
an annotated repo mental model, make-command coverage grouped by use
(37 targets with when-to-use notes), a testing quickstart, and an
annotated depth-doc map.
Relocates the misaudienced "Talent CLI Boundaries" runtime restrictions
out of AGENTS.md (coder audience) into talent/journal/references/cli.md
(cogitate audience, loaded via the journal skill).
Replaces the full Layer Hygiene section in docs/coding-standards.md
with a one-line redirect to AGENTS.md §7 — single source of truth for
the invariants, no drift.
Plan: vpe/workspace/solstone-agents-md-revamp-plan.md (extro repo).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tighten contamination guard in talent/sense.md: role: attendee now explicitly requires meeting_detected: true for the same segment; otherwise every Person is role: mentioned.
Add a Python clamp in talent/participation.post_process: when no contributing sense segment has meeting_detected: true, rewrite every role: attendee entry to role: mentioned. Missing/unreadable sense.json collapses to False. One warning per affected record.
Cover with tests/test_sense_contamination_guard.py (prompt-presence + runtime clamp + idempotence + missing-file fallback).
Per-owner identity files (self.md, agency.md, partner.md, awareness.md,
pulse.md, pulse_output.md, identity_pulse.md, history.jsonl, news/*) were
being written into a repo-tracked sol/ directory and committed/pushed to
the public AGPL repo by talent/heartbeat.md. This is an active privacy
leak that violates sol pbc Article IV.
Five coupled moves in one pass:
1. git rm -r sol/ at repo root; /sol/ added to .gitignore.
2. {journal}/sol/ → {journal}/identity/ everywhere (code, tests, fixtures,
docs); ensure_sol_directory → ensure_identity_directory.
3. New write_identity() helper in think/identity.py — single write path
with per-directory fcntl.LOCK_EX, atomic tmpfile + os.replace, 0o600
perms, hash-based history.jsonl audit log (no diffs).
4. talent/heartbeat.md: deleted "Path notes" block; Step 6 is now a no-op
close. Other talents (awareness_tender, pulse, naming) updated to use
identity/ paths and stripped of any commit/push instructions.
5. think/prompts.py: removed SOL_DIR and the repo-root read branch;
template vars are now $identity_* loaded only from {journal}/identity/.
Acceptance greps all return 0. New tests cover write_identity()
atomicity, lock serialization, history schema, 0o600 mode, and missing-
file first writes. CLI surface (sol call identity ...) unchanged.
No data migration code (Jer is sole user, resets identity on deploy).
No backward-compat shims. Clean break.
Sprint 4 cleanup 1 (req_myu5gwm7 Lode 4xvyyikt). Previously
merged via hopper but blocked by known refine-stage stuck-lode
false-positive; applied manually from the lode worktree.
Path A — talent/sense.md: contamination guard now explicitly ties
role: attendee to meeting_detected: true for the same segment.
When meeting_detected: false, every Person must be role: mentioned
even if they spoke, were quoted, or referenced in transcript
(closes podcast / press conference / lecture leak).
Path B — talent/participation.py: post-processor clamps attendee
to mentioned when no contributing sense segment had
meeting_detected: true. Defensive net catching any Path A prompt
leakage. Logs at warning level with count + activity id for
observability.
Tests: 3 new cases in test_sense_contamination_guard.py covering
clamp positive (podcast with speakers → all mentioned), negative
pass-through (real meeting unchanged), idempotence + logging.
make test: 3254 passed, 1 skipped.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Post-Sprint-4 trivials bundled:
- remove dead occurrence/has_extraction detection in settings routes
plus the now-unreachable Extract toggle UI branch in workspace.html,
and drop the corresponding extract/has_extraction keys from the
generators.json baseline; refresh adjacent docstring/copy that
referenced the removed field
- normalize the schedule generator description to the post-Sprint-3
"anticipated activity records" phrasing in talent/schedule.md and
the three API baselines that mirror it
- fix stale "calendar" in the talent/journal/references/config.md
apps.order example (now "activities")
- replace retired "occurrence" hook value with live "schedule" in
the _format_tags test data
- document the sense.json/sense.md dual-write coupling to
think/cluster.py's per-segment talents/**/*.md glob
The activity-anticipation dispatcher only loaded the current local day's
records, so a pre-alert for an early-morning D+1 activity was silently
missed when local_now was late on day D (and symmetric on the other side).
Scan yesterday/today/tomorrow gated by _ACTIVITY_ANTICIPATION_CROSSDAY_WINDOW_MINUTES
(default 120), and build start_dt from each record's own day.
Co-Authored-By: Codex <noreply@openai.com>
Migrate entity_observer to an external Draft 2020-12 schema and flip the
observations payload from a dynamic-key dict ({slug: [...]}) to a typed
list of groups ([{entity_id, items: [...]}]). Typed lists validate under
OpenAI strict mode where patternProperties does not (founder decision #4,
2026-04-19 audit).
Clean break: no dict-of-lists fallback in the post-hook; schema + prompt
+ hook flip together. Stats baseline updated to reflect the new schema
field surfacing for this talent (matches the speaker_attribution pattern).
Hook tests rewritten to the new shape; a new schema test file covers
validator + positive/negative payloads.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Scaffolding chunks 1-4 of the spl-solstone integration (vpe req_xhoetsh6).
Complete fork from github.com/solpbc/spl home/ — no pip dep, no submodule,
no sync. Two projects are fully independent from here.
- think/link/: tunnel service (service.py, relay_client.py, wsgi_bridge.py,
ca.py, auth.py, nonces.py, mux.py, framing.py, tls_adapter.py, paths.py).
pair_server.py dropped — pair runs through convey's existing listener.
CA has no passphrase layer per spec (journal/link/ca/private.pem mode 0600).
Added last_seen_at column + touch_last_seen() on AuthorizedClients.
WSGI bridge pipes tunnel bytes to convey's real Flask app.
- apps/link/: dashboard (workspace.html), Flask routes (/pair-start,
/pair, /unpair, /api/devices, /api/status), Typer CLI (pair/list/
unpair/status). All spec literal-copy strings landed verbatim.
- supervisor: link service launches alongside cortex (--no-link opt-out).
- sol.py: 'sol link' command + GROUPS entry.
- pyproject.toml: pyOpenSSL + websockets deps.
- tests/link/test_framing.py: 17 tests ported from spl-repo (all green).
Privacy invariant: no payload bytes in logs. Rendezvous-only (method,
path, status, byte counts, tunnel_id, stream_id). Callosum tract 'link'
emits enrolled/connecting/connected/disconnect/tunnel_pair/tunnel_close/
last_seen.
Remaining chunks (ca/auth/mux/nonces/wsgi unit tests; in-tree test
client; end-to-end integration test; blindness grep; spl-repo
cross-reference PR) delegated to a continuation hopper lode.
Full solstone suite: 3498 tests passing.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Apply founder decision #3 from the 2026-04-19 audit by schema-constraining speaker_attribution output to a top-level array, loading that schema from prompt frontmatter, and removing post_process tolerance for legacy wrapper shapes; this keeps the migration a clean break per CLAUDE.md §8, adds focused schema and post-hook coverage, and mirrors the new schema field in the stats API baseline.
Next structured-outputs migration: constrain the schedule talent with an
external Draft 2020-12 schema. This introduces
`talent/schedule.schema.json`, wires it via
`"schema": "schedule.schema.json"` in the frontmatter, and encodes
every field the post-hook reads. `talent/schedule.py` is unchanged:
strict hook validation remains authoritative, and the schema adds
defense-in-depth plus provider-side constraint.
The single semantic tightening is `activity`: it now uses a closed enum
of 10 values (`meeting`, `call`, `deadline`, `appointment`, `event`,
`travel`, `reminder`, `errand`, `celebration`,
`doctor_appointment`) instead of the prompt's prior "not a restricted
enum" disclaimer. The scope's seed list held: a fixture scan across
`tests/fixtures/` and `tests/baselines/` found zero real anticipated
records, so no widening was required.
The inline participation entry deliberately omits `entity_id` (5 fields
instead of the shared fragment's 6) because the schedule hook assigns it
itself via `find_matching_entity`. A drift test in
`tests/test_schedule_schema.py` asserts the inline shape equals the
shared fragment minus `entity_id`, so any future fragment change forces
an explicit update here. The loader still has no cross-file `$ref`
plumbing, so the shape remains inlined rather than referenced.
A few schema decisions are worth recording. `participation_confidence`
is required but typed `["number", "null"]`, matching the hook's
`.get(...)` and null-tolerant prompt contract. `details` is required but
typed `"string"` with no `minLength`, because empty string is valid per
both the prompt and `str(... or "")` in the hook. `start` and `end`
use `["string", "null"]` plus an `HH:MM:SS` pattern; under Draft
2020-12 that pattern only applies to string instances.
This continues the structured-outputs series after participation
(07bc7b8e), detect_created (b58bd862), daily_schedule (0e098e7b), story
(50693752), and sense (8c952dc4). Live provider validation remains
deferred per precedent in those prior lodes.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds read-only meta-surface reporting journal trust state: capture
coverage, synthesis completeness, consumer-signal health. Emits explicit
HealthNotes for missing signals and stale state rather than silently
zeroing.
- think/surfaces/types.py: append 5 frozen dataclasses per LOCKED shape
(CaptureHealth, SynthesisHealth, ConsumerSignalHealth, HealthNote,
HealthReport). Field names/types/order fixed by scope contract.
- think/surfaces/health.py: summary/full/for_range returning HealthReport.
One shared range scan across get_facets() × days feeds capture and
synthesis builders; consumer-signals composes ledger.list() twice.
Silent-facet ladder emits at most one note per facet (highest
severity); stale/missing indexer and missing talent day-index emit
dedicated notes. Two structural info notes (coverage_ratio and
corrections-ledger) always emitted per scope.
- think/tools/health.py: Typer sub-app mounting summary / full /
for-range / pipeline under sol call health. Pipeline subcommand
relocated verbatim from apps/health/call.py (local-time default
preserved). Top-level help disambiguates from sol health (service
liveness).
- think/call.py: re-alphabetize built-in import/mount block; register
health_app once.
- apps/health/call.py + apps/health/tests/test_call.py deleted to free
the sol call health namespace for the new built-in. Web surface
(routes.py, workspace.html, app.json, talent/health/) left intact.
- tests/test_surfaces_health.py: 23 tests covering every LOCKED metric,
note-emission rule, range validation, deterministic sort order, CLI
JSON round-trip, and pipeline subcommand behavior preservation.
Note sort order: (severity_rank, category, message) with
critical=0/warn=1/info=2, applied once in _build_report. Never silently
zero a missing signal. coverage_ratio always None in V1; hours_total
denominator shipped, but the ratio is withheld until Sprint 5+ per
cpo/specs/in-flight/consumer-surface-health.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Next L3 structured-outputs migration: constrain the participation talent
with an external schema.
Follows sense (8c952dc4), story (50693752), daily_schedule (0e098e7b),
and detect_created (b58bd862).
Introduces talent/participation.schema.json plus a shared
participation_entry.schema.json fragment. The top-level schema inlines
that fragment shape (Option B) instead of using cross-file $ref because
think/talent.py::_load_talent_schema has no cross-file registry
plumbing; drift is guarded by
tests/test_participation_schema.py::test_participation_schema_items_match_fragment.
No post-hook changes: talent/participation.py is byte-identical, and the
prompt body in talent/participation.md is untouched.
Provider-side validation remains deferred per precedent; advisory local
validation continues through the existing dispatcher path.
Forward note: talent/schedule.md:55-63 is the logical next adoption site
for the shared fragment, but its current participation example does not
include entity_id, which the fragment requires, so that follow-up lode
will need to update the example or introduce a variant.
Add think/detect_created.schema.json (Draft 2020-12) and pass it as
json_schema= to the existing generate(...) call in detect_created().
This is the first non-talent-dispatcher consumer of the L1 json_schema
kwarg threaded through in c030248d.
Approach mirrors the L3 talent migrations (8c952dc4 sense, 50693752
story, 0e098e7b daily_schedule) but applied via direct generate() rather
than the talent dispatcher: detect_created.md is loaded through
think.prompts.load_prompt, not think.talent.get_talent, so the schema
lives co-located at think/detect_created.schema.json and is passed
explicitly.
Schema uses the provider-intersection subset only (type, enum, pattern,
required, additionalProperties, properties, minLength), with root
additionalProperties: false and required: [day, time, confidence, source,
utc]. The module memoizes the schema at import with a module-level
_SCHEMA constant; a malformed schema fails import loudly.
Caller wiring, parsing, UTC->local conversion, and the return shape are
unchanged. think/models.py validates advisorily via Draft202012Validator
and logs violations; no provider plumbing or caller edits were needed.
Live provider validation deferred: the worktree has no .env and provider
keys are unavailable. Advisory schema_validation will engage on the next
real run against google (primary) and anthropic (backup), matching the
0e098e7b precedent.
Tests: tests/test_detect_created_schema.py adds (1) Draft202012Validator
schema-validity, (2) accept/reject matrix covering each field's
constraint, and (3) a wiring assertion that detect_created() passes
_SCHEMA to generate() via monkeypatched think.models.generate. Existing
tests/test_importer.py mocks are unaffected (they return plain dicts and
bypass the schema path). make ci green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Third structured-outputs L3 migration after talents/sense (8c952dc4)
and talents/story (50693752). Introduces talent/daily_schedule.schema.json —
Draft 2020-12, provider-intersection subset (type, pattern, required,
additionalProperties, properties) — two HH:MM-pattern string fields
{primary, fallback}.
talent/daily_schedule.py is unchanged — post_process's strptime("%H:%M")
remains the authoritative gate; schema enforcement is advisory via the
existing dispatcher plumbing (think/talent.py and think/talents.py) and
surfaces on the finish event's schema_validation.
Live validation deferred — GOOGLE_API_KEY not available in this worktree
environment. Following the sense+story precedent, dispatcher pass-through
coverage remains provided by tests/test_generate_full.py
(test_dispatcher_passes_json_schema and
test_finish_event_includes_schema_validation), and new unit tests assert
schema validity and pattern behavior end-to-end.
Tests:
- test_daily_schedule_schema_file_is_valid_draft_2020_12
- test_daily_schedule_loaded_json_schema_matches_on_disk_schema
- test_daily_schedule_pattern_accepts_and_rejects_expected_values
Stats API baseline refresh: one new `schema` field in the daily_schedule
generator block in tests/baselines/api/stats/stats.json.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Second structured-outputs L3 migration after talents/sense (8c952dc4).
Introduces talent/story.schema.json — Draft 2020-12, provider-intersection
subset (type, enum, required, additionalProperties, properties, items,
minLength, minimum, maximum, maxItems) — shared by the three story
talents. Schema enforcement is advisory: schema_validation surfaces on
the finish event via existing dispatcher plumbing (think/talent.py and
think/talents.py), but never blocks generation or the story post-hook.
talent/story.py is unchanged — it remains the authoritative source for
normalization (topics lowercase/dedupe/cap, confidence clamp, resolution
enum filtering). The schema mirrors its required-field contract:
- top-level: body, topics, confidence, commitments, closures, decisions
- commitments: 5 required string fields
- closures: 5 required string fields + resolution enum
{sent, done, signed, dropped, deferred}
- decisions: 3 required string fields
- additionalProperties: false at every closed object level
- topics deliberately permissive (no minItems/uniqueItems/per-item
minLength) — the hook does its own normalization
Tests:
- test_story_schema_file_is_valid_draft_2020_12
- test_story_talents_load_shared_schema (all three talents resolve the
shared schema through get_talent)
- test_story_schema_mirrors_hook_requirements (asserts required-field
parity with talent/story.py and enum parity with ALLOWED_RESOLUTIONS)
- test_story_hook_fixtures_validate_against_schema (hook test fixtures
stay schema-clean)
- All 12 existing test_story_hook.py tests remain green.
Stats API baseline refresh: the stats API exposes raw generator
frontmatter, so the new `schema` field appears in three generator blocks
in tests/baselines/api/stats/stats.json — updated accordingly.
Live validation deferred — provider creds not available in this worktree
environment, matching the sense migration's disposition. End-to-end
wiring is verified by the dispatcher pass-through test plus existing
json_schema coverage in tests/test_generate_full.py and
tests/test_talent.py.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Five field-shape corrections to match cpo/specs/in-flight/consumer-surface-profile.md:
- Cadence.interactions_90d renamed to recent_interactions_count_30d; count
now filters the existing 90-day scan to the 30-day window (same anchor as
closed_since in full()). Interval math, last_seen, and quiet_gap_days still
derive from the full 90-day history.
- Cadence.gone_quiet_since now returns int days-since-last-seen (was the
YYYYMMDD string).
- Profile.generated_at is now an int UTC-ms timestamp (was an ISO string).
- Profile.full()'s decisions_involving_them drops the since= filter so
older-than-30-day decisions surface (previously suppressed).
- ProfileBrief reshaped to the spec's 7-field form: drops is_self and
generated_at; adds type and description (via new _ResolvedTarget.
description_for() helper shared with full()).
No backwards-compat shims; every caller updated in place. CLI renderers
and tests updated to match.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Make run_generate and run_agenerate robust to two latent Anthropic SDK
constraints that surfaced during the L3 sense pilot. Both apply only to
the generate path; cogitate is untouched.
Fix A: when thinking_budget is positive and max_output_tokens falls at
or below thinking_budget + 1000, lift max_tokens rather than clamp the
caller's thinking budget. The caller's declared max is a stated output
floor when thinking is active; clamping thinking would silently shrink
a deliberate reasoning budget. A logger.info emits before/after on each
lift. The BadRequestError retry inherits the adjustment via dict copy.
Fix B: route requests through client.messages.stream(...) when
max_tokens trips the SDK's non-streaming guard, either by exceeding
MODEL_NONSTREAMING_TOKENS[model] or the time formula threshold
(60 * 60 * max_tokens / 128_000 > 600 ≈ 21,333 tokens). Downstream
extraction is unchanged since ParsedMessage subclasses Message.
MODEL_NONSTREAMING_TOKENS is imported from anthropic._constants — the
SDK itself imports it via the public messages path at
anthropic/resources/messages/messages.py, so the symbol is stable.
Both fixes compose: thinking_budget=24576 + max_output_tokens=24576 is
lifted to 25577 and then routed through streaming. The retry path
re-evaluates the dispatch decision, so a primary create() call that
raises BadRequestError with post-lift max_tokens above the threshold
will route its retry through streaming.
Live validation with production-scale budgets deferred:
ANTHROPIC_API_KEY is not available in this worktree. Unit tests cover
Fix A (adjust / no-adjust / async), Fix B (create vs stream per model,
sync + async), the Fix A → Fix B interaction, and the tool-use fallback
routing under streaming.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Live validation of the L3 sense pilot surfaced a real bug in L1's
Anthropic structured-output fallback path: when the primary
output_config call raises BadRequestError, the fallback to forced
tool_use kept the `thinking` parameter, which Anthropic's API rejects
("Thinking may not be enabled when tool_choice forces tool use"). The
fallback then bubbled a confusing secondary 400 instead of recovering.
Drop `thinking` from retry_kwargs in both sync + async paths. Restore
the temperature value that thinking originally displaced (the primary
path sets thinking xor temperature). Add a regression test asserting
the retry kwargs strip thinking and carry temperature forward.
Pre-existing Anthropic constraints surfaced during the same live test
but are out of scope here:
1. max_tokens must be > thinking.budget_tokens (production sense
defaults satisfy this)
2. SDK requires streaming for max_tokens that could take >10 min
(~30k+ for sonnet) — production sense default of 49152 hits this
Both affect any thinking-enabled Anthropic caller, schema or no
schema. Filed as separate VPE follow-up notes.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move sense's structured-output schema out of the inline ```json fenced
block in talent/sense.md into talent/sense.schema.json (Draft 2020-12,
additionalProperties:false at every closed level), wired via the
`schema` frontmatter field that landed in cde43afc. The inline ##
Output Schema block becomes a one-line pointer; the field-by-field
prose stays as the model's semantic reference.
First real consumer of the structured-outputs infrastructure landed
in c030248d (Lode 1 — provider json_schema plumbing) and cde43afc
(Lode 2 — talent dispatcher schema wiring). This lode is the L3 pilot;
remaining `output: "json"` talents migrate one at a time.
Schema design:
- 9 required top-level fields; all enums copied verbatim from the
field-by-field prose
- speakers tightened to array<string> per apps/speakers/routes.py and
apps/speakers/bootstrap.py contracts
- Provider-intersection subset only: type, enum, required,
additionalProperties, properties, items, minLength
Test fixture alignment:
- _make_sense_output() helper now returns a schema-valid default
- test_pipeline_smoke SEGMENTS padded to schema-valid shape
- test_sense_contamination_guard payloads padded to schema-valid shape
- Intentionally degraded splitter inputs ({} and null-valued dict)
retained as defensive tests with comments explaining why
Live validation deferred — provider creds not available in this
worktree environment. End-to-end wiring is verified by:
- get_talent("sense")["json_schema"] equals the on-disk schema
(new test_sense_loaded_json_schema_matches_on_disk_schema)
- existing test_generate_full.py:129/:221 cover json_schema forwarding
and schema_validation surfacing on the finish event
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Cortex spawns talents via `python -m think.talents` (think/cortex.py:32
TALENT_EXECUTION_MODULE, :241 spawn args). Without an `if __name__ ==
"__main__":` guard, the module loaded, defined `main()`, and exited 0
without ever invoking it — so every talent run since the agents→talents
rename in 0050be8c (2026-04-17) silently failed with "Talent exited
with code 0 without finish event" and no sense.json (or any other
talent output) was written.
Confirmed by surveying /data/solstone/journal/talents/sense/*.jsonl:
50/50 most recent runs error, all runs prior to 2026-04-17T16:53 finish.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
docs/APPS.md gains a Skills app section covering the CLI verb surface
(Lode A), the owner-wide storage under journal/skills/, and the
daily observer (41) + editor (60) talent pair. API baselines are
refreshed: the legacy "skills" entry is gone; skill_observer +
skill_editor appear at the expected priorities in stats, talents-day,
providers, and generators.
Daily priority ladder after this lode:
30 activities_review
40 facet_newsletter
41 skill_observer ← new
50 morning_briefing, todos/daily (pre-existing collision)
55 entities
56 entities_review
57 entity_observer
60 skill_editor ← new
Part of Lode B of the skills-observer-editor refactor.
Deletes talent/skills.{md,py} and tests/test_skills_hook.py. The
new apps/skills/talent pair (skill_observer + skill_editor) plus
apps/skills/call.py now owns the full skills pipeline end to end.
tests/test_journal_skill.py is unrelated (tests the journal SKILL.md
symlinks) and is preserved.
Part of Lode B of the skills-observer-editor refactor.
Rewrites _collect_skills() to read journal/skills/ directly; drops
the per-facet walk and the legacy skill_generated flag. Returns the
new shape (slug, description, category, confidence, status, facets,
first_seen, last_seen), excludes retired patterns and patterns
without a profile markdown, sorts by confidence desc then last_seen.
Threads status through both the Jinja card block and the
refreshSkills() JS builder, appending " (dormant)" to dormant
cards. test_home_skills.py migrates to owner-wide fixtures with
dormant-visible and retired-hidden coverage.
Part of Lode B of the skills-observer-editor refactor.
Daily owner-wide talent pair that replaces the old per-activity
talent/skills observer/generator:
- skill_observer.md (cogitate, priority 41): reads recent activities
and promotes/seeds/refreshes patterns via sol call skills.
- skill_editor.md + skill_editor.py (generate, priority 60): writes
one profile per run from the pending queue (edit_requests →
needs_profile → needs_refresh → skip).
- Pre-hook builds $skill_context from metadata + full observation
ledger + last-5 activity records + last-3 narratives + per-span
JSONL reads for observations[-3:] (≈4.3–4.8k tokens worst case).
- Post-hook validates frontmatter (name, display_name, description
1–1024, category, numeric confidence 0–1), atomic writes via
think.skills.save_profile, clears pending flags idempotently, and
fires an agency.md nudge only on first-time creates.
- 19 hook tests + 1 prompt smoke test.
Part of Lode B of the skills-observer-editor refactor.
Lands the storage layout, shared helpers, and `sol call skills` CLI
verbs (list, show, observe, seed, promote, refresh, mark-dormant,
retire, edit-request, rename) as the foundation for the skills
observer/editor refactor. Zero user-visible change — the old
per-activity talent and per-facet data stay dormant on disk; the new
owner-wide journal/skills/{patterns.jsonl,edit_requests.jsonl,*.md}
starts empty. Writes are fcntl-locked, atomic, and idempotent. Lode B
will add the observer/editor talents and cut Pulse over to read the
new storage.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lazy-load a schema frontmatter field in get_talent(): require a
relative path, reject traversal and symlink escapes, parse the JSON,
validate Draft 2020-12 well-formedness, and attach the result as
config[\"json_schema\"]. Thread that through both
_execute_generate() call sites and propagate advisory
schema_validation onto finish events only when present.
Only _execute_generate is wired here. _execute_batch_generate does
not exist, and think/batch.py remains a separate, non-talent-aware
path that is explicitly out of scope for this lode.
Schema loading and validation happens lazily in get_talent(), not in
get_talent_configs(). This keeps discovery, status, and settings
paths tolerant of one broken unused talent instead of failing
broadly at metadata load time.
No real talent migrates here; Lode 3 will migrate talent/sense.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Close three schema/normalization gaps in the storyteller post-hook.
Writes record story.talent from context name; topics are stripped,
lowercased, deduped, capped at 10; confidence is clamped to [0,1]
instead of dropping out-of-range rows. NaN, non-numeric, and bool
confidence still reject. Delete the orphaned design doc the refactor
was handed from.
Adds an optional `json_schema: dict | None` parameter to generate(),
agenerate(), and generate_with_result() in think/models.py and threads
it through all four provider modules. Each provider translates to its
native structured-output mechanism (Gemini response_json_schema, OpenAI
Responses text.format.json_schema, Anthropic output_config with
forced-tool-use fallback on BadRequestError, Ollama dict format).
After the call, an advisory jsonschema.validate pass logs violations
and surfaces a schema_validation field on GenerateResult from
generate_with_result; it never raises and never mutates the returned
text. When json_schema is None, every provider's SDK kwargs are
byte-for-byte identical to before. Lode 1 of 3 — talents and callers
unchanged.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Consolidate the three storytelling talents (conversation/work/event) onto
the activity record instead of a separate facets/*/spans/*.jsonl sink.
Each talent now emits story (body + topics + confidence) plus structured
commitments, closures, and decisions; the new talent/story.py hook merges
these onto the activity record via a dedicated writer in think/activities.py
(merge_story_fields), atomically under one file lock with a single edits[]
entry per merge.
- delete talent/spans.py, think/spans.py, and their tests
- drop the facets/*/spans/*.jsonl formatter entry
- storytellers run at priority 20 so they serialize after participation (10),
relying on existing priority-group dispatch rather than new locking
- format_activities renders story body + topics so FTS coverage moves onto
the activity record formatter
- owner/counterparty fuzzy-resolve to *_entity_id via find_matching_entity;
originals preserved, null on miss
- closures with non-vocab resolution dropped per-entry
- hook returns "" to suppress the generator JSON artifact
Unblocks the Ledger consumer surface without any back-compat shims or
historical backfill.
Retire meetings/decisions/followups/messaging per-span talents plus the
orphaned decisionalizer cogitate. Introduce conversation/work/event
storytelling talents that each emit one structured JSONL row per
(span, talent) into journal/facets/{facet}/spans/{YYYYMMDD}.jsonl via a
shared talent/spans.py post-hook. think/spans.py::format_spans renders
each row as an FTS-indexable markdown chunk.
Dispatch changes in think/thinking.py skip all three talents on
synthetic (cogitate/anticipated) or segmentless activity records, and
skip the work talent for browsing/reading activities below level_avg
0.4. Coding is ungated.
Forward-only flip: historical on-disk markdown under
facets/*/activities/{date}/{span_id}/*.md is preserved and still
indexed. Downstream consumers that read the retired filenames
(speakers app, activities_review, morning_briefing) continue to work on
pre-flip-day data and degrade gracefully on new data; a follow-up will
migrate them to read the new spans.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Eliminates the startup race where sense/cortex called require_solstone()
before convey's TCP listener was bound, producing false ERROR /
Restarting noise and dropping startup-catchup tasks with exit code 1.
- think/supervisor.py: new wait_for_convey_ready() called between
start_convey_server() and start_sense/cortex. TaskQueue gains ready
flag + _pending buffer + set_ready() drain so startup catchup tasks
submit before convey is up but dispatch after. main() sets
SOL_SUPERVISOR_SPAWNED=1 before any _launch_process so children
inherit it.
- think/utils.py: EXIT_TEMPFAIL hoisted here as the single source of
truth; require_solstone() exits 75 (absorbed quietly by
handle_runner_exits) when SOL_SUPERVISOR_SPAWNED=1, else keeps the
existing exit-1 + friendly-stderr path for external CLIs.
- tests/test_supervisor_startup.py: 10 new deterministic tests.
- AGENTS.md: tiny fix for an unrelated pre-existing CI blocker
surfaced during `make ci`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move the "Requires solstone to be installed with the sol CLI on PATH"
infrastructure detail out of the frontmatter description — the
Prerequisites section in the body already covers it. Keep the description
purely task-framed ("Query and search your solstone journal from any
project..."). Add sol call command literals to TRIGGER so sessions that
already know the commands still activate the skill. Add Invoke via Bash
line to the body summary.
This skill is installed user-wide via `npx skills add -g` by
`make install-service` (the dual-install pattern documented in the
synthesis plan) — intentional, not orphaned.
Part of skills audit req_loq3e2lk pass 1+2.
Clarify that tos.txt lives at <journal_root>/apps/support/portal/tos.txt
(resolved via journal app storage) to remove any ambiguity about source
tree vs. runtime location. Reformat TRIGGER into playbook style with
task-intent dash prefix + literal command list. Add Invoke via Bash line
and a Gotchas section (KB-first on create, --product default, diagnostic
leakage, attachment ordering).
Part of skills audit req_loq3e2lk pass 1+2.
The talent/vit/ skill is intentionally narrow — it's the publish step
that runs at the end of a VPE playbook when a feature lands through
hopper. Discovery and consumption commands (skim, follow, learn, remix,
vet) are covered by the user-wide `using-vit` skill; duplicating them
here would create activation-ambiguity between two skills.
Make the scope explicit in description, body, and a dedicated Scope
note. Add a structured TRIGGER (was previously inline "Activates when..."
prose) with literal ship-workflow phrases. Add Invoke via Bash line.
Replace the "other commands" section with a pre-ship-diagnostics
section + pointer to using-vit.
Part of skills audit req_loq3e2lk pass 1+2. The audit suggested promoting
skim/follow/learn to peer sections; rejected here because that would
duplicate using-vit and split activation unpredictably. Scope note in
the synthesis plan to address this.
Post-revamp exemplar still missed a structured TRIGGER line and the
"Invoke via Bash" clause. Adds both. Also replaces the stale
"Occurrence" vocabulary row (legacy term) with "Activity" to match
the reform shipped in references/facets.md and references/storage.md.
Part of skills audit req_loq3e2lk pass 1+2.
apps/health/call.py exposes `sol call health pipeline` (the only
command actually defined by the health app itself) but the skill
never documented it. Add a pipeline-summary section with --day /
--yesterday flags and examples.
Reframe description from "diagnose" infrastructure framing to task
framing ("monitor service uptime, troubleshoot capture gaps"). Make
the tri-scope surface explicit (sol health*, sol talent*,
sol call health*) since health diagnostics cross levels. Add a
Gotchas block (callosum 10s timeout, ms-precision talent log IDs,
today-vs-yesterday pipeline window). Expand TRIGGER.
Part of skills audit req_loq3e2lk pass 1+2.
Two user-facing commands existed in think/tools/routines.py (lines 395-442)
but were absent from the skill: suggest-respond (record accept/decline on
a routine suggestion) and suggest-state (print raw suggestion state).
Add both to the command table and a short "Responding to suggestions"
section. Also add a Gotchas block (IANA timezone requirement, suggestion
idempotency). Reframe description from infrastructure framing
("create, manage, and inspect") to task framing ("set up recurring
routines — daily briefings, weekly reviews..."). Expand TRIGGER with
literal command names.
Part of skills audit req_loq3e2lk pass 1+2.
Commands renamed in the write-verb polarity inversion (commit 1c59dad6):
owner detect → detect
owner confirm → confirm-owner (backfill default)
owner reject → reject-owner (14-day cooldown)
Skill still taught the old names; updated examples, headings, and the
conversational curation guidance to match.
Add sections for nine previously-undocumented commands: bootstrap,
resolve-names, attribute-segment, backfill, discover, link-import,
seed-from-imports, owner-ready. Add a Gotchas block led by the
preview-by-default writer polarity — the most common silent-failure
in this CLI.
Plus Invoke via Bash line and expanded TRIGGER with literal command names.
Part of skills audit req_loq3e2lk pass 1+2.
Nudge scheduling (call.py:120-149, 441, 490) was live but the skill only
documented basic add/done/cancel/move. Add the --nudge/-n flag to the
add section with the four accepted formats, then add list-nudges-due
and dispatch-nudges as full sections. Also add a Gotchas block for the
fuzzy-duplicate silent-rejection behavior and the read/mutate split
between list-nudges-due and dispatch-nudges.
Plus Invoke via Bash and expanded TRIGGER.
Part of skills audit req_loq3e2lk pass 1+2.
Three commands shipped to production without matching skill updates:
- move (3/18, commit 064c50e2) — cross-facet entity moves
- consolidate (4/17, commit 9730cd6d) — roll segment detections into journal entities
- merge (4/17, commit c9c1c8a9) — dedup two journal entities, preview-by-default
Add sections for each plus a Gotchas block calling out merge's --no-commit
default, consolidate's 85% fuzzy threshold, and the detect-vs-observe split.
Add Invoke via Bash line and expand TRIGGER with literal command names.
Part of skills audit req_loq3e2lk pass 1+2.
Participation shipped 4/18 (commit 101da52e) but skill was never updated.
Adds the schema (name, role ∈ {attendee, mentioned}, source ∈ {voice,
speaker_label, transcript, screen, other}, confidence, context) and a
worked example. Also adds the --source filter flag on list (anticipated,
user, cogitate).
Plus Invoke via Bash line and literal command names in TRIGGER.
Part of skills audit req_loq3e2lk pass 1+2.
Code renamed in March (f8bd2868, bf1cab94) but skill still taught the
deprecated --audio / --screen flags as primary. Update source-flag table,
source-type meanings, scan description, and example to match current code.
Keep --audio / --screen mentioned as hidden aliases. Also add "Invoke via
Bash" line and literal command names to TRIGGER.
Part of skills audit req_loq3e2lk pass 1+2.
Full revamp of the repo-root AGENTS.md per vpe req_7szh7vce. Promotes
load-bearing invariants from docs/coding-standards.md inline so they
can't be skipped: L1–L9 layer hygiene (with the domain-ownership table
at the top), no-compat-shims, trust get_journal(), SPDX header, fail
loudly. Adds a start-here ordered reading list, a top-level repo map,
an annotated repo mental model, make-command coverage grouped by use
(37 targets with when-to-use notes), a testing quickstart, and an
annotated depth-doc map.
Relocates the misaudienced "Talent CLI Boundaries" runtime restrictions
out of AGENTS.md (coder audience) into talent/journal/references/cli.md
(cogitate audience, loaded via the journal skill).
Replaces the full Layer Hygiene section in docs/coding-standards.md
with a one-line redirect to AGENTS.md §7 — single source of truth for
the invariants, no drift.
Plan: vpe/workspace/solstone-agents-md-revamp-plan.md (extro repo).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tighten contamination guard in talent/sense.md: role: attendee now explicitly requires meeting_detected: true for the same segment; otherwise every Person is role: mentioned.
Add a Python clamp in talent/participation.post_process: when no contributing sense segment has meeting_detected: true, rewrite every role: attendee entry to role: mentioned. Missing/unreadable sense.json collapses to False. One warning per affected record.
Cover with tests/test_sense_contamination_guard.py (prompt-presence + runtime clamp + idempotence + missing-file fallback).
Per-owner identity files (self.md, agency.md, partner.md, awareness.md,
pulse.md, pulse_output.md, identity_pulse.md, history.jsonl, news/*) were
being written into a repo-tracked sol/ directory and committed/pushed to
the public AGPL repo by talent/heartbeat.md. This is an active privacy
leak that violates sol pbc Article IV.
Five coupled moves in one pass:
1. git rm -r sol/ at repo root; /sol/ added to .gitignore.
2. {journal}/sol/ → {journal}/identity/ everywhere (code, tests, fixtures,
docs); ensure_sol_directory → ensure_identity_directory.
3. New write_identity() helper in think/identity.py — single write path
with per-directory fcntl.LOCK_EX, atomic tmpfile + os.replace, 0o600
perms, hash-based history.jsonl audit log (no diffs).
4. talent/heartbeat.md: deleted "Path notes" block; Step 6 is now a no-op
close. Other talents (awareness_tender, pulse, naming) updated to use
identity/ paths and stripped of any commit/push instructions.
5. think/prompts.py: removed SOL_DIR and the repo-root read branch;
template vars are now $identity_* loaded only from {journal}/identity/.
Acceptance greps all return 0. New tests cover write_identity()
atomicity, lock serialization, history schema, 0o600 mode, and missing-
file first writes. CLI surface (sol call identity ...) unchanged.
No data migration code (Jer is sole user, resets identity on deploy).
No backward-compat shims. Clean break.
Sprint 4 cleanup 1 (req_myu5gwm7 Lode 4xvyyikt). Previously
merged via hopper but blocked by known refine-stage stuck-lode
false-positive; applied manually from the lode worktree.
Path A — talent/sense.md: contamination guard now explicitly ties
role: attendee to meeting_detected: true for the same segment.
When meeting_detected: false, every Person must be role: mentioned
even if they spoke, were quoted, or referenced in transcript
(closes podcast / press conference / lecture leak).
Path B — talent/participation.py: post-processor clamps attendee
to mentioned when no contributing sense segment had
meeting_detected: true. Defensive net catching any Path A prompt
leakage. Logs at warning level with count + activity id for
observability.
Tests: 3 new cases in test_sense_contamination_guard.py covering
clamp positive (podcast with speakers → all mentioned), negative
pass-through (real meeting unchanged), idempotence + logging.
make test: 3254 passed, 1 skipped.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Post-Sprint-4 trivials bundled:
- remove dead occurrence/has_extraction detection in settings routes
plus the now-unreachable Extract toggle UI branch in workspace.html,
and drop the corresponding extract/has_extraction keys from the
generators.json baseline; refresh adjacent docstring/copy that
referenced the removed field
- normalize the schedule generator description to the post-Sprint-3
"anticipated activity records" phrasing in talent/schedule.md and
the three API baselines that mirror it
- fix stale "calendar" in the talent/journal/references/config.md
apps.order example (now "activities")
- replace retired "occurrence" hook value with live "schedule" in
the _format_tags test data
- document the sense.json/sense.md dual-write coupling to
think/cluster.py's per-segment talents/**/*.md glob
The activity-anticipation dispatcher only loaded the current local day's
records, so a pre-alert for an early-morning D+1 activity was silently
missed when local_now was late on day D (and symmetric on the other side).
Scan yesterday/today/tomorrow gated by _ACTIVITY_ANTICIPATION_CROSSDAY_WINDOW_MINUTES
(default 120), and build start_dt from each record's own day.
Co-Authored-By: Codex <noreply@openai.com>