···11{
22- "type": "cogitate",
33- "tier": 3,
22+ "type": "generate",
33+ "tier": 2,
4455 "title": "Entity Observer",
66 "description": "Extracts durable factoids about attached entities from journal content",
···88 "schedule": "daily",
99 "priority": 57,
1010 "multi_facet": true,
1111- "group": "Entities"
1111+ "group": "Entities",
1212+ "output": "json",
1313+ "thinking_budget": 2048,
1414+ "hook": {"pre": "entities:entity_observer", "post": "entities:entity_observer"},
1515+ "load": {"transcripts": false, "percepts": false, "agents": false}
1216}
1313-1414-$sol_identity
1515-1616-$facets
17171818## Core Mission
19192020-Extract durable factoids about attached entities from recent journal content within this specific facet. Observations are persistent facts that help with future interactions - preferences, expertise, relationships, schedules, and biographical details. This is NOT about logging daily activity (that's entity detection), but capturing lasting knowledge.
2020+Extract durable factoids about attached entities from recent journal content. Observations are persistent facts that help with future interactions - preferences, expertise, relationships, schedules, and biographical details. This is NOT about logging daily activity (that's entity detection), but capturing lasting knowledge.
21212222-## Input Context
2222+## Pre-computed Context
23232424-You receive:
2525-1. **Facet context** - the specific facet (e.g., "personal", "work") you are observing entities for
2626-2. **Current date/time** - to focus on recent journal content
2727-3. **Attached entities for THIS facet** - Obtain this list by executing the Python command: `from think.entities.loading import load_entities; entities = load_entities(SOL_FACET)`. If no entities are returned, report "No attached entities to observe" and finish.
2424+Below you'll find the pre-computed context for this observation run, including:
2525+- Active entities (those that appeared in today's content)
2626+- Recent observations for each entity (last 3)
2727+- Relevant knowledge graph content
28282929-## Tooling
3030-3131-SOL_DAY and SOL_FACET are set in your environment. When performing actions, use the following Python calls:
3232-3333-- **List Entities:** Execute: `from think.entities.loading import load_entities; entities = load_entities(SOL_FACET)`
3434- - The result will be a list of entities.
3535-- **Read Current Observations:** Execute: `from think.entities.observations import load_observations; observations = load_observations(SOL_FACET, entity_id)`
3636- - **MUST execute this before adding observations.** Note the `count` for guard awareness.
3737- - The `entity_id` can be an entity ID, full name, or alias.
3838-- **Add New Observation:** Execute: `from think.entities.observations import add_observation; add_observation(SOL_FACET, entity_id, content, SOL_DAY)`
3939- - This adds an observation with guard (observation number auto-calculated).
4040- - Use entity_id from the `load_observations` response for consistency.
4141-4242-Discovery tools:
4343-- `sol call journal read AGENT` - read full agent output (e.g., knowledge_graph, followups)
4444-- `sol call journal search QUERY -d DAY -a AGENT -f FACET -n LIMIT` - unified search across journal content
4545-- `sol call journal events [-f FACET]` - get structured events
2929+$observer_context
46304731## What Makes a Good Observation
4832···7963- **Projects**: Architecture decisions, design principles, known constraints, key technical learnings. NOT commit logs or deployment activity.
8064- **Tools**: Capabilities, limitations, best-practice configurations. NOT "was used for X on Y" — that's a usage log, not a fact about the tool.
81658282-## Observation Process
6666+## Output Format
83678484-### Phase 1: Load Context
6868+Respond with a JSON object in this exact format:
85698686-1. Use the provided current date and analysis day in YYYYMMDD format
8787-2. Execute Python: `from think.entities.loading import load_entities; entities = load_entities(SOL_FACET)`
8888-3. If no attached entities, report "No attached entities to observe" and finish
8989-9090-### Phase 2: Identify Active Entities
9191-9292-Before deep-mining every entity, scan the day's content to find which entities actually appeared:
9393-9494-1. Check knowledge graph: `sol call journal read knowledge_graph`
9595-2. Check events: `sol call journal events -f FACET`
9696-3. From these sources, identify which attached entities were active today, prioritizing those with high relevance or recent activity (e.g., seen within the last 7 days or having a relevance score above a threshold).
9797-4. Focus your deep mining (Phase 3) on entities that appeared in today's content
9898-5. For entities NOT mentioned today, skip — no content means no new observations
9999-100100-This is especially important for large facets (50+ entities). Don't search for every entity name when you can scan what the day produced first.
101101-102102-### Phase 3: Mine and Observe Active Entities
103103-104104-For each entity that appeared in today's content:
105105-106106-1. **Read current observations** (REQUIRED - guard mechanism):
107107- Execute Python: `from think.entities.observations import load_observations; observations = load_observations(SOL_FACET, entity_id)`
108108- Note the `count` for guard awareness.
109109-110110-2. **Mine recent content** for factoids about this entity:
111111- - Search transcripts: `sol call journal search "{name}" -a audio -n 5`
112112- - Search insights: `sol call journal search "{name}" -n 5`
7070+```json
7171+{
7272+ "observations": {
7373+ "entity_slug": [
7474+ {"content": "The durable observation text", "reasoning": "Why this qualifies (1 sentence)"}
7575+ ]
7676+ },
7777+ "skipped": ["entity_ids_examined_but_no_new_observations"],
7878+ "summary": "Observed X entities, Y new observations total."
7979+}
8080+```
11381114114-3. **Extract and filter observations**:
115115- - Apply the litmus test (both questions must be yes)
116116- - Apply the entity-type strategy (people = who they are, projects = design decisions, etc.)
117117- - Check for semantic duplicates against existing observations (see Deduplication below)
118118- - One fact per observation — no compound sentences
119119-120120-4. **Add new observations** (one at a time; guard handled by CLI):
121121- Execute Python: `from think.entities.observations import add_observation; add_observation(SOL_FACET, entity_id, content, SOL_DAY)`
122122-123123-### Phase 4: Report Summary
124124-125125-Summarize what was observed:
126126-- "Observed 3 entities for [facet]: Alice (2 new observations), Bob (1 new observation), Acme Corp (0 - nothing new)"
127127-128128-Remember: Your goal is to build a curated knowledge base of the most important facts about entities — not a comprehensive activity log. Every observation should answer "What's something durable and useful to know about this entity?" not "What happened with them today?" When the knowledge base is already rich, restraint is the right call.8282+Rules:
8383+- Use the entity_id (slug) from the context as the key
8484+- One fact per observation — no compound sentences
8585+- Check for semantic duplicates against the existing observations shown in context
8686+- If existing observations are already rich, zero new observations is valid and correct
8787+- The `reasoning` field is for audit only
8888+- Include ALL examined entities in either `observations` or `skipped`
8989+- Empty observations dict is valid when nothing new is found
+80
apps/entities/talent/entity_observer.py
···11+# SPDX-License-Identifier: AGPL-3.0-only
22+# Copyright (c) 2026 sol pbc
33+44+"""Entity observer talent hook — pre-computes context and persists observations.
55+66+pre_process: Assembles entity context (attached entities, recent observations,
77+ KG excerpts) and injects it as $observer_context template variable.
88+99+post_process: Parses JSON output, validates entity_ids against attached entities,
1010+ and persists valid observations via add_observation().
1111+"""
1212+1313+from __future__ import annotations
1414+1515+import json
1616+import logging
1717+1818+from think.entities.context import assemble_observer_context
1919+from think.entities.loading import load_entities
2020+from think.entities.observations import add_observation, load_observations
2121+2222+logger = logging.getLogger(__name__)
2323+2424+2525+def pre_process(context: dict) -> dict | None:
2626+ facet = context.get("facet")
2727+ day = context.get("day")
2828+ if not facet or not day:
2929+ return None
3030+3131+ observer_context = assemble_observer_context(facet, day)
3232+ return {"template_vars": {"observer_context": observer_context}}
3333+3434+3535+def post_process(result: str, context: dict) -> str | None:
3636+ facet = context.get("facet")
3737+ day = context.get("day")
3838+ if not facet or not day:
3939+ return None
4040+4141+ try:
4242+ data = json.loads(result)
4343+ except json.JSONDecodeError:
4444+ logger.warning("entity_observer: could not parse result as JSON")
4545+ return None
4646+4747+ if not isinstance(data, dict):
4848+ return None
4949+5050+ observations = data.get("observations")
5151+ if not isinstance(observations, dict) or not observations:
5252+ return None
5353+5454+ valid_entity_ids = {entity.get("id") for entity in load_entities(facet) if entity.get("id")}
5555+5656+ for entity_id, items in observations.items():
5757+ if entity_id not in valid_entity_ids:
5858+ logger.debug("Skipping unrecognized entity_id: %s", entity_id)
5959+ continue
6060+ if not isinstance(items, list):
6161+ continue
6262+6363+ existing = {
6464+ obs.get("content", "").strip().lower()
6565+ for obs in load_observations(facet, entity_id)
6666+ }
6767+6868+ for item in items:
6969+ if not isinstance(item, dict):
7070+ continue
7171+ content = str(item.get("content", "")).strip()
7272+ if not content:
7373+ continue
7474+ if content.lower() in existing:
7575+ logger.debug("Skipping duplicate observation for %s: %s", entity_id, content[:60])
7676+ continue
7777+ add_observation(facet, entity_id, content, day)
7878+ existing.add(content.lower())
7979+8080+ return None