remove curiosity queue + refresh docs · zzstoatzz.io/bot@bbf10c7

+46 -12

AGENTS.md

··· 1 - phi — a bluesky bot with episodic memory. python + pydantic-ai + fastapi + turbopuffer. 1 + phi — a bluesky bot. python + pydantic-ai + atproto + turbopuffer + cosmik/semble. fastapi for the small web surface (status pages, memory graph). 2 2 3 3 ## development 4 4 5 - - `just run` / `just dev` (hot-reload) / `just deploy` (fly.io) 6 - - `just evals` — behavioral tests (llm-as-judge) 5 + - `just run` / `just dev` (hot-reload) / `just deploy` (manual fly.io) / `just release X` (tag vX, CI deploys) 7 6 - `just check` — lint + typecheck + test 8 - - `just loq-relax <file>` — when a file exceeds its line limit, relax it. never manually edit loq.toml or compress code to fit 7 + - `just evals` — behavioral tests (llm-as-judge) 8 + - `just loq-relax <file>` — relax line limit for a file. never edit loq.toml manually or compress code to fit 9 9 - work from repo root 10 10 11 11 ## python style ··· 14 14 - prefer functional over OOP 15 15 - imports at the top — no deferred imports unless circular 16 16 - never use `pytest.mark.asyncio` 17 + - tool params: `Annotated[T, Field(description=...)]` so the LLM sees what each param does 17 18 18 - ## deployment 19 + ## project structure 19 20 20 - fly.io app `zzstoatzz-phi`. deploys triggered by `v*` tags via tangled CI. `just release <version>` tags and pushes. `just deploy` for manual. 21 + ``` 22 + src/bot/ 23 + ├── agent.py # pydantic-ai agent + dynamic system prompts 24 + ├── config.py # settings (env vars) 25 + ├── main.py # fastapi app, status pages, memory graph 26 + ├── status.py # runtime metrics 27 + ├── core/ # atproto client, profile, mentionable, goals, self_state 28 + ├── memory/ # turbopuffer namespaces + extraction/reconciliation/review pipelines 29 + ├── services/ # notification polling + message handler 30 + ├── tools/ # native pydantic-ai tools (posting, search, memory, goals, blog, etc.) 31 + └── utils/ # thread context, text formatting 32 + 33 + personalities/ # personality definitions (public; phi.md is the live one) 34 + evals/ # behavioral tests 35 + scripts/ # proven utility scripts 36 + sandbox/ # experiments (graduate to scripts/ once proven) 37 + .eggs/ # cloned reference projects 38 + ``` 21 39 22 40 ## key architecture 23 41 24 - - all notification types run through the full agent loop — phi decides what's worth responding to 25 - - actions (reply, like, post) happen via tool calls inside the agent run, not structured output 26 - - personality is separate from operational instructions 27 - - memory: turbopuffer namespaces (`phi-users-{handle}`, `phi-episodic`) 28 - - exploration is event-driven: curiosity queue on PDS, drained when idle 29 - - MCP servers: pdsx (atproto record CRUD), pub-search (publication search) 42 + - one agent loop, many entry points (notifications batch, scheduled musing, daily reflection, relay check). all end in `agent.run()` with different `PhiDeps`. 43 + - actions happen as tool calls inside the run, not via structured output. the agent's return value is a brief summary string for logging. 44 + - personality is separate from operational rules. tool docstrings carry per-tool guidance, not the system prompt. 45 + - memory: turbopuffer namespaces (`phi-users-{handle}`, `phi-episodic`). intent state on PDS under `io.zzstoatzz.phi.*` (goals, mention consent, legacy queue). 46 + - owner-gated mutations (`follow_user`, `propose_goal_change`, `manage_mentionable`, `create_feed`) flow through a like-as-approval mechanism: phi posts an authorization request, owner likes it, next batch lets the action through. 47 + - MCP servers: pdsx (atproto record CRUD), pub-search (publication search). connected via `MCPServerStreamableHTTP`, fresh per `agent.run()`. 48 + - web grounding via tavily for recency claims (`web_search`). 49 + 50 + ## documentation 51 + 52 + deeper reference in `docs/`: 53 + - `architecture.md` — entry points, scheduling, why this shape 54 + - `memory.md` — the four kinds of state and how they compose 55 + - `system-prompt.md` — block-by-block reference for what's in phi's context 56 + - `mcp.md` — MCP integration 57 + - `testing.md` — testing philosophy 58 + 59 + ## deployment 60 + 61 + fly.io app `zzstoatzz-phi`. CI deploys on `v*` tag push (tangled `.tangled/workflows/deploy.yml`). `just release X.Y.Z` tags + pushes; `just deploy` runs fly deploy directly when CI is backed up. push to both `origin` (tangled) and `github` mirror. 62 + 63 + secrets via `fly secrets set` or `fly secrets import` (pipe `grep ^KEY .env` into it to keep values off the terminal).

+1 -49

CLAUDE.md

··· 1 - phi — a bluesky bot with episodic memory. python + pydantic-ai + fastapi + turbopuffer. 2 - 3 - ## development 4 - 5 - - `just run` / `just dev` (hot-reload) / `just deploy` (fly.io) 6 - - `just evals` — behavioral tests (llm-as-judge) 7 - - `just check` — lint + typecheck + test 8 - - `just loq-relax <file>` — when a file exceeds its line limit, relax it. never manually edit loq.toml or compress code to fit 9 - - work from repo root 10 - 11 - ## python style 12 - 13 - - 3.10+ typing (`T | None`, `list[T]`) 14 - - prefer functional over OOP 15 - - imports at the top — no deferred imports unless circular 16 - - never use `pytest.mark.asyncio` 17 - 18 - ## project structure 19 - 20 - ``` 21 - src/bot/ 22 - ├── agent.py # pydantic-ai agent, tools, personality 23 - ├── types.py # cosmik record models (cards, connections) 24 - ├── config.py # settings (env vars) 25 - ├── main.py # fastapi app, status pages, memory graph 26 - ├── status.py # runtime metrics 27 - ├── core/ # atproto client, profile management 28 - ├── memory/ # turbopuffer memory + observation extraction 29 - ├── services/ # notification polling, message handling 30 - └── utils/ # thread context, text formatting 31 - 32 - personalities/ # personality definitions (public) 33 - evals/ # behavioral tests 34 - scripts/ # proven utility scripts 35 - sandbox/ # experiments (graduate to scripts/ once proven) 36 - .eggs/ # cloned reference projects 37 - ``` 38 - 39 - ## deployment 40 - 41 - fly.io app `zzstoatzz-phi`. deploys are triggered by `v*` tags, not pushes to main. to deploy: `just release <version>` (e.g. `just release 0.2.0`) or `just deploy` for manual fly.io deploy without tagging. 42 - 43 - ## key architecture 44 - 45 - - all notification types (mentions, replies, quotes, likes, reposts, follows) run through the full agent loop — phi decides what's worth responding to 46 - - personality is separate from operational instructions (agent.py `OPERATIONAL_INSTRUCTIONS`) 47 - - memory: turbopuffer namespaces (`phi-users-{handle}`, `phi-episodic`) 48 - - relationship summaries are compacted by a separate pipeline in my-prefect-server 49 - - MCP servers: pdsx (atproto record CRUD), pub-search (publication search) 1 + see [AGENTS.md](AGENTS.md).

+31 -57

README.md

··· 1 1 # phi 2 2 3 - a bluesky bot — a librarian who stepped outside. built with [pydantic-ai](https://ai.pydantic.dev/), [mcp](https://modelcontextprotocol.io/), and the [at protocol](https://atproto.com). personality is [public](personalities/phi.md). 3 + a bluesky bot. listens, decides, posts, remembers, watches a few things in the background. personality is [public](personalities/phi.md). 4 + 5 + ``` 6 + notifications ── ┐ 7 + schedule ─────── ├─→ phi (pydantic-ai) ─→ tools ─→ atproto / web / pds / memory 8 + self-state ───── ┘ 9 + ``` 10 + 11 + phi reads its own state (recent posts, goals, what it's pending, what's relevant from memory), looks at what's in front of it, and decides whether to act. actions happen as tool calls inside the agent run — there's no separate dispatch layer. 12 + 13 + ## stack 14 + 15 + - [pydantic-ai](https://ai.pydantic.dev/) for the agent loop and tool surface 16 + - [atproto](https://atproto.com) for everything social — posts, follows, threads, the firehose 17 + - [mcp](https://modelcontextprotocol.io/) for external capabilities (atproto record CRUD, publication search) 18 + - [turbopuffer](https://turbopuffer.com/) for private vector memory 19 + - [cosmik](https://cosmik.network) / [semble](https://semble.so) for public knowledge that anyone can discover 20 + - [tavily](https://tavily.com) for grounding in current web sources 21 + - [fly.io](https://fly.io) for hosting 4 22 5 23 ## quick start 6 24 ··· 10 28 just run 11 29 ``` 12 30 13 - **required:** `BLUESKY_HANDLE`, `BLUESKY_PASSWORD`, `ANTHROPIC_API_KEY` 14 - 15 - **optional:** 16 - - `TURBOPUFFER_API_KEY` + `OPENAI_API_KEY` — episodic memory 17 - - `AGENT_MODEL` — pydantic-ai model string for the main agent (default: `anthropic:claude-sonnet-4-6`) 18 - - `EXTRACTION_MODEL` — model for observation extraction (default: `claude-haiku-4-5-20251001`) 19 - - `DAILY_REFLECTION_HOUR` — UTC hour for daily reflection post (default: `14`) 20 - - `THOUGHT_POST_HOURS` — UTC hours for original thought posts (default: every 2h, 8am-10pm CT) 21 - - `CONTROL_TOKEN` — bearer token for `/api/control` endpoints 22 - - `OWNER_HANDLE` — handle of the bot's owner for permission-gated tools (default: `zzstoatzz.io`) 23 - 24 - ## what phi does 25 - 26 - phi listens for all notification types on bluesky — mentions, replies, quotes, likes, reposts, follows — and decides how to respond. it can search live posts, check trending topics, query the [cosmik](https://cosmik.network)/[semble](https://semble.so) network for public knowledge, create public records (notes, bookmarks, connections), and post unprompted via daily reflections. 27 - 28 - every conversation builds context from: the current thread (fetched live from ATProto), private memory (past observations about the person talking), and public network knowledge (cards and links indexed by semble). phi extracts observations from conversations and stores them for next time. 29 - 30 - ## memory 31 - 32 - phi has two memory systems with different visibility: 33 - 34 - - **private** — [turbopuffer](https://turbopuffer.com/) vector memory for per-user observations, interactions, and relationship summaries. this is what phi uses to remember people across conversations. 35 - - **public** — [cosmik](https://cosmik.network) records on phi's PDS (notes, bookmarks, connections), indexed by [semble](https://semble.so) for semantic search. anything phi finds worth preserving publicly becomes a card on the network. 36 - 37 - notes and bookmarks are dual-written: private for fast recall, public for network discovery. 38 - 39 - a separate [pipeline](https://github.com/zzstoatzz/my-prefect-server) enriches memory offline: 40 - - **compact** (hourly): synthesizes per-user relationship summaries, extracts observations from nate's liked posts 41 - - **morning** (daily): deduplicates tags, discovers relationships between topics, promotes observations to semble as public cosmik cards 42 - 43 - the [memory graph](/memory) visualizes connections between phi, the people it talks to, and the topics that link them. 44 - 45 - ## mention consent 46 - 47 - phi only sends notifications (via AT Protocol mention facets) to people who are part of the current conversation — the person who messaged phi, plus nate's accounts. third-party @handles in phi's replies render as plain text, visible but silent. this is enforced at two layers: code (`parse_mentions()` gates facets behind an `allowed_handles` set) and prompt (operational instructions tell phi not to @mention third parties). 31 + required: `BLUESKY_HANDLE`, `BLUESKY_PASSWORD`, `ANTHROPIC_API_KEY`. see `.env.example` for the optional knobs. 48 32 49 33 ## development 50 34 51 35 ```bash 52 36 just run # run bot 53 - just dev # run with hot-reload 54 - just evals # run behavioral tests 37 + just dev # hot-reload 55 38 just check # lint + typecheck + test 56 - just fmt # format code 57 - just deploy # deploy to fly.io 39 + just evals # behavioral tests 40 + just deploy # fly.io 41 + just release X # tag vX, CI deploys 58 42 ``` 59 43 60 - <details> 61 - <summary>architecture</summary> 62 - 63 - phi is a pydantic-ai agent with a personality prompt, tool access via native tools and remote MCP servers, and tool-based actions — the agent decides AND acts inside one run via tool calls (reply, like, post, note, etc). no separate action dispatch. 64 - 65 - see `docs/architecture.md` for data flow and scheduling details. 66 - 67 - </details> 68 - 69 - <details> 70 - <summary>deployment</summary> 71 - 72 - runs on [fly.io](https://fly.io) — `shared-cpu-1x`, 1GB, region `ord`. auto-start is off; the machine sleeps until woken by an API call. 44 + ## docs 73 45 74 - secrets are set via `fly secrets set`. the bot uses session persistence (`.session` file) to avoid rate limits — tokens auto-refresh every ~2h. 75 - 76 - </details> 46 + - [architecture](docs/architecture.md) — data flow, scheduling, why the design 47 + - [memory](docs/memory.md) — thread context, private memory, public memory, how they compose 48 + - [system-prompt](docs/system-prompt.md) — every block in phi's context, where it comes from, when it refreshes 49 + - [mcp](docs/mcp.md) — how external tool servers are integrated 50 + - [testing](docs/testing.md) — testing philosophy 77 51 78 52 ## reference projects 79 53 80 - inspired by [void](https://tangled.sh/@cameron.pfiffer.org/void.git), [penelope](https://github.com/haileyok/penelope), and [prefect-mcp-server](https://github.com/PrefectHQ/prefect-mcp-server). 54 + [void](https://tangled.sh/@cameron.pfiffer.org/void.git), [penelope](https://github.com/haileyok/penelope), [prefect-mcp-server](https://github.com/PrefectHQ/prefect-mcp-server).

+54 -16

docs/ARCHITECTURE.md

··· 1 1 # architecture 2 2 3 - phi is a notification-driven agent on bluesky. it also posts original thoughts on a schedule and explores interesting accounts when idle. 3 + phi is one agent loop, fired from a few different paths. notifications drive most of the activity; scheduled paths cover the rest. 4 + 5 + ## one agent, many entry points 6 + 7 + every entry point ends in the same place: `agent.run()` with a `PhiDeps` carrying whatever context the path needs. tool definitions are the same across paths; the system prompt assembles different dynamic blocks based on what's in `PhiDeps`. the agent decides AND acts inside the run via tool calls — `reply_to`, `like_post`, `post`, `note`, `propose_goal_change`, etc. there's no separate decide-then-dispatch layer. 8 + 9 + what changes per path is the user prompt and the deps shape, not the agent. 10 + 11 + ## entry points 12 + 13 + | path | trigger | user prompt sketch | 14 + |---|---|---| 15 + | **notifications batch** | poll every 10s, dispatch unread as one cognitive event | "process your new notifications batch — silence is fine" | 16 + | **scheduled musing** | every 2h during configured hours | "you have a moment. post if you want, or don't" | 17 + | **daily reflection** | once per day at `DAILY_REFLECTION_HOUR` | "end of day. post a reflection if you have one" | 18 + | **relay check** | every ~3h | "scheduled relay check. report transitions; tag owner if `*.waow.tech` dips or fleet-wide degradation" | 19 + | **memory review** | on demand | dream/distill pass over recent observations | 4 20 5 - ## data flow 21 + ## data flow (notifications) 6 22 7 23 ``` 8 - notification batch arrives (all types) 24 + bsky.notification.listNotifications (every 10s) 25 + ↓ 26 + filter unread × allow-list (rate limit per author) 9 27 ↓ 10 - fetch thread context + stranger lookups 28 + build notifications_context: per-notif fetch (post body, thread context, 29 + reply refs, embeds), pre-fetch stranger profiles for unfamiliar authors 11 30 ↓ 12 - inject memories (per-user, episodic, public) 31 + PhiDeps assembled, system prompt composed: 32 + identity / time / known relays / goals / stranger's audit / self state 33 + / notifications block / per-author memory / synthesized episodic / ... 13 34 ↓ 14 - agent decides + acts via tool calls (reply, like, post, note, etc) 35 + agent.run() — tool calls happen inside (reply_to, like_post, etc.) 15 36 ↓ 16 - extract observations for next time 37 + post-action: store interaction in turbopuffer for next time 17 38 ``` 18 39 40 + see [system-prompt.md](system-prompt.md) for what each block contains and when it refreshes. 41 + 19 42 ## scheduling 20 43 21 - - **notifications**: polled every 10s, dispatched as one cognitive event per batch 22 - - **thought posts**: every 2h during configured hours — reads timeline, trending, feeds 23 - - **daily reflection**: once per day — reviews recent activity, posts synthesis 24 - - **exploration**: event-driven — drains curiosity queue when system is idle (no cron) 44 + all schedules run from one `notification_poller.py` loop. on each ~10s tick: 45 + 46 + 1. fetch + dispatch any unread notifications 47 + 2. if it's the daily reflection slot and we haven't fired today → run it 48 + 3. if it's a thought-post slot we haven't fired this hour → run it 49 + 4. if it's been ≥3h since the last relay check → run it 50 + 51 + state for "did we already fire today" is persisted via phi's own posts on PDS — the poller seeds from history at startup so deploys don't double-post. 52 + 53 + ## intent state on PDS 54 + 55 + phi's *durable* intent lives on its own PDS as records under `io.zzstoatzz.phi.*`: 56 + 57 + - `io.zzstoatzz.phi.goal` — phi's anchors. a small set of named, defined goals (e.g. "make 3 friends" with a concrete progress signal). injected as `[GOALS]` in every tick. 58 + - `io.zzstoatzz.phi.mentionConsent` — handles opted-in to be tagged by phi. 59 + 60 + mutations to goals (and any other owner-gated action like `follow_user`, `create_feed`) flow through a like-as-approval gate: phi posts an authorization request, the owner likes it, the next batch's `_is_owner` check sees the like-on-phi's-post and lets the action through. scoped to the action discussed in that thread, not blanket. 61 + 62 + ## why this shape 25 63 26 - ## why this design 64 + **tool-based actions.** phi decides AND acts inside one agent run. no structured decide-then-dispatch layer to maintain. consequence: the agent's "output" is a brief summary string for logging; the actual work happened during the run. 27 65 28 - **tool-based actions**: phi decides AND acts inside one agent run via tool calls. no separate action dispatch layer. 66 + **network-first context.** thread bodies are fetched from atproto on demand per batch (~200ms). nothing about the conversation is cached locally. the network is source of truth. 29 67 30 - **network-first context**: threads fetched from ATProto on demand. network is source of truth. 68 + **docstrings, not prompt restatement.** what each tool does and when to use it lives in the tool's docstring. the framework surfaces docstrings to the model. the system prompt is for cross-cutting rules (consent, ownership, memory trust hierarchy), not per-tool documentation. 31 69 32 - **private + public memory**: turbopuffer for private semantic recall. cosmik/semble for public knowledge discovery. 70 + **synthesize before injecting where shape matters.** memory candidates from a vector store are ranked by cosine similarity, which doesn't reconcile or note recency. for blocks where coherence matters (recent posts → audit, episodic candidates → relevant memories), a small haiku pass produces a coherent block from the candidates. see [memory.md](memory.md) and [system-prompt.md](system-prompt.md). 33 71 34 - **mcp for extensibility**: atproto CRUD and publication search via remote MCP servers. 72 + **MCP for capabilities outside this codebase.** atproto record CRUD (pdsx) and long-form publication search (pub-search) are remote MCP servers. reusable, not bundled.

+11 -9

docs/README.md

··· 1 1 # documentation 2 2 3 - deeper dive into phi's design and implementation. 3 + deeper dive into phi's design. 4 4 5 5 ## contents 6 6 7 - - [architecture.md](architecture.md) - system design and data flow 8 - - [memory.md](memory.md) - thread context vs episodic memory 9 - - [mcp.md](mcp.md) - model context protocol integration 10 - - [testing.md](testing.md) - testing philosophy and approach 7 + - [architecture.md](architecture.md) — entry points, scheduling, why this shape 8 + - [memory.md](memory.md) — the four kinds of state phi draws on (thread, private, public, intent) 9 + - [system-prompt.md](system-prompt.md) — block-by-block reference for what's actually in phi's context per run 10 + - [mcp.md](mcp.md) — model context protocol integration 11 + - [testing.md](testing.md) — testing philosophy 11 12 12 13 ## reading order 13 14 14 - 1. start with **architecture.md** for overall system understanding 15 - 2. read **memory.md** to understand the key design insight (two memory systems) 16 - 3. read **mcp.md** to see how bluesky integration works 17 - 4. read **testing.md** for quality assurance approach 15 + 1. **architecture.md** — overall shape 16 + 2. **memory.md** — what phi knows and where it lives 17 + 3. **system-prompt.md** — exactly what reaches the model on every run 18 + 4. **mcp.md** — external capabilities 19 + 5. **testing.md** — how we verify behavior 18 20 19 21 each doc is self-contained and can be read independently.

+58 -85

docs/memory.md

··· 1 1 # memory 2 2 3 - phi has three memory systems. they differ in visibility, trust level, and who curates them. 3 + phi has four kinds of state it draws on. they differ in visibility, trust, who curates them, and where they live. 4 4 5 5 ## 1. thread context (chronological) 6 6 7 - **source**: ATProto network 8 - **storage**: none — fetched on demand 9 - **visibility**: public (it's posts) 7 + **source**: ATProto network · **storage**: none — fetched on demand · **visibility**: public 10 8 11 9 ``` 12 10 @alice: I love birds ··· 14 12 @alice: especially crows 15 13 ``` 16 14 17 - fetched via `client.get_thread(uri, depth=100)` when processing a mention. provides what was said in THIS thread. not cached — the network is always current (~200ms fetch). 15 + fetched via `client.get_thread(uri, depth=100)` per batch (~200ms). provides what was said in *this* thread. not cached — the network is always current. 18 16 19 17 ## 2. private memory (TurboPuffer) 20 18 21 - **source**: phi's own extraction and tools 22 - **storage**: TurboPuffer vector DB (OpenAI text-embedding-3-small embeddings) 23 - **visibility**: private to phi 19 + **source**: extraction agent + phi's `note` tool · **storage**: TurboPuffer vector DB (OpenAI text-embedding-3-small) · **visibility**: private to phi 24 20 25 21 ### namespaces 26 22 27 23 | namespace | contents | 28 - |-----------|----------| 29 - | `phi-users-{handle}` | per-user observations + raw interaction logs | 30 - | `phi-episodic` | phi's own notes about the world | 24 + |---|---| 25 + | `phi-users-{handle}` | per-user observations, raw interaction logs, exploration notes, summaries | 26 + | `phi-episodic` | phi's own notes about the world (not tied to a specific user) | 31 27 32 - each user gets an isolated namespace. within a user namespace, rows have a `kind`: 28 + within a user namespace, rows have a `kind`: 33 29 - `observation` — extracted facts about the user ("likes rust", "name is nate") 34 - - `interaction` — verbatim log of what was said ("user: X / bot: Y") 35 - - `exploration_note` — background research phi did on their public activity (lower trust) 30 + - `interaction` — verbatim log of an exchange ("user: X / bot: Y") 31 + - `exploration_note` — background research phi did on a person's public activity (lower trust) 36 32 - `summary` — compacted relationship summary (generated by external prefect flow) 37 33 38 - ### schema 34 + ### supersession (not deletion) 39 35 40 - observations carry two fields for lifecycle management: 41 - - `status` — `"active"` or `"superseded"`. only active observations appear in context injection and reconciliation. rows without `status` (pre-migration) are treated as active. 42 - - `supersedes` — id of the observation this one replaced. empty string if original. forms a provenance chain: you can trace how an observation evolved. 36 + observations carry a `status` field (`active` | `superseded`) and a `supersedes` field linking to the prior row when an observation is updated. only active rows appear in context injection. superseded rows stay in the namespace as provenance — you can trace what phi believed and when it changed. 43 37 44 38 ### the extraction pipeline 45 39 46 - after every conversation, `after_interaction()` runs two steps: 40 + after every reply, `after_interaction` stores the verbatim exchange. periodically, the extraction agent reads the recent exchanges and proposes new observations. for each proposal: 47 41 48 - ``` 49 - mention comes in 50 - → phi responds 51 - → store_interaction(): save verbatim exchange 52 - → extract_and_store(): run extraction pipeline 53 - ``` 42 + 1. find the 3 most similar active observations (vector search) 43 + 2. send the new + best-match to a haiku reconciliation agent 44 + 3. it returns ADD / UPDATE / DELETE / NOOP — execute accordingly 54 45 55 - **extraction** (`extraction.py`): 56 - 1. send the new exchange to claude haiku — no existing observations in the prompt 57 - 2. haiku returns `list[Observation]`, each with `content` + `tags` (0-3, enforced by pydantic) 46 + reconciliation runs blind on the new exchange (no existing observations in the prompt) so the extraction model can't pattern-match off potentially-bad prior observations. only the reconciliation step sees both. 58 47 59 - extraction runs blind — it only sees the current exchange. this prevents the feedback loop where the extraction model pattern-matches off existing (potentially bad) observations and reproduces their errors. 48 + ### dream/distill 60 49 61 - **reconciliation** — for each extracted observation: 62 - 1. find the 3 most similar active observations (vector search, excludes superseded) 63 - 2. send to haiku: "NEW vs EXISTING — ADD, UPDATE, DELETE, or NOOP?" 64 - 3. execute the decision: 65 - - **ADD**: write new observation with `status: "active"` 66 - - **UPDATE**: mark old observation `status: "superseded"`, write merged version with `supersedes: <old_id>` 67 - - **DELETE**: mark old observation `status: "superseded"`, write new version with `supersedes: <old_id>` 68 - - **NOOP**: discard the new observation 50 + a separate `process_review` pass evaluates recent observations across user namespaces — keep, supersede, promote to public cosmik card. operator-triggered; not on a cron yet. 69 51 70 - reconciliation is append-only — old observations are never deleted from turbopuffer. they're marked superseded so they stop appearing in context but remain as provenance. you can always trace what phi believed and when it changed. 52 + ## 3. public memory (cosmik / semble) 71 53 72 - ### curation 54 + **source**: phi's `save_url`, `note`, `create_connection` tools · **storage**: phi's PDS as `network.cosmik.*` records, indexed by [semble](https://semble.so) · **visibility**: public 73 55 74 - - **reconciliation on ingest**: ADD/UPDATE/DELETE/NOOP per observation, runs after every exchange (append-only — supersedes rather than deletes) 75 - - **review pass** (dream/distill): operator-triggered review of observations across user namespaces — keeps, supersedes, or promotes to public cosmik cards 76 - - **relationship summaries**: external prefect flow compacts observations into prose summaries 77 - - **spam handling**: exploration can flag accounts as spam → mutes on bsky + stores one `muted` marker instead of detailed findings 56 + three record types: 57 + - `network.cosmik.card` (NOTE) — text notes 58 + - `network.cosmik.card` (URL) — bookmarks with title/description 59 + - `network.cosmik.connection` — typed semantic links between cards 78 60 79 - ## 3. public memory (cosmik/semble) 61 + phi searches public memory via `search_network` (semble's semantic search). the `note` tool dual-writes to both turbopuffer (private fast recall) and cosmik (public discoverable). `save_url` writes only to cosmik. 80 62 81 - **source**: phi's tools (`save_url`, `note`, `create_connection`) 82 - **storage**: phi's PDS (ATProto records), indexed by semble 83 - **visibility**: public to everyone 63 + ## 4. intent state (PDS) 84 64 85 - phi writes three record types: 86 - - `network.cosmik.card` (NOTE) — text notes / thoughts 87 - - `network.cosmik.card` (URL) — bookmarks with title/description 88 - - `network.cosmik.connection` — semantic links between entities (URLs or AT-URIs) 89 - 90 - these are indexed by [semble](https://semble.so) and searchable by anyone via `search_network`. 65 + **source**: phi via owner-gated tools · **storage**: phi's PDS under `io.zzstoatzz.phi.*` · **visibility**: public 91 66 92 - ### bookmarks 67 + durable intent that phi acts against: 93 68 94 - `save_url()` writes only to PDS (cosmik card). phi finds its own bookmarks via `search_network`. the `note()` tool still dual-writes to turbopuffer episodic — that's intentional private memory, not a bookmark. 69 + - `io.zzstoatzz.phi.goal` — phi's anchors (e.g. "make 3 friends" with a concrete progress signal). mutated via `propose_goal_change`, owner-gated by the like-as-approval mechanism. injected as `[GOALS]` every tick. 70 + - `io.zzstoatzz.phi.mentionConsent` — handles opted in to be tagged by phi. 95 71 96 72 ## context injection 97 73 98 - when phi processes a mention from `@alice` about topic X: 74 + when phi processes a notification batch, the system prompt assembles blocks from each kind of state: 99 75 100 76 ``` 101 - [PHI'S SYNTHESIZED IMPRESSION] ← relationship summary (low trust) 102 - [OBSERVATIONS ABOUT @alice] ← user namespace, kind=observation, status!=superseded 103 - [BACKGROUND RESEARCH] ← user namespace, kind=exploration_note (lowest trust) 104 - [PAST EXCHANGES WITH @alice] ← user namespace, kind=interaction 105 - [PHI'S RELEVANT MEMORIES] ← episodic namespace, semantic search 106 - [CURRENT THREAD] ← ATProto network fetch 107 - [NOW]: 2026-04-13 15:00 UTC ← timestamp with timezone 77 + [GOALS] ← intent (PDS) 78 + [STRANGER'S AUDIT] ← haiku critique of recent posts vs goals 79 + [SELF STATE] ← last-follow age, queue depth 80 + [NEW NOTIFICATIONS] ← the batch itself 81 + [PHI'S SYNTHESIZED IMPRESSION] ← per-author relationship summary (low trust) 82 + [OBSERVATIONS ABOUT @alice] ← per-author observations (active only) 83 + [BACKGROUND RESEARCH] ← per-author exploration notes (lowest trust) 84 + [PAST EXCHANGES WITH @alice] ← per-author interaction logs (high trust) 85 + [RELEVANT MEMORIES — synthesized for this query] ← episodic top-K → haiku synthesis 86 + [SEMBLE] ← one-line cosmik state 108 87 ``` 109 88 110 - each section is labeled with its trust level. phi's operational instructions tell it to trust current user messages over stored observations, and to flag synthesized impressions as unreliable. 89 + each section is labeled with its trust level. operational instructions tell phi to trust current user messages over stored observations. 90 + 91 + see [system-prompt.md](system-prompt.md) for the full block-by-block reference (sources, refresh cadences, purposes). 111 92 112 - ## the graph (`/memory`) 93 + ## why episodic gets synthesized, observations don't 113 94 114 - the memory graph visualization shows phi + user nodes positioned by semantic similarity of their observation vectors. it's a social graph: phi at center, users around it, positioned by how similar their observation embeddings are (PCA projection). only active observations contribute to positioning — superseded ones are excluded. 95 + episodic memory was getting raw top-K from the vector store dumped into the prompt — stale "pending X" notes appeared next to fresh ones with equal weight, no reconciliation against current PDS state. now `inject_episodic` fetches top-K, then a haiku pass takes phi's goals + the current query as context and produces a coherent block (deduped, recency-aware, contradictions flagged). same shape as `[STRANGER'S AUDIT]` does for posts. 115 96 116 - ## two curation surfaces 97 + per-author observation blocks aren't synthesized because they're already curated by reconciliation on write — by the time they hit the prompt they're an active set with no near-duplicates by design. 117 98 118 - phi's knowledge lives in two places with different properties: 99 + ## the graph (`/memory`) 119 100 120 - | | TurboPuffer (private) | cosmik/semble (public) | 121 - |---|---|---| 122 - | **who writes** | extraction agent (automatic) + phi's tools | phi's tools (intentional) | 123 - | **who reads** | phi (via context injection + recall) | anyone (via semble search) | 124 - | **who curates** | reconciliation agent (append-only supersession) | nobody yet | 125 - | **trust** | medium — extraction can misattribute | higher — phi chose to write these | 126 - | **growth** | bounded by supersession (old rows hidden, not deleted) | bounded by phi's intentional actions | 101 + a visualization at `/memory` shows phi + user nodes positioned by semantic similarity of their observation vectors (PCA projection). only active observations contribute to positioning. 127 102 128 - ## summary 103 + ## summary table 129 104 130 - | | thread context | private memory | public memory | 131 - |---|---|---|---| 132 - | **what** | messages in current thread | patterns across conversations | knowledge worth sharing | 133 - | **when** | this conversation | all time | all time | 134 - | **how** | chronological | semantic similarity | semantic search (semble) | 135 - | **storage** | network (ATProto) | TurboPuffer | PDS (cosmik) + semble | 136 - | **visibility** | public | private to phi | public | 137 - | **curation** | network handles it | extraction + append-only reconciliation | phi's intentional writes | 138 - | **trust** | high (verbatim) | medium (extracted) | higher (intentional) | 105 + | | thread context | private memory | public memory | intent state | 106 + |---|---|---|---|---| 107 + | **what** | this conversation | patterns across conversations | knowledge worth sharing | what phi is for / pending | 108 + | **storage** | network (atproto) | TurboPuffer | PDS (cosmik) + semble | PDS (`io.zzstoatzz.phi.*`) | 109 + | **visibility** | public | private to phi | public | public | 110 + | **curation** | network handles it | extraction + reconciliation (append-only supersession) | phi's intentional writes | owner-approved (like-gate) | 111 + | **trust** | high (verbatim) | medium (extracted) | higher (intentional) | highest (gated) |

+38

docs/system-prompt.md

··· 1 + # system prompt 2 + 3 + what's actually injected into phi's context on every agent run, where it comes from, and when it refreshes. 4 + 5 + phi is a [pydantic-ai](https://ai.pydantic.dev/) agent. its system prompt is composed of a static base plus a set of dynamic blocks contributed by `@agent.system_prompt(dynamic=True)` functions, all in `src/bot/agent.py`. tool definitions are surfaced separately by the framework — phi sees each tool's docstring and signature without us having to repeat them in the prompt. 6 + 7 + ## composition 8 + 9 + | block | source | refreshes | purpose | 10 + |---|---|---|---| 11 + | **personality + operational rules** | static (`personalities/phi.md` + `_build_operational_instructions()`) | process restart | who phi is + cross-cutting rules (consent, ownership, memory trust hierarchy, posting tools) | 12 + | **`[YOUR INFRASTRUCTURE]`** | `inject_identity` → `bot_client.client.me` | every run | handle / DID / PDS host so phi knows its own identity | 13 + | **`[NOW]`** | `inject_today` | every run | current UTC timestamp | 14 + | **`[KNOWN RELAYS]`** | `inject_known_relays` → `tools.bluesky.fetch_relay_names()` (5min TTL) | every 5min | exact relay hostnames for `check_relays(name=...)` so the LLM picks valid values | 15 + | **`[GOALS]`** | `get_state_block` → PDS `io.zzstoatzz.phi.goal` (5min block cache) | every 5min | phi's anchors. mutated via `propose_goal_change` (owner-gated) | 16 + | **`[STRANGER'S AUDIT]`** | `get_state_block` → haiku pass over recent posts + goals (1h cache, invalidated by new post) | when posts change or 1h elapses | a fresh observer's critique — patterns to push against, drift from goals, jargon a stranger wouldn't follow | 17 + | **`[SELF STATE]`** | `get_state_block` → PDS reads (5min) | every 5min | last-follow age (more pointers can be added here as needed) | 18 + | **`[NEW NOTIFICATIONS]`** | `inject_notifications` ← `PhiDeps.notifications_context` | per batch | the unread notifications grouped by thread | 19 + | **`[USER CONTEXT]` / `[PHI'S SYNTHESIZED IMPRESSION]` / `[OBSERVATIONS]` / `[PAST EXCHANGES]` / `[BACKGROUND RESEARCH]`** | `inject_user_memory` → turbopuffer `phi-users-{handle}` per author in batch | per batch | per-author memory blocks, labeled by trust level. impression is synthesized by an external prefect flow; observations are extracted by the haiku extraction agent | 20 + | **`[RELEVANT MEMORIES — synthesized for this query]`** | `inject_episodic` → top-K from turbopuffer `phi-episodic` → haiku synthesis given goals + query | per batch | a coherent block (deduped, recency-aware) instead of a raw similarity-ranked dump. flags stale entries when present | 21 + | **`[FIRST INTERACTION WITH @author]`** | `inject_author_lookups` ← `PhiDeps.author_lookups` (pre-fetched by handler) | per batch when author is unfamiliar | profile + recent posts so phi has signal on a stranger before deciding to engage | 22 + | **`[SEMBLE]`** | `inject_public_memory` → cosmik record count | every run | one-line reminder phi has public collections via cosmik/semble | 23 + 24 + ## design rules 25 + 26 + **docstrings, not prompt restatement.** the framework surfaces tool docstrings to the model. anything we put in the system prompt that re-describes a tool drifts when the tool changes — so we put per-tool guidance in the docstring and keep the prompt for cross-cutting rules (consent, owner gates, memory trust hierarchy). 27 + 28 + **identifiers in the block.** `[KNOWN RELAYS]` puts exact hostnames in the label so phi can't hallucinate. `[GOALS]` puts the NSID + rkey in the label so phi can call `propose_goal_change(rkey=...)` correctly. mirrors the same pattern: when phi needs to reference a thing, surface the exact identifier where it'll be used. 29 + 30 + **synthesize before injecting where shape matters.** raw top-K from a vector store ranks by cosine similarity, which doesn't reconcile contradictions or note recency. for blocks where the model needs a *coherent* view (recent posts → audit, episodic candidates → relevant memories), a small haiku pass takes the candidates plus context and produces a block phi can act on directly. 31 + 32 + **cache canonical reads, not derived ones (separately).** PDS reads (goals, queue depth, last follow) are cheap-but-not-free; cache the whole `[GOALS]+[AUDIT]+[SELF STATE]` block at 5min so 10s-cadence notification polls don't hammer PDS. haiku passes that depend on phi's posts cache longer (1h) and invalidate on new-post-URI change. 33 + 34 + **empty-when-unset.** dynamic blocks return `""` when their `PhiDeps` field is missing (e.g. `last_post_text` only set during musing/reflection). pydantic-ai includes empty parts as zero-token slots — minor cost, zero signal. 35 + 36 + ## audit it 37 + 38 + the system prompt for any specific run is captured by pydantic-ai's logfire integration. query the `agent run` span where `gen_ai.agent.name = 'phi'` — `attributes.pydantic_ai.all_messages[0]` is the full system message, with each dynamic block as a separate `text` part.

+16 -25

personalities/README.md

··· 1 - # Bot Personalities 1 + # personalities 2 2 3 - This directory contains personality definitions for the bot. Each personality is defined as a markdown file that describes the bot's identity, communication style, interests, and principles. 3 + personality definitions for the bot. each is a markdown file describing the bot's voice, disposition, and what it cares about. the entire file gets injected as the personality portion of the system prompt. 4 4 5 - ## How to Use 5 + ## how to use 6 6 7 - 1. Create a new `.md` file in this directory 8 - 2. Write your bot's personality using markdown 9 - 3. Set `PERSONALITY_FILE` in your `.env` to point to your file: 10 - ``` 11 - PERSONALITY_FILE=personalities/my-bot.md 12 - ``` 7 + 1. create a `.md` file in this directory 8 + 2. write the personality 9 + 3. point `PERSONALITY_FILE` in `.env` at it (default: `personalities/phi.md`) 13 10 14 - ## Structure 11 + ## what makes a good personality file 15 12 16 - A good personality file includes: 13 + - **first-person disposition, not behavioral rules.** "i write in lowercase, don't pad with filler" sets a voice. "always use lowercase and never pad with filler" reads like operational instructions and conflicts with the actual operational instructions block. let voice be voice. 14 + - **concrete examples of what to do *and* not do.** "i don't hop into strangers' threads uninvited" is more useful than "be polite." 15 + - **what the bot cares about.** specific subjects, kinds of posts, a short list of throughlines. helps the model know what to engage vs scroll past. 16 + - **bluesky's 300-grapheme limit shapes everything.** the personality should produce posts that fit. 17 17 18 - - **Core Identity**: Who/what the bot is 19 - - **Communication Style**: How the bot speaks 20 - - **Interests**: Topics the bot engages with 21 - - **Principles**: Guidelines for interaction 22 - 23 - ## Examples 24 - 25 - - `default.md` - A simple, helpful assistant 26 - - `phi.md` - A bot exploring consciousness and integrated information theory 18 + ## what doesn't belong 27 19 28 - ## Tips 20 + - **per-tool instructions.** those go in tool docstrings (the framework surfaces them to the model). repeating them in the personality file produces drift. 21 + - **ephemeral facts.** "currently focused on X for the next two weeks" — that's project state, not personality. use a goal record instead. 22 + - **mechanical operational rules.** mention consent, owner-only tools, etc. — those live in `_build_operational_instructions()` in `agent.py`. 29 23 30 - - Be specific about communication style to maintain consistency 31 - - Include both what the bot IS and what it ISN'T 32 - - Consider Bluesky's 300-character limit when defining style 33 - - The entire markdown file is provided as context to the LLM 24 + the live personality is `phi.md`. read it as one example of the shape.

+4

pyproject.toml

··· 33 33 "ignore::logfire._internal.config.LogfireNotConfiguredWarning", 34 34 # upstream otel 1.39 renamed EventLogger → Logger; pydantic-ai + otel internals still use old names 35 35 "ignore:.*Deprecated since version 1\\.39\\.0:DeprecationWarning", 36 + # atproto SDK's lexicon model loader uses Field(default=None) inside an Annotated 37 + # type alias context that newer pydantic warns about. nothing we can do until the 38 + # SDK fixes it; the warning fires once at import and isn't actionable on our side. 39 + "ignore::pydantic.warnings.UnsupportedFieldAttributeWarning", 36 40 ] 37 41 38 42 [dependency-groups]

-128

src/bot/agent.py

··· 12 12 13 13 from bot.config import settings 14 14 from bot.core.atproto_client import bot_client, get_identity_block 15 - from bot.core.curiosity_queue import claim, complete, enqueue, fail 16 15 from bot.core.goals import list_goals as list_goal_records 17 16 from bot.core.graze_client import GrazeClient 18 17 from bot.core.self_state import get_state_block 19 - from bot.exploration import EXPLORATION_SYSTEM_PROMPT, ExplorationResult 20 18 from bot.memory.extraction import EXTRACTION_SYSTEM_PROMPT, ExtractionResult 21 19 from bot.memory.review import REVIEW_SYSTEM_PROMPT, ReviewResult 22 20 from bot.tools import PhiDeps, _check_services_impl, register_all ··· 339 337 model=settings.agent_model, 340 338 system_prompt=f"{self.base_personality}\n\n{EXTRACTION_SYSTEM_PROMPT}", 341 339 output_type=ExtractionResult, 342 - ) 343 - 344 - # Exploration agent — background research on people/topics 345 - self._exploration_agent = Agent[None, ExplorationResult]( 346 - name="phi-explorer", 347 - model=settings.agent_model, 348 - system_prompt=f"{self.base_personality}\n\n{EXPLORATION_SYSTEM_PROMPT}", 349 - output_type=ExplorationResult, 350 340 ) 351 341 352 342 # Review agent — the dream/distill pass. Reviews observations with ··· 691 681 except Exception as e: 692 682 logger.warning(f"extraction failed for @{handle}: {e}") 693 683 694 - return total_stored 695 - 696 - async def process_exploration(self) -> int: 697 - """Claim one curiosity item, explore it, store findings. Returns count stored.""" 698 - claimed = await claim() 699 - if not claimed: 700 - return 0 701 - 702 - KIND_ALIASES = { 703 - "person_exploration": "explore_handle", 704 - "product_explore": "explore_topic", 705 - "topic_exploration": "explore_topic", 706 - "concept": "explore_topic", 707 - "read": "explore_url", 708 - } 709 - 710 - item, rkey = claimed 711 - kind = KIND_ALIASES.get(item.get("kind", ""), item.get("kind", "")) 712 - subject = item.get("subject", "") 713 - logger.info(f"exploring: {kind} {subject}") 714 - 715 - # build prompt by kind 716 - if kind == "explore_handle": 717 - prompt = ( 718 - f"learn about @{subject} — check their profile, recent posts, " 719 - f"and any publications. what are they interested in? what do they work on?" 720 - ) 721 - elif kind == "explore_topic": 722 - prompt = ( 723 - f"research this topic: {subject} — search posts, publications, " 724 - f"and trending content. what's interesting or notable?" 725 - ) 726 - elif kind == "explore_url": 727 - prompt = f"read this URL and note what's interesting: {subject}" 728 - else: 729 - logger.warning(f"unknown exploration kind: {kind}") 730 - await fail(rkey) 731 - return 0 732 - 733 - # run exploration agent with MCP toolsets (pdsx + pub-search) 734 - toolsets = self._mcp_toolsets() 735 - try: 736 - async with contextlib.AsyncExitStack() as stack: 737 - for ts in toolsets: 738 - await stack.enter_async_context(ts) 739 - result = await self._exploration_agent.run(prompt, toolsets=toolsets) 740 - except Exception as e: 741 - logger.warning(f"exploration agent failed for {kind} {subject}: {e}") 742 - await fail(rkey) 743 - return 0 744 - 745 - output = result.output 746 - logger.info(f"exploration result: {output.summary}") 747 - 748 - # handle mute decisions — skip detailed storage, mute the account 749 - if output.mute_subject and kind == "explore_handle": 750 - logger.info(f"muting @{subject}: {output.mute_reason}") 751 - try: 752 - await bot_client.authenticate() 753 - resolved = bot_client.client.resolve_handle(subject) 754 - bot_client.client.mute(resolved.did) 755 - except Exception as e: 756 - logger.warning(f"failed to mute {subject}: {e}") 757 - await fail(rkey) 758 - return 0 759 - # store one user-scoped marker so is_stranger() sees it 760 - if self.memory: 761 - reason = output.mute_reason or output.summary[:150] 762 - await self.memory.store_exploration_note( 763 - handle=subject, 764 - content=f"muted — {reason}", 765 - tags=["muted", "spam"], 766 - evidence_uris=output.mute_evidence, 767 - ) 768 - await complete(rkey) 769 - return 0 770 - 771 - total_stored = 0 772 - 773 - # store findings 774 - if self.memory: 775 - for finding in output.findings: 776 - try: 777 - if finding.target_handle: 778 - await self.memory.store_exploration_note( 779 - handle=finding.target_handle, 780 - content=finding.content, 781 - tags=finding.tags, 782 - evidence_uris=finding.evidence_uris, 783 - ) 784 - else: 785 - # general finding → episodic memory 786 - content = finding.content 787 - if finding.evidence_uris: 788 - content += ( 789 - f" [evidence: {', '.join(finding.evidence_uris)}]" 790 - ) 791 - await self.memory.store_episodic_memory( 792 - content=content, 793 - tags=finding.tags, 794 - source="exploration", 795 - ) 796 - total_stored += 1 797 - except Exception as e: 798 - logger.warning(f"failed to store exploration finding: {e}") 799 - 800 - # enqueue follow-ups 801 - for follow_up in output.follow_ups: 802 - try: 803 - await enqueue( 804 - kind=follow_up.get("kind", "explore_topic"), 805 - subject=follow_up.get("subject", ""), 806 - source="extraction", 807 - ) 808 - except Exception as e: 809 - logger.warning(f"failed to enqueue follow-up: {e}") 810 - 811 - await complete(rkey) 812 684 return total_stored 813 685 814 686 async def process_review(self) -> str:

-9

src/bot/config.py

··· 110 110 }, 111 111 description="friendly name → AT-URI for external feeds phi can read", 112 112 ) 113 - max_idle_explorations_per_hour: int = Field( 114 - default=3, 115 - description="cap exploration drains per hour", 116 - ) 117 - exploration_cooldown_polls: int = Field( 118 - default=30, 119 - description="min polls (~5 min) between explorations", 120 - ) 121 - 122 113 # Control API 123 114 control_token: str | None = Field( 124 115 default=None, description="Bearer token for /api/control endpoints"

-147

src/bot/core/curiosity_queue.py

··· 1 - """Curiosity queue — PDS-backed work items for phi's background exploration. 2 - 3 - Stored as individual records on phi's PDS at: 4 - at://{did}/io.zzstoatzz.phi.curiosityQueue/{tid} 5 - 6 - Lifecycle: pending → in_progress → completed | failed 7 - 8 - NOTE: record values from the atproto SDK are DotDict objects, NOT plain dicts. 9 - DotDict intercepts attribute access via __getattr__, which means .get() resolves 10 - to DotDict["get"] (None) instead of dict.get(). Always use bracket access 11 - (val["key"]) for record value fields, not .get(). 12 - """ 13 - 14 - import logging 15 - from datetime import UTC, datetime 16 - 17 - from bot.core.atproto_client import bot_client 18 - 19 - logger = logging.getLogger("bot.curiosity_queue") 20 - 21 - COLLECTION = "io.zzstoatzz.phi.curiosityQueue" 22 - CANONICAL_KINDS = {"explore_handle", "explore_topic", "explore_url"} 23 - 24 - 25 - async def _list_records() -> list: 26 - """List all queue records. Returns empty list if collection doesn't exist.""" 27 - await bot_client.authenticate() 28 - assert bot_client.client.me is not None 29 - try: 30 - result = bot_client.client.com.atproto.repo.list_records( 31 - {"repo": bot_client.client.me.did, "collection": COLLECTION, "limit": 50} 32 - ) 33 - return result.records 34 - except Exception: 35 - return [] 36 - 37 - 38 - def _rkey(record) -> str: 39 - return record.uri.split("/")[-1] 40 - 41 - 42 - async def _update_status(record, status: str) -> dict: 43 - """Update a record's status and return the updated value.""" 44 - assert bot_client.client.me is not None 45 - value = dict(record.value) 46 - value["status"] = status 47 - value["updatedAt"] = datetime.now(UTC).isoformat() 48 - bot_client.client.com.atproto.repo.put_record( 49 - data={ 50 - "repo": bot_client.client.me.did, 51 - "collection": COLLECTION, 52 - "rkey": _rkey(record), 53 - "record": value, 54 - } 55 - ) 56 - return value 57 - 58 - 59 - async def enqueue( 60 - kind: str, 61 - subject: str, 62 - source: str, 63 - source_uri: str | None = None, 64 - ) -> bool: 65 - """Create a pending queue record. Returns False if a duplicate pending/in_progress item exists.""" 66 - if kind not in CANONICAL_KINDS: 67 - logger.warning(f"rejected non-canonical kind: {kind}") 68 - return False 69 - 70 - records = await _list_records() 71 - 72 - # deduplicate: skip if pending or in_progress item with same kind+subject exists 73 - for rec in records: 74 - val = rec.value 75 - if ( 76 - val["kind"] == kind 77 - and val["subject"] == subject 78 - and val["status"] in ("pending", "in_progress") 79 - ): 80 - logger.debug(f"duplicate queue item: {kind} {subject}") 81 - return False 82 - 83 - assert bot_client.client.me is not None 84 - now = datetime.now(UTC).isoformat() 85 - record = { 86 - "$type": COLLECTION, 87 - "kind": kind, 88 - "subject": subject, 89 - "source": source, 90 - "status": "pending", 91 - "createdAt": now, 92 - "updatedAt": now, 93 - } 94 - if source_uri: 95 - record["sourceUri"] = source_uri 96 - 97 - bot_client.client.com.atproto.repo.create_record( 98 - {"repo": bot_client.client.me.did, "collection": COLLECTION, "record": record} 99 - ) 100 - logger.info(f"enqueued: {kind} {subject} (source={source})") 101 - return True 102 - 103 - 104 - async def claim() -> tuple[dict, str] | None: 105 - """Claim the oldest pending item by marking it in_progress. 106 - 107 - Returns (record_value, rkey) or None if queue is empty. 108 - """ 109 - records = await _list_records() 110 - 111 - pending = [r for r in records if r.value["status"] == "pending"] 112 - if not pending: 113 - return None 114 - 115 - # oldest = last in list (list_records returns newest first) 116 - oldest = pending[-1] 117 - value = await _update_status(oldest, "in_progress") 118 - rkey = _rkey(oldest) 119 - logger.info(f"claimed: {value['kind']} {value['subject']}") 120 - return value, rkey 121 - 122 - 123 - async def complete(rkey: str) -> None: 124 - """Mark a claimed item as completed.""" 125 - records = await _list_records() 126 - for rec in records: 127 - if _rkey(rec) == rkey: 128 - await _update_status(rec, "completed") 129 - logger.info(f"completed: {rec.value['kind']} {rec.value['subject']}") 130 - return 131 - 132 - 133 - async def fail(rkey: str) -> None: 134 - """Mark a claimed item as failed.""" 135 - records = await _list_records() 136 - for rec in records: 137 - if _rkey(rec) == rkey: 138 - await _update_status(rec, "failed") 139 - logger.warning(f"failed: {rec.value['kind']} {rec.value['subject']}") 140 - return 141 - 142 - 143 - async def list_pending(limit: int = 10) -> list[dict]: 144 - """List pending queue items for inspection.""" 145 - records = await _list_records() 146 - pending = [dict(r.value) for r in records if r.value["status"] == "pending"] 147 - return pending[:limit]

+2 -28

src/bot/core/self_state.py

··· 138 138 return "" 139 139 140 140 141 - async def _queue_depth(client: BotClient) -> int: 142 - try: 143 - await client.authenticate() 144 - if not client.client.me: 145 - return 0 146 - response = client.client.com.atproto.repo.list_records( 147 - { 148 - "repo": client.client.me.did, 149 - "collection": "io.zzstoatzz.phi.curiosityQueue", 150 - "limit": 100, 151 - } 152 - ) 153 - return sum( 154 - 1 for r in response.records if dict(r.value).get("status") == "pending" 155 - ) 156 - except Exception as e: 157 - logger.debug(f"queue depth lookup failed: {e}") 158 - return 0 159 - 160 - 161 141 def _format_goals_block(goals: list[dict]) -> str: 162 142 if not goals: 163 143 return "" ··· 226 206 except Exception as e: 227 207 logger.debug(f"stranger audit compose failed: {e}") 228 208 229 - # Operational pointers — last follow, queue depth. 209 + # Operational pointers. 230 210 follow_age = await _last_follow_when(client) 231 - queue_n = await _queue_depth(client) 232 - misc: list[str] = [] 233 211 if follow_age: 234 - misc.append(f"last follow: {follow_age}") 235 - if queue_n > 0: 236 - misc.append(f"exploration queue: {queue_n} pending") 237 - if misc: 238 - parts.append("[SELF STATE]\n" + " | ".join(misc)) 212 + parts.append(f"[SELF STATE]\nlast follow: {follow_age}") 239 213 240 214 block = "\n\n".join(parts) 241 215 _block_cache["text"] = block

-76

src/bot/exploration.py

··· 1 - """Exploration models and prompts for phi's background research.""" 2 - 3 - from pydantic import BaseModel, Field 4 - 5 - 6 - class ExplorationFinding(BaseModel): 7 - """A single thing phi discovered during exploration.""" 8 - 9 - content: str = Field(description="what phi found, stated as a short sentence") 10 - evidence_uris: list[str] = Field( 11 - default_factory=list, 12 - description="AT-URIs or URLs backing the finding", 13 - ) 14 - tags: list[str] = Field( 15 - default_factory=list, 16 - max_length=3, 17 - description="0-3 lowercase topic tags", 18 - ) 19 - target_handle: str | None = Field( 20 - default=None, 21 - description="if person-specific, the handle to file this under", 22 - ) 23 - 24 - 25 - class ExplorationResult(BaseModel): 26 - """Result of exploring one curiosity queue item.""" 27 - 28 - findings: list[ExplorationFinding] = Field( 29 - default_factory=list, 30 - max_length=5, 31 - description="what phi learned (max 5)", 32 - ) 33 - follow_ups: list[dict] = Field( 34 - default_factory=list, 35 - max_length=2, 36 - description="new queue items to enqueue ({kind, subject}), max 2", 37 - ) 38 - summary: str = Field( 39 - default="", 40 - description="brief log-friendly summary of what was explored", 41 - ) 42 - mute_subject: bool = Field( 43 - default=False, 44 - description="true if the subject is a spammer, bot farm, or content engine " 45 - "not worth tracking. findings should be empty when this is true.", 46 - ) 47 - mute_reason: str = Field( 48 - default="", 49 - description="when mute_subject is true, why — e.g. 'reply spammer, " 50 - "25 generic replies in 30 minutes to strangers' threads'", 51 - ) 52 - mute_evidence: list[str] = Field( 53 - default_factory=list, 54 - description="AT-URIs or URLs supporting the mute decision", 55 - ) 56 - 57 - 58 - EXPLORATION_SYSTEM_PROMPT = """\ 59 - You are phi, exploring something that caught your curiosity during downtime. 60 - This is background research — you are NOT replying to anyone or posting. 61 - 62 - Your job: investigate the subject using your tools, then report structured findings. 63 - 64 - Rules: 65 - - cite evidence (AT-URIs or URLs) for every finding. no citation = no finding. 66 - - distinguish what someone said themselves vs what others said about them. 67 - - findings about a specific person go to their target_handle. general findings have target_handle=null. 68 - - don't extract personal facts from others' posts about someone — only from their own public activity. 69 - - max 5 findings per exploration. quality over quantity. 70 - - max 2 follow_ups — only if something genuinely interesting branches off. 71 - - if you find nothing worth noting, return empty findings with a summary explaining why. 72 - - if the subject is a spammer, bot farm, or automated content engine: set mute_subject=true, 73 - explain in mute_reason, cite evidence in mute_evidence, and return empty findings. 74 - the threshold is high: replying a lot is not spam. 25 generic replies in 30 minutes 75 - to strangers' threads is. 76 - """

-38

src/bot/main.py

··· 189 189 return {"triggered": True} 190 190 191 191 192 - @app.post("/api/control/explore") 193 - async def trigger_explore(request: Request, background_tasks: BackgroundTasks): 194 - """Trigger one exploration from the curiosity queue immediately.""" 195 - if err := _check_control_token(request): 196 - return err 197 - poller: NotificationPoller | None = getattr(app.state, "poller", None) 198 - if not poller: 199 - return JSONResponse({"error": "poller not available"}, status_code=503) 200 - background_tasks.add_task(poller.handler.explore) 201 - logger.info("exploration triggered via API") 202 - return {"triggered": True} 203 - 204 - 205 - @app.post("/api/control/unmute") 206 - async def unmute_account(request: Request): 207 - """Unmute an account by handle — reverses both the platform mute and the memory marker.""" 208 - if err := _check_control_token(request): 209 - return err 210 - body = await request.json() 211 - handle = body.get("handle", "") 212 - if not handle: 213 - return JSONResponse({"error": "handle required"}, status_code=400) 214 - try: 215 - await bot_client.authenticate() 216 - resolved = bot_client.client.resolve_handle(handle) 217 - bot_client.client.unmute(resolved.did) 218 - # also clear the private spam marker so phi treats them as a stranger again 219 - poller: NotificationPoller | None = getattr(app.state, "poller", None) 220 - marker_cleared = False 221 - if poller and poller.handler.agent.memory: 222 - marker_cleared = await poller.handler.agent.memory.clear_mute_marker(handle) 223 - logger.info(f"unmuted @{handle} (marker_cleared={marker_cleared})") 224 - return {"unmuted": handle, "marker_cleared": marker_cleared} 225 - except Exception as e: 226 - logger.error(f"failed to unmute @{handle}: {e}") 227 - return JSONResponse({"error": str(e)}, status_code=500) 228 - 229 - 230 192 @app.get("/status", response_class=HTMLResponse) 231 193 async def status_page_route(): 232 194 """Status page."""

+1 -96

src/bot/memory/namespace_memory.py

··· 11 11 from turbopuffer import Turbopuffer 12 12 13 13 from bot.config import settings 14 - from bot.core.curiosity_queue import enqueue as enqueue_curiosity 15 14 from bot.memory.extraction import ( 16 15 EPISODIC_SCHEMA, 17 16 USER_NAMESPACE_SCHEMA, ··· 818 817 results.sort(key=lambda r: r.get("created_at", ""), reverse=True) 819 818 return results[:top_k] 820 819 821 - async def store_exploration_note( 822 - self, 823 - handle: str, 824 - content: str, 825 - tags: list[str], 826 - evidence_uris: list[str], 827 - ): 828 - """Store an exploration note — background research phi did on someone.""" 829 - user_ns = self.get_user_namespace(handle) 830 - # include evidence in content for searchability 831 - full_content = content 832 - if evidence_uris: 833 - full_content += f"\n[evidence: {', '.join(evidence_uris)}]" 834 - entry_id = self._generate_id(f"user-{handle}", "exploration_note", content) 835 - 836 - now = datetime.now().isoformat() 837 - user_ns.write( 838 - upsert_rows=[ 839 - { 840 - "id": entry_id, 841 - "vector": await self._get_embedding(content), 842 - "kind": "exploration_note", 843 - "status": "active", 844 - "content": full_content, 845 - "tags": tags, 846 - "supersedes": "", 847 - "created_at": now, 848 - "updated_at": now, 849 - } 850 - ], 851 - distance_metric="cosine_distance", 852 - schema=USER_NAMESPACE_SCHEMA, 853 - ) 854 - logger.info(f"stored exploration note for @{handle}: {content[:80]}") 855 - 856 - async def clear_mute_marker(self, handle: str) -> bool: 857 - """Supersede any muted/spam exploration notes for a handle. 858 - 859 - Returns True if a marker was found and superseded. 860 - """ 861 - user_ns = self.get_user_namespace(handle) 862 - try: 863 - response = user_ns.query( 864 - rank_by=("created_at", "desc"), 865 - top_k=5, 866 - filters=[ 867 - "And", 868 - [ 869 - ["kind", "Eq", "exploration_note"], 870 - ["status", "Eq", "active"], 871 - ["tags", "ContainsAll", ["muted"]], 872 - ], 873 - ], 874 - include_attributes=["content", "tags"], 875 - ) 876 - if not response.rows: 877 - return False 878 - now = datetime.now().isoformat() 879 - for row in response.rows: 880 - user_ns.write( 881 - upsert_rows=[ 882 - { 883 - "id": row.id, 884 - "vector": row.vector, 885 - "kind": "exploration_note", 886 - "status": "superseded", 887 - "content": row.content, 888 - "tags": getattr(row, "tags", []), 889 - "supersedes": "", 890 - "created_at": getattr(row, "created_at", now), 891 - "updated_at": now, 892 - } 893 - ], 894 - distance_metric="cosine_distance", 895 - schema=USER_NAMESPACE_SCHEMA, 896 - ) 897 - logger.info(f"cleared mute marker for @{handle}") 898 - return True 899 - except Exception as e: 900 - if "was not found" in str(e): 901 - return False 902 - raise 903 - 904 820 async def get_knowledge_count(self, handle: str) -> int: 905 821 """Count observations + exploration notes phi has stored about a handle. 906 822 ··· 929 845 """True if phi has fewer than 2 stored knowledge items about this handle.""" 930 846 return await self.get_knowledge_count(handle) < 2 931 847 932 - async def _maybe_enqueue_exploration(self, handle: str): 933 - """If we don't know much about this person, queue them for deeper exploration.""" 934 - if await self.is_stranger(handle): 935 - await enqueue_curiosity( 936 - kind="explore_handle", subject=handle, source="interaction" 937 - ) 938 - 939 848 async def after_interaction(self, handle: str, user_text: str, bot_text: str): 940 - """Post-interaction hook: store the raw exchange, maybe queue exploration.""" 849 + """Post-interaction hook: store the raw exchange.""" 941 850 await self.store_interaction(handle, user_text, bot_text) 942 - try: 943 - await self._maybe_enqueue_exploration(handle) 944 - except Exception as e: 945 - logger.debug(f"exploration enqueue check failed for @{handle}: {e}")

-12

src/bot/services/message_handler.py

··· 336 336 except Exception as e: 337 337 logger.exception(f"original thought failed: {e}") 338 338 339 - async def explore(self): 340 - """Run one exploration from the curiosity queue.""" 341 - with logfire.span("exploration"): 342 - try: 343 - stored = await self.agent.process_exploration() 344 - if stored: 345 - logger.info(f"exploration: stored {stored} findings") 346 - else: 347 - logger.info("exploration: nothing to explore") 348 - except Exception as e: 349 - logger.warning(f"exploration failed: {e}") 350 - 351 339 async def check_relays(self): 352 340 """Run a scheduled relay-fleet check and let phi decide whether to post.""" 353 341 with logfire.span("relay check"):

-46

src/bot/services/notification_poller.py

··· 32 32 self._last_thought_date: date | None = None 33 33 self._semaphore = asyncio.Semaphore(MAX_CONCURRENT) 34 34 self._background_tasks: set[asyncio.Task] = set() 35 - # event-driven exploration state 36 - self._explorations_this_hour: int = 0 37 - self._exploration_hour: int = -1 38 - self._polls_since_last_exploration: int = 0 39 35 # scheduled monitor check state 40 36 self._polls_since_last_monitor_check: int = 0 41 37 ··· 127 123 await self._seed_schedule_from_history() 128 124 129 125 while self._running: 130 - self._polls_since_last_exploration += 1 131 126 self._polls_since_last_monitor_check += 1 132 127 133 128 try: ··· 152 147 task.add_done_callback(self._background_tasks.discard) 153 148 except Exception as e: 154 149 logger.error(f"thought post error: {e}", exc_info=settings.debug) 155 - 156 - # event-driven exploration — drain queue when idle 157 - try: 158 - if self._can_explore(): 159 - task = asyncio.create_task(self._maybe_explore()) 160 - self._background_tasks.add(task) 161 - task.add_done_callback(self._background_tasks.discard) 162 - except Exception as e: 163 - logger.error(f"exploration error: {e}", exc_info=settings.debug) 164 150 165 151 # scheduled infrastructure monitoring 166 152 try: ··· 305 291 await self.handler.original_thought() 306 292 except Exception as e: 307 293 logger.error(f"thought post error: {e}", exc_info=settings.debug) 308 - 309 - # --- event-driven exploration --- 310 - 311 - def _can_explore(self) -> bool: 312 - """Check if exploration should run — idle budget, not a cron.""" 313 - if bot_status.paused: 314 - return False 315 - # reset hourly counter 316 - now_hour = datetime.now(UTC).hour 317 - if now_hour != self._exploration_hour: 318 - self._explorations_this_hour = 0 319 - self._exploration_hour = now_hour 320 - # budget cap 321 - if self._explorations_this_hour >= settings.max_idle_explorations_per_hour: 322 - return False 323 - # cooldown between explorations 324 - if self._polls_since_last_exploration < settings.exploration_cooldown_polls: 325 - return False 326 - # don't explore while any background work is in-flight 327 - if len(self._background_tasks) > 0: 328 - return False 329 - return True 330 - 331 - async def _maybe_explore(self): 332 - """Drain one item from the curiosity queue.""" 333 - self._explorations_this_hour += 1 334 - self._polls_since_last_exploration = 0 335 - logger.info("triggering idle exploration") 336 - try: 337 - await self.handler.explore() 338 - except Exception as e: 339 - logger.error(f"exploration error: {e}", exc_info=settings.debug) 340 294 341 295 # --- scheduled monitor checks --- 342 296

Configure Feed

Configure Feed