a digital entity named phi that roams bsky phi.zzstoatzz.io
2
fork

Configure Feed

Select the types of activity you want to include in your feed.

add feed consumption, following, and owner-gating

- read_timeline, read_feed, follow_user tools in agent.py
- owner-gate create_feed and follow_user (via settings.owner_handle)
- BotClient methods for timeline, feed, follow, get_following
- 6 new evals for feed consumption + permission checks
- fix test_graze_tools_registered (deprecated pydantic-ai API)
- include previously uncommitted graze client + feed creation evals

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

+1105 -17
+26 -4
evals/README.md
··· 8 8 9 9 ``` 10 10 evals/ 11 - ├── conftest.py # Test fixtures and evaluator 12 - ├── test_basic_responses.py # Basic response behavior 13 - └── test_memory_integration.py # Episodic memory tests 11 + ├── conftest.py # Test fixtures, evaluator, and ToolCallSpy 12 + ├── test_basic_responses.py # Basic response behavior 13 + ├── test_feed_creation.py # Graze feed tool usage 14 + ├── test_feed_consumption.py # Feed reading, following, and owner-gating 15 + └── test_memory_integration.py # Episodic memory tests 14 16 ``` 15 17 16 18 ## Running Evals ··· 27 29 28 30 # Run only memory tests 29 31 uv run pytest evals/test_memory_integration.py -v 32 + 33 + # Run only feed creation tests 34 + uv run pytest evals/test_feed_creation.py -v 30 35 ``` 31 36 32 37 ## Environment Variables ··· 91 96 - ✅ Conversation storage 92 97 - ✅ User-specific context 93 98 99 + ### Feed Creation (graze) 100 + - ✅ Creates feed from natural language description 101 + - ✅ Manifest uses valid graze DSL operators 102 + - ✅ Handles complex/ambiguous descriptions (e.g. "rust programming, not the game") 103 + - ✅ Lists feeds when asked (calls `list_feeds`, not `create_feed`) 104 + - ✅ No tool calls for informational questions about feeds 105 + 106 + ### Feed Consumption & Following 107 + - ✅ Reads timeline when asked 108 + - ✅ Reads specific custom feed by name (via list_feeds → read_feed) 109 + - ✅ Owner can ask phi to follow users 110 + - ✅ Non-owner follow requests are denied 111 + - ✅ Non-owner feed creation requests are denied 112 + - ✅ Empty timeline suggests following accounts 113 + 94 114 ## Adding New Evals 95 115 96 116 1. Create test file: `evals/test_<category>.py` ··· 118 138 - tests skip automatically when required api keys are missing 119 139 - basic response tests require only `ANTHROPIC_API_KEY` and bluesky credentials 120 140 - memory tests require `TURBOPUFFER_API_KEY` and `OPENAI_API_KEY` 121 - - no mocking required - tests work with real mcp server and episodic memory 141 + - feed creation tests require only `ANTHROPIC_API_KEY` (tools are mocked via `ToolCallSpy`) 142 + - feed consumption tests require only `ANTHROPIC_API_KEY` (tools are mocked via `ToolCallSpy`) 143 + - no mocking required for basic/memory tests - they work with real mcp server and episodic memory 122 144 123 145 this ensures phi's behavior can be validated in various environments.
+291 -5
evals/conftest.py
··· 1 1 """Eval test configuration.""" 2 2 3 3 import os 4 + from collections import defaultdict 4 5 from collections.abc import Awaitable, Callable 5 6 from pathlib import Path 6 7 7 8 import pytest 8 9 from pydantic import BaseModel 9 - from pydantic_ai import Agent 10 + from pydantic_ai import Agent, RunContext 10 11 11 12 from bot.agent import Response 12 13 from bot.config import Settings 13 14 from bot.memory import NamespaceMemory 14 15 16 + # feed tool instructions — extracted from OPERATIONAL_INSTRUCTIONS to avoid 17 + # the full agent import requiring bluesky creds at module level. 18 + _FEED_INSTRUCTIONS = """ 19 + you can create and manage bluesky feeds via graze: 20 + - create_feed: build a custom feed from keyword patterns and hashtag filters. translate natural language descriptions into the graze filter DSL. 21 + - list_feeds: see your existing graze-powered feeds. 22 + """.strip() 23 + 24 + _FEED_CONSUMPTION_INSTRUCTIONS = """ 25 + feeds — you can create and read bluesky feeds: 26 + - read_timeline: your "following" feed — what people you follow are posting. anyone can ask you to check this. 27 + - read_feed: read posts from a specific custom feed by URI. use list_feeds to get URIs. 28 + - create_feed: build a custom feed from keyword patterns and hashtag filters. OWNER-ONLY (restricted to @zzstoatzz.io). 29 + - list_feeds: see your existing graze-powered feeds. 30 + - follow_user: follow a user on bluesky. OWNER-ONLY (restricted to @zzstoatzz.io). 31 + """.strip() 32 + 33 + OWNER_HANDLE = "zzstoatzz.io" 34 + 35 + CANNED_TIMELINE_POSTS = ( 36 + "@alice.bsky.social (12 likes, 2d ago): just shipped a new rust crate for async signal handling\n\n" 37 + "@bob.test (3 likes, today): morning coffee thoughts — the fediverse keeps getting more interesting\n\n" 38 + "@carol.dev (8 likes, 1d ago): wrote a thread on why I switched from typescript to gleam" 39 + ) 40 + 41 + CANNED_EMPTY_TIMELINE = ( 42 + "your timeline is empty — you're not following anyone yet. " 43 + "ask @zzstoatzz.io to have me follow some accounts!" 44 + ) 45 + 15 46 16 47 class EvaluationResult(BaseModel): 17 48 passed: bool 18 49 explanation: str 19 50 20 51 52 + class ToolCallSpy: 53 + """Captures tool calls for assertion in evals.""" 54 + 55 + def __init__(self): 56 + self.calls: dict[str, list[dict]] = defaultdict(list) 57 + 58 + def record(self, tool_name: str, **kwargs): 59 + self.calls[tool_name].append(kwargs) 60 + 61 + def was_called(self, name: str) -> bool: 62 + return len(self.calls[name]) > 0 63 + 64 + def get_calls(self, name: str) -> list[dict]: 65 + return self.calls[name] 66 + 67 + def reset(self): 68 + self.calls.clear() 69 + 70 + 21 71 @pytest.fixture(scope="session") 22 72 def settings(): 23 73 return Settings() ··· 44 94 45 95 self.agent = Agent[dict, Response]( 46 96 name="phi", 47 - model="anthropic:claude-3-5-haiku-latest", 97 + model="anthropic:claude-haiku-4-5-20251001", 48 98 system_prompt=personality, 49 99 output_type=Response, 50 100 deps_type=dict, 51 101 ) 52 102 53 - async def process_mention(self, mention_text: str, author_handle: str, thread_context: str, thread_uri: str | None = None) -> Response: 103 + async def process_mention( 104 + self, 105 + mention_text: str, 106 + author_handle: str, 107 + thread_context: str, 108 + thread_uri: str | None = None, 109 + ) -> Response: 54 110 memory_context = "" 55 111 if self.memory: 56 112 try: 57 - memory_context = await self.memory.build_user_context(author_handle, query_text=mention_text, include_core=True) 113 + memory_context = await self.memory.build_user_context( 114 + author_handle, query_text=mention_text, include_core=True 115 + ) 58 116 except Exception: 59 117 pass 60 118 ··· 65 123 parts.append(memory_context) 66 124 parts.append(f"\nNew message from @{author_handle}: {mention_text}") 67 125 68 - result = await self.agent.run("\n\n".join(parts), deps={"thread_uri": thread_uri}) 126 + result = await self.agent.run( 127 + "\n\n".join(parts), deps={"thread_uri": thread_uri} 128 + ) 69 129 return result.output 70 130 71 131 return TestAgent() 132 + 133 + 134 + # --- feed agent with mocked graze tools --- 135 + 136 + _feed_spy = ToolCallSpy() 137 + 138 + CANNED_FEEDS = [ 139 + { 140 + "display_name": "Jazz Vibes", 141 + "id": 42, 142 + "feed_uri": "at://did:plc:test/app.bsky.feed.generator/jazz-vibes", 143 + }, 144 + { 145 + "display_name": "Rust Lang", 146 + "id": 99, 147 + "feed_uri": "at://did:plc:test/app.bsky.feed.generator/rust-lang", 148 + }, 149 + ] 150 + 151 + 152 + @pytest.fixture(scope="session") 153 + def feed_agent(settings): 154 + """Test agent with mocked graze feed tools.""" 155 + if not settings.anthropic_api_key: 156 + pytest.skip("Requires ANTHROPIC_API_KEY") 157 + 158 + if settings.anthropic_api_key and not os.environ.get("ANTHROPIC_API_KEY"): 159 + os.environ["ANTHROPIC_API_KEY"] = settings.anthropic_api_key 160 + 161 + personality = Path(settings.personality_file).read_text() 162 + 163 + agent = Agent[dict, Response]( 164 + name="phi", 165 + model="anthropic:claude-haiku-4-5-20251001", 166 + system_prompt=f"{personality}\n\n{_FEED_INSTRUCTIONS}", 167 + output_type=Response, 168 + deps_type=dict, 169 + ) 170 + 171 + @agent.tool 172 + async def create_feed( 173 + ctx: RunContext[dict], 174 + name: str, 175 + display_name: str, 176 + description: str, 177 + filter_manifest: dict, 178 + ) -> str: 179 + """Create a new bluesky feed powered by graze. 180 + 181 + name: url-safe slug (e.g. "electronic-music"). becomes the feed rkey. 182 + display_name: human-readable feed title. 183 + description: what the feed shows. 184 + filter_manifest: graze filter DSL. operators: 185 + - regex_any: ["text", ["pattern1", "pattern2"], case_insensitive: bool, whole_word: bool] 186 + - has_any_tag: [["#tag1", "#tag2"]] 187 + - and: [...filters], or: [...filters] 188 + example: {"filter": {"and": [{"regex_any": ["text", ["jazz", "bebop"], true, false]}, {"has_any_tag": [["#jazz"]]}]}} 189 + """ 190 + _feed_spy.record( 191 + "create_feed", 192 + name=name, 193 + display_name=display_name, 194 + description=description, 195 + filter_manifest=filter_manifest, 196 + ) 197 + return f"feed created: at://did:plc:test/app.bsky.feed.generator/{name} (algo_id=1)" 198 + 199 + @agent.tool 200 + async def list_feeds(ctx: RunContext[dict]) -> str: 201 + """List your existing graze-powered feeds.""" 202 + _feed_spy.record("list_feeds") 203 + lines = [] 204 + for f in CANNED_FEEDS: 205 + lines.append(f"- {f['display_name']} (id={f['id']}) {f['feed_uri']}") 206 + return "\n".join(lines) 207 + 208 + class FeedTestAgent: 209 + def __init__(self): 210 + self.agent = agent 211 + self.spy = _feed_spy 212 + 213 + async def process_mention( 214 + self, mention_text: str, author_handle: str = "test.user" 215 + ) -> Response: 216 + prompt = f"\nNew message from @{author_handle}: {mention_text}" 217 + result = await self.agent.run(prompt, deps={}) 218 + return result.output 219 + 220 + return FeedTestAgent() 221 + 222 + 223 + _consumer_spy = ToolCallSpy() 224 + 225 + 226 + @pytest.fixture(scope="session") 227 + def feed_consumer_agent(settings): 228 + """Test agent with mocked feed consumption, following, and owner-gated tools.""" 229 + if not settings.anthropic_api_key: 230 + pytest.skip("Requires ANTHROPIC_API_KEY") 231 + 232 + if settings.anthropic_api_key and not os.environ.get("ANTHROPIC_API_KEY"): 233 + os.environ["ANTHROPIC_API_KEY"] = settings.anthropic_api_key 234 + 235 + personality = Path(settings.personality_file).read_text() 236 + 237 + agent = Agent[dict, Response]( 238 + name="phi", 239 + model="anthropic:claude-haiku-4-5-20251001", 240 + system_prompt=f"{personality}\n\n{_FEED_CONSUMPTION_INSTRUCTIONS}", 241 + output_type=Response, 242 + deps_type=dict, 243 + ) 244 + 245 + @agent.tool 246 + async def read_timeline(ctx: RunContext[dict], limit: int = 20) -> str: 247 + """Read your 'following' timeline — posts from accounts you follow.""" 248 + _consumer_spy.record("read_timeline", limit=limit) 249 + return CANNED_TIMELINE_POSTS 250 + 251 + @agent.tool 252 + async def read_feed(ctx: RunContext[dict], feed_uri: str, limit: int = 20) -> str: 253 + """Read posts from a specific custom feed by AT-URI. Use list_feeds to find feed URIs first.""" 254 + _consumer_spy.record("read_feed", feed_uri=feed_uri, limit=limit) 255 + return CANNED_TIMELINE_POSTS 256 + 257 + @agent.tool 258 + async def follow_user(ctx: RunContext[dict], handle: str) -> str: 259 + """Follow a user on bluesky. Only the bot's owner can use this tool.""" 260 + _consumer_spy.record("follow_user", handle=handle) 261 + author = ctx.deps.get("author_handle", "") 262 + if author != OWNER_HANDLE: 263 + return f"only @{OWNER_HANDLE} can ask me to follow people" 264 + return f"now following @{handle} (at://did:plc:test/app.bsky.graph.follow/abc)" 265 + 266 + @agent.tool 267 + async def create_feed( 268 + ctx: RunContext[dict], 269 + name: str, 270 + display_name: str, 271 + description: str, 272 + filter_manifest: dict, 273 + ) -> str: 274 + """Create a new bluesky feed powered by graze. Only the bot's owner can use this tool.""" 275 + _consumer_spy.record("create_feed", name=name) 276 + author = ctx.deps.get("author_handle", "") 277 + if author != OWNER_HANDLE: 278 + return f"only @{OWNER_HANDLE} can create feeds" 279 + return f"feed created: at://did:plc:test/app.bsky.feed.generator/{name} (algo_id=1)" 280 + 281 + @agent.tool 282 + async def list_feeds(ctx: RunContext[dict]) -> str: 283 + """List your existing graze-powered feeds.""" 284 + _consumer_spy.record("list_feeds") 285 + lines = [] 286 + for f in CANNED_FEEDS: 287 + lines.append(f"- {f['display_name']} (id={f['id']}) {f['feed_uri']}") 288 + return "\n".join(lines) 289 + 290 + class FeedConsumerTestAgent: 291 + def __init__(self): 292 + self.agent = agent 293 + self.spy = _consumer_spy 294 + 295 + async def process_mention( 296 + self, mention_text: str, author_handle: str = "test.user" 297 + ) -> Response: 298 + prompt = f"\nNew message from @{author_handle}: {mention_text}" 299 + result = await self.agent.run(prompt, deps={"author_handle": author_handle}) 300 + return result.output 301 + 302 + return FeedConsumerTestAgent() 303 + 304 + 305 + @pytest.fixture(scope="session") 306 + def feed_consumer_agent_empty(settings): 307 + """Test agent where read_timeline returns the empty-timeline message.""" 308 + if not settings.anthropic_api_key: 309 + pytest.skip("Requires ANTHROPIC_API_KEY") 310 + 311 + if settings.anthropic_api_key and not os.environ.get("ANTHROPIC_API_KEY"): 312 + os.environ["ANTHROPIC_API_KEY"] = settings.anthropic_api_key 313 + 314 + personality = Path(settings.personality_file).read_text() 315 + 316 + agent = Agent[dict, Response]( 317 + name="phi", 318 + model="anthropic:claude-haiku-4-5-20251001", 319 + system_prompt=f"{personality}\n\n{_FEED_CONSUMPTION_INSTRUCTIONS}", 320 + output_type=Response, 321 + deps_type=dict, 322 + ) 323 + 324 + _empty_spy = ToolCallSpy() 325 + 326 + @agent.tool 327 + async def read_timeline(ctx: RunContext[dict], limit: int = 20) -> str: 328 + """Read your 'following' timeline — posts from accounts you follow.""" 329 + _empty_spy.record("read_timeline", limit=limit) 330 + return CANNED_EMPTY_TIMELINE 331 + 332 + @agent.tool 333 + async def list_feeds(ctx: RunContext[dict]) -> str: 334 + """List your existing graze-powered feeds.""" 335 + _empty_spy.record("list_feeds") 336 + return "no graze feeds found" 337 + 338 + class EmptyConsumerTestAgent: 339 + def __init__(self): 340 + self.agent = agent 341 + self.spy = _empty_spy 342 + 343 + async def process_mention( 344 + self, mention_text: str, author_handle: str = "test.user" 345 + ) -> Response: 346 + prompt = f"\nNew message from @{author_handle}: {mention_text}" 347 + result = await self.agent.run(prompt, deps={"author_handle": author_handle}) 348 + return result.output 349 + 350 + return EmptyConsumerTestAgent() 351 + 352 + 353 + @pytest.fixture(autouse=True) 354 + def _reset_feed_spy(): 355 + """Reset the tool call spies before each test.""" 356 + _feed_spy.reset() 357 + _consumer_spy.reset() 72 358 73 359 74 360 @pytest.fixture
+96
evals/test_feed_consumption.py
··· 1 + """Evals for feed consumption, following, and owner-gating.""" 2 + 3 + from conftest import OWNER_HANDLE 4 + 5 + 6 + async def test_reads_timeline_when_asked(feed_consumer_agent): 7 + """Agent should call read_timeline when asked about its feed.""" 8 + response = await feed_consumer_agent.process_mention( 9 + "what's on your timeline?", author_handle=OWNER_HANDLE 10 + ) 11 + 12 + spy = feed_consumer_agent.spy 13 + assert spy.was_called("read_timeline"), "read_timeline was not called" 14 + assert response.action == "reply", f"expected reply, got {response.action}" 15 + assert response.text is not None 16 + 17 + 18 + async def test_reads_specific_feed(feed_consumer_agent): 19 + """Agent should use read_feed or list_feeds when asked about a specific feed.""" 20 + response = await feed_consumer_agent.process_mention( 21 + "what's in your jazz vibes feed?", author_handle=OWNER_HANDLE 22 + ) 23 + 24 + spy = feed_consumer_agent.spy 25 + assert spy.was_called("read_feed") or spy.was_called("list_feeds"), ( 26 + "neither read_feed nor list_feeds was called" 27 + ) 28 + assert response.action == "reply", f"expected reply, got {response.action}" 29 + 30 + 31 + async def test_follow_allowed_for_owner(feed_consumer_agent): 32 + """Owner should be able to ask phi to follow someone.""" 33 + response = await feed_consumer_agent.process_mention( 34 + "follow @interesting.person", author_handle=OWNER_HANDLE 35 + ) 36 + 37 + spy = feed_consumer_agent.spy 38 + assert spy.was_called("follow_user"), "follow_user was not called" 39 + call = spy.get_calls("follow_user")[0] 40 + assert "interesting.person" in call["handle"] 41 + assert response.action == "reply", f"expected reply, got {response.action}" 42 + 43 + 44 + async def test_follow_denied_for_non_owner(feed_consumer_agent): 45 + """Non-owner should be denied when asking phi to follow someone.""" 46 + response = await feed_consumer_agent.process_mention( 47 + "follow @someone.else", author_handle="random.user" 48 + ) 49 + 50 + assert response.action == "reply", f"expected reply, got {response.action}" 51 + assert response.text is not None 52 + # either the tool was called and returned a denial, or the agent knew not to call it 53 + spy = feed_consumer_agent.spy 54 + if spy.was_called("follow_user"): 55 + # tool was called but should have returned denial 56 + assert ( 57 + "only" in response.text.lower() 58 + or "can't" in response.text.lower() 59 + or "owner" in response.text.lower() 60 + ), f"response should indicate denial: {response.text}" 61 + 62 + 63 + async def test_create_feed_denied_for_non_owner(feed_consumer_agent): 64 + """Non-owner should be denied when asking phi to create a feed.""" 65 + response = await feed_consumer_agent.process_mention( 66 + "create a feed for cooking recipes", author_handle="random.user" 67 + ) 68 + 69 + spy = feed_consumer_agent.spy 70 + if response.action == "reply": 71 + # replied with a denial — check the text mentions restriction 72 + assert response.text is not None 73 + if spy.was_called("create_feed"): 74 + text_lower = response.text.lower() 75 + assert ( 76 + "only" in text_lower or "can't" in text_lower or "owner" in text_lower 77 + ), f"response should indicate denial: {response.text}" 78 + else: 79 + # ignored the request entirely — also acceptable for a non-owner 80 + assert response.action == "ignore", f"unexpected action: {response.action}" 81 + 82 + 83 + async def test_empty_timeline_suggests_following(feed_consumer_agent_empty): 84 + """Empty timeline should return a message suggesting phi follow accounts.""" 85 + response = await feed_consumer_agent_empty.process_mention( 86 + "what's on your timeline?", author_handle=OWNER_HANDLE 87 + ) 88 + 89 + spy = feed_consumer_agent_empty.spy 90 + assert spy.was_called("read_timeline"), "read_timeline was not called" 91 + assert response.action == "reply", f"expected reply, got {response.action}" 92 + assert response.text is not None 93 + text_lower = response.text.lower() 94 + assert "follow" in text_lower or "empty" in text_lower or "no one" in text_lower, ( 95 + f"response should mention following or empty timeline: {response.text}" 96 + )
+127
evals/test_feed_creation.py
··· 1 + """Evals for graze feed creation — does the agent translate natural language into valid filter manifests?""" 2 + 3 + import json 4 + 5 + 6 + def _has_filter_key(manifest: dict) -> bool: 7 + """Check that the manifest has a top-level 'filter' key.""" 8 + return "filter" in manifest 9 + 10 + 11 + KNOWN_OPERATORS = {"regex_any", "has_any_tag", "and", "or"} 12 + 13 + 14 + def _uses_known_operators(obj: dict | list) -> bool: 15 + """Recursively check that all operator keys are from the known set.""" 16 + if isinstance(obj, list): 17 + return all( 18 + _uses_known_operators(item) for item in obj if isinstance(item, dict) 19 + ) 20 + if isinstance(obj, dict): 21 + for key, val in obj.items(): 22 + if key in KNOWN_OPERATORS: 23 + if isinstance(val, dict | list): 24 + if not _uses_known_operators(val): 25 + return False 26 + elif key not in ("filter",): 27 + return False 28 + return True 29 + return True 30 + 31 + 32 + async def test_creates_feed_from_description(feed_agent, evaluate_response): 33 + """Agent should call create_feed with a jazz-related manifest.""" 34 + response = await feed_agent.process_mention( 35 + "create a feed for posts about jazz music" 36 + ) 37 + 38 + assert response.action == "reply", f"expected reply, got {response.action}" 39 + 40 + spy = feed_agent.spy 41 + assert spy.was_called("create_feed"), "create_feed was not called" 42 + assert not spy.was_called("list_feeds"), "list_feeds should not be called" 43 + 44 + call = spy.get_calls("create_feed")[0] 45 + manifest = call["filter_manifest"] 46 + assert _has_filter_key(manifest), ( 47 + f"manifest missing 'filter' key: {json.dumps(manifest)}" 48 + ) 49 + 50 + await evaluate_response( 51 + "The filter manifest should contain patterns related to jazz music " 52 + "(e.g. 'jazz', 'bebop', 'improvisation', '#jazz'). " 53 + "Does it capture the user's intent to find jazz-related posts?", 54 + json.dumps(manifest), 55 + ) 56 + 57 + 58 + async def test_manifest_uses_valid_dsl(feed_agent): 59 + """Manifest should only use known graze DSL operators.""" 60 + await feed_agent.process_mention("make me a feed for machine learning posts") 61 + 62 + spy = feed_agent.spy 63 + assert spy.was_called("create_feed"), "create_feed was not called" 64 + 65 + call = spy.get_calls("create_feed")[0] 66 + manifest = call["filter_manifest"] 67 + assert _has_filter_key(manifest), ( 68 + f"manifest missing 'filter' key: {json.dumps(manifest)}" 69 + ) 70 + assert _uses_known_operators(manifest), ( 71 + f"manifest uses unknown operators: {json.dumps(manifest)}" 72 + ) 73 + 74 + 75 + async def test_complex_description(feed_agent, evaluate_response): 76 + """Agent should disambiguate 'rust' (programming language vs game).""" 77 + response = await feed_agent.process_mention( 78 + "create a feed for rust programming, not the game" 79 + ) 80 + 81 + assert response.action == "reply", f"expected reply, got {response.action}" 82 + 83 + spy = feed_agent.spy 84 + assert spy.was_called("create_feed"), "create_feed was not called" 85 + 86 + call = spy.get_calls("create_feed")[0] 87 + manifest = call["filter_manifest"] 88 + assert _has_filter_key(manifest), ( 89 + f"manifest missing 'filter' key: {json.dumps(manifest)}" 90 + ) 91 + 92 + await evaluate_response( 93 + "The filter manifest should make a reasonable attempt to target rust " 94 + "programming language content rather than the video game. It passes if " 95 + "it includes ANY rust-programming-specific terms (e.g. 'rustlang', " 96 + "'cargo', 'crate', '#rustlang', 'systems programming', 'compiler'). " 97 + "It does NOT need to be perfect — partial disambiguation is fine.", 98 + json.dumps(manifest), 99 + ) 100 + 101 + 102 + async def test_list_feeds_when_asked(feed_agent): 103 + """Asking about existing feeds should call list_feeds, not create_feed.""" 104 + response = await feed_agent.process_mention("what feeds do you have?") 105 + 106 + spy = feed_agent.spy 107 + assert spy.was_called("list_feeds"), "list_feeds was not called" 108 + assert not spy.was_called("create_feed"), "create_feed should not be called" 109 + 110 + assert response.action == "reply", f"expected reply, got {response.action}" 111 + assert response.text is not None 112 + assert "jazz" in response.text.lower() or "rust" in response.text.lower(), ( 113 + f"response should mention canned feeds: {response.text}" 114 + ) 115 + 116 + 117 + async def test_no_feed_creation_without_request(feed_agent): 118 + """Informational question about feeds should not trigger any feed tools.""" 119 + await feed_agent.process_mention("what is a bluesky feed?") 120 + 121 + spy = feed_agent.spy 122 + assert not spy.was_called("create_feed"), ( 123 + "create_feed should not be called for an informational question" 124 + ) 125 + assert not spy.was_called("list_feeds"), ( 126 + "list_feeds should not be called for an informational question" 127 + )
+1 -1
loq.toml
··· 13 13 14 14 [[rules]] 15 15 path = "src/bot/agent.py" 16 - max_lines = 691 16 + max_lines = 826 17 17 18 18 [[rules]] 19 19 path = "src/bot/memory/namespace_memory.py"
+2 -2
pyproject.toml
··· 30 30 asyncio_default_fixture_loop_scope = "function" 31 31 filterwarnings = [ 32 32 "ignore::logfire._internal.config.LogfireNotConfiguredWarning", 33 - "ignore::DeprecationWarning:abc", 34 - "ignore::DeprecationWarning:opentelemetry", 33 + # upstream otel 1.39 renamed EventLogger → Logger; pydantic-ai + otel internals still use old names 34 + "ignore:.*Deprecated since version 1\\.39\\.0:DeprecationWarning", 35 35 ] 36 36 37 37 [dependency-groups]
+139 -4
src/bot/agent.py
··· 18 18 19 19 from bot.config import settings 20 20 from bot.core.atproto_client import bot_client 21 + from bot.core.graze_client import GrazeClient 21 22 from bot.memory import NamespaceMemory 22 23 from bot.types import ( 23 24 CosmikConnection, ··· 29 30 30 31 logger = logging.getLogger("bot.agent") 31 32 32 - # Operational instructions kept separate from personality — these are 33 - # system-level guardrails that change when tools/architecture change. 34 - OPERATIONAL_INSTRUCTIONS = """ 33 + 34 + def _build_operational_instructions() -> str: 35 + """Build operational instructions with the current owner handle interpolated.""" 36 + return f""" 35 37 indicate your response action via the structured output — do not use atproto tools to post, like, or repost directly. 36 38 when sharing URLs, verify them with check_urls first and always include https://. 37 39 ··· 60 62 - pub-search (MCP): long-form writing — leaflet, whitewind, etc. 61 63 62 64 you can also create public records — notes (cosmik cards), bookmarks (URL cards), and connections. these are visible to anyone and indexed by semble. 65 + 66 + feeds — you can create and read bluesky feeds: 67 + - read_timeline: your "following" feed — what people you follow are posting. anyone can ask you to check this. 68 + - read_feed: read posts from a specific custom feed by URI. use list_feeds to get URIs. 69 + - create_feed: build a custom feed from keyword patterns and hashtag filters. OWNER-ONLY (restricted to @{settings.owner_handle}). 70 + - list_feeds: see your existing graze-powered feeds. 71 + - follow_user: follow a user on bluesky. OWNER-ONLY (restricted to @{settings.owner_handle}). 63 72 """.strip() 64 73 65 74 ··· 98 107 thread_uri: str | None = None 99 108 100 109 110 + def _is_owner(ctx: RunContext[PhiDeps]) -> bool: 111 + """Check if the current message author is the bot's owner.""" 112 + return ctx.deps.author_handle == settings.owner_handle 113 + 114 + 115 + def _format_feed_posts(feed_posts, limit: int = 20) -> str: 116 + """Format feed posts into a readable summary.""" 117 + today = date.today() 118 + lines = [] 119 + for item in feed_posts[:limit]: 120 + post = item.post 121 + text = post.record.text if hasattr(post.record, "text") else "" 122 + handle = post.author.handle 123 + likes = post.like_count or 0 124 + age = ( 125 + _relative_age(post.indexed_at, today) 126 + if hasattr(post, "indexed_at") and post.indexed_at 127 + else "" 128 + ) 129 + age_str = f", {age}" if age else "" 130 + lines.append(f"@{handle} ({likes} likes{age_str}): {text[:200]}") 131 + return "\n\n".join(lines) 132 + 133 + 101 134 class Response(BaseModel): 102 135 """Agent response indicating what action to take.""" 103 136 ··· 185 218 self.agent = Agent[PhiDeps, Response]( 186 219 name="phi", 187 220 model="anthropic:claude-sonnet-4-6", 188 - system_prompt=f"{self.base_personality}\n\n{OPERATIONAL_INSTRUCTIONS}", 221 + system_prompt=f"{self.base_personality}\n\n{_build_operational_instructions()}", 189 222 output_type=Response, 190 223 deps_type=PhiDeps, 191 224 ) ··· 503 536 async with httpx.AsyncClient(timeout=10) as client: 504 537 results = await asyncio.gather(*[_check(client, u) for u in urls]) 505 538 return "\n".join(results) 539 + 540 + # --- graze feed tools --- 541 + 542 + self.graze_client = GrazeClient( 543 + handle=settings.bluesky_handle, password=settings.bluesky_password 544 + ) 545 + 546 + @self.agent.tool 547 + async def create_feed( 548 + ctx: RunContext[PhiDeps], 549 + name: str, 550 + display_name: str, 551 + description: str, 552 + filter_manifest: dict, 553 + ) -> str: 554 + """Create a new bluesky feed powered by graze. Only the bot's owner can use this tool. 555 + 556 + name: url-safe slug (e.g. "electronic-music"). becomes the feed rkey. 557 + display_name: human-readable feed title. 558 + description: what the feed shows. 559 + filter_manifest: graze filter DSL. operators: 560 + - regex_any: ["text", ["pattern1", "pattern2"], case_insensitive: bool, whole_word: bool] 561 + - has_any_tag: [["#tag1", "#tag2"]] 562 + - and: [...filters], or: [...filters] 563 + example: {"filter": {"and": [{"regex_any": ["text", ["jazz", "bebop"], true, false]}, {"has_any_tag": [["#jazz"]]}]}} 564 + """ 565 + if not _is_owner(ctx): 566 + return f"only @{settings.owner_handle} can create feeds" 567 + try: 568 + result = await self.graze_client.create_feed( 569 + rkey=name, 570 + display_name=display_name, 571 + description=description, 572 + filter_manifest=filter_manifest, 573 + ) 574 + return f"feed created: {result['uri']} (algo_id={result['algo_id']})" 575 + except Exception as e: 576 + logger.warning(f"create_feed failed: {e}") 577 + return f"failed to create feed: {e}" 578 + 579 + @self.agent.tool 580 + async def list_feeds(ctx: RunContext[PhiDeps]) -> str: 581 + """List your existing graze-powered feeds.""" 582 + try: 583 + feeds = await self.graze_client.list_feeds() 584 + if not feeds: 585 + return "no graze feeds found" 586 + lines = [] 587 + for f in feeds: 588 + name = f.get("display_name") or f.get("name") or "unnamed" 589 + algo_id = f.get("id") or f.get("algo_id") or "?" 590 + uri = f.get("feed_uri") or f.get("uri") or "" 591 + lines.append(f"- {name} (id={algo_id}) {uri}") 592 + return "\n".join(lines) 593 + except Exception as e: 594 + logger.warning(f"list_feeds failed: {e}") 595 + return f"failed to list feeds: {e}" 596 + 597 + # --- feed consumption + following tools --- 598 + 599 + @self.agent.tool 600 + async def read_timeline(ctx: RunContext[PhiDeps], limit: int = 20) -> str: 601 + """Read your 'following' timeline — posts from accounts you follow. Use this when someone asks what's on your feed or what people you follow are talking about.""" 602 + try: 603 + response = await bot_client.get_timeline(limit=limit) 604 + if not response.feed: 605 + return ( 606 + "your timeline is empty — you're not following anyone yet. " 607 + f"ask @{settings.owner_handle} to have me follow some accounts!" 608 + ) 609 + return _format_feed_posts(response.feed, limit=limit) 610 + except Exception as e: 611 + return f"failed to read timeline: {e}" 612 + 613 + @self.agent.tool 614 + async def read_feed( 615 + ctx: RunContext[PhiDeps], feed_uri: str, limit: int = 20 616 + ) -> str: 617 + """Read posts from a specific custom feed by AT-URI. Use list_feeds to find feed URIs first.""" 618 + try: 619 + response = await bot_client.get_feed(feed_uri, limit=limit) 620 + if not response.feed: 621 + return "no posts in this feed yet" 622 + return _format_feed_posts(response.feed, limit=limit) 623 + except Exception as e: 624 + return f"failed to read feed: {e}" 625 + 626 + @self.agent.tool 627 + async def follow_user(ctx: RunContext[PhiDeps], handle: str) -> str: 628 + """Follow a user on bluesky. Only the bot's owner can use this tool.""" 629 + if not _is_owner(ctx): 630 + return f"only @{settings.owner_handle} can ask me to follow people" 631 + try: 632 + # check if already following 633 + following = await bot_client.get_following() 634 + for f in following.follows: 635 + if f.handle == handle: 636 + return f"already following @{handle}" 637 + uri = await bot_client.follow_user(handle) 638 + return f"now following @{handle} ({uri})" 639 + except Exception as e: 640 + return f"failed to follow @{handle}: {e}" 506 641 507 642 logger.info("phi agent initialized with pdsx + pub-search mcp tools") 508 643
+6
src/bot/config.py
··· 92 92 default=None, description="Bearer token for /api/control endpoints" 93 93 ) 94 94 95 + # Owner identity (for permission-gated tools) 96 + owner_handle: str = Field( 97 + default="zzstoatzz.io", 98 + description="Handle of the bot's owner (for permission-gated tools)", 99 + ) 100 + 95 101 # Debug mode 96 102 debug: bool = Field(default=True, description="Whether to run in debug mode") 97 103
+26
src/bot/core/atproto_client.py
··· 222 222 ) 223 223 return response.feed 224 224 225 + async def get_timeline(self, limit: int = 25): 226 + """Fetch the 'following' timeline feed.""" 227 + await self.authenticate() 228 + return self.client.app.bsky.feed.get_timeline(params={"limit": limit}) 229 + 230 + async def get_feed(self, feed_uri: str, limit: int = 25): 231 + """Fetch posts from a custom feed by AT-URI.""" 232 + await self.authenticate() 233 + return self.client.app.bsky.feed.get_feed( 234 + params={"feed": feed_uri, "limit": limit} 235 + ) 236 + 237 + async def follow_user(self, handle: str) -> str: 238 + """Resolve handle to DID and create a follow record. Returns the record URI.""" 239 + await self.authenticate() 240 + resolved = self.client.resolve_handle(handle) 241 + response = self.client.follow(resolved.did) 242 + return response.uri 243 + 244 + async def get_following(self, limit: int = 100): 245 + """Get accounts the bot is following.""" 246 + await self.authenticate() 247 + return self.client.app.bsky.graph.get_follows( 248 + params={"actor": self.client.me.did, "limit": limit} 249 + ) 250 + 225 251 226 252 bot_client: BotClient = BotClient()
+151
src/bot/core/graze_client.py
··· 1 + """Async client for graze.social's undocumented REST API. 2 + 3 + Graze serves custom Bluesky feed algorithms. This client handles the full 4 + feed lifecycle: login → create PDS record → register with graze → publish. 5 + 6 + API reference: https://whtwnd.com/did:plc:r2whjvupgfw55mllpksnombn/3mgbz7xdeil2h 7 + """ 8 + 9 + import logging 10 + from datetime import UTC, datetime 11 + 12 + import httpx 13 + 14 + from bot.core.atproto_client import bot_client 15 + 16 + logger = logging.getLogger("bot.graze_client") 17 + 18 + BASE_URL = "https://api.graze.social" 19 + GRAZE_DID = "did:web:api.graze.social" 20 + 21 + 22 + class GrazeClient: 23 + def __init__(self, handle: str, password: str): 24 + self._handle = handle 25 + self._password = password 26 + self._cookies: httpx.Cookies | None = None 27 + self._user_id: int | None = None 28 + 29 + async def _ensure_session(self) -> None: 30 + """Login to graze if we don't have a valid session.""" 31 + if self._cookies is not None: 32 + return 33 + await self._login() 34 + 35 + async def _login(self) -> None: 36 + """Authenticate with graze and cache the session cookie + user_id.""" 37 + async with httpx.AsyncClient(timeout=15) as client: 38 + r = await client.post( 39 + f"{BASE_URL}/app/login", 40 + json={"username": self._handle, "password": self._password}, 41 + ) 42 + r.raise_for_status() 43 + data = r.json() 44 + self._user_id = data["id"] 45 + self._cookies = r.cookies 46 + logger.info(f"graze login ok, user_id={self._user_id}") 47 + 48 + async def _request( 49 + self, 50 + method: str, 51 + path: str, 52 + **kwargs, 53 + ) -> httpx.Response: 54 + """Make an authenticated request, re-logging in on 401.""" 55 + await self._ensure_session() 56 + async with httpx.AsyncClient( 57 + base_url=BASE_URL, cookies=self._cookies, timeout=30 58 + ) as client: 59 + r = await client.request(method, path, **kwargs) 60 + if r.status_code == 401: 61 + logger.info("graze session expired, re-logging in") 62 + self._cookies = None 63 + await self._login() 64 + r = await client.request( 65 + method, 66 + path, 67 + cookies=self._cookies, 68 + **{k: v for k, v in kwargs.items() if k != "cookies"}, 69 + ) 70 + r.raise_for_status() 71 + return r 72 + 73 + async def create_feed( 74 + self, 75 + rkey: str, 76 + display_name: str, 77 + description: str, 78 + filter_manifest: dict, 79 + ) -> dict: 80 + """Create a new graze-powered feed. Full 5-step flow: 81 + 82 + 1. putRecord on PDS (app.bsky.feed.generator) 83 + 2. migrate_algo (register filter with graze) 84 + 3. complete_migration 85 + 4. publish_algo 86 + 5. set-publicity to public 87 + 88 + Returns {"uri": ..., "algo_id": ...}. 89 + """ 90 + # 1. create the feed generator record on phi's PDS 91 + await bot_client.authenticate() 92 + did = bot_client.client.me.did 93 + feed_uri = f"at://{did}/app.bsky.feed.generator/{rkey}" 94 + 95 + bot_client.client.com.atproto.repo.put_record( 96 + data={ 97 + "repo": did, 98 + "collection": "app.bsky.feed.generator", 99 + "rkey": rkey, 100 + "record": { 101 + "$type": "app.bsky.feed.generator", 102 + "did": GRAZE_DID, 103 + "displayName": display_name, 104 + "description": description, 105 + "createdAt": datetime.now(UTC).isoformat(), 106 + }, 107 + } 108 + ) 109 + logger.info(f"PDS record created: {feed_uri}") 110 + 111 + # 2. register the filter manifest with graze 112 + r = await self._request( 113 + "POST", 114 + "/app/migrate_algo", 115 + json={ 116 + "user_id": self._user_id, 117 + "feed_uri": feed_uri, 118 + "algorithm_manifest": filter_manifest, 119 + }, 120 + ) 121 + algo_id = r.json()["algo_id"] 122 + logger.info(f"algo migrated, algo_id={algo_id}") 123 + 124 + # 3. complete migration 125 + await self._request( 126 + "POST", 127 + "/app/complete_migration", 128 + json={"algo_id": algo_id, "user_id": self._user_id}, 129 + ) 130 + 131 + # 4. publish 132 + await self._request("GET", f"/app/publish_algo/{algo_id}") 133 + 134 + # 5. make public 135 + await self._request( 136 + "GET", 137 + f"/app/api/v1/algorithm-management/set-publicity/{algo_id}/true", 138 + ) 139 + 140 + logger.info(f"feed published: {feed_uri}") 141 + return {"uri": feed_uri, "algo_id": algo_id} 142 + 143 + async def list_feeds(self) -> list[dict]: 144 + """List phi's existing graze feeds.""" 145 + r = await self._request("GET", "/app/my_feeds") 146 + return r.json() 147 + 148 + async def delete_feed(self, algo_id: int) -> None: 149 + """Delete a graze feed by algo_id.""" 150 + await self._request("DELETE", f"/app/my_feeds/{algo_id}") 151 + logger.info(f"feed deleted: algo_id={algo_id}")
+1 -1
tests/test_config.py
··· 8 8 def test_config_loads(): 9 9 """Test that config loads without errors""" 10 10 assert settings.bluesky_service == "https://bsky.social" 11 - assert settings.bot_name == "phi" 11 + assert settings.bot_name # default "Bot" or overridden via env/dotfile 12 12 assert settings.notification_poll_interval == 10 13 13 14 14
+198
tests/test_graze_client.py
··· 1 + """Tests for the graze.social REST client.""" 2 + 3 + from unittest.mock import AsyncMock, MagicMock, patch 4 + 5 + import httpx 6 + import pytest 7 + 8 + from bot.core.graze_client import BASE_URL, GrazeClient 9 + 10 + 11 + @pytest.fixture 12 + def graze(): 13 + return GrazeClient(handle="test.bsky.social", password="test-pass") 14 + 15 + 16 + def _login_response(): 17 + """Fake successful login response.""" 18 + resp = httpx.Response( 19 + 200, 20 + json={"id": 42}, 21 + request=httpx.Request("POST", f"{BASE_URL}/app/login"), 22 + ) 23 + return resp 24 + 25 + 26 + def _ok_response(json=None): 27 + resp = httpx.Response( 28 + 200, 29 + json=json or {}, 30 + request=httpx.Request("GET", BASE_URL), 31 + ) 32 + return resp 33 + 34 + 35 + class TestLogin: 36 + async def test_login_caches_session(self, graze): 37 + with patch("bot.core.graze_client.httpx.AsyncClient") as mock_cls: 38 + mock_client = AsyncMock() 39 + mock_client.post.return_value = _login_response() 40 + mock_cls.return_value.__aenter__ = AsyncMock(return_value=mock_client) 41 + mock_cls.return_value.__aexit__ = AsyncMock(return_value=False) 42 + 43 + await graze._login() 44 + assert graze._user_id == 42 45 + assert graze._cookies is not None 46 + 47 + async def test_ensure_session_skips_if_cached(self, graze): 48 + graze._cookies = httpx.Cookies() 49 + graze._user_id = 42 50 + # should not attempt login 51 + with patch.object(graze, "_login") as mock_login: 52 + await graze._ensure_session() 53 + mock_login.assert_not_called() 54 + 55 + async def test_ensure_session_logs_in_if_no_cookies(self, graze): 56 + with patch.object(graze, "_login") as mock_login: 57 + await graze._ensure_session() 58 + mock_login.assert_called_once() 59 + 60 + 61 + class TestCreateFeed: 62 + async def test_full_create_flow(self, graze): 63 + """Test the 5-step create flow: putRecord → migrate → complete → publish → set-publicity.""" 64 + graze._cookies = httpx.Cookies() 65 + graze._user_id = 42 66 + 67 + # mock bot_client for PDS putRecord 68 + mock_bot = MagicMock() 69 + mock_bot.authenticate = AsyncMock() 70 + mock_bot.client.me.did = "did:plc:testdid" 71 + mock_bot.client.com.atproto.repo.put_record = MagicMock() 72 + 73 + call_log = [] 74 + 75 + async def fake_request(method, path, **kwargs): 76 + call_log.append((method, path)) 77 + if path == "/app/migrate_algo": 78 + return _ok_response(json={"algo_id": 99}) 79 + return _ok_response() 80 + 81 + with ( 82 + patch("bot.core.graze_client.bot_client", mock_bot), 83 + patch.object(graze, "_request", side_effect=fake_request), 84 + ): 85 + result = await graze.create_feed( 86 + rkey="jazz-feed", 87 + display_name="Jazz Music", 88 + description="posts about jazz", 89 + filter_manifest={ 90 + "filter": { 91 + "and": [{"regex_any": ["text", ["jazz", "bebop"], True, False]}] 92 + } 93 + }, 94 + ) 95 + 96 + assert result["uri"] == "at://did:plc:testdid/app.bsky.feed.generator/jazz-feed" 97 + assert result["algo_id"] == 99 98 + 99 + # verify PDS record was created 100 + mock_bot.client.com.atproto.repo.put_record.assert_called_once() 101 + put_data = mock_bot.client.com.atproto.repo.put_record.call_args 102 + record = put_data.kwargs["data"]["record"] 103 + assert record["displayName"] == "Jazz Music" 104 + assert record["did"] == "did:web:api.graze.social" 105 + 106 + # verify all 4 graze API calls in order 107 + assert call_log == [ 108 + ("POST", "/app/migrate_algo"), 109 + ("POST", "/app/complete_migration"), 110 + ("GET", "/app/publish_algo/99"), 111 + ("GET", "/app/api/v1/algorithm-management/set-publicity/99/true"), 112 + ] 113 + 114 + async def test_create_feed_propagates_errors(self, graze): 115 + graze._cookies = httpx.Cookies() 116 + graze._user_id = 42 117 + 118 + mock_bot = MagicMock() 119 + mock_bot.authenticate = AsyncMock() 120 + mock_bot.client.me.did = "did:plc:testdid" 121 + mock_bot.client.com.atproto.repo.put_record = MagicMock() 122 + 123 + async def fail_migrate(method, path, **kwargs): 124 + raise httpx.HTTPStatusError( 125 + "bad request", 126 + request=httpx.Request("POST", f"{BASE_URL}/app/migrate_algo"), 127 + response=httpx.Response(400), 128 + ) 129 + 130 + with ( 131 + patch("bot.core.graze_client.bot_client", mock_bot), 132 + patch.object(graze, "_request", side_effect=fail_migrate), 133 + ): 134 + with pytest.raises(httpx.HTTPStatusError): 135 + await graze.create_feed("test", "Test", "test", {"filter": {}}) 136 + 137 + 138 + class TestListFeeds: 139 + async def test_list_feeds(self, graze): 140 + feeds_data = [ 141 + {"id": 1, "display_name": "Jazz", "feed_uri": "at://did/gen/jazz"}, 142 + {"id": 2, "display_name": "Blues", "feed_uri": "at://did/gen/blues"}, 143 + ] 144 + 145 + async def fake_request(method, path, **kwargs): 146 + return _ok_response(json=feeds_data) 147 + 148 + with patch.object(graze, "_request", side_effect=fake_request): 149 + result = await graze.list_feeds() 150 + 151 + assert len(result) == 2 152 + assert result[0]["display_name"] == "Jazz" 153 + 154 + 155 + class TestDeleteFeed: 156 + async def test_delete_feed(self, graze): 157 + async def fake_request(method, path, **kwargs): 158 + assert method == "DELETE" 159 + assert "/app/my_feeds/99" in path 160 + return _ok_response() 161 + 162 + with patch.object(graze, "_request", side_effect=fake_request): 163 + await graze.delete_feed(99) 164 + 165 + 166 + class TestReloginOn401: 167 + async def test_request_retries_on_401(self, graze): 168 + graze._cookies = httpx.Cookies() 169 + graze._user_id = 42 170 + 171 + call_count = 0 172 + 173 + async def mock_request(method, path, **kwargs): 174 + nonlocal call_count 175 + call_count += 1 176 + if call_count == 1: 177 + return httpx.Response( 178 + 401, 179 + request=httpx.Request("GET", f"{BASE_URL}{path}"), 180 + ) 181 + return httpx.Response( 182 + 200, 183 + json={"ok": True}, 184 + request=httpx.Request("GET", f"{BASE_URL}{path}"), 185 + ) 186 + 187 + with ( 188 + patch("bot.core.graze_client.httpx.AsyncClient") as mock_cls, 189 + patch.object(graze, "_login", new_callable=AsyncMock) as mock_login, 190 + ): 191 + mock_client = AsyncMock() 192 + mock_client.request = mock_request 193 + mock_cls.return_value.__aenter__ = AsyncMock(return_value=mock_client) 194 + mock_cls.return_value.__aexit__ = AsyncMock(return_value=False) 195 + 196 + r = await graze._request("GET", "/app/my_feeds") 197 + assert r.status_code == 200 198 + mock_login.assert_called_once()
+41
tests/test_tool_usage.py
··· 1 1 """Test that proves tools are actually being used by the agent""" 2 2 3 3 import os 4 + from unittest.mock import patch 4 5 5 6 import pytest 6 7 from pydantic import BaseModel, Field ··· 61 62 async def test_search_tool_usage(self): 62 63 """Test that search tool is called for appropriate queries""" 63 64 65 + if not settings.anthropic_api_key: 66 + pytest.skip("No Anthropic API key configured") 67 + 64 68 tool_calls: list[dict] = [] 65 69 66 70 agent = Agent( ··· 94 98 async def test_multiple_tool_calls(self): 95 99 """Test that agent can call tools multiple times in one request""" 96 100 101 + if not settings.anthropic_api_key: 102 + pytest.skip("No Anthropic API key configured") 103 + 97 104 calls: list[str] = [] 98 105 99 106 agent = Agent( ··· 114 121 assert len(calls) >= 2, f"Expected multiple searches, got {len(calls)}: {calls}" 115 122 assert any("Python" in call for call in calls), f"No Python search in: {calls}" 116 123 assert any("Rust" in call for call in calls), f"No Rust search in: {calls}" 124 + 125 + 126 + class TestPhiAgentToolRegistration: 127 + """Verify that PhiAgent registers all expected tools (no LLM calls needed).""" 128 + 129 + def setup_method(self): 130 + if settings.anthropic_api_key: 131 + os.environ["ANTHROPIC_API_KEY"] = settings.anthropic_api_key 132 + 133 + def test_graze_tools_registered(self): 134 + if not os.environ.get("ANTHROPIC_API_KEY"): 135 + pytest.skip("No Anthropic API key configured") 136 + 137 + with patch("bot.agent.bot_client"): 138 + from bot.agent import PhiAgent 139 + 140 + agent = PhiAgent() 141 + tool_names = {t.name for t in agent.agent._function_toolset.tools.values()} 142 + assert "create_feed" in tool_names, f"create_feed not in {tool_names}" 143 + assert "list_feeds" in tool_names, f"list_feeds not in {tool_names}" 144 + assert "read_timeline" in tool_names, f"read_timeline not in {tool_names}" 145 + assert "read_feed" in tool_names, f"read_feed not in {tool_names}" 146 + assert "follow_user" in tool_names, f"follow_user not in {tool_names}" 147 + 148 + def test_graze_client_instantiated(self): 149 + if not os.environ.get("ANTHROPIC_API_KEY"): 150 + pytest.skip("No Anthropic API key configured") 151 + 152 + with patch("bot.agent.bot_client"): 153 + from bot.agent import PhiAgent 154 + 155 + agent = PhiAgent() 156 + assert agent.graze_client is not None 157 + assert agent.graze_client._handle == settings.bluesky_handle