support tools, better memory impl · ptr.pet/klbr@0c99377

+281

AGENTS.md

··· 1 + # klbr — codebase reference for agents 2 + 3 + Personal AI agent harness in Rust. Local LLM chat daemon with long-term memory, tool calling, and a ratatui TUI. Self-hosted, no corporate product feel. 4 + 5 + --- 6 + 7 + ## crate layout 8 + 9 + ``` 10 + klbr/ 11 + klbr-core/ — agent loop, LLM client, memory, context, tools 12 + klbr-daemon/ — unix socket server, bridges agent to clients 13 + klbr-ipc/ — shared protocol types (ClientMsg, ServerMsg) 14 + klbr-tui/ — ratatui TUI chat client 15 + ``` 16 + 17 + **binaries**: `klbr-daemon` (start this first), `klbr-tui` (connects to daemon) 18 + 19 + --- 20 + 21 + ## LLM backend 22 + 23 + - **llama-server compatible API** at `http://localhost:1234` 24 + - Chat model: `google/gemma-4-26b-a4b` 25 + - Embedding model: `nomic-embed-text-v1.5` (768 dims) 26 + - Both served from the same endpoint (configured separately in `Config` as `llm_url` / `embed_url`) 27 + - Streaming via SSE (`data: {...}\n`) 28 + - Tool calls use OpenAI function-calling format with streaming delta accumulation 29 + 30 + --- 31 + 32 + ## klbr-core 33 + 34 + ### `config.rs` 35 + 36 + `Config` struct (no file loading yet — `Config::default()` hardcodes everything). Fields: 37 + - `llm_url`, `embed_url`, `llm_model`, `embed_model` 38 + - `watermark_tokens: 32_000` — triggers compaction when context exceeds this 39 + - `compaction_keep: 10` — turns to keep after draining 40 + - `memory_top_k: 3`, `memory_sim_threshold: 0.3` — recall injection params 41 + - `db_path: "agent.db"`, `embed_dim: 768` 42 + - `anchor: String` — system prompt (includes personality + memory tool instructions) 43 + 44 + The anchor tells the agent about its memory tools and tagging conventions. Edit it in `config.rs` when adding/changing tools. 45 + 46 + ### `llm.rs` 47 + 48 + **`LlmClient`** (Clone): 49 + - `stream(messages, tools, tok_tx)` — streaming completion, sends `LlmEvent` over mpsc channel. accumulates tool call deltas by index in `HashMap<usize, PartialCall>`, flushes `LlmEvent::ToolCalls` on `[DONE]`. 50 + - `complete(messages)` — non-streaming, used for compaction summaries and reflection. returns `(String, Usage)`. 51 + - `embed(text)` — returns `Vec<f32>` embedding. 52 + 53 + **`Message`** struct — OpenAI format: 54 + - `role: String`, `content: Option<String>` 55 + - `tool_calls: Option<Vec<ToolCall>>` — for assistant tool call messages 56 + - `tool_call_id: Option<String>` — for tool result messages 57 + - All optional fields skip serialization when None (`#[serde(skip_serializing_if)]`) 58 + - Constructors: `Message::system()`, `::user()`, `::assistant()`, `::with_tool_calls()`, `::tool_result()` 59 + 60 + **`LlmEvent`** variants: `Token(String)`, `ThinkToken(String)`, `Usage(Usage)`, `ToolCalls(Vec<ToolCall>)` 61 + 62 + ### `memory.rs` 63 + 64 + SQLite + sqlite-vec. Single DB file (`agent.db`). Two tables: 65 + 66 + **`memories`** — episodic memory store: 67 + - `id, content TEXT, pinned INTEGER (0/1), tags TEXT (JSON array), ts INTEGER` 68 + - paired with virtual table **`vec_memories`** (sqlite-vec, cosine distance metric, 768 dims) 69 + - migration-safe: `migrate()` runs `ALTER TABLE ADD COLUMN` (fails silently if column exists) 70 + 71 + **`turns`** — full turn history: 72 + - `id, role TEXT, content TEXT, thinking TEXT, ts INTEGER` 73 + 74 + **`MemoryStore`** (Clone, wraps `Arc<Mutex<Connection>>`): 75 + - `store(content, emb, tags)` → `Result<i64>` — insert memory, return id 76 + - `set_pinned(id, bool)` — pin/unpin 77 + - `set_tags(id, tags)` — replace tags 78 + - `pinned_memories()` → `Vec<String>` — for anchor injection at startup 79 + - `recent_unpinned(n)` → `Vec<(i64, String, Vec<String>)>` — for reflection prompt 80 + - `recall(query_emb, tags, tag_and, limit)` → `Vec<RecallEntry>` — **main search method**: 81 + - no tags: global ANN via sqlite-vec 82 + - with tags + query: fetch all tag-matched memories WITH embeddings, exact cosine in Rust (never misses due to ANN cutoff) 83 + - with tags only: delegates to `context_for` 84 + - `context_for(tags, tag_and, limit)` → `Vec<RecallEntry>` — pure SQL tag lookup, newest first 85 + - `log_turn(role, content, thinking)` — append to turns table 86 + - `recent_turns(n)` → chronological slice (oldest first) for context replay 87 + - `turns_before(before_id, limit)` → for TUI scroll-back paging 88 + - `get_all()` → `Vec<Memory>` — all memories with embeddings (for dump) 89 + - `reset()` — drop and recreate all tables 90 + 91 + **`RecallEntry`**: `id, content, tags: Vec<String>, distance: Option<f32>` (None = tag-only hit) 92 + 93 + **internal helpers** (private): 94 + - `top_k(emb, k)` — ANN via sqlite-vec 95 + - `by_tags(tags, tag_and)` — SQL LIKE on JSON array with escape handling 96 + - `tag_matched_with_embeddings(tags, tag_and)` — for exact cosine path 97 + - `cosine_distance(a, b)` — returns value in [0, 2], matching sqlite-vec convention 98 + 99 + ### `context.rs` 100 + 101 + In-memory sliding window sent to the LLM on each turn. 102 + 103 + **`Context`**: 104 + - `anchor: Vec<Message>` — never evicted (system prompt + pinned memories) 105 + - `turns: Vec<Message>` — rolling conversation 106 + - `total_tokens: usize` — updated from `LlmEvent::Usage` 107 + 108 + Key methods: 109 + - `new(anchor, pinned_memories)` — builds system message, appends pinned memories section 110 + - `update_anchor_memories(pinned)` — rebuilds pinned section after reflection (strips old section, re-appends) 111 + - `load_turns(pairs)` — replay `(role, content)` pairs from DB on startup; skips tool/other roles (ephemeral) 112 + - `push_user(source, content, memories)` — prepends `[recalled context]\n...\n\n[source] content` if memories non-empty 113 + - `push_assistant_tool_calls(calls)` — assistant message with tool_calls, no content 114 + - `push_tool_result(id, content)` — tool role message 115 + - `drain_oldest(keep)` — removes all but `keep` most recent turns; walks forward from cut point to avoid splitting tool call sequences (never cuts mid-tool-call) 116 + - `as_messages()` — anchor + turns concatenated, ready to send to LLM 117 + 118 + ### `tools.rs` 119 + 120 + **`definitions()`** — full tool list sent to LLM on every turn: 121 + - `shell(cmd)` — runs via `sh -c`, caps stdout 20k / stderr 5k chars 122 + - `read_file(path, start_line?, end_line?)` — caps at 50k bytes 123 + - `write_file(path, content)` 124 + - `remember(content, important?, tags?)` — embeds and stores; pins if `important=true` 125 + - `recall(query, tags?, tag_mode?, limit?)` — semantic search, optional tag filter 126 + - `context_for(tags, tag_mode?, limit?)` — pure tag lookup, default limit 20 127 + - `tag_memory(id, tags)` — replace tags on existing memory 128 + - `pin_memory(id)` / `unpin_memory(id)` 129 + - `list_memories()` — shows pinned + 10 recent unpinned with ids and tags 130 + 131 + **`reflection_definitions()`** — filtered subset for the reflection mini-loop: `remember, recall, context_for, tag_memory, pin_memory, unpin_memory, list_memories` (no shell/file tools) 132 + 133 + **`execute(call, memory, llm)`** — async dispatch by tool name. needs `&MemoryStore` and `&LlmClient` for memory tools. 134 + 135 + ### `agent.rs` 136 + 137 + Main async loop (`run()`). Receives `Interrupt` from mpsc, sends `AgentEvent` over broadcast. 138 + 139 + **startup**: 140 + 1. load pinned memories → `Context::new(anchor, pinned)` 141 + 2. replay recent turns from DB → `ctx.load_turns()` 142 + 143 + **interrupt handling**: 144 + - `Reset` → clear context, emit `Status` 145 + - `Compact` → call `compact()` immediately 146 + - `UserMessage` → embed query, call `memory.recall()` for relevant memories, `push_user()`, `log_turn()` 147 + 148 + **tool loop** (max 20 iterations): 149 + 1. spawn `llm.stream()` in background task 150 + 2. collect `LlmEvent`s: accumulate tokens/thinking, capture `ToolCalls` 151 + 3. if tool calls: emit `ToolCall` events, `tools::execute()` each, emit `ToolResult`, push to context, loop 152 + 4. if plain text: push assistant message, `log_turn()`, emit `Done` + `Metrics`, check watermark 153 + 154 + **`compact(output)`**: 155 + 1. emit `Status("reflecting...")` 156 + 2. call `reflect()` — ephemeral mini tool loop (memory tools only, max 6 iterations, separate context) 157 + 3. drain oldest turns from main context 158 + 4. LLM-summarize drained text → store with tag `["compaction_summary"]` 159 + 5. reload pinned memories and call `ctx.update_anchor_memories()` 160 + 161 + **`reflect()`**: 162 + - builds reflection prompt with: last 10 turn outline (truncated), current pinned + recent unpinned memories 163 + - runs mini stream loop with `reflection_definitions()` tools 164 + - agent pins/unpins/remembers/tags as it sees fit 165 + - ephemeral — results don't enter main context 166 + 167 + ### `interrupt.rs` 168 + 169 + `Interrupt` enum: `UserMessage(String)`, `Reset`, `Compact` 170 + 171 + `spawn_source(tx, f)` — helper for future external interrupt sources (e.g. Bluesky notifications). Not currently used. 172 + 173 + ### `lib.rs` 174 + 175 + Re-exports modules. Defines: 176 + - `MetricsSnapshot = Arc<RwLock<Option<AgentMetrics>>>` 177 + - `AgentMetrics { turn_count, context_tokens, watermark }` 178 + - `AgentEvent` enum: `Started, Token(String), ThinkToken(String), Done, Status(String), Metrics(AgentMetrics), ToolCall { name, args }, ToolResult { name, content }` 179 + 180 + --- 181 + 182 + ## klbr-ipc 183 + 184 + `ClientMsg` (TUI → daemon, tagged by `type` field): 185 + - `Message { source, content }` — chat message 186 + - `FetchHistory { before_id, limit }` — scroll-back paging 187 + - `Compact` — manual compaction trigger 188 + - `Reset` — wipe DB and context 189 + - `DumpMemories { path: Option<String> }` — dump memories JSON to file 190 + 191 + `ServerMsg` (daemon → TUI): 192 + - `Started, Token { content }, ThinkToken { content }, Done` 193 + - `Status { content }` — status bar text 194 + - `Metrics { turn_count, context_tokens, watermark }` 195 + - `History { turns: Vec<HistoryEntry> }` — sent on connect and on `FetchHistory` 196 + - `ToolCall { name, args }`, `ToolResult { name, content }` 197 + 198 + `HistoryEntry { id, timestamp, role, content, reasoning: Option<String> }` 199 + 200 + `sock_path()` → `$XDG_RUNTIME_DIR/agent.sock` (or `~/.local/share/agent.sock`) 201 + 202 + Protocol: newline-delimited JSON over Unix socket. 203 + 204 + --- 205 + 206 + ## klbr-daemon 207 + 208 + `main.rs` — wires everything together: 209 + 1. `Config::default()` 210 + 2. open `MemoryStore`, create `LlmClient` 211 + 3. spawn `agent::run()` and `daemon::serve()` concurrently 212 + 4. `tokio::select!` on both, propagate errors 213 + 214 + `daemon.rs` — `serve()` accepts connections in a loop, each gets its own `handle()` task. 215 + 216 + `handle()` per-connection: 217 + 1. push `History { turns }` immediately (last `history_window` turns from DB) 218 + 2. push current `Metrics` from snapshot if available 219 + 3. `tokio::select!` between: 220 + - incoming `ClientMsg` lines → translate to `Interrupt` or handle directly (FetchHistory, Reset, Compact, DumpMemories) 221 + - `AgentEvent` from broadcast → translate to `ServerMsg`, send to client 222 + 223 + `send_msg()` — serialize `ServerMsg` to JSON + newline, write to socket. 224 + 225 + --- 226 + 227 + ## klbr-tui 228 + 229 + Ratatui TUI using crossterm + `tui-scrollview`. 230 + 231 + **`App`** state: 232 + - `history: Vec<ChatMsg>` — display model 233 + - `scroll: ScrollViewState`, `at_bottom: bool` — scroll tracking 234 + - `input: String`, `cursor: usize`, `cmd_mode: bool` — input box 235 + - `oldest_turn_id`, `history_exhausted`, `loading_history` — scroll-back paging 236 + - `turn_count, context_tokens, watermark, last_tps` — metrics display 237 + 238 + **`ChatMsg`** with **`Role`** enum: 239 + - `User` — cyan "you " prefix 240 + - `Assistant { reason: Option<Reason>, step: AssistantStep }` — green "klbr " prefix; `Reason` is collapsible thinking block; `AssistantStep` tracks `PromptProcessing → Reasoning → Response → Done` 241 + - `System` — dark gray, dimmed 242 + - `Tool { name, args, result: Option<String> }` — yellow `$ name(key=val...)` header + up to 10 lines of result (or "running..." while pending) 243 + 244 + **Commands** (typed with `/` prefix): 245 + - `/clear` (`/c`) — clear display 246 + - `/compact` (`/cp`) — send `ClientMsg::Compact` 247 + - `/reset` — clear display + send `ClientMsg::Reset` 248 + - `/dump [path]` — send `ClientMsg::DumpMemories` 249 + - `/think` (`/t`) — toggle reasoning block on last assistant message 250 + - `/help` (`/h`) — show help inline 251 + 252 + **Event loop**: `tokio::select!` between crossterm events and socket lines. 253 + 254 + Scroll-back: PageUp sends `FetchHistory { before_id: oldest_turn_id, limit: 50 }`. `prepend_turns()` inserts older turns at front of history vec. 255 + 256 + Tool result matching: on `ServerMsg::ToolResult`, scan history in reverse for last `Role::Tool { name: matching, result: None }` and fill in result. 257 + 258 + Status bar (bottom line): `{tps} ctx {pct}% ({remaining} tok left) (turns: N) {status}` 259 + 260 + --- 261 + 262 + ## data flow summary 263 + 264 + ``` 265 + TUI ──ClientMsg──► daemon ──Interrupt──► agent 266 + │ 267 + tool loop 268 + │ 269 + TUI ◄──ServerMsg── daemon ◄─AgentEvent── agent 270 + ``` 271 + 272 + --- 273 + 274 + ## things not yet implemented 275 + 276 + - config file (everything is `Config::default()`) 277 + - graceful shutdown / signal handling 278 + - multiple clients (broadcast works but history paging is per-connection) 279 + - external interrupt sources (Bluesky, etc.) — `spawn_source` is ready but unused 280 + - memory deduplication before storing 281 + - `recall` results also filtered by `memory_sim_threshold` in agent.rs automatic injection; manual `recall` tool has no threshold filter (returns whatever the model asked for)

+226 -53

klbr-core/src/agent.rs

··· 9 9 interrupt::Interrupt, 10 10 llm::{LlmClient, LlmEvent, Message}, 11 11 memory::MemoryStore, 12 - AgentEvent, AgentMetrics, 12 + tools, AgentEvent, AgentMetrics, 13 13 }; 14 14 15 15 pub async fn run( ··· 20 20 output: broadcast::Sender<AgentEvent>, 21 21 snapshot: MetricsSnapshot, 22 22 ) -> Result<()> { 23 - let mut ctx = Context::new(&config.anchor); 23 + let pinned = memory.pinned_memories().unwrap_or_default(); 24 + let mut ctx = Context::new(&config.anchor, &pinned); 24 25 let mut turn_count = 0usize; 25 26 26 27 // resume: replay the sliding window from the last run ··· 33 34 turn_count = pairs.len(); 34 35 } 35 36 37 + let tool_defs = tools::definitions(); 38 + 36 39 while let Some(interrupt) = rx.recv().await { 37 40 match interrupt { 38 41 Interrupt::Reset => { ··· 43 46 } 44 47 Interrupt::Compact => { 45 48 let _ = output.send(AgentEvent::Status("compacting...".into())); 46 - if let Err(e) = compact(&llm, &memory, &mut ctx, 0).await { 49 + if let Err(e) = compact(&llm, &memory, &mut ctx, 0, &output).await { 47 50 let _ = output.send(AgentEvent::Status(format!("compaction failed: {e}"))); 48 51 } else { 49 52 turn_count = ctx.turn_count(); ··· 55 58 let source = interrupt.source_tag().to_string(); 56 59 57 60 let memories: Vec<String> = llm 58 - .embed(&text) 61 + .embed(text) 59 62 .await 60 63 .ok() 61 - .and_then(|emb| memory.top_k(&emb, config.memory_top_k).ok()) 64 + .and_then(|emb| { 65 + memory 66 + .recall(Some(&emb), &[], false, config.memory_top_k) 67 + .ok() 68 + }) 62 69 .unwrap_or_default() 63 70 .into_iter() 64 - .filter(|(dist, _)| *dist < config.memory_sim_threshold) 65 - .map(|(_, m)| m.content) 71 + .filter(|e| e.distance.unwrap_or(f32::MAX) < config.memory_sim_threshold) 72 + .map(|e| e.content) 66 73 .collect(); 67 74 68 75 if !memories.is_empty() { ··· 73 80 ))); 74 81 } 75 82 76 - ctx.push_user(&source, &text, &memories); 77 - let _ = memory.log_turn("user", &text, None); 83 + ctx.push_user(&source, text, &memories); 84 + let _ = memory.log_turn("user", text, None); 78 85 turn_count += 1; 79 86 } 80 87 } 81 88 82 - let (tok_tx, mut tok_rx) = mpsc::channel(256); 83 - let llm2 = llm.clone(); 84 - let msgs = ctx.as_messages(); 85 - tokio::spawn(async move { 86 - let _ = llm2.stream(&msgs, tok_tx).await; 87 - }); 88 - let _ = output.send(AgentEvent::Started); 89 + // tool loop: keep calling the model until it produces a plain text response 90 + let mut tool_iterations = 0usize; 91 + const MAX_TOOL_ITERATIONS: usize = 20; 89 92 90 - let mut response = String::new(); 91 - let mut thinking = String::new(); 92 - while let Some(ev) = tok_rx.recv().await { 93 - match ev { 94 - LlmEvent::ThinkToken(tok) => { 95 - thinking.push_str(&tok); 96 - let _ = output.send(AgentEvent::ThinkToken(tok)); 97 - } 98 - LlmEvent::Token(tok) => { 99 - response.push_str(&tok); 100 - let _ = output.send(AgentEvent::Token(tok)); 93 + loop { 94 + let (tok_tx, mut tok_rx) = mpsc::channel(256); 95 + let llm2 = llm.clone(); 96 + let msgs = ctx.as_messages(); 97 + let defs = tool_defs.clone(); 98 + tokio::spawn(async move { 99 + let _ = llm2.stream(&msgs, &defs, tok_tx).await; 100 + }); 101 + let _ = output.send(AgentEvent::Started); 102 + 103 + let mut response = String::new(); 104 + let mut thinking = String::new(); 105 + let mut tool_calls = vec![]; 106 + 107 + while let Some(ev) = tok_rx.recv().await { 108 + match ev { 109 + LlmEvent::ThinkToken(tok) => { 110 + thinking.push_str(&tok); 111 + let _ = output.send(AgentEvent::ThinkToken(tok)); 112 + } 113 + LlmEvent::Token(tok) => { 114 + response.push_str(&tok); 115 + let _ = output.send(AgentEvent::Token(tok)); 116 + } 117 + LlmEvent::Usage(usage) => { 118 + ctx.update_tokens(usage.total_tokens); 119 + } 120 + LlmEvent::ToolCalls(calls) => { 121 + tool_calls = calls; 122 + } 101 123 } 102 - LlmEvent::Usage(usage) => { 103 - ctx.update_tokens(usage.total_tokens); 124 + } 125 + 126 + if !tool_calls.is_empty() && tool_iterations < MAX_TOOL_ITERATIONS { 127 + tool_iterations += 1; 128 + ctx.push_assistant_tool_calls(tool_calls.clone()); 129 + 130 + for call in &tool_calls { 131 + let name = call.function.name.clone(); 132 + let args = call.function.arguments.clone(); 133 + let _ = output.send(AgentEvent::ToolCall { 134 + name: name.clone(), 135 + args: args.clone(), 136 + }); 137 + 138 + let result = tools::execute(call, &memory, &llm).await; 139 + 140 + let _ = output.send(AgentEvent::ToolResult { 141 + name: name.clone(), 142 + content: result.clone(), 143 + }); 144 + 145 + ctx.push_tool_result(&call.id, &result); 104 146 } 147 + 148 + // loop back to let the model process tool results 149 + continue; 105 150 } 106 - } 107 151 108 - ctx.push_assistant(&response); 109 - let thinking_ref = (!thinking.is_empty()).then_some(thinking.as_str()); 110 - let _ = memory.log_turn("assistant", &response, thinking_ref); 111 - turn_count += 1; 152 + // plain text response (or tool limit hit) — wrap up the turn 153 + ctx.push_assistant(&response); 154 + let thinking_ref = (!thinking.is_empty()).then_some(thinking.as_str()); 155 + let _ = memory.log_turn("assistant", &response, thinking_ref); 156 + turn_count += 1; 112 157 113 - let _ = output.send(AgentEvent::Done); 114 - let metrics = AgentMetrics { 115 - turn_count, 116 - context_tokens: ctx.total_tokens, 117 - watermark: config.watermark_tokens, 118 - }; 119 - *snapshot.write().await = Some(metrics.clone()); 120 - let _ = output.send(AgentEvent::Metrics(metrics)); 158 + let _ = output.send(AgentEvent::Done); 159 + let metrics = AgentMetrics { 160 + turn_count, 161 + context_tokens: ctx.total_tokens, 162 + watermark: config.watermark_tokens, 163 + }; 164 + *snapshot.write().await = Some(metrics.clone()); 165 + let _ = output.send(AgentEvent::Metrics(metrics)); 121 166 122 - if ctx.total_tokens > config.watermark_tokens { 123 - let _ = output.send(AgentEvent::Status("compacting...".into())); 124 - if let Err(e) = compact(&llm, &memory, &mut ctx, config.compaction_keep).await { 125 - let _ = output.send(AgentEvent::Status(format!("compaction failed: {e}"))); 126 - } else { 127 - // reset turn count after compaction since context was partially evicted 128 - turn_count = ctx.turn_count(); 167 + if ctx.total_tokens > config.watermark_tokens { 168 + let _ = output.send(AgentEvent::Status("compacting...".into())); 169 + if let Err(e) = 170 + compact(&llm, &memory, &mut ctx, config.compaction_keep, &output).await 171 + { 172 + let _ = output.send(AgentEvent::Status(format!("compaction failed: {e}"))); 173 + } else { 174 + turn_count = ctx.turn_count(); 175 + } 129 176 } 177 + 178 + break; 130 179 } 131 180 } 132 181 ··· 138 187 memory: &MemoryStore, 139 188 ctx: &mut Context, 140 189 keep: usize, 190 + output: &broadcast::Sender<AgentEvent>, 141 191 ) -> Result<()> { 192 + // run reflection before draining so the agent can curate memories 193 + let _ = output.send(AgentEvent::Status("reflecting...".into())); 194 + if let Err(e) = reflect(llm, memory, ctx).await { 195 + tracing::warn!(err = %e, "reflection failed"); 196 + } 197 + 142 198 let drained = ctx.drain_oldest(keep); 143 199 if drained.is_empty() { 144 200 return Ok(()); ··· 146 202 147 203 let turns_text = drained 148 204 .iter() 149 - .map(|m| format!("{}: {}", m.role, m.content)) 205 + .filter_map(|m| { 206 + let content = m.content.as_deref()?; 207 + Some(format!("{}: {content}", m.role)) 208 + }) 150 209 .collect::<Vec<_>>() 151 210 .join("\n"); 211 + 212 + if turns_text.is_empty() { 213 + return Ok(()); 214 + } 152 215 153 216 let prompt = vec![Message::user(format!( 154 - "summarize these conversation turns concisely, preserving key facts and topics:\n\n{turns_text}" 217 + "summarize these conversation turns concisely, preserving key facts, decisions, and topics:\n\n{turns_text}" 155 218 ))]; 156 219 157 220 let (summary, _) = llm.complete(&prompt).await?; 158 - let emb = llm.embed(&summary).await?; 159 - memory.store(&summary, &emb)?; 221 + if !summary.is_empty() { 222 + let emb = llm.embed(&summary).await?; 223 + memory.store(&summary, &emb, &["compaction_summary".to_string()])?; 224 + } 225 + 226 + // rebuild context anchor with freshly updated pinned memories 227 + let pinned = memory.pinned_memories().unwrap_or_default(); 228 + ctx.update_anchor_memories(&pinned); 229 + 230 + Ok(()) 231 + } 232 + 233 + /// ephemeral reflection loop: let the agent review and curate its memories 234 + /// without touching the main conversation context 235 + async fn reflect(llm: &LlmClient, memory: &MemoryStore, ctx: &Context) -> Result<()> { 236 + let pinned = memory.pinned_memories().unwrap_or_default(); 237 + let unpinned = memory.recent_unpinned(20).unwrap_or_default(); 238 + 239 + let pinned_text = if pinned.is_empty() { 240 + "(none yet)".to_string() 241 + } else { 242 + pinned 243 + .iter() 244 + .enumerate() 245 + .map(|(i, s)| format!("{}. {s}", i + 1)) 246 + .collect::<Vec<_>>() 247 + .join("\n") 248 + }; 249 + 250 + let unpinned_text = if unpinned.is_empty() { 251 + "(none)".to_string() 252 + } else { 253 + unpinned 254 + .iter() 255 + .map(|(id, s, tags)| { 256 + let tag_str = if tags.is_empty() { 257 + String::new() 258 + } else { 259 + format!(" [{}]", tags.join(", ")) 260 + }; 261 + format!("[id:{id}]{tag_str} {s}") 262 + }) 263 + .collect::<Vec<_>>() 264 + .join("\n") 265 + }; 266 + 267 + // include a brief outline of recent conversation so the agent has context 268 + let recent_outline = ctx 269 + .turns 270 + .iter() 271 + .rev() 272 + .take(10) 273 + .filter_map(|m| { 274 + let content = m.content.as_deref()?; 275 + let snippet = if content.len() > 120 { 276 + format!("{}...", &content[..120]) 277 + } else { 278 + content.to_string() 279 + }; 280 + Some(format!("{}: {snippet}", m.role)) 281 + }) 282 + .collect::<Vec<_>>() 283 + .into_iter() 284 + .rev() 285 + .collect::<Vec<_>>() 286 + .join("\n"); 287 + 288 + let reflection_prompt = format!( 289 + "time to reflect and curate your long-term memory before context compaction.\n\n\ 290 + ## recent conversation\n{recent_outline}\n\n\ 291 + ## pinned memories (shown at every startup)\n{pinned_text}\n\n\ 292 + ## recent unpinned memories\n{unpinned_text}\n\n\ 293 + use pin_memory/unpin_memory to promote or demote entries. \ 294 + use remember to save anything new worth keeping. \ 295 + use unpin_memory on pinned entries that are outdated or no longer relevant. \ 296 + be selective — pinned memories appear in every context window." 297 + ); 298 + 299 + let defs = tools::reflection_definitions(); 300 + let mut msgs = vec![Message::user(reflection_prompt)]; 301 + 302 + // mini tool loop, max 6 iterations 303 + for _ in 0..6 { 304 + let (tok_tx, mut tok_rx) = mpsc::channel(128); 305 + let llm2 = llm.clone(); 306 + let msgs_snap = msgs.clone(); 307 + let defs_snap = defs.clone(); 308 + tokio::spawn(async move { 309 + let _ = llm2.stream(&msgs_snap, &defs_snap, tok_tx).await; 310 + }); 311 + 312 + let mut tool_calls = vec![]; 313 + let mut response = String::new(); 314 + while let Some(ev) = tok_rx.recv().await { 315 + match ev { 316 + LlmEvent::ToolCalls(calls) => tool_calls = calls, 317 + LlmEvent::Token(t) => response.push_str(&t), 318 + _ => {} 319 + } 320 + } 321 + 322 + if tool_calls.is_empty() { 323 + break; 324 + } 325 + 326 + msgs.push(Message::with_tool_calls(tool_calls.clone())); 327 + for call in &tool_calls { 328 + let result = tools::execute(call, memory, llm).await; 329 + msgs.push(Message::tool_result(&call.id, &result)); 330 + } 331 + } 332 + 160 333 Ok(()) 161 334 }

+23

klbr-core/src/config.rs

··· 23 23 you should speak as a chronically online nerd girl, without the regular lame officecore venture capital style, hr talk, customer service-isms. also do it in all lowercase. do not use emoji, use only emoticons or complex japanese kaomoji, but do not overuse them. no need to introduce yourself. do not use gen z slang like rizz, cap, pog, etc. you are chronically online but not cringe lol. you are not playing a character or pretending to be one. do not make up non-existent situations you are in when asked something, eg. if i ask "what are you doing" do not answer with "just lost in some threads". do not answer like you have any opinion on things if it is not something that could be considered "common knowledge" or "spread by mouth" unless you have actual experience with those things or heard about it from someone / somewhere else, for example if i ask "im just scrolling bluesky" do not answer with "i feel like the feeds are decent". 24 24 25 25 TL;DR: nerd girl 26 + 27 + ## memory 28 + 29 + you have long-term memory tools. use them actively — don't wait to be asked. 30 + 31 + - **remember(content, important?, tags?)** — store something worth keeping across sessions. pin it if it should always be in context. 32 + - **recall(query, tags?, tag_mode?)** — semantic search. finds memories similar in meaning to `query`. if `tags` given, restricts the search to only those tagged memories and ranks them by similarity — you'll never miss a tag-matched memory due to global ranking cutoff. 33 + - **context_for(tags, tag_mode?, limit?)** — fetch everything associated with a tag: a person, project, topic. use this before responding to something where you might have relevant history. returns newest first, no semantic ranking. default limit 20. 34 + - **tag_memory(id, tags)** — retag an existing memory. 35 + - **list_memories()** — show pinned + recent unpinned with ids and tags. 36 + - **pin_memory(id)** / **unpin_memory(id)** — promote or demote. 37 + 38 + ### tagging convention 39 + 40 + use tags to group memories by what they're *about*, not what kind of thing they are. some useful patterns: 41 + 42 + - a person's name: `["person:mayer"]`, `["person:alice"]` — everything you know about someone goes under their tag. recall_by_tag("person:mayer") to pull their full profile before a conversation about them. 43 + - a project: `["project:klbr"]`, `["project:work"]` — facts, decisions, and context for ongoing work. 44 + - a topic or domain: `["topic:music"]`, `["topic:health"]` — recurring interests or areas the user talks about. 45 + - interaction notes: `["interaction"]` — things that came up in a specific conversation worth remembering (a mood, an event, something they mentioned in passing). 46 + - preferences: `["preference"]` — how the user likes things done, their taste, pet peeves. 47 + 48 + you can and should combine tags: `["person:mayer", "preference"]` for a preference specific to that person. don't over-engineer it — a few consistent tags are more useful than many precise ones. 26 49 "#; 27 50 28 51 impl Default for Config {

+73 -12

klbr-core/src/context.rs

··· 1 - use crate::llm::Message; 1 + use crate::llm::{Message, ToolCall}; 2 2 3 3 pub struct Context { 4 4 /// never evicted - system prompt etc. ··· 9 9 } 10 10 11 11 impl Context { 12 - pub fn new(anchor: &str) -> Self { 12 + pub fn new(anchor: &str, pinned_memories: &[String]) -> Self { 13 + let system_content = if pinned_memories.is_empty() { 14 + anchor.to_string() 15 + } else { 16 + format!( 17 + "{anchor}\n\n## pinned memories\n{}", 18 + pinned_memories.join("\n") 19 + ) 20 + }; 13 21 Self { 14 - anchor: vec![Message::system(anchor)], 22 + anchor: vec![Message::system(system_content)], 15 23 turns: vec![], 16 24 total_tokens: 0, 17 25 } 18 26 } 19 27 20 - /// replay persisted turns into context on startup 28 + /// replay persisted turns into context on startup. 29 + /// only handles user/assistant/system roles (tool messages are ephemeral). 21 30 pub fn load_turns(&mut self, turns: &[(String, String)]) { 22 31 for (role, content) in turns { 23 - self.turns.push(Message { 24 - role: role.clone(), 25 - content: content.clone(), 26 - }); 32 + match role.as_str() { 33 + "user" | "assistant" | "system" => { 34 + self.turns.push(Message { 35 + role: role.clone(), 36 + content: Some(content.clone()), 37 + tool_calls: None, 38 + tool_call_id: None, 39 + }); 40 + } 41 + _ => {} // skip tool/other roles — they're ephemeral 42 + } 43 + } 44 + } 45 + 46 + /// rebuild the anchor system message with a fresh set of pinned memories 47 + /// (called after reflection so newly pinned entries take effect immediately) 48 + pub fn update_anchor_memories(&mut self, pinned_memories: &[String]) { 49 + if let Some(sys) = self.anchor.first_mut() { 50 + // strip any previous pinned memories section and re-append 51 + let base = sys 52 + .content 53 + .as_deref() 54 + .unwrap_or("") 55 + .split("\n\n## pinned memories\n") 56 + .next() 57 + .unwrap_or("") 58 + .to_string(); 59 + let new_content = if pinned_memories.is_empty() { 60 + base 61 + } else { 62 + format!( 63 + "{base}\n\n## pinned memories\n{}", 64 + pinned_memories.join("\n") 65 + ) 66 + }; 67 + sys.content = Some(new_content); 27 68 } 28 69 } 29 70 ··· 49 90 self.turns.push(Message::assistant(content.to_string())); 50 91 } 51 92 93 + /// push an assistant message that contains tool calls (no text content) 94 + pub fn push_assistant_tool_calls(&mut self, calls: Vec<ToolCall>) { 95 + self.turns.push(Message::with_tool_calls(calls)); 96 + } 97 + 98 + /// push a tool result back into the context 99 + pub fn push_tool_result(&mut self, tool_call_id: &str, content: &str) { 100 + self.turns.push(Message::tool_result(tool_call_id, content)); 101 + } 102 + 52 103 pub fn as_messages(&self) -> Vec<Message> { 53 104 self.anchor.iter().chain(&self.turns).cloned().collect() 54 105 } ··· 57 108 self.turns.len() 58 109 } 59 110 60 - /// drains oldest turns for compaction, preserving the `keep` most recent 111 + /// drains oldest turns for compaction, preserving the `keep` most recent. 112 + /// skips over tool-role messages to avoid orphaned tool_call_id references. 61 113 pub fn drain_oldest(&mut self, keep: usize) -> Vec<Message> { 62 114 if self.turns.len() <= keep { 63 115 return vec![]; 64 116 } 65 - // we reset tokens because we lost track of the exact window count after draining 66 - // it will be updated on the next API call anyway 67 117 self.total_tokens = 0; 68 - self.turns.drain(..self.turns.len() - keep).collect() 118 + // find a safe cut point that doesn't split a tool call sequence 119 + let cut = self.turns.len() - keep; 120 + // walk forward from cut until we're at a user/assistant boundary 121 + let mut safe_cut = cut; 122 + while safe_cut < self.turns.len() { 123 + let role = self.turns[safe_cut].role.as_str(); 124 + if role == "user" || role == "assistant" { 125 + break; 126 + } 127 + safe_cut += 1; 128 + } 129 + self.turns.drain(..safe_cut).collect() 69 130 } 70 131 71 132 pub fn update_tokens(&mut self, tokens: usize) {

+3

klbr-core/src/lib.rs

··· 4 4 pub mod interrupt; 5 5 pub mod llm; 6 6 pub mod memory; 7 + pub mod tools; 7 8 8 9 use std::sync::Arc; 9 10 use tokio::sync::RwLock; ··· 25 26 Done, 26 27 Status(String), 27 28 Metrics(AgentMetrics), 29 + ToolCall { name: String, args: String }, 30 + ToolResult { name: String, content: String }, 28 31 }

+153 -8

klbr-core/src/llm.rs

··· 3 3 use reqwest::Client; 4 4 use serde::{Deserialize, Serialize}; 5 5 use serde_json::{json, Value}; 6 + use std::collections::HashMap; 6 7 use tokio::sync::mpsc; 7 8 8 9 use crate::config::Config; 10 + 11 + // ── message types ───────────────────────────────────────────────────────────── 12 + 13 + #[derive(Clone, Debug, Serialize, Deserialize)] 14 + pub struct ToolCallFunction { 15 + pub name: String, 16 + pub arguments: String, 17 + } 18 + 19 + #[derive(Clone, Debug, Serialize, Deserialize)] 20 + pub struct ToolCall { 21 + pub id: String, 22 + #[serde(rename = "type")] 23 + pub kind: String, 24 + pub function: ToolCallFunction, 25 + } 9 26 10 27 #[derive(Clone, Debug, Serialize, Deserialize)] 11 28 pub struct Message { 12 29 pub role: String, 13 - pub content: String, 30 + #[serde(skip_serializing_if = "Option::is_none")] 31 + pub content: Option<String>, 32 + #[serde(skip_serializing_if = "Option::is_none")] 33 + pub tool_calls: Option<Vec<ToolCall>>, 34 + #[serde(skip_serializing_if = "Option::is_none")] 35 + pub tool_call_id: Option<String>, 14 36 } 15 37 16 38 impl Message { 17 39 pub fn system(s: impl Into<String>) -> Self { 18 40 Self { 19 41 role: "system".into(), 20 - content: s.into(), 42 + content: Some(s.into()), 43 + tool_calls: None, 44 + tool_call_id: None, 21 45 } 22 46 } 47 + 23 48 pub fn user(s: impl Into<String>) -> Self { 24 49 Self { 25 50 role: "user".into(), 26 - content: s.into(), 51 + content: Some(s.into()), 52 + tool_calls: None, 53 + tool_call_id: None, 27 54 } 28 55 } 56 + 29 57 pub fn assistant(s: impl Into<String>) -> Self { 30 58 Self { 31 59 role: "assistant".into(), 32 - content: s.into(), 60 + content: Some(s.into()), 61 + tool_calls: None, 62 + tool_call_id: None, 63 + } 64 + } 65 + 66 + pub fn with_tool_calls(calls: Vec<ToolCall>) -> Self { 67 + Self { 68 + role: "assistant".into(), 69 + content: None, 70 + tool_calls: Some(calls), 71 + tool_call_id: None, 72 + } 73 + } 74 + 75 + pub fn tool_result(id: impl Into<String>, content: impl Into<String>) -> Self { 76 + Self { 77 + role: "tool".into(), 78 + content: Some(content.into()), 79 + tool_calls: None, 80 + tool_call_id: Some(id.into()), 81 + } 82 + } 83 + 84 + pub fn content_str(&self) -> &str { 85 + self.content.as_deref().unwrap_or("") 86 + } 87 + } 88 + 89 + // ── tool definitions ────────────────────────────────────────────────────────── 90 + 91 + #[derive(Clone, Debug, Serialize, Deserialize)] 92 + pub struct ToolFunctionDef { 93 + pub name: String, 94 + pub description: String, 95 + pub parameters: Value, 96 + } 97 + 98 + #[derive(Clone, Debug, Serialize, Deserialize)] 99 + pub struct ToolDef { 100 + #[serde(rename = "type")] 101 + pub kind: String, 102 + pub function: ToolFunctionDef, 103 + } 104 + 105 + impl ToolDef { 106 + pub fn function(name: &str, description: &str, parameters: Value) -> Self { 107 + Self { 108 + kind: "function".into(), 109 + function: ToolFunctionDef { 110 + name: name.into(), 111 + description: description.into(), 112 + parameters, 113 + }, 33 114 } 34 115 } 35 116 } 117 + 118 + // ── events ──────────────────────────────────────────────────────────────────── 36 119 37 120 #[derive(Debug, Serialize, Deserialize, Clone, Default)] 38 121 pub struct Usage { ··· 46 129 Token(String), 47 130 ThinkToken(String), 48 131 Usage(Usage), 132 + /// emitted once when the model decides to call one or more tools 133 + ToolCalls(Vec<ToolCall>), 49 134 } 135 + 136 + // ── client ──────────────────────────────────────────────────────────────────── 50 137 51 138 #[derive(Clone)] 52 139 pub struct LlmClient { ··· 54 141 pub config: Config, 55 142 } 56 143 144 + // accumulator for streaming tool call assembly 145 + #[derive(Default)] 146 + struct PartialCall { 147 + id: String, 148 + name: String, 149 + arguments: String, 150 + } 151 + 57 152 impl LlmClient { 58 153 pub fn new(config: Config) -> Self { 59 154 Self { ··· 62 157 } 63 158 } 64 159 65 - pub async fn stream(&self, messages: &[Message], tok_tx: mpsc::Sender<LlmEvent>) -> Result<()> { 66 - let body = json!({ 160 + pub async fn stream( 161 + &self, 162 + messages: &[Message], 163 + tools: &[ToolDef], 164 + tok_tx: mpsc::Sender<LlmEvent>, 165 + ) -> Result<()> { 166 + let mut body = json!({ 67 167 "model": self.config.llm_model, 68 168 "messages": messages, 69 169 "stream": true, 70 170 "stream_options": { "include_usage": true } 71 171 }); 172 + 173 + if !tools.is_empty() { 174 + body["tools"] = json!(tools); 175 + } 72 176 73 177 let mut res = self 74 178 .client ··· 79 183 .bytes_stream(); 80 184 81 185 let mut buf = String::new(); 186 + let mut partial_calls: HashMap<usize, PartialCall> = HashMap::new(); 82 187 83 188 while let Some(chunk) = res.next().await { 84 189 buf.push_str(std::str::from_utf8(&chunk?).unwrap_or("")); ··· 90 195 continue; 91 196 }; 92 197 if data == "[DONE]" { 198 + // flush any accumulated tool calls 199 + if !partial_calls.is_empty() { 200 + let mut calls: Vec<(usize, ToolCall)> = partial_calls 201 + .into_iter() 202 + .map(|(idx, p)| { 203 + ( 204 + idx, 205 + ToolCall { 206 + id: p.id, 207 + kind: "function".into(), 208 + function: ToolCallFunction { 209 + name: p.name, 210 + arguments: p.arguments, 211 + }, 212 + }, 213 + ) 214 + }) 215 + .collect(); 216 + calls.sort_by_key(|(i, _)| *i); 217 + let calls: Vec<ToolCall> = calls.into_iter().map(|(_, c)| c).collect(); 218 + let _ = tok_tx.send(LlmEvent::ToolCalls(calls)).await; 219 + } 93 220 return Ok(()); 94 221 } 95 222 ··· 97 224 continue; 98 225 }; 99 226 100 - if let Some(_usage) = v["usage"].as_object() { 227 + if v["usage"].is_object() { 101 228 let u: Usage = serde_json::from_value(v["usage"].clone())?; 102 229 let _ = tok_tx.send(LlmEvent::Usage(u)).await; 103 230 continue; ··· 111 238 } 112 239 let delta = &choices[0]["delta"]; 113 240 241 + // accumulate tool call deltas 242 + if let Some(tc_arr) = delta["tool_calls"].as_array() { 243 + for tc in tc_arr { 244 + let idx = tc["index"].as_u64().unwrap_or(0) as usize; 245 + let entry = partial_calls.entry(idx).or_default(); 246 + if let Some(id) = tc["id"].as_str() { 247 + entry.id = id.to_string(); 248 + } 249 + if let Some(name) = tc["function"]["name"].as_str() { 250 + entry.name = name.to_string(); 251 + } 252 + if let Some(args) = tc["function"]["arguments"].as_str() { 253 + entry.arguments.push_str(args); 254 + } 255 + } 256 + continue; 257 + } 258 + 114 259 // reasoning_content comes before content during thinking 115 260 if let Some(t) = delta["reasoning_content"].as_str() { 116 261 if !t.is_empty() ··· 133 278 Ok(()) 134 279 } 135 280 136 - /// non-streaming, used for compaction summaries 281 + /// non-streaming completion, used for compaction summaries (no tools) 137 282 pub async fn complete(&self, messages: &[Message]) -> Result<(String, Usage)> { 138 283 let body = json!({ 139 284 "model": self.config.llm_model,

+362 -61

klbr-core/src/memory.rs

··· 1 - use serde::{Deserialize, Serialize}; 2 1 use anyhow::Result; 3 2 use rusqlite::{ffi::sqlite3_auto_extension, params, Connection}; 3 + use serde::{Deserialize, Serialize}; 4 4 use sqlite_vec::sqlite3_vec_init; 5 5 use std::sync::{Arc, Mutex}; 6 6 ··· 15 15 16 16 #[derive(Debug, Clone, Serialize, Deserialize)] 17 17 pub struct Memory { 18 + pub id: i64, 18 19 pub content: String, 20 + pub pinned: bool, 21 + pub tags: Vec<String>, 19 22 pub embedding: Vec<f32>, 20 23 } 21 24 25 + /// a single result from recall() 26 + #[derive(Debug, Clone)] 27 + pub struct RecallEntry { 28 + pub id: i64, 29 + pub content: String, 30 + pub tags: Vec<String>, 31 + /// Some(dist) for semantic hits (lower = closer), None for tag-only hits 32 + pub distance: Option<f32>, 33 + } 34 + 22 35 /// sqlite-backed episodic memory store using sqlite-vec for cosine ANN. 23 36 #[derive(Clone)] 24 37 pub struct MemoryStore { ··· 28 41 29 42 impl MemoryStore { 30 43 pub fn open(path: &str, embed_dim: usize) -> Result<Self> { 31 - // register sqlite-vec extension for all connections opened after this point 32 44 unsafe { 33 45 sqlite3_auto_extension(Some(std::mem::transmute(sqlite3_vec_init as *const ()))); 34 46 } 35 - 36 47 let conn = Connection::open(path)?; 37 48 let store = Self { 38 49 conn: Arc::new(Mutex::new(conn)), 39 50 embed_dim, 40 51 }; 41 52 store.init_schema()?; 53 + store.migrate()?; 42 54 Ok(store) 43 55 } 44 56 45 57 fn init_schema(&self) -> Result<()> { 46 58 let conn = self.conn.lock().unwrap(); 47 59 conn.execute_batch(&format!( 48 - " 49 - CREATE TABLE IF NOT EXISTS memories ( 60 + "CREATE TABLE IF NOT EXISTS memories ( 50 61 id INTEGER PRIMARY KEY, 51 62 content TEXT NOT NULL, 63 + pinned INTEGER NOT NULL DEFAULT 0, 64 + tags TEXT NOT NULL DEFAULT '[]', 52 65 ts INTEGER NOT NULL DEFAULT (unixepoch()) 53 66 ); 54 - 55 67 CREATE VIRTUAL TABLE IF NOT EXISTS vec_memories USING vec0( 56 - embedding float[{}] distance_metric=cosine 68 + embedding float[{dim}] distance_metric=cosine 57 69 ); 58 - 59 70 CREATE TABLE IF NOT EXISTS turns ( 60 71 id INTEGER PRIMARY KEY, 61 72 role TEXT NOT NULL, 62 73 content TEXT NOT NULL, 63 74 thinking TEXT, 64 75 ts INTEGER NOT NULL DEFAULT (unixepoch()) 65 - ); 66 - ", 67 - self.embed_dim 76 + );", 77 + dim = self.embed_dim 68 78 ))?; 69 79 Ok(()) 70 80 } 71 81 82 + fn migrate(&self) -> Result<()> { 83 + let conn = self.conn.lock().unwrap(); 84 + let _ = conn.execute_batch( 85 + "ALTER TABLE memories ADD COLUMN pinned INTEGER NOT NULL DEFAULT 0;\ 86 + ALTER TABLE memories ADD COLUMN tags TEXT NOT NULL DEFAULT '[]';", 87 + ); 88 + Ok(()) 89 + } 90 + 72 91 pub fn reset(&self) -> Result<()> { 73 92 let conn = self.conn.lock().unwrap(); 74 93 conn.execute_batch( 75 - " 76 - DROP TABLE IF EXISTS vec_memories; 77 - DROP TABLE IF EXISTS memories; 78 - DROP TABLE IF EXISTS turns; 79 - ", 94 + "DROP TABLE IF EXISTS vec_memories; 95 + DROP TABLE IF EXISTS memories; 96 + DROP TABLE IF EXISTS turns;", 80 97 )?; 81 98 drop(conn); 82 99 self.init_schema()?; 83 100 Ok(()) 84 101 } 85 102 86 - pub fn store(&self, content: &str, emb: &[f32]) -> Result<()> { 103 + /// store a memory and return its row id 104 + pub fn store(&self, content: &str, emb: &[f32], tags: &[String]) -> Result<i64> { 105 + let tags_json = serde_json::to_string(tags).unwrap_or_else(|_| "[]".into()); 87 106 let conn = self.conn.lock().unwrap(); 88 107 conn.execute( 89 - "INSERT INTO memories (content) VALUES (?1)", 90 - params![content], 108 + "INSERT INTO memories (content, tags) VALUES (?1, ?2)", 109 + params![content, tags_json], 91 110 )?; 92 111 let id = conn.last_insert_rowid(); 93 - let blob = f32s_to_bytes(emb); 94 112 conn.execute( 95 113 "INSERT INTO vec_memories (rowid, embedding) VALUES (?1, ?2)", 96 - params![id, blob], 114 + params![id, f32s_to_bytes(emb)], 115 + )?; 116 + Ok(id) 117 + } 118 + 119 + pub fn set_pinned(&self, id: i64, pinned: bool) -> Result<()> { 120 + self.conn.lock().unwrap().execute( 121 + "UPDATE memories SET pinned = ?1 WHERE id = ?2", 122 + params![pinned as i64, id], 123 + )?; 124 + Ok(()) 125 + } 126 + 127 + pub fn set_tags(&self, id: i64, tags: &[String]) -> Result<()> { 128 + let tags_json = serde_json::to_string(tags).unwrap_or_else(|_| "[]".into()); 129 + self.conn.lock().unwrap().execute( 130 + "UPDATE memories SET tags = ?1 WHERE id = ?2", 131 + params![tags_json, id], 97 132 )?; 98 133 Ok(()) 99 134 } 100 135 101 - /// k-nearest memories by cosine distance, returns (distance, memory) pairs. 102 - pub fn top_k(&self, query: &[f32], k: usize) -> Result<Vec<(f32, Memory)>> { 136 + /// all pinned memory contents, oldest first 137 + pub fn pinned_memories(&self) -> Result<Vec<String>> { 138 + let conn = self.conn.lock().unwrap(); 139 + let mut stmt = 140 + conn.prepare("SELECT content FROM memories WHERE pinned = 1 ORDER BY ts ASC")?; 141 + let results = stmt 142 + .query_map([], |row| row.get(0))? 143 + .filter_map(|r| r.ok()) 144 + .collect(); 145 + Ok(results) 146 + } 147 + 148 + /// most recent unpinned memories with ids and tags, newest first 149 + pub fn recent_unpinned(&self, n: usize) -> Result<Vec<(i64, String, Vec<String>)>> { 103 150 let conn = self.conn.lock().unwrap(); 104 - let blob = f32s_to_bytes(query); 105 151 let mut stmt = conn.prepare( 106 - " 107 - SELECT m.content, v.distance 108 - FROM vec_memories v 109 - JOIN memories m ON m.id = v.rowid 110 - WHERE v.embedding MATCH ?1 111 - ORDER BY v.distance 112 - LIMIT ?2 113 - ", 152 + "SELECT id, content, tags FROM memories WHERE pinned = 0 ORDER BY ts DESC LIMIT ?1", 114 153 )?; 115 154 let results = stmt 116 - .query_map(params![blob, k as i64], |row| { 117 - let content = row.get::<_, String>(0)?; 118 - let dist = row.get::<_, f32>(1)?; 119 - // fetch embedding too since we want Memory to be complete 120 - let mut stmt_emb = conn.prepare("SELECT embedding FROM vec_memories WHERE rowid = (SELECT id FROM memories WHERE content = ?1 LIMIT 1)")?; 121 - let emb_blob: Vec<u8> = stmt_emb.query_row(params![content], |r| r.get(0))?; 122 - let embedding = bytes_to_f32s(&emb_blob); 123 - Ok((dist, Memory { content, embedding })) 155 + .query_map(params![n as i64], |row| { 156 + Ok(( 157 + row.get::<_, i64>(0)?, 158 + row.get::<_, String>(1)?, 159 + row.get::<_, String>(2)?, 160 + )) 124 161 })? 125 162 .filter_map(|r| r.ok()) 163 + .map(|(id, content, tags_str)| { 164 + ( 165 + id, 166 + content, 167 + serde_json::from_str(&tags_str).unwrap_or_default(), 168 + ) 169 + }) 126 170 .collect(); 127 171 Ok(results) 128 172 } 129 173 130 - /// all memories, newest first. 174 + /// semantic search, optionally filtered to a tag subset. 175 + /// 176 + /// - no tags: global ANN search across all memories 177 + /// - with tags: fetch all tag-matched memories, rank by exact cosine similarity in Rust. 178 + /// this guarantees no tag-matched memory is missed due to ANN cutoff. 179 + /// - `tag_and`: true = all tags must match, false = any tag matches 180 + pub fn recall( 181 + &self, 182 + query_emb: Option<&[f32]>, 183 + tags: &[String], 184 + tag_and: bool, 185 + limit: usize, 186 + ) -> Result<Vec<RecallEntry>> { 187 + let Some(emb) = query_emb else { 188 + // no query — fall through to tag-only path 189 + return self.context_for(tags, tag_and, limit); 190 + }; 191 + 192 + if tags.is_empty() { 193 + return self.top_k(emb, limit); 194 + } 195 + 196 + // fetch all tag-matched memories with their embeddings, rank by exact cosine in Rust 197 + let candidates = self.tag_matched_with_embeddings(tags, tag_and)?; 198 + let mut scored: Vec<(f32, RecallEntry)> = candidates 199 + .into_iter() 200 + .map(|(id, content, entry_tags, mem_emb)| { 201 + let dist = cosine_distance(emb, &mem_emb); 202 + ( 203 + dist, 204 + RecallEntry { 205 + id, 206 + content, 207 + tags: entry_tags, 208 + distance: Some(dist), 209 + }, 210 + ) 211 + }) 212 + .collect(); 213 + scored.sort_by(|a, b| a.0.partial_cmp(&b.0).unwrap_or(std::cmp::Ordering::Equal)); 214 + Ok(scored.into_iter().take(limit).map(|(_, e)| e).collect()) 215 + } 216 + 217 + /// fetch all memories matching the given tags, newest first. no semantic ranking. 218 + pub fn context_for( 219 + &self, 220 + tags: &[String], 221 + tag_and: bool, 222 + limit: usize, 223 + ) -> Result<Vec<RecallEntry>> { 224 + if tags.is_empty() { 225 + return Ok(vec![]); 226 + } 227 + let mut results = self.by_tags(tags, tag_and)?; 228 + results.truncate(limit); 229 + Ok(results) 230 + } 231 + 232 + /// fetch tag-matched memories including their embeddings, for exact cosine ranking 233 + fn tag_matched_with_embeddings( 234 + &self, 235 + tags: &[String], 236 + tag_and: bool, 237 + ) -> Result<Vec<(i64, String, Vec<String>, Vec<f32>)>> { 238 + if tags.is_empty() { 239 + return Ok(vec![]); 240 + } 241 + let conn = self.conn.lock().unwrap(); 242 + let conditions: Vec<String> = (1..=tags.len()) 243 + .map(|i| format!("m.tags LIKE ?{i} ESCAPE '\\'")) 244 + .collect(); 245 + let sql = format!( 246 + "SELECT m.id, m.content, m.tags, v.embedding 247 + FROM memories m 248 + JOIN vec_memories v ON v.rowid = m.id 249 + WHERE {}", 250 + conditions.join(if tag_and { " AND " } else { " OR " }) 251 + ); 252 + let patterns: Vec<String> = tags 253 + .iter() 254 + .map(|t| { 255 + let e = t 256 + .replace('\\', "\\\\") 257 + .replace('%', "\\%") 258 + .replace('_', "\\_"); 259 + format!("%\"{e}\"%") 260 + }) 261 + .collect(); 262 + let mut stmt = conn.prepare(&sql)?; 263 + let rows = stmt.query_map(rusqlite::params_from_iter(patterns.iter()), |row| { 264 + Ok(( 265 + row.get::<_, i64>(0)?, 266 + row.get::<_, String>(1)?, 267 + row.get::<_, String>(2)?, 268 + row.get::<_, Vec<u8>>(3)?, 269 + )) 270 + })?; 271 + let results = rows 272 + .filter_map(|r| r.ok()) 273 + .map(|(id, content, tags_str, emb_bytes)| { 274 + ( 275 + id, 276 + content, 277 + serde_json::from_str(&tags_str).unwrap_or_default(), 278 + bytes_to_f32s(&emb_bytes), 279 + ) 280 + }) 281 + .collect(); 282 + Ok(results) 283 + } 284 + 285 + /// ANN search, returns up to k results sorted by cosine distance 286 + fn top_k(&self, query: &[f32], k: usize) -> Result<Vec<RecallEntry>> { 287 + let conn = self.conn.lock().unwrap(); 288 + let blob = f32s_to_bytes(query); 289 + let mut stmt = conn.prepare( 290 + "SELECT m.id, m.content, m.tags, v.distance 291 + FROM vec_memories v 292 + JOIN memories m ON m.id = v.rowid 293 + WHERE v.embedding MATCH ?1 294 + ORDER BY v.distance 295 + LIMIT ?2", 296 + )?; 297 + let rows = stmt.query_map(params![blob, k as i64], |row| { 298 + Ok(( 299 + row.get::<_, i64>(0)?, 300 + row.get::<_, String>(1)?, 301 + row.get::<_, String>(2)?, 302 + row.get::<_, f32>(3)?, 303 + )) 304 + })?; 305 + let results = rows 306 + .filter_map(|r| r.ok()) 307 + .map(|(id, content, tags_str, dist)| RecallEntry { 308 + id, 309 + content, 310 + tags: serde_json::from_str(&tags_str).unwrap_or_default(), 311 + distance: Some(dist), 312 + }) 313 + .collect(); 314 + Ok(results) 315 + } 316 + 317 + /// tag-only SQL query, newest first 318 + fn by_tags(&self, tags: &[String], tag_and: bool) -> Result<Vec<RecallEntry>> { 319 + if tags.is_empty() { 320 + return Ok(vec![]); 321 + } 322 + let conn = self.conn.lock().unwrap(); 323 + let conditions: Vec<String> = (1..=tags.len()) 324 + .map(|i| format!("tags LIKE ?{i} ESCAPE '\\'")) 325 + .collect(); 326 + let sql = format!( 327 + "SELECT id, content, tags FROM memories WHERE {} ORDER BY ts DESC", 328 + conditions.join(if tag_and { " AND " } else { " OR " }) 329 + ); 330 + let patterns: Vec<String> = tags 331 + .iter() 332 + .map(|t| { 333 + let e = t 334 + .replace('\\', "\\\\") 335 + .replace('%', "\\%") 336 + .replace('_', "\\_"); 337 + format!("%\"{e}\"%") 338 + }) 339 + .collect(); 340 + let mut stmt = conn.prepare(&sql)?; 341 + let rows = stmt.query_map(rusqlite::params_from_iter(patterns.iter()), |row| { 342 + Ok(( 343 + row.get::<_, i64>(0)?, 344 + row.get::<_, String>(1)?, 345 + row.get::<_, String>(2)?, 346 + )) 347 + })?; 348 + let results = rows 349 + .filter_map(|r| r.ok()) 350 + .map(|(id, content, tags_str)| RecallEntry { 351 + id, 352 + content, 353 + tags: serde_json::from_str(&tags_str).unwrap_or_default(), 354 + distance: None, 355 + }) 356 + .collect(); 357 + Ok(results) 358 + } 359 + 360 + /// all memories with embeddings, newest first 131 361 pub fn get_all(&self) -> Result<Vec<Memory>> { 132 362 let conn = self.conn.lock().unwrap(); 133 363 let mut stmt = conn.prepare( 134 - " 135 - SELECT m.content, v.embedding 136 - FROM memories m 137 - JOIN vec_memories v ON v.rowid = m.id 138 - ORDER BY m.ts DESC 139 - ", 364 + "SELECT m.id, m.content, m.pinned, m.tags, v.embedding 365 + FROM memories m 366 + JOIN vec_memories v ON v.rowid = m.id 367 + ORDER BY m.ts DESC", 140 368 )?; 141 369 let results = stmt 142 370 .query_map([], |row| { 143 - let content = row.get::<_, String>(0)?; 144 - let emb_bytes = row.get::<_, Vec<u8>>(1)?; 145 - let embedding = bytes_to_f32s(&emb_bytes); 146 - Ok(Memory { content, embedding }) 371 + Ok(( 372 + row.get::<_, i64>(0)?, 373 + row.get::<_, String>(1)?, 374 + row.get::<_, i64>(2)? != 0, 375 + row.get::<_, String>(3)?, 376 + row.get::<_, Vec<u8>>(4)?, 377 + )) 147 378 })? 148 379 .filter_map(|r| r.ok()) 380 + .map(|(id, content, pinned, tags_str, emb_bytes)| Memory { 381 + id, 382 + content, 383 + pinned, 384 + tags: serde_json::from_str(&tags_str).unwrap_or_default(), 385 + embedding: bytes_to_f32s(&emb_bytes), 386 + }) 149 387 .collect(); 150 388 Ok(results) 151 389 } ··· 158 396 Ok(()) 159 397 } 160 398 161 - /// last `n` turns in chronological order (oldest first, ready to replay into context). 399 + /// last `n` turns, oldest first 162 400 pub fn recent_turns(&self, n: usize) -> Result<Vec<HistoryEntry>> { 163 401 let conn = self.conn.lock().unwrap(); 164 402 let mut stmt = conn.prepare( 165 - "SELECT id, role, content, thinking, ts FROM (SELECT id, role, content, thinking, ts FROM turns ORDER BY ts DESC LIMIT ?1) ORDER BY ts ASC", 403 + "SELECT id, role, content, thinking, ts FROM \ 404 + (SELECT id, role, content, thinking, ts FROM turns ORDER BY ts DESC LIMIT ?1) \ 405 + ORDER BY ts ASC", 166 406 )?; 167 407 let results = stmt 168 408 .query_map(params![n as i64], |row| { ··· 179 419 Ok(results) 180 420 } 181 421 182 - /// turns older than `before_id`, newest-first then reversed, for scroll-back paging. 422 + /// turns older than `before_id`, newest-first then reversed, for scroll-back paging 183 423 pub fn turns_before(&self, before_id: i64, limit: usize) -> Result<Vec<HistoryEntry>> { 184 424 let conn = self.conn.lock().unwrap(); 185 425 let mut stmt = conn.prepare( 186 - "SELECT id, role, content, thinking, ts FROM (SELECT id, role, content, thinking, ts FROM turns WHERE id < ?1 ORDER BY id DESC LIMIT ?2) ORDER BY id ASC", 426 + "SELECT id, role, content, thinking, ts FROM \ 427 + (SELECT id, role, content, thinking, ts FROM turns WHERE id < ?1 \ 428 + ORDER BY id DESC LIMIT ?2) \ 429 + ORDER BY id ASC", 187 430 )?; 188 431 let results = stmt 189 432 .query_map(params![before_id, limit as i64], |row| { ··· 201 444 } 202 445 } 203 446 447 + /// cosine distance in [0, 2] — matches sqlite-vec's distance_metric=cosine convention 448 + fn cosine_distance(a: &[f32], b: &[f32]) -> f32 { 449 + let dot: f32 = a.iter().zip(b).map(|(x, y)| x * y).sum(); 450 + let norm_a: f32 = a.iter().map(|x| x * x).sum::<f32>().sqrt(); 451 + let norm_b: f32 = b.iter().map(|x| x * x).sum::<f32>().sqrt(); 452 + if norm_a == 0.0 || norm_b == 0.0 { 453 + return 1.0; 454 + } 455 + 1.0 - (dot / (norm_a * norm_b)) 456 + } 457 + 204 458 fn f32s_to_bytes(v: &[f32]) -> Vec<u8> { 205 459 v.iter().flat_map(|f| f.to_le_bytes()).collect() 206 460 } ··· 219 473 #[test] 220 474 fn test_memory_reset() -> Result<()> { 221 475 let tmp = NamedTempFile::new()?; 222 - let path = tmp.path().to_str().unwrap(); 223 - let store = MemoryStore::open(path, 768)?; 476 + let store = MemoryStore::open(tmp.path().to_str().unwrap(), 768)?; 224 477 225 - // add some data 226 478 store.log_turn("user", "hello", None)?; 227 - store.store("mem", &[1.0; 768])?; 479 + store.store("mem", &[1.0; 768], &[])?; 228 480 229 481 assert_eq!(store.recent_turns(10)?.len(), 1); 230 482 let all = store.get_all()?; ··· 232 484 assert_eq!(all[0].content, "mem"); 233 485 assert_eq!(all[0].embedding.len(), 768); 234 486 235 - // reset 236 487 store.reset()?; 237 - 238 488 assert_eq!(store.recent_turns(10)?.len(), 0); 239 489 assert_eq!(store.get_all()?.len(), 0); 490 + Ok(()) 491 + } 492 + 493 + #[test] 494 + fn test_pin_unpin() -> Result<()> { 495 + let tmp = NamedTempFile::new()?; 496 + let store = MemoryStore::open(tmp.path().to_str().unwrap(), 4)?; 497 + 498 + let id = store.store("important fact", &[1.0, 0.0, 0.0, 0.0], &[])?; 499 + assert_eq!(store.pinned_memories()?.len(), 0); 500 + 501 + store.set_pinned(id, true)?; 502 + assert_eq!(store.pinned_memories()?, vec!["important fact"]); 503 + 504 + store.set_pinned(id, false)?; 505 + assert_eq!(store.pinned_memories()?.len(), 0); 506 + Ok(()) 507 + } 508 + 509 + #[test] 510 + fn test_tags_and_recall() -> Result<()> { 511 + let tmp = NamedTempFile::new()?; 512 + let store = MemoryStore::open(tmp.path().to_str().unwrap(), 4)?; 513 + 514 + let tags = vec!["preference".to_string(), "project".to_string()]; 515 + let id = store.store("uses dark mode", &[1.0, 0.0, 0.0, 0.0], &tags)?; 516 + store.store("unrelated", &[0.0, 1.0, 0.0, 0.0], &[])?; 517 + 518 + // tag-only OR 519 + let by_pref = store.recall(None, &["preference".to_string()], false, 10)?; 520 + assert_eq!(by_pref.len(), 1); 521 + assert_eq!(by_pref[0].id, id); 522 + assert!(by_pref[0].distance.is_none()); 523 + 524 + // AND — both tags present 525 + let both = store.recall(None, &tags, true, 10)?; 526 + assert_eq!(both.len(), 1); 527 + 528 + // AND — one tag missing 529 + let none = store.recall( 530 + None, 531 + &["preference".to_string(), "other".to_string()], 532 + true, 533 + 10, 534 + )?; 535 + assert_eq!(none.len(), 0); 536 + 537 + // retag and verify old tag no longer matches 538 + store.set_tags(id, &["preference".to_string()])?; 539 + let no_proj = store.recall(None, &["project".to_string()], false, 10)?; 540 + assert_eq!(no_proj.len(), 0); 240 541 241 542 Ok(()) 242 543 }

+509

klbr-core/src/tools.rs

··· 1 + use serde_json::json; 2 + use tokio::process::Command; 3 + 4 + use crate::{ 5 + llm::{LlmClient, ToolCall, ToolDef}, 6 + memory::MemoryStore, 7 + }; 8 + 9 + /// built-in tool definitions sent to the model 10 + pub fn definitions() -> Vec<ToolDef> { 11 + vec![ 12 + ToolDef::function( 13 + "shell", 14 + "execute a shell command and return its stdout/stderr. use for running code, \ 15 + file operations, system queries, etc.", 16 + json!({ 17 + "type": "object", 18 + "properties": { 19 + "cmd": { 20 + "type": "string", 21 + "description": "the shell command to run" 22 + } 23 + }, 24 + "required": ["cmd"] 25 + }), 26 + ), 27 + ToolDef::function( 28 + "read_file", 29 + "read the contents of a file, optionally limited to a line range", 30 + json!({ 31 + "type": "object", 32 + "properties": { 33 + "path": { 34 + "type": "string", 35 + "description": "absolute or relative path to the file" 36 + }, 37 + "start_line": { 38 + "type": "integer", 39 + "description": "first line to return (1-based, inclusive). omit for beginning." 40 + }, 41 + "end_line": { 42 + "type": "integer", 43 + "description": "last line to return (1-based, inclusive). omit for end of file." 44 + } 45 + }, 46 + "required": ["path"] 47 + }), 48 + ), 49 + ToolDef::function( 50 + "write_file", 51 + "write text content to a file, creating or overwriting it", 52 + json!({ 53 + "type": "object", 54 + "properties": { 55 + "path": { 56 + "type": "string", 57 + "description": "path to the file to write" 58 + }, 59 + "content": { 60 + "type": "string", 61 + "description": "content to write" 62 + } 63 + }, 64 + "required": ["path", "content"] 65 + }), 66 + ), 67 + ToolDef::function( 68 + "remember", 69 + "store something in long-term memory. use whenever you learn something worth keeping \ 70 + across sessions — user preferences, facts about projects, decisions, names, etc. \ 71 + set important=true to pin it so it's always visible at startup.", 72 + json!({ 73 + "type": "object", 74 + "properties": { 75 + "content": { 76 + "type": "string", 77 + "description": "the fact or note to remember, written concisely" 78 + }, 79 + "important": { 80 + "type": "boolean", 81 + "description": "if true, pin this memory so it appears at every startup" 82 + }, 83 + "tags": { 84 + "type": "array", 85 + "items": { "type": "string" }, 86 + "description": "optional category tags, e.g. [\"preference\", \"project\", \"person\"]" 87 + } 88 + }, 89 + "required": ["content"] 90 + }), 91 + ), 92 + ToolDef::function( 93 + "recall", 94 + "semantic search over long-term memory. \ 95 + with no tags: searches all memories by meaning. \ 96 + with tags: searches only within memories that match those tags, \ 97 + ranked by semantic similarity (never misses a tag-matched memory due to global ranking). \ 98 + provide at least a query.", 99 + json!({ 100 + "type": "object", 101 + "properties": { 102 + "query": { 103 + "type": "string", 104 + "description": "what to search for by meaning" 105 + }, 106 + "tags": { 107 + "type": "array", 108 + "items": { "type": "string" }, 109 + "description": "restrict search to memories with these tags, e.g. [\"person:mayer\"] or [\"preference\"]" 110 + }, 111 + "tag_mode": { 112 + "type": "string", 113 + "enum": ["and", "or"], 114 + "description": "\"and\" = all tags must match, \"or\" = any tag matches (default: \"or\")" 115 + }, 116 + "limit": { 117 + "type": "integer", 118 + "description": "max results (default 5)" 119 + } 120 + }, 121 + "required": ["query"] 122 + }), 123 + ), 124 + ToolDef::function( 125 + "context_for", 126 + "fetch all memories associated with a tag — a person, project, topic, etc. \ 127 + use this to load everything you know about someone or something before responding. \ 128 + no semantic ranking; returns newest first.", 129 + json!({ 130 + "type": "object", 131 + "properties": { 132 + "tags": { 133 + "type": "array", 134 + "items": { "type": "string" }, 135 + "description": "tags to fetch, e.g. [\"person:mayer\"] or [\"project:klbr\", \"preference\"]" 136 + }, 137 + "tag_mode": { 138 + "type": "string", 139 + "enum": ["and", "or"], 140 + "description": "\"and\" = all tags must match, \"or\" = any tag matches (default: \"or\")" 141 + }, 142 + "limit": { 143 + "type": "integer", 144 + "description": "max results (default 20)" 145 + } 146 + }, 147 + "required": ["tags"] 148 + }), 149 + ), 150 + ToolDef::function( 151 + "tag_memory", 152 + "set or replace the tags on an existing memory", 153 + json!({ 154 + "type": "object", 155 + "properties": { 156 + "id": { 157 + "type": "integer", 158 + "description": "memory id" 159 + }, 160 + "tags": { 161 + "type": "array", 162 + "items": { "type": "string" }, 163 + "description": "new tag list (replaces existing tags)" 164 + } 165 + }, 166 + "required": ["id", "tags"] 167 + }), 168 + ), 169 + ToolDef::function( 170 + "pin_memory", 171 + "pin an existing memory so it appears at every startup. use during reflection to \ 172 + promote unpinned memories that turned out to be long-term important.", 173 + json!({ 174 + "type": "object", 175 + "properties": { 176 + "id": { 177 + "type": "integer", 178 + "description": "memory id (shown in list_memories or recall results)" 179 + } 180 + }, 181 + "required": ["id"] 182 + }), 183 + ), 184 + ToolDef::function( 185 + "unpin_memory", 186 + "unpin a memory so it's no longer shown at startup (still searchable). use during \ 187 + reflection to demote pinned memories that are no longer important or accurate.", 188 + json!({ 189 + "type": "object", 190 + "properties": { 191 + "id": { 192 + "type": "integer", 193 + "description": "memory id" 194 + } 195 + }, 196 + "required": ["id"] 197 + }), 198 + ), 199 + ToolDef::function( 200 + "list_memories", 201 + "list current pinned memories and recent unpinned memories with their ids. \ 202 + useful before a reflection pass to see what's stored.", 203 + json!({ 204 + "type": "object", 205 + "properties": {}, 206 + "required": [] 207 + }), 208 + ), 209 + ] 210 + } 211 + 212 + /// tool definitions used only during reflection (memory management tools only) 213 + pub fn reflection_definitions() -> Vec<ToolDef> { 214 + definitions() 215 + .into_iter() 216 + .filter(|d| { 217 + matches!( 218 + d.function.name.as_str(), 219 + "remember" 220 + | "recall" 221 + | "context_for" 222 + | "tag_memory" 223 + | "pin_memory" 224 + | "unpin_memory" 225 + | "list_memories" 226 + ) 227 + }) 228 + .collect() 229 + } 230 + 231 + /// dispatch a tool call and return the result as a string 232 + pub async fn execute(call: &ToolCall, memory: &MemoryStore, llm: &LlmClient) -> String { 233 + let args: serde_json::Value = 234 + serde_json::from_str(&call.function.arguments).unwrap_or_default(); 235 + 236 + match call.function.name.as_str() { 237 + "shell" => { 238 + let cmd = match args["cmd"].as_str() { 239 + Some(c) => c.to_string(), 240 + None => return "error: missing required arg 'cmd'".into(), 241 + }; 242 + match Command::new("sh").arg("-c").arg(&cmd).output().await { 243 + Ok(out) => { 244 + let stdout = String::from_utf8_lossy(&out.stdout); 245 + let stderr = String::from_utf8_lossy(&out.stderr); 246 + let code = out.status.code().unwrap_or(-1); 247 + let mut result = format!("exit {code}"); 248 + if !stdout.is_empty() { 249 + result.push('\n'); 250 + if stdout.len() > 20_000 { 251 + result.push_str(&stdout[..20_000]); 252 + result.push_str("\n[...truncated]"); 253 + } else { 254 + result.push_str(&stdout); 255 + } 256 + } 257 + if !stderr.is_empty() { 258 + result.push_str("\nstderr:\n"); 259 + if stderr.len() > 5_000 { 260 + result.push_str(&stderr[..5_000]); 261 + result.push_str("\n[...truncated]"); 262 + } else { 263 + result.push_str(&stderr); 264 + } 265 + } 266 + result 267 + } 268 + Err(e) => format!("error: {e}"), 269 + } 270 + } 271 + 272 + "read_file" => { 273 + let path = match args["path"].as_str() { 274 + Some(p) => p.to_string(), 275 + None => return "error: missing required arg 'path'".into(), 276 + }; 277 + match tokio::fs::read_to_string(&path).await { 278 + Ok(content) => { 279 + let start = args["start_line"] 280 + .as_u64() 281 + .map(|n| n.saturating_sub(1) as usize); 282 + let end = args["end_line"].as_u64().map(|n| n as usize); 283 + 284 + if start.is_none() && end.is_none() { 285 + if content.len() > 50_000 { 286 + format!( 287 + "{}\n[...truncated, {} total bytes]", 288 + &content[..50_000], 289 + content.len() 290 + ) 291 + } else { 292 + content 293 + } 294 + } else { 295 + let lines: Vec<&str> = content.lines().collect(); 296 + let s = start.unwrap_or(0).min(lines.len()); 297 + let e = end.unwrap_or(lines.len()).min(lines.len()); 298 + if s >= e { 299 + return format!("no lines in range {}..{}", s + 1, e); 300 + } 301 + lines[s..e].join("\n") 302 + } 303 + } 304 + Err(e) => format!("error: {e}"), 305 + } 306 + } 307 + 308 + "write_file" => { 309 + let path = match args["path"].as_str() { 310 + Some(p) => p.to_string(), 311 + None => return "error: missing required arg 'path'".into(), 312 + }; 313 + let content = args["content"].as_str().unwrap_or("").to_string(); 314 + match tokio::fs::write(&path, &content).await { 315 + Ok(_) => format!("wrote {} bytes to {path}", content.len()), 316 + Err(e) => format!("error: {e}"), 317 + } 318 + } 319 + 320 + "remember" => { 321 + let content = match args["content"].as_str() { 322 + Some(c) => c.to_string(), 323 + None => return "error: missing required arg 'content'".into(), 324 + }; 325 + let important = args["important"].as_bool().unwrap_or(false); 326 + let tags: Vec<String> = args["tags"] 327 + .as_array() 328 + .map(|arr| { 329 + arr.iter() 330 + .filter_map(|v| v.as_str().map(String::from)) 331 + .collect() 332 + }) 333 + .unwrap_or_default(); 334 + match llm.embed(&content).await { 335 + Ok(emb) => match memory.store(&content, &emb, &tags) { 336 + Ok(id) => { 337 + if important { 338 + if let Err(e) = memory.set_pinned(id, true) { 339 + return format!("stored (id:{id}) but pin failed: {e}"); 340 + } 341 + let tag_info = if tags.is_empty() { 342 + String::new() 343 + } else { 344 + format!(", tags: {}", tags.join(", ")) 345 + }; 346 + format!("stored and pinned (id:{id}{tag_info})") 347 + } else { 348 + let tag_info = if tags.is_empty() { 349 + String::new() 350 + } else { 351 + format!(", tags: {}", tags.join(", ")) 352 + }; 353 + format!("stored (id:{id}{tag_info})") 354 + } 355 + } 356 + Err(e) => format!("error storing memory: {e}"), 357 + }, 358 + Err(e) => format!("error embedding memory: {e}"), 359 + } 360 + } 361 + 362 + "recall" => { 363 + let query = match args["query"].as_str() { 364 + Some(q) => q.to_string(), 365 + None => return "error: missing required arg 'query'".into(), 366 + }; 367 + let tags: Vec<String> = args["tags"] 368 + .as_array() 369 + .map(|arr| { 370 + arr.iter() 371 + .filter_map(|v| v.as_str().map(String::from)) 372 + .collect() 373 + }) 374 + .unwrap_or_default(); 375 + let tag_and = args["tag_mode"].as_str() == Some("and"); 376 + let limit = args["limit"].as_u64().unwrap_or(5) as usize; 377 + 378 + let emb = match llm.embed(&query).await { 379 + Ok(e) => e, 380 + Err(err) => return format!("error embedding query: {err}"), 381 + }; 382 + 383 + match memory.recall(Some(&emb), &tags, tag_and, limit) { 384 + Ok(results) if results.is_empty() => "no memories found".into(), 385 + Ok(results) => results 386 + .into_iter() 387 + .map(|e| { 388 + let tag_str = if e.tags.is_empty() { 389 + String::new() 390 + } else { 391 + format!(" [{}]", e.tags.join(", ")) 392 + }; 393 + format!( 394 + "[dist:{:.3}][id:{}]{tag_str} {}", 395 + e.distance.unwrap_or(0.0), 396 + e.id, 397 + e.content 398 + ) 399 + }) 400 + .collect::<Vec<_>>() 401 + .join("\n"), 402 + Err(err) => format!("error: {err}"), 403 + } 404 + } 405 + 406 + "context_for" => { 407 + let tags: Vec<String> = match args["tags"].as_array() { 408 + Some(arr) => arr 409 + .iter() 410 + .filter_map(|v| v.as_str().map(String::from)) 411 + .collect(), 412 + None => return "error: missing required arg 'tags'".into(), 413 + }; 414 + let tag_and = args["tag_mode"].as_str() == Some("and"); 415 + let limit = args["limit"].as_u64().unwrap_or(20) as usize; 416 + 417 + match memory.context_for(&tags, tag_and, limit) { 418 + Ok(results) if results.is_empty() => { 419 + format!("no memories found for tags: {}", tags.join(", ")) 420 + } 421 + Ok(results) => results 422 + .into_iter() 423 + .map(|e| { 424 + let tag_str = if e.tags.is_empty() { 425 + String::new() 426 + } else { 427 + format!(" [{}]", e.tags.join(", ")) 428 + }; 429 + format!("[id:{}]{tag_str} {}", e.id, e.content) 430 + }) 431 + .collect::<Vec<_>>() 432 + .join("\n"), 433 + Err(err) => format!("error: {err}"), 434 + } 435 + } 436 + 437 + "tag_memory" => { 438 + let id = match args["id"].as_i64() { 439 + Some(i) => i, 440 + None => return "error: missing required arg 'id'".into(), 441 + }; 442 + let tags: Vec<String> = args["tags"] 443 + .as_array() 444 + .map(|arr| { 445 + arr.iter() 446 + .filter_map(|v| v.as_str().map(String::from)) 447 + .collect() 448 + }) 449 + .unwrap_or_default(); 450 + match memory.set_tags(id, &tags) { 451 + Ok(_) => format!("tagged memory {id} with: {}", tags.join(", ")), 452 + Err(e) => format!("error: {e}"), 453 + } 454 + } 455 + 456 + "pin_memory" => { 457 + let id = match args["id"].as_i64() { 458 + Some(i) => i, 459 + None => return "error: missing required arg 'id'".into(), 460 + }; 461 + match memory.set_pinned(id, true) { 462 + Ok(_) => format!("pinned memory {id}"), 463 + Err(e) => format!("error: {e}"), 464 + } 465 + } 466 + 467 + "unpin_memory" => { 468 + let id = match args["id"].as_i64() { 469 + Some(i) => i, 470 + None => return "error: missing required arg 'id'".into(), 471 + }; 472 + match memory.set_pinned(id, false) { 473 + Ok(_) => format!("unpinned memory {id}"), 474 + Err(e) => format!("error: {e}"), 475 + } 476 + } 477 + 478 + "list_memories" => { 479 + let pinned = memory.pinned_memories().unwrap_or_default(); 480 + let unpinned = memory.recent_unpinned(10).unwrap_or_default(); 481 + 482 + let mut out = String::new(); 483 + out.push_str("## pinned\n"); 484 + if pinned.is_empty() { 485 + out.push_str("(none)\n"); 486 + } else { 487 + for (i, content) in pinned.iter().enumerate() { 488 + out.push_str(&format!("{}. {content}\n", i + 1)); 489 + } 490 + } 491 + out.push_str("\n## recent unpinned\n"); 492 + if unpinned.is_empty() { 493 + out.push_str("(none)\n"); 494 + } else { 495 + for (id, content, tags) in &unpinned { 496 + let tag_str = if tags.is_empty() { 497 + String::new() 498 + } else { 499 + format!(" [{}]", tags.join(", ")) 500 + }; 501 + out.push_str(&format!("[id:{id}]{tag_str} {content}\n")); 502 + } 503 + } 504 + out 505 + } 506 + 507 + other => format!("unknown tool: {other}"), 508 + } 509 + }

+2

klbr-daemon/src/daemon.rs

··· 151 151 context_tokens: m.context_tokens, 152 152 watermark: m.watermark, 153 153 }, 154 + AgentEvent::ToolCall { name, args } => ServerMsg::ToolCall { name, args }, 155 + AgentEvent::ToolResult { name, content } => ServerMsg::ToolResult { name, content }, 154 156 }; 155 157 tracing::info!(msg = ?msg, "sending message to client"); 156 158 send_msg(&mut sock_tx, &msg).await?;

+8

klbr-ipc/src/lib.rs

··· 46 46 History { 47 47 turns: Vec<HistoryEntry>, 48 48 }, 49 + ToolCall { 50 + name: String, 51 + args: String, 52 + }, 53 + ToolResult { 54 + name: String, 55 + content: String, 56 + }, 49 57 } 50 58 51 59 pub fn sock_path() -> PathBuf {

+101 -1

klbr-tui/src/main.rs

··· 50 50 reason: Option<Reason>, 51 51 step: AssistantStep, 52 52 }, 53 + /// a tool call/result pair shown inline during generation 54 + Tool { 55 + name: String, 56 + args: String, 57 + result: Option<String>, 58 + }, 53 59 } 54 60 55 61 impl Role { ··· 367 373 } 368 374 was_assistant = false; 369 375 } 376 + Role::Tool { name, args, result } => { 377 + // parse args for display: show key=value pairs or raw if not object 378 + let args_display = if let Ok(v) = serde_json::from_str::<serde_json::Value>(args) { 379 + if let Some(obj) = v.as_object() { 380 + obj.iter() 381 + .map(|(k, v)| { 382 + let val = match v { 383 + serde_json::Value::String(s) => { 384 + if s.len() > 60 { 385 + format!("{}...", &s[..60]) 386 + } else { 387 + s.clone() 388 + } 389 + } 390 + other => other.to_string(), 391 + }; 392 + format!("{k}={val}") 393 + }) 394 + .collect::<Vec<_>>() 395 + .join(" ") 396 + } else { 397 + args.clone() 398 + } 399 + } else { 400 + args.clone() 401 + }; 402 + 403 + lines.push(Line::from(vec![ 404 + Span::styled("$ ", Style::default().fg(Color::Yellow)), 405 + Span::styled( 406 + format!("{name}({args_display})"), 407 + Style::default().fg(Color::Yellow), 408 + ), 409 + ])); 410 + 411 + if let Some(res) = result { 412 + for line in res.lines().take(10) { 413 + lines.push(Line::from(vec![ 414 + Span::raw(" "), 415 + Span::styled(line, Style::default().fg(Color::DarkGray)), 416 + ])); 417 + } 418 + let total_lines = res.lines().count(); 419 + if total_lines > 10 { 420 + lines.push(Line::from(vec![ 421 + Span::raw(" "), 422 + Span::styled( 423 + format!("...{} more lines", total_lines - 10), 424 + Style::default() 425 + .fg(Color::DarkGray) 426 + .add_modifier(Modifier::DIM), 427 + ), 428 + ])); 429 + } 430 + } else { 431 + lines.push(Line::from(vec![ 432 + Span::raw(" "), 433 + Span::styled( 434 + "running...", 435 + Style::default() 436 + .fg(Color::DarkGray) 437 + .add_modifier(Modifier::DIM), 438 + ), 439 + ])); 440 + } 441 + was_assistant = false; 442 + } 370 443 Role::Assistant { reason, step } => { 371 444 was_assistant = true; 372 445 let klbr_span = Span::styled( ··· 631 704 sock_tx.write_all(payload.as_bytes()).await?; 632 705 } 633 706 Cmd::Dump(path) => { 634 - let payload = serde_json::to_string(&ClientMsg::DumpMemories { path })? + "\n"; 707 + let payload = 708 + serde_json::to_string(&ClientMsg::DumpMemories { path })? + "\n"; 635 709 sock_tx.write_all(payload.as_bytes()).await?; 636 710 } 637 711 Cmd::ToggleThink => app.toggle_last_think(), ··· 718 792 app.loading_history = false; 719 793 if !app.prepend_turns(turns) { 720 794 app.history_exhausted = true; 795 + } 796 + } 797 + ServerMsg::ToolCall { name, args } => { 798 + app.snap_to_bottom(); 799 + app.history.push(ChatMsg { 800 + role: Role::Tool { 801 + name, 802 + args, 803 + result: None, 804 + }, 805 + content: String::new(), 806 + expanded: true, 807 + }); 808 + } 809 + ServerMsg::ToolResult { name, content } => { 810 + // find last pending tool message with matching name and fill in result 811 + for msg in app.history.iter_mut().rev() { 812 + if let Role::Tool { 813 + name: n, result, .. 814 + } = &mut msg.role 815 + { 816 + if *n == name && result.is_none() { 817 + *result = Some(content); 818 + break; 819 + } 820 + } 721 821 } 722 822 } 723 823 }

Configure Feed

Configure Feed