ive harnessed the harness
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

at main 324 lines 14 kB view raw view rendered
1# klbr — codebase reference for agents 2 3Personal AI agent harness in Rust. Local LLM chat daemon with long-term memory, tool calling, and a ratatui TUI. Self-hosted, no corporate product feel. 4 5--- 6 7## crate layout 8 9``` 10klbr/ 11 klbr-core/ — agent loop, LLM client, memory, context, tools 12 klbr-daemon/ — WebSocket server, bridges agent to clients 13 klbr-ipc/ — shared protocol types (ClientMsg, ServerMsg) 14 klbr-tui/ — ratatui TUI chat client 15``` 16 17**binaries**: `klbr-daemon` (start this first), `klbr-tui` (connects to daemon) 18 19--- 20 21## LLM backend 22 23- **llama-server compatible API** at `http://localhost:1234` 24- Chat model: `google/gemma-4-26b-a4b` 25- Embedding model: `nomic-embed-text-v1.5` (768 dims) 26- Both served from the same endpoint (configured separately in `Config` as `llm_url` / `embed_url`) 27- Streaming via SSE (`data: {...}\n`) 28- Tool calls use OpenAI function-calling format with streaming delta accumulation 29 30--- 31 32## klbr-core 33 34### `config.rs` 35 36`Config` struct (supports JSON config loading via `Config::load()`). Fields: 37- `llm_url`, `embed_url`, `llm_model`, `embed_model` 38- `watermark_tokens: 32_000` — triggers compaction when context exceeds this 39- `compaction_keep: 10` — turns to keep after draining 40- `memory_top_k: 3`, `memory_sim_threshold: 0.3` — recall injection params 41- `history_window` — how many persisted turns to send on connect 42- `compaction_llm_url`, `compaction_model` — optional separate LLM for compaction 43- `db_path: "agent.db"`, `embed_dim: 768` 44- `anchor: String` — system prompt (includes personality + memory tool instructions) 45 46The anchor tells the agent about its memory tools and tagging conventions. Edit it in `config.rs` when adding/changing tools. 47 48### `llm.rs` 49 50**`LlmClient`** (Clone): 51- `stream(messages, tools, tok_tx)` — streaming completion, sends `LlmEvent` over mpsc channel. accumulates tool call deltas by index in `HashMap<usize, PartialCall>`, flushes `LlmEvent::ToolCalls` on `[DONE]`. 52- `complete(messages)` — non-streaming, used for compaction summaries and reflection. returns `(String, Usage)`. 53- `embed(text)` — returns `Vec<f32>` embedding. 54 55**`Message`** struct — OpenAI format: 56- `role: String`, `content: Option<String>` 57- `tool_calls: Option<Vec<ToolCall>>` — for assistant tool call messages 58- `tool_call_id: Option<String>` — for tool result messages 59- All optional fields skip serialization when None (`#[serde(skip_serializing_if)]`) 60- Constructors: `Message::system()`, `::user()`, `::assistant()`, `::with_tool_calls()`, `::tool_result()` 61 62**`LlmEvent`** variants: `Token(String)`, `ThinkToken(String)`, `Usage(Usage)`, `ToolCalls(Vec<ToolCall>)` 63 64### `memory.rs` 65 66SQLite + sqlite-vec. Single DB file (`agent.db`). Two tables: 67 68**`memories`** — episodic memory store: 69- `id, content TEXT, pinned INTEGER (0/1), tags TEXT (JSON array), ts INTEGER` 70- paired with virtual table **`vec_memories`** (sqlite-vec, cosine distance metric, 768 dims) 71- migration-safe: `migrate()` runs `ALTER TABLE ADD COLUMN` (fails silently if column exists) 72 73**`turns`** — full turn history: 74- `id, role TEXT, content TEXT, thinking TEXT, ts INTEGER` 75 76**`MemoryStore`** (Clone, wraps `Arc<Mutex<Connection>>`): 77- `store(content, emb, tags)``Result<i64>` — insert memory, return id 78- `set_pinned(id, bool)` — pin/unpin 79- `set_tags(id, tags)` — replace tags 80- `pinned_memories()``Vec<String>` — for anchor injection at startup 81- `recent_unpinned(n)``Vec<(i64, String, Vec<String>)>` — for reflection prompt 82- `recall(query_emb, tags, tag_and, limit)``Vec<RecallEntry>`**main search method**: 83 - no tags: global ANN via sqlite-vec 84 - with tags + query: fetch all tag-matched memories WITH embeddings, exact cosine in Rust (never misses due to ANN cutoff) 85 - with tags only: delegates to `context_for` 86- `context_for(tags, tag_and, limit)``Vec<RecallEntry>` — pure SQL tag lookup, newest first 87- `log_turn(role, content, thinking)` — append to turns table 88- `recent_turns(n)` → chronological slice (oldest first) for context replay 89- `turns_before(before_id, limit)` → for TUI scroll-back paging 90- `get_all()``Vec<Memory>` — all memories with embeddings (for dump) 91- `reset()` — drop and recreate all tables 92 93**`RecallEntry`**: `id, content, tags: Vec<String>, distance: Option<f32>` (None = tag-only hit) 94 95**internal helpers** (private): 96- `top_k(emb, k)` — ANN via sqlite-vec 97- `by_tags(tags, tag_and)` — SQL LIKE on JSON array with escape handling 98- `tag_matched_with_embeddings(tags, tag_and)` — for exact cosine path 99- `cosine_distance(a, b)` — returns value in [0, 2], matching sqlite-vec convention 100 101### `context.rs` 102 103In-memory sliding window sent to the LLM on each turn. 104 105**`Context`**: 106- `anchor: Vec<Message>` — never evicted (system prompt + pinned memories) 107- `turns: Vec<Message>` — rolling conversation 108- `total_tokens: usize` — updated from `LlmEvent::Usage` 109 110Key methods: 111- `new(anchor, pinned_memories)` — builds system message, appends pinned memories section 112- `update_anchor(anchor, pinned)` — rebuilds system message with pinned section 113- `load_turns(pairs)` — replay `(role, content)` pairs from DB on startup; skips tool/other roles (ephemeral) 114- `inject_recalled_memories(memories)` — ephemeral assistant message `[recalled memory]` with `[id:..] [tags:..]` blocks 115- `push_input(content)` — user turn only (persisted history stays clean) 116- `push_assistant_tool_calls(calls)` — assistant message with tool_calls, no content 117- `push_tool_result(id, content)` — tool role message 118- `drain_oldest(keep)` — removes all but `keep` most recent turns; walks forward from cut point to avoid splitting tool call sequences (never cuts mid-tool-call) 119- `as_messages()` — anchor + turns concatenated, ready to send to LLM 120 121### `tools.rs` 122 123**`definitions()`** — full tool list sent to LLM on every turn: 124- `shell(cmd)` — runs via `sh -c`, caps stdout 20k / stderr 5k chars 125- `read_file(path, start_line?, end_line?)` — caps at 50k bytes 126- `write_file(path, content)` 127- `remember(content, important?, tags?)` — embeds and stores; pins if `important=true` 128- `recall(query, tags?, tag_mode?, limit?)` — semantic search, optional tag filter 129- `context_for(tags, tag_mode?, limit?)` — pure tag lookup, default limit 20 130- `edit_memory(id?, tags?, pinned?, special?)` — unified edit tool (including special anchor memory) 131- `list_memories()` — shows pinned + 10 recent unpinned with ids and tags 132 133**`memory_tools()`** — filtered subset for the reflection mini-loop: `remember, recall, context_for, edit_memory, list_memories` (no shell/file tools) 134 135**`execute(call, memory, llm)`** — async dispatch by tool name. needs `&MemoryStore` and `&LlmClient` for memory tools. 136 137### `agent.rs` 138 139Main async loop (`run()`). Receives `Interrupt` from mpsc, sends `AgentEvent` over broadcast. 140 141**startup**: 1421. load pinned memories → `Context::new(anchor, pinned)` 1432. replay recent turns from DB → `ctx.load_turns()` 144 145**interrupt handling**: 146- `Reset` → clear context, emit `Status` 147- `Compact` → call `compact()` immediately 148- `UserMessage` → embed query, recall, `ctx.inject_recalled_memories()`, `ctx.push_input()`, `log_turn()` 149 150**tool loop** (max 20 iterations): 1511. spawn `llm.stream()` in background task 1522. collect `LlmEvent`s: accumulate tokens/thinking, capture `ToolCalls` 1533. if tool calls: emit `ToolCall` events, `tools::execute()` each, emit `ToolResult`, push to context, loop 1544. if plain text: push assistant message, `log_turn()`, emit `Done` + `Metrics`, check watermark 155 156**`compact(output)`**: 1571. emit `Status("reflecting...")` 1582. call `reflect()` — ephemeral mini tool loop (memory tools only, max 6 iterations, separate context) 1593. drain oldest turns from main context 1604. LLM-summarize drained text → store with tag `["compaction_summary"]` 1615. reload pinned memories and call `ctx.update_anchor_memories()` 162 163**`reflect()`**: 164- builds reflection prompt with: last 10 turn outline (truncated), current pinned + recent unpinned memories 165- runs mini stream loop with `reflection_definitions()` tools 166- agent pins/unpins/remembers/tags as it sees fit 167- ephemeral — results don't enter main context 168 169### `interrupt.rs` 170 171`Interrupt` enum: `UserMessage(String)`, `Reset`, `Compact` 172 173`spawn_source(tx, f)` — helper for future external interrupt sources (e.g. Bluesky notifications). Not currently used. 174 175### `lib.rs` 176 177Re-exports modules. Defines: 178- `MetricsSnapshot = Arc<RwLock<Option<AgentMetrics>>>` 179- `AgentMetrics { turn_count, context_tokens, watermark }` 180- `AgentEvent` enum: `Started, Token(String), ThinkToken(String), Done, Status(String), Metrics(AgentMetrics), ToolCall { name, args }, ToolResult { name, content }` 181 182--- 183 184## klbr-ipc 185 186`ClientMsg` (TUI → daemon, tagged by `type` field): 187- `Message { source, content }` — chat message 188- `FetchHistory { before_id, limit }` — scroll-back paging 189- `Compact` — manual compaction trigger 190- `Reset` — wipe DB and context 191- `DumpMemories { path: Option<String> }` — dump memories JSON to file 192 193`ServerMsg` (daemon → TUI): 194- `Started, Token { content }, ThinkToken { content }, Done` 195- `Status { content }` — status bar text 196- `Metrics { turn_count, context_tokens, watermark }` 197- `History { turns: Vec<HistoryEntry> }` — sent on connect and on `FetchHistory` 198- `ToolCall { name, args }`, `ToolResult { name, content }` 199 200`HistoryEntry { id, timestamp, role, content, reasoning: Option<String> }` 201 202`ws_url()``ws://127.0.0.1:8765` 203 204Protocol: one JSON `ClientMsg`/`ServerMsg` per WebSocket text frame. 205 206--- 207 208## klbr-daemon 209 210`main.rs` — wires everything together: 2111. `Config::load()` 2122. open `MemoryStore`, create `LlmClient` 2133. spawn `agent::run()` and `daemon::serve()` concurrently 2144. `tokio::select!` on both, propagate errors 215 216`daemon.rs``serve()` accepts connections in a loop, each gets its own `handle()` task. 217 218`handle()` per-connection: 2191. push `History { turns }` immediately (last `history_window` turns from DB) 2202. push current `Metrics` from snapshot if available 2213. `tokio::select!` between: 222 - incoming WebSocket `ClientMsg` frames → translate to `Interrupt` or handle directly (FetchHistory, Reset, Compact, DumpMemories) 223 - `AgentEvent` from broadcast → translate to `ServerMsg`, send to client 224 225`send_msg()` — serialize `ServerMsg` to JSON, send as WS text frame. 226 227--- 228 229## klbr-tui 230 231Ratatui TUI using crossterm + `tui-scrollview`. 232 233**`App`** state: 234- `history: Vec<ChatMsg>` — display model 235- `scroll: ScrollViewState`, `at_bottom: bool` — scroll tracking 236- `input: String`, `cursor: usize`, `cmd_mode: bool` — input box 237- `oldest_turn_id`, `history_exhausted`, `loading_history` — scroll-back paging 238- `turn_count, context_tokens, watermark, last_tps` — metrics display 239 240**`ChatMsg`** with **`Role`** enum: 241- `User` — cyan "you " prefix 242- `Assistant { reason: Option<Reason>, step: AssistantStep }` — green "klbr " prefix; `Reason` is collapsible thinking block; `AssistantStep` tracks `PromptProcessing → Reasoning → Response → Done` 243- `System` — dark gray, dimmed 244- `Tool { name, args, result: Option<String> }` — yellow `$ name(key=val...)` header + up to 10 lines of result (or "running..." while pending) 245 246**Commands** (typed with `/` prefix): 247- `/clear` (`/c`) — clear display 248- `/compact` (`/cp`) — send `ClientMsg::Compact` 249- `/reset` — clear display + send `ClientMsg::Reset` 250- `/dump [path]` — send `ClientMsg::DumpMemories` 251- `/think` (`/t`) — toggle reasoning block on last assistant message 252- `/help` (`/h`) — show help inline 253 254**Event loop**: `tokio::select!` between crossterm events and socket lines. 255 256Scroll-back: PageUp sends `FetchHistory { before_id: oldest_turn_id, limit: 50 }`. `prepend_turns()` inserts older turns at front of history vec. 257 258Tool result matching: on `ServerMsg::ToolResult`, scan history in reverse for last `Role::Tool { name: matching, result: None }` and fill in result. 259 260Status bar (bottom line): `{tps} ctx {pct}% ({remaining} tok left) (turns: N) {status}` 261 262--- 263 264## data flow summary 265 266``` 267TUI ──ClientMsg──► daemon ──Interrupt──► agent 268269 tool loop 270271TUI ◄──ServerMsg── daemon ◄─AgentEvent── agent 272``` 273 274--- 275 276## things not yet implemented 277 278- multiple clients (broadcast works but history paging is per-connection) 279- external interrupt sources (Bluesky, etc.) — `spawn_source` is ready but unused 280- auth/encryption for networked transports (WebSocket is local-only by default) 281 282 283<!-- headroom:rtk-instructions --> 284# RTK (Rust Token Killer) - Token-Optimized Commands 285 286When running shell commands, **always prefix with `rtk`**. This reduces context 287usage by 60-90% with zero behavior change. If rtk has no filter for a command, 288it passes through unchanged — so it is always safe to use. 289 290## Key Commands 291```bash 292# Git (59-80% savings) 293rtk git status rtk git diff rtk git log 294 295# Files & Search (60-75% savings) 296rtk ls <path> rtk read <file> rtk grep <pattern> 297rtk find <pattern> rtk diff <file> 298 299# Test (90-99% savings) — shows failures only 300rtk pytest tests/ rtk cargo test rtk test <cmd> 301 302# Build & Lint (80-90% savings) — shows errors only 303rtk tsc rtk lint rtk cargo build 304rtk prettier --check rtk mypy rtk ruff check 305 306# Analysis (70-90% savings) 307rtk err <cmd> rtk log <file> rtk json <file> 308rtk summary <cmd> rtk deps rtk env 309 310# GitHub (26-87% savings) 311rtk gh pr view <n> rtk gh run list rtk gh issue list 312 313# Infrastructure (85% savings) 314rtk docker ps rtk kubectl get rtk docker logs <c> 315 316# Package managers (70-90% savings) 317rtk pip list rtk pnpm install rtk npm run <script> 318``` 319 320## Rules 321- In command chains, prefix each segment: `rtk git add . && rtk git commit -m "msg"` 322- For debugging, use raw command without rtk prefix 323- `rtk proxy <cmd>` runs command without filtering but tracks usage 324<!-- /headroom:rtk-instructions -->