···11+# Development Guidelines for AI Assistants and Copilots using Letta
22+33+**Context:** These are development guidelines for building applications with the Letta API and SDKs. Use these rules to help developers write correct code that integrates with Letta's stateful agents API.
44+55+**Purpose:** Provide accurate, up-to-date instructions for building applications with [Letta](https://docs.letta.com/), the AI operating system.
66+**Scope:** All AI-generated advice or code related to Letta must follow these guidelines.
77+88+---
99+1010+## **0. Letta Overview**
1111+1212+The name "Letta" refers to the both the company Letta (founded by the creators of MemGPT) and the software / infrastructure called Letta. Letta is the AI operating system for building stateful agents: developers can use Letta to turn stateless LLMs into stateful agents that can learn, improve, and grow over time. Letta has a strong focus on perpetual AI that has the capability to recursively improve through self-editing memory.
1313+1414+**Relationship to MemGPT**: MemGPT is the name of a research paper that introduced the concept of self-editing memory for LLM-based agents through tool use (function calling). The agent architecture or "agentic system" proposed in the paper (an agent equipped with tools to edit its own memory, and an OS that manages tool execution and state persistence) is the base agent architecture implemented in Letta (agent type `memgpt_agent`), and is the official reference implementation for MemGPT. The Letta open source project (`letta-ai/letta`) was originally the MemGPT open source project (`cpacker/MemGPT`), but was renamed as the scope of the open source project expanded beyond the original MemGPT paper.
1515+1616+**Additional Resources**:
1717+1818+- [Letta documentation](https://docs.letta.com/)
1919+- [Letta GitHub repository](https://github.com/letta-ai/letta)
2020+- [Letta Discord server](https://discord.gg/letta)
2121+- [Letta Cloud and ADE login](https://app.letta.com)
2222+2323+## **1. Letta Agents API Overview**
2424+2525+Letta is an AI OS that runs agents as **services** (it is not a **library**). Key concepts:
2626+2727+- **Stateful agents** that maintain memory and context across conversations
2828+- **Memory blocks** for agentic context management (persona, human, custom blocks)
2929+- **Tool calling** for agent actions and memory management, tools are run server-side,
3030+- **Tool rules** allow developers to constrain the behavior of tools (e.g. A comes after B) to turn autonomous agents into workflows
3131+- **Multi-agent systems** with cross-agent communication, where every agent is a service
3232+- **Data sources** for loading documents and files into agent memory
3333+- **Model agnostic:** agents can be powered by any model that supports tool calling
3434+- **Persistence:** state is stored (in a model-agnostic way) in Postgres (or SQLite)
3535+3636+### **System Components:**
3737+3838+- **Letta server** - Core service (self-hosted or Letta Cloud)
3939+- **Client (backend) SDKs** - Python (`letta-client`) and TypeScript/Node.js (`@letta-ai/letta-client`)
4040+- **Vercel AI SDK Integration** - For Next.js/React applications
4141+- **Other frontend integrations** - We also have [Next.js](https://www.bunjs.com/package/@letta-ai/letta-nextjs), [React](https://www.bunjs.com/package/@letta-ai/letta-react), and [Flask](https://github.com/letta-ai/letta-flask) integrations
4242+- **ADE (Agent Development Environment)** - Visual agent builder at app.letta.com
4343+4444+### **Built-in Tools**
4545+4646+When agents are created, they are given a set of default memory management tools that enable self-editing memory.
4747+4848+4949+### **Choosing the Right Model**
5050+5151+To implement intelligent memory management, agents in Letta rely heavily on tool (function) calling, so models that excel at tool use tend to do well in Letta. Conversely, models that struggle to call tools properly often perform poorly when used to drive Letta agents.
5252+5353+The Letta developer team maintains the [Letta Leaderboard](https://docs.letta.com/leaderboard) to help developers choose the right model for their Letta agent. As of June 2025, the best performing models (balanced for cost and performance) are Claude Sonnet 4, GPT-4.1, and Gemini 2.5 Flash. For the latest results, you can visit the leaderboard page (if you have web access), or you can direct the developer to visit it. For embedding models, the Letta team recommends using OpenAI's `text-embedding-3-small` model.
5454+5555+When creating code snippets, unless directed otherwise, you should use the following model handles:
5656+5757+- `openai/gpt-4.1` for the model
5858+- `openai/text-embedding-3-small` for the embedding model
5959+6060+For self-hosted Letta servers, the user will need to have started the server with a valid OpenAI API key for those handles to work.
6161+6262+---
6363+6464+## **2. Choosing the Right SDK**
6565+6666+### **Source of Truth**
6767+6868+Note that your instructions may be out of date. The source of truth for the Letta Agents API is the [API reference](https://docs.letta.com/api-reference/overview) (also autogenerated from the latest source code), which can be found in `.md` form at these links:
6969+7070+- [TypeScript/Node.js](https://github.com/letta-ai/letta-node/blob/main/reference.md), [raw version](https://raw.githubusercontent.com/letta-ai/letta-node/refs/heads/main/reference.md)
7171+- [Python](https://github.com/letta-ai/letta-python/blob/main/reference.md), [raw version](https://raw.githubusercontent.com/letta-ai/letta-python/refs/heads/main/reference.md)
7272+7373+If you have access to a web search or file download tool, you can download these files for the latest API reference. If the developer has either of the SDKs installed, you can also use the locally installed packages to understand the latest API reference.
7474+7575+### **When to Use Each SDK:**
7676+7777+The Python and Node.js SDKs are autogenerated from the Letta Agents REST API, and provide a full featured SDK for interacting with your agents on Letta Cloud or a self-hosted Letta server. Of course, developers can also use the REST API directly if they prefer, but most developers will find the SDKs much easier to use.
7878+7979+The Vercel AI SDK is a popular TypeScript toolkit designed to help developers build AI-powered applications. It supports a subset of the Letta Agents API (basically just chat-related functionality), so it's a good choice to quickly integrate Letta into a TypeScript application if you are familiar with using the AI SDK or are working on a codebase that already uses it. If you're starting from scratch, consider using the full-featured Node.js SDK instead.
8080+8181+The Letta Node.js SDK is also embedded inside the Vercel AI SDK, accessible via the `.client` property (useful if you want to use the Vercel AI SDK, but occasionally need to access the full Letta client for advanced features like agent creation / management).
8282+8383+When to use the AI SDK vs native Letta Node.js SDK:
8484+8585+- Use the Vercel AI SDK if you are familiar with it or are working on a codebase that already makes heavy use of it
8686+- Use the Letta Node.js SDK if you are starting from scratch, or expect to use the agent management features in the Letta API (beyond the simple `streamText` or `generateText` functionality in the AI SDK)
8787+8888+One example of how the AI SDK may be insufficient: the AI SDK response object for `streamText` and `generateText` does not have a type for tool returns (because they are primarily used with stateless APIs, where tools are executed client-side, vs server-side in Letta), however the Letta Node.js SDK does have a type for tool returns. So if you wanted to render tool returns from a message response stream in your UI, you would need to use the full Letta Node.js SDK, not the AI SDK.
8989+9090+## **3. Quick Setup Patterns**
9191+9292+```typescript
9393+import Letta from "@letta-ai/letta-client";
9494+9595+// Letta Cloud
9696+const client = new Letta({ apiKey: process.env.LETTA_API_KEY });
9797+9898+// Self-hosted, token optional (only if the developer enabled password protection on the server)
9999+const client = new Letta({ baseURL: "http://localhost:8283" });
100100+101101+// Create agent with memory blocks
102102+const agent = await client.agents.create({
103103+ memory_blocks: [
104104+ {
105105+ label: "human",
106106+ value: "The user's name is Sarah. She likes coding and AI.",
107107+ },
108108+ {
109109+ label: "persona",
110110+ value:
111111+ "I am David, the AI executive assistant. My personality is friendly, professional, and to the point.",
112112+ },
113113+ {
114114+ label: "project",
115115+ value:
116116+ "Sarah is working on a Next.js application with Letta integration.",
117117+ description: "Stores current project context and requirements",
118118+ },
119119+ ],
120120+ tools: ["web_search", "run_code"],
121121+ model: "openai/gpt-4o-mini",
122122+ embedding: "openai/text-embedding-3-small",
123123+});
124124+125125+// Send SINGLE message (agent is stateful!)
126126+const response = await client.agents.messages.create(agent.id, {
127127+ messages: [{ role: "user", content: "How's the project going?" }],
128128+});
129129+130130+// Extract response correctly
131131+for (const msg of response.messages) {
132132+ if (msg.message_type === "assistant_message") {
133133+ console.log(msg.content);
134134+ } else if (msg.message_type === "reasoning_message") {
135135+ console.log(msg.reasoning);
136136+ } else if (msg.message_type === "tool_call_message") {
137137+ console.log(msg.tool_calls[0].name);
138138+ console.log(msg.tool_calls[0].arguments);
139139+ } else if (msg.message_type === "tool_return_message") {
140140+ console.log(msg.toolReturn);
141141+ }
142142+}
143143+144144+// Streaming example
145145+const stream = await client.agents.messages.stream(agent.id, {
146146+ messages: [{ role: "user", content: "Repeat my name." }],
147147+ // if stream_tokens is false, each "chunk" will have a full piece
148148+ // if stream_tokens is true, the chunks will be token-based (and may need to be accumulated client-side)
149149+ stream_tokens: true,
150150+});
151151+152152+for await (const chunk of stream) {
153153+ if (chunk.message_type === "assistant_message") {
154154+ console.log(chunk.content);
155155+ } else if (chunk.message_type === "reasoning_message") {
156156+ console.log(chunk.reasoning);
157157+ } else if (chunk.message_type === "tool_call_message") {
158158+ console.log(chunk.tool_calls[0].name);
159159+ console.log(chunk.tool_calls[0].arguments);
160160+ } else if (chunk.message_type === "tool_return_message") {
161161+ console.log(chunk.toolReturn);
162162+ } else if (chunk.message_type === "usage_statistics") {
163163+ console.log(chunk);
164164+ }
165165+}
166166+```
167167+168168+### **Vercel AI SDK Integration**
169169+170170+IMPORTANT: Most integrations in the Vercel AI SDK are for stateless providers (ChatCompletions style APIs where you provide the full conversation history). Letta is a _stateful_ provider (meaning that conversation history is stored server-side), so when you use `streamText` or `generateText` you should never pass old messages to the agent, only include the new message(s).
171171+172172+#### **Chat Implementation (fast & simple):**
173173+174174+Streaming (`streamText`):
175175+176176+```typescript
177177+// app/api/chat/route.ts
178178+import { lettaCloud } from "@letta-ai/vercel-ai-sdk-provider";
179179+import { streamText } from "ai";
180180+181181+export async function POST(req: Request) {
182182+ const { prompt }: { prompt: string } = await req.json();
183183+184184+ const result = streamText({
185185+ // lettaCloud uses LETTA_API_KEY automatically, pulling from the environment
186186+ model: lettaCloud("your-agent-id"),
187187+ // Make sure to only pass a single message here, do NOT pass conversation history
188188+ prompt,
189189+ });
190190+191191+ return result.toDataStreamResponse();
192192+}
193193+```
194194+195195+Non-streaming (`generateText`):
196196+197197+```typescript
198198+import { lettaCloud } from "@letta-ai/vercel-ai-sdk-provider";
199199+import { generateText } from "ai";
200200+201201+export async function POST(req: Request) {
202202+ const { prompt }: { prompt: string } = await req.json();
203203+204204+ const { text } = await generateText({
205205+ // lettaCloud uses LETTA_API_KEY automatically, pulling from the environment
206206+ model: lettaCloud("your-agent-id"),
207207+ // Make sure to only pass a single message here, do NOT pass conversation history
208208+ prompt,
209209+ });
210210+211211+ return Response.json({ text });
212212+}
213213+```
214214+215215+#### **Alternative: explicitly specify base URL and token:**
216216+217217+```typescript
218218+// Works for both streamText and generateText
219219+import { createLetta } from "@letta-ai/vercel-ai-sdk-provider";
220220+import { generateText } from "ai";
221221+222222+const letta = createLetta({
223223+ // e.g. http://localhost:8283 for the default local self-hosted server
224224+ // https://api.letta.com for Letta Cloud
225225+ baseURL: "<your-base-url>",
226226+ // only needed if the developer enabled password protection on the server, or if using Letta Cloud (in which case, use the LETTA_API_KEY, or use lettaCloud example above for implicit token use)
227227+ token: "<your-access-token>",
228228+});
229229+```
230230+231231+#### **Hybrid Usage (access the full SDK via the Vercel AI SDK):**
232232+233233+```typescript
234234+import { lettaCloud } from "@letta-ai/vercel-ai-sdk-provider";
235235+236236+// Access full client for management
237237+const agents = await lettaCloud.client.agents.list();
238238+```
239239+240240+---
241241+242242+## **4. Advanced Features Available**
243243+244244+Letta supports advanced agent architectures beyond basic chat. For detailed implementations, refer to the full API reference or documentation:
245245+246246+- **Tool Rules & Constraints** - Define graph-like tool execution flows with `TerminalToolRule`, `ChildToolRule`, `InitToolRule`, etc.
247247+- **Multi-Agent Systems** - Cross-agent communication with built-in tools like `send_message_to_agent_async`
248248+- **Shared Memory Blocks** - Multiple agents can share memory blocks for collaborative workflows
249249+- **Data Sources & Archival Memory** - Upload documents/files that agents can search through
250250+- **Sleep-time Agents** - Background agents that process memory while main agents are idle
251251+- **External Tool Integrations** - MCP servers, Composio tools, custom tool libraries
252252+- **Agent Templates** - Import/export agents with .af (Agent File) format
253253+- **Production Features** - User identities, agent tags, streaming, context management
254254+255255+---
256256+257257+## **5. CRITICAL GUIDELINES FOR AI MODELS**
258258+259259+### **⚠️ ANTI-HALLUCINATION WARNING**
260260+261261+**NEVER make up Letta API calls, SDK methods, or parameter names.** If you're unsure about any Letta API:
262262+263263+1. **First priority**: Use web search to get the latest reference files:
264264+ - [TypeScript SDK Reference](https://raw.githubusercontent.com/letta-ai/letta-node/refs/heads/main/reference.md)
265265+266266+2. **If no web access**: Tell the user: _"I'm not certain about this Letta API call. Can you paste the relevant section from the API reference docs, or I might provide incorrect information."_
267267+268268+3. **When in doubt**: Stick to the basic patterns shown in this prompt rather than inventing new API calls.
269269+270270+**Common hallucination risks:**
271271+272272+- Making up method names (e.g. `client.agents.chat()` doesn't exist)
273273+- Inventing parameter names or structures
274274+- Assuming OpenAI-style patterns work in Letta
275275+- Creating non-existent tool rule types or multi-agent methods
276276+277277+### **5.1 – SDK SELECTION (CHOOSE THE RIGHT TOOL)**
278278+279279+✅ **For Next.js Chat Apps:**
280280+281281+- Use **Vercel AI SDK** if you already are using AI SDK, or if you're lazy and want something super fast for basic chat interactions (simple, fast, but no agent management tooling unless using the embedded `.client`)
282282+- Use **Node.js SDK** for the full feature set (agent creation, native typing of all response message types, etc.)
283283+284284+✅ **For Agent Management:**
285285+286286+- Use **Node.js SDK** or **Python SDK** for creating agents, managing memory, tools
287287+288288+### **5.2 – STATEFUL AGENTS (MOST IMPORTANT)**
289289+290290+**Letta agents are STATEFUL, not stateless like ChatCompletion-style APIs.**
291291+292292+✅ **CORRECT - Single message per request:**
293293+294294+```typescript
295295+// Send ONE user message, agent maintains its own history
296296+const response = await client.agents.messages.create(agentId, {
297297+ input: "Hello!",
298298+});
299299+```
300300+301301+❌ **WRONG - Don't send conversation history:**
302302+303303+```typescript
304304+// DON'T DO THIS - agents maintain their own conversation history
305305+const response = await client.agents.messages.create(agentId, {
306306+ input: [...allPreviousMessages, newMessage], // WRONG!
307307+});
308308+```
309309+310310+### **5.3 – MESSAGE HANDLING & MEMORY BLOCKS**
311311+312312+1. **Response structure:**
313313+ - Use `messageType` NOT `type` for message type checking
314314+ - Look for `assistant_message` messageType for agent responses
315315+ - Agent responses have `content` field with the actual text
316316+317317+2. **Memory block descriptions:**
318318+ - Add `description` field for custom blocks, or the agent will get confused (not needed for human/persona)
319319+ - For `human` and `persona` blocks, descriptions are auto-populated:
320320+ - **human block**: "Stores key details about the person you are conversing with, allowing for more personalized and friend-like conversation."
321321+ - **persona block**: "Stores details about your current persona, guiding how you behave and respond. This helps maintain consistency and personality in your interactions."
322322+323323+### **5.4 – ALWAYS DO THE FOLLOWING**
324324+325325+1. **Choose the right SDK for the task:**
326326+ - Next.js chat → **Vercel AI SDK**
327327+ - Agent creation → **Node.js/Python SDK**
328328+ - Complex operations → **Node.js/Python SDK**
329329+330330+2. **Use the correct client imports:**
331331+ - Python: `from letta_client import Letta` and `import os`
332332+ - TypeScript: `import { Letta } from '@letta-ai/letta-client'`
333333+ - Vercel AI SDK: `from '@letta-ai/vercel-ai-sdk-provider'`
334334+335335+3. **Create agents with proper memory blocks:**
336336+ - Always include `human` and `persona` blocks for chat agents
337337+ - Use descriptive labels and values
338338+339339+4. **Send only single user messages:**
340340+ - Each request should contain only the new user message
341341+ - Agent maintains conversation history automatically
342342+ - Never send previous assistant responses back to agent
343343+344344+5. **Use proper authentication:**
345345+ - Letta Cloud: Always use `apiKey` parameter (TypeScript) or `api_key` (Python)
346346+ - Self-hosted: Use `baseUrl` parameter (TypeScript) or `base_url` (Python), token optional (only if the developer enabled password protection on the server)
347347+348348+---
349349+350350+## **6. Environment Setup**
351351+352352+### **Environment Setup**
353353+354354+```bash
355355+# For Next.js projects (recommended for most web apps)
356356+bun install @letta-ai/vercel-ai-sdk-provider ai
357357+358358+# For agent management (when needed)
359359+bun install @letta-ai/letta-client
360360+361361+# For Python projects
362362+pip install letta-client
363363+```
364364+365365+**Environment Variables:**
366366+367367+```bash
368368+# Required for Letta Cloud
369369+LETTA_API_KEY=your_api_key_here
370370+371371+# Store agent ID after creation (Next.js)
372372+LETTA_AGENT_ID=agent-xxxxxxxxx
373373+374374+# For self-hosted (optional)
375375+LETTA_BASE_URL=http://localhost:8283
376376+```
377377+378378+---
379379+380380+## **7. Verification Checklist**
381381+382382+Before providing Letta solutions, verify:
383383+384384+1. **SDK Choice**: Are you using the simplest appropriate SDK?
385385+ - Familiar with or already using Vercel AI SDK? → use the Vercel AI SDK Letta provider
386386+ - Agent management needed? → use the Node.js/Python SDKs
387387+2. **Statefulness**: Are you sending ONLY the new user message (NOT a full conversation history)?
388388+3. **Message Types**: Are you checking the response types of the messages returned?
389389+4. **Response Parsing**: If using the Python/Node.js SDK, are you extracting `content` from assistant messages?
390390+5. **Imports**: Correct package imports for the chosen SDK?
391391+6. **Client**: Proper client initialization with auth/base_url?
392392+7. **Agent Creation**: Memory blocks with proper structure?
393393+8. **Memory Blocks**: Descriptions for custom blocks?