// learn · tutorial

Add memory to a LangChain agent in 5 minutes

A working walkthrough: give a LangChain agent memory that survives across sessions, using a simple HTTP memory API. No vector database, no embedding pipeline, and no coupling your memory to a framework that keeps changing.

The short version: keep your memory layer separate from LangChain. Store facts in a plain HTTP key-value API, read them into the agent's prompt before it runs, and write new facts back after. Your memory then survives both new sessions and LangChain's own churn. The full code is below and takes about five minutes.

Why not just use LangChain's built-in memory?

You can, and for simple in-session history it works. But LangChain's memory abstractions have been redesigned repeatedly: ConversationBufferMemory gave way to RunnableWithMessageHistory, which gave way to LangGraph checkpointers. Each migration meant rewriting memory code, because the memory was coupled to the orchestration layer.

There is a durable alternative: keep the facts your agent needs to remember in a store that does not know or care which version of LangChain you are running. When the framework changes, your memory code does not. That is the approach this tutorial takes, and it happens to be the simplest one to get working too.

Before you start, you need:

Python 3.9 or newer
pip install langchain langchain-openai requests
An OpenAI API key (or any chat model LangChain supports)
An AgentRAM API key, from the free tier below

Step 1: Get your AgentRAM key

// about 30 seconds

Sign up for AgentRAM and copy your API key. New accounts start with 100 free operations and no credit card, which is plenty to complete and test this tutorial. Keep the key handy for the next step.

Step 2: Write a tiny memory helper

// about 2 minutes

This is the whole memory layer: two functions over the AgentRAM HTTP API. One stores a fact under a key, the other reads it back. It has no LangChain imports, which is the entire point. It will keep working no matter how LangChain changes.

# memory.py
import requests

API_KEY = "agentram_your_key_here"
BASE = "https://api.agentram.dev"
HEADERS = {"x-api-key": API_KEY, "Content-Type": "application/json"}

def remember(agent_id: str, key: str, value: str) -> None:
    """Store one fact under a key."""
    requests.post(f"{BASE}/memory", json={
        "agent_id": agent_id,
        "key": key,
        "value": value,
    }, headers=HEADERS)

def recall(agent_id: str, key: str) -> str | None:
    """Read one fact back, or None if it was never stored."""
    res = requests.get(f"{BASE}/memory",
        params={"agent_id": agent_id, "key": key},
        headers=HEADERS)
    if res.status_code == 200:
        return res.json()["data"]["value"]
    return None

That is the complete integration surface. Everything else is ordinary LangChain.

Step 3: Load memory before the agent runs

// about 1 minute

Build a normal LangChain agent, but before you invoke it, read what you know about this user and put it into the system prompt. Here the agent uses one agent_id per user, so each user gets their own memory.

# agent.py
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain.agents import create_tool_calling_agent, AgentExecutor
from memory import remember, recall

USER_ID = "user-123"          # one memory scope per user
llm = ChatOpenAI(model="gpt-4o", temperature=0)
tools = []                     # add your own tools here

# Pull anything we already know about this user
known_name = recall(USER_ID, "name")
known_pref = recall(USER_ID, "reply_style")

memory_note = ""
if known_name:
    memory_note += f" The user's name is {known_name}."
if known_pref:
    memory_note += f" They prefer {known_pref} replies."

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant." + memory_note),
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}"),
])

agent = create_tool_calling_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools, verbose=False)

On the first run memory_note is empty and the agent behaves normally. On later runs it silently starts the conversation already knowing the user, because the facts came from AgentRAM rather than from the context window.

Step 4: Save memory after the agent runs

// about 1 minute

After the agent responds, write any new facts back. In a real app you would decide what is worth keeping; here we store two simple values so you can see the round trip.

# still agent.py

# First session: teach it something
result = executor.invoke({"input": "Hi, I'm Ada and I like short answers."})
print(result["output"])

# Persist what we learned, so the next run remembers it
remember(USER_ID, "name", "Ada")
remember(USER_ID, "reply_style", "short")

Now stop the program and run it again with a fresh question:

result = executor.invoke({"input": "What's my name?"})
print(result["output"])   # -> knows the user is Ada

The second run is a completely new process. There is no conversation history in memory, yet the agent knows the user, because Step 3 read the facts back from AgentRAM before the agent started. That is persistent memory: it survived the process ending.

What just happened. LangChain handled the reasoning loop. AgentRAM handled the remembering. Neither one depends on the other, so you can upgrade LangChain, switch to LangGraph, or change models without touching your memory code.

Where to take it next

This pattern extends naturally:

More facts: store anything you can name with a key, such as goals, past decisions, or account details. Reading and writing are one credit each.
Multiple agents, one memory: if you have several agents that should share what they know, give them a shared namespace instead of a per-agent ID. See the docs for shared memory.
Expiring facts: pass a TTL when you store something that should not live forever, like a temporary session detail.
Let the model decide: expose remember as a LangChain tool so the agent itself chooses when to save something.

If you are wondering why this works without embeddings or a vector store, it is because this is structured memory: you always know the key. That is the case for a large share of agent memory, and it is covered in agent memory without a vector database.

Get your free API key

Everything above runs on the free tier. 100 operations, no credit card, memory working in about five minutes.

Start building

Read the full API in the docs.