Skip to main content
Persistent Memory Infrastructure for AI Agents

Agent Memory API

Store, recall, and forget. Multi-tier memory with semantic search, GraphRAG, and automatic consolidation. One API for AI agents that remember.

npx zerodb-cli init
100%
Recall@1 (LongMemEval)
<400ms
Retrieval Latency
3 Tiers
Working + Episodic + Semantic

What Is an Agent Memory API?

AI agents lose context when conversations end. They can't remember what you told them yesterday, what your preferences are, or what they learned from previous tasks.

An agent memory API solves this by giving agents persistent storage for facts, preferences, and relationships — with semantic search to recall the right information at the right time.

ZeroMemory goes further with multi-tier memory (working, episodic, semantic), automatic consolidation between tiers, importance-weighted decay, and GraphRAG — hybrid search that combines vector similarity with knowledge graph traversal for multi-hop reasoning.

Core API Primitives

Three operations. That's all your agent needs to remember everything.

Remember

POST /memory/v2/remember

Store a fact, observation, or interaction. Auto-generates embeddings (free), extracts entities, assigns importance scores, and builds knowledge graph edges.

Recall

POST /memory/v2/recall

Search memories by meaning. Blended scoring combines vector similarity, importance weight, and recency. Filter by user, session, or metadata.

Forget

POST /memory/v2/forget

Delete memories by ID, user, session, or time range. GDPR-friendly — remove all memories for a user with one call.

Plus: /reflect (agent self-reflection), /profile (user profiles from memories), /relate (entity relationships), /graph/* (16 GraphRAG endpoints)

Why AINative for Agent Memory

Not just store-and-search. A complete cognitive memory system built for production agents.

Multi-Tier Memory

Working memory for active tasks, episodic memory for past interactions, semantic memory for long-term knowledge. Automatic consolidation between tiers.

Semantic Recall

Search by meaning, not keywords. Free embeddings included — no OpenAI key required. Blended scoring: similarity + importance + recency.

GraphRAG Hybrid Search

Combines vector search with multi-hop knowledge graph traversal. Finds connections that flat search misses — people, orgs, concepts, and their relationships.

Auto Consolidation & Decay

Memories strengthen with access, decay with time. Working memory consolidates into long-term storage. Importance scores adapt based on usage patterns.

Entity Graphs & Profiles

Auto-extracts entities and relationships from stored memories. Builds user profiles, agent reflections, and knowledge graphs — zero configuration.

MCP + Framework Native

6-tool MCP server for Claude Code, Cursor, and VS Code. LangChain and LlamaIndex integrations. REST API for any stack.

Add Memory in Minutes

REST API, Python SDK, or MCP server — pick your integration path. Free embeddings, no infrastructure to manage.

Quick Setup

npx zerodb-cli init

MCP Server (Memory Tools)

npm i ainative-zerodb-memory-mcp

Python SDK

pip install langchain-zerodb
  • Free embeddings — BAAI/bge models, no OpenAI costs
  • Multi-tier memory with automatic consolidation
  • GraphRAG: vector + knowledge graph hybrid search
  • MCP server for Claude Code, Cursor, VS Code
  • LangChain + LlamaIndex integrations
# Store a memory
curl -X POST https://api.ainative.studio/api/v1/public/memory/v2/remember \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "content": "User prefers dark mode and uses Python for backend work",
    "metadata": { "user_id": "u_123", "source": "onboarding" }
  }'

# Response:
{
  "memory_id": "mem_abc...",
  "importance": 0.72,
  "entities_extracted": ["dark mode", "Python"],
  "tier": "working"
}

Use Cases

From chatbots to autonomous research agents — persistent memory changes what agents can do.

Copilots & Chat Assistants

Remember user preferences, past conversations, and context across sessions. No more "As an AI, I don't have memory of previous conversations."

Example

A support chatbot recalls a customer's past tickets, product purchases, and preferred resolution method — before the customer says a word.

Autonomous Agents

Long-running agents that accumulate knowledge over days and weeks. Working memory for the current task, episodic memory for what happened, semantic memory for what they've learned.

Example

A research agent builds a knowledge graph of papers, authors, and findings across hundreds of sessions — and uses GraphRAG to discover connections.

Multi-Agent Systems

Shared memory across agent swarms. One agent stores a finding, another agent recalls it. User-scoped and project-scoped memory isolation built in.

Example

A coding agent stores architecture decisions. A review agent recalls them when evaluating PRs. A docs agent uses them to generate documentation.

RAG Pipelines

Go beyond document retrieval. Combine vector search with entity relationships for answers that require multi-hop reasoning across your knowledge base.

Example

Query: "What pricing does Company X use?" GraphRAG traverses Company X → negotiates_with → Competitor Y → uses → pricing strategy Z.

Building from Scratch vs. Using ZeroMemory

Compare agent memory approaches side by side.

FeatureZeroMemoryMem0LettaBuild Custom
Memory store & recall
Multi-tier memory (working/episodic/semantic)
Automatic consolidation & decay
Semantic search with free embeddings
GraphRAG (vector + graph hybrid)
Knowledge graph auto-population
MCP server (agent tools)
Entity extraction & profiles
Ontology templates
No infrastructure to manage

Frequently Asked Questions

What is an agent memory API?

An agent memory API lets AI agents store, search, and retrieve information across sessions. Instead of losing context when a conversation ends, agents persist facts, preferences, and relationships — and recall them later using semantic search. ZeroMemory provides this with multi-tier memory, automatic consolidation, and GraphRAG hybrid retrieval.

How do AI agents store memory?

Agents call POST /remember with text content and optional metadata. ZeroMemory auto-generates embeddings (free), extracts entities and relationships, assigns importance scores, and stores everything in Postgres with pgvector indexes. No separate embedding API or graph database needed.

What database should I use for agent memory?

Use a purpose-built memory API like ZeroMemory rather than raw vector databases. Memory APIs handle embedding, scoring, consolidation, and retrieval in one call. ZeroDB is Postgres-native, so you get relational data, vector search, and knowledge graphs without managing separate infrastructure.

How does GraphRAG improve memory recall?

Standard vector search finds semantically similar memories. GraphRAG adds a second stage — it traverses entity relationships in a knowledge graph to surface structurally connected information. For example, querying about a person finds their team, projects, tools, and collaborators through multi-hop graph traversal, even if those memories don't share similar text.

Is there a free tier?

Yes. ZeroDB Free includes 500K vectors, 2GB storage, full memory API access, and free embeddings. No credit card required. Get started with npx zerodb-cli init.

Does it work with Claude, GPT, and open-source models?

Yes. ZeroMemory is LLM-agnostic. Use the REST API from any stack, the MCP server with Claude Code or Cursor, or the Python SDK with LangChain and LlamaIndex. The memory layer is independent of which model generates or consumes the memories.

Give Your Agents Persistent Memory

One API for memory, semantic search, and GraphRAG. Start free — no credit card, no signup wall.