All posts

Your AI's Memory, Explained: Graph-Based Recall That Works Like Your Brain

Chvor Team · · 10 min read
memory architecture deep-dive

The Amnesia Problem

Every conversation with a chat AI starts the same way: from nothing. You tell it your name, your preferences, the context of your project, the thing you told it yesterday. It nods along as if hearing it for the first time, because it is. Context windows are finite. Sessions end. The slate gets wiped.

This is the fundamental limitation of conversational AI as it exists today. Not intelligence — memory. Most systems solve this with a vector database bolted onto the side: embed the conversation, store the vectors, retrieve the closest matches next time. It works, in the way that a filing cabinet works. You can find things if you know roughly what you’re looking for. But it doesn’t learn. It doesn’t strengthen connections over time. It doesn’t forget the irrelevant and sharpen the important. It doesn’t behave like memory at all.

Chvor takes a different approach. Its memory system is not a retrieval-augmented appendix. It is a graph-based cognitive architecture inspired by how human memory actually functions — with decay, reinforcement, consolidation, emotional weighting, and spreading activation. This post walks through how it works, layer by layer.

Memory as a Graph, Not a List

In a traditional vector store, each memory is an island. A chunk of text gets embedded into a high-dimensional vector, stored in an index, and retrieved by cosine similarity. There is no relationship between memories. There is no structure. There is no notion that one memory caused another, or that two memories are about the same entity, or that a newer memory contradicts an older one.

Chvor’s memory is a graph. Every memory is a node, and nodes are connected by typed, weighted edges. When you tell your AI assistant that you moved from Berlin to Lisbon, that doesn’t just create a new fact — it creates a supersedes edge linking the new location to the old one, a temporal edge anchoring the move in time, and an entity edge connecting both memories to the node representing you. The system doesn’t just store what it knows. It stores how things relate.

Seven edge types define the topology of the graph: temporal (ordering events in time), causal (linking cause to effect), semantic (connecting thematically related ideas), entity (binding memories to the people, places, and things they reference), contradiction (flagging conflicts between memories), supersedes (marking when new information replaces old), and narrative (threading memories into coherent storylines). Each edge carries a weight, so the system can distinguish between a strong causal link and a loose semantic association.

Three Tiers of Detail

Not every memory needs to be loaded in full every time. Human recall works the same way — you remember the gist of a conversation before the exact words, the shape of an event before the specifics. Chvor mirrors this with a three-tier content structure on every memory node.

L0 (Abstract) is roughly 120 characters. One line. A compressed summary that captures the essence: “Prefers TypeScript over JavaScript for all new projects.” L0 representations are cheap enough to include directly in the prompt context. They give the model a map of what it knows without flooding the context window.

L1 (Overview) is around 1,000 characters. A paragraph that adds context, nuance, and supporting detail: when the preference was stated, what prompted it, how strongly it was expressed. L1 is loaded when a memory is relevant to the current conversation but doesn’t need its full history.

L2 (Detail) is up to 5,000 characters. The full narrative — the original conversation excerpts, the reasoning chain, the surrounding context. L2 is loaded on demand, only when the system needs deep specificity. Most memories never need to surface at this level in a given session.

This tiered approach means Chvor can hold a broad awareness of hundreds of memories in a single prompt while only paying the token cost for deep detail where it matters.

Six Categories of Memory

Every memory node is classified into one of six categories, each serving a distinct cognitive function:

Profile memories capture who the user is — their name, occupation, location, communication style. These form the stable foundation of identity. “Works as a backend engineer at a climate tech startup in Lisbon.”

Preference memories record what the user likes, dislikes, and how they want things done. “Wants concise responses without unnecessary preamble. Prefers code examples in Rust.”

Entity memories represent the people, organizations, tools, and concepts the user talks about. “Ravi is the user’s cofounder. They’ve been working together since 2023.”

Event memories anchor things in time. “Launched the beta on March 12th. The deployment failed twice before succeeding.” These create the temporal backbone of the graph.

Pattern memories emerge from observation. The system notices recurring behaviors, preferences, or workflows that the user hasn’t explicitly stated. “Typically asks for architectural feedback on Mondays after sprint planning.”

Case memories store problem-solution pairs — past interactions where a specific approach was tried, what worked, and what didn’t. “When the database migration failed, rolling back and re-running with the —force flag resolved the issue.”

These categories aren’t just labels. They influence how memories are weighted in retrieval, how quickly they decay, and how they interact during consolidation.

The Metadata That Makes It Work

Raw text isn’t enough. Each memory node carries metadata that governs its behavior in the system:

Strength is a float between 0.0 and 1.0, representing how robust the memory is. New memories are created at 0.8. Every time a memory is accessed or reinforced, its strength increases by 0.15 (capped at 1.0). Strength determines how likely a memory is to surface during retrieval and how resistant it is to decay.

Decay rate controls how quickly a memory fades without reinforcement. This is where Ebbinghaus enters the picture.

Confidence reflects how certain the system is about the memory’s accuracy. A directly stated fact (“I live in Lisbon”) carries higher confidence than an inference drawn from context.

Provenance tracks the origin of the memory across four levels: stated (the user said it explicitly), extracted (pulled from conversation context), inferred (derived by the model from patterns), and consolidated (synthesized during a consolidation cycle from multiple sources). Provenance affects confidence and helps the system know when to verify rather than assume.

Emotional valence captures the emotional charge of the moment when the memory was formed. A frustrated debugging session, an excited product launch, a casual aside — these carry different emotional weight, and that weight influences both storage and retrieval.

Ebbinghaus Decay: Forgetting as a Feature

Forgetting is not a bug. It is the mechanism that keeps memory systems from drowning in noise. Chvor implements a decay model inspired by Hermann Ebbinghaus’s forgetting curve, adapted for the rhythms of human-AI interaction.

The core formula: strength(t) = strength_0 * e^(-decay_rate * days_since_access).

A memory is born at 0.8 strength. Without reinforcement, it fades according to its decay rate. But every time the memory is accessed — retrieved in a conversation, referenced, or reinforced — two things happen: the strength bumps by 0.15, and the decay rate itself slows down.

This produces the spaced repetition effect that makes the system genuinely adaptive. A memory accessed once and never again will fade within days. A memory accessed three times over the course of a week will persist for months. A memory accessed repeatedly over longer and longer intervals becomes nearly permanent, with a decay rate approaching zero.

Emotional valence amplifies this. High-emotion moments — marked by sentiment analysis at the time of memory creation — produce memories with higher initial strength and slower baseline decay. The system remembers what mattered, not just what was said.

Five-Stage Retrieval

When a message arrives and the system needs to recall relevant context, retrieval unfolds in five stages:

Stage 1: Vector search. The query is embedded and matched against stored memory vectors using sqlite-vec. This casts a wide net, pulling in semantically similar candidates.

Stage 2: Composite re-ranking. Candidates are scored across five dimensions with explicit weights: semantic similarity (35%), memory strength (25%), recency (15%), category relevance (15%), and emotional resonance (10%). This ensures that retrieval isn’t purely about textual similarity — a strong, recent, emotionally resonant memory of the right category will outrank a merely similar one.

Stage 3: Context-aware weighting. The scores are adjusted based on the current conversation context. If the user is debugging, case memories get a boost. If they’re planning, event and pattern memories rise. The system reads the situation and adjusts what it surfaces.

Stage 4: Spreading activation. Once top memories are selected, the graph comes alive. Activation spreads along edges — pulling in causally linked events, related entities, and contradicting facts that the system should flag. This is where the graph topology pays off. A simple vector store stops at retrieval. Spreading activation lets the system think associatively.

Stage 5: Predictive preloading. Based on the trajectory of the conversation, the system preloads L1 and L2 content for memories likely to be needed in the next few turns. This reduces latency and lets the AI feel anticipatory rather than reactive.

Consolidation: The Sleep Cycle

Every six hours, Chvor runs a consolidation cycle — the system’s equivalent of sleep. Four operations execute during this window:

Fragment merging combines partial memories about the same entity or event into unified nodes. If three separate conversations mentioned details about a project, those fragments collapse into a single, richer memory.

Insight synthesis identifies patterns across memories that weren’t visible in isolation. The system might notice that the user always asks about performance optimization after deploying new features, generating a new pattern memory.

Narrative weaving connects event memories into coherent timelines, strengthening temporal and causal edges. The graph becomes not just a collection of facts but a story.

Graph pruning removes memories whose strength has decayed below threshold — nodes that were never reinforced, edges that lost relevance. This keeps the graph lean and retrieval fast.

Consolidation is what separates a memory system from a database. Databases store what you put in. Memory systems learn from what they’ve stored.

Cross-Channel Persistence

Chvor is channel-agnostic. The memory graph is tied to the user, not the interface. A preference stated in a Telegram chat is immediately available when the user switches to Discord or the web interface. An entity introduced in one channel doesn’t need to be re-explained in another.

This sounds simple, but it requires the memory system to normalize and reconcile information from different conversational contexts. The provenance metadata and consolidation cycles handle the merging. The graph handles the structure. The result is an AI that knows you — not an AI that knows what you said in one particular chat window.

Storage and Security

All of this runs on SQLite with sqlite-vec for vector operations. No external database dependencies. No cloud vector services. Everything stays on your hardware, under your control.

Memory is encrypted at rest with AES-256-GCM. Each user’s memory graph is isolated. Because Chvor is self-hosted, you own the infrastructure, the data, and the encryption keys. There is no telemetry, no external calls, no third-party access to your memory graph.

Why This Is Different

Most AI memory implementations are, at their core, semantic search over conversation logs. They embed text, store vectors, and retrieve by similarity. It works for simple recall, but it doesn’t learn. It doesn’t decay. It doesn’t consolidate. It doesn’t understand that a memory from last month might be more important than one from yesterday if it was reinforced ten times. It doesn’t know that two facts contradict each other, or that one supersedes another.

Chvor’s graph-based cognitive architecture treats memory as a living system. Memories are born, strengthened, connected, consolidated, and — when they are no longer relevant — allowed to fade. The system develops an understanding of the user that deepens over time, not because it stores more data, but because it processes what it has stored into something structured, weighted, and associative.

This is what it takes to build AI that truly remembers. Not a bigger context window. Not a faster vector index. A system that treats memory the way memory works — as a graph of relationships, shaped by time, reinforced by use, and refined by reflection.