Law 15 · Retrieval & Memory

Memory Is a System, Not a Window

Give the agent a hierarchy, not just a bigger prompt.

Diagram explaining Memory Is a System, Not a Window

The principle

Treat the context window like a computer's RAM: an agent should actively page information between a small in-context working set and large external storage, deciding what to keep, evict, and recall. Cramming everything into one flat window conflates working memory with long-term storage and hits hard limits. Durable agent memory needs explicit tiers and self-managed retrieval.

Why it happens

A flat, ever-growing prompt conflates working memory with long-term storage, so it hits the context limit, dilutes attention across irrelevant history, and pays to re-process the same tokens every turn, which is why durable memory needs explicit tiers with paging between a small in-context set and large external stores. MemGPT made this concrete by treating the context window like a computer's RAM and giving the model self-directed functions to page information in and out of a larger external store, letting it manage what to keep, evict, and recall. Equally important is the retrieval policy that decides what to surface back into context: generative-agent systems scored memories by a weighted combination of recency, importance, and relevance to the current situation, demonstrating that good recall is a ranking problem, not just a storage problem. Architecting these tiers and policies, rather than enlarging the window, is what keeps a long-running agent coherent.

Watch for

In practice

Your agent's long-running session keeps degrading: by hour two it is forgetting decisions from hour one because you have been appending everything into one ever-growing prompt until attention spreads thin and costs balloon. A bigger context window just delays the same wall. Build memory in tiers instead: a small working set in context, summarized recallable notes, and an external store the agent reads and writes deliberately, with explicit policies for what gets promoted, summarized, and evicted. Treat the window like RAM, not a filing cabinet.

Apply it

  1. Separate a small in-context working set from a large external store and page entries between them deliberately.
  2. Define explicit policies for what gets promoted, summarized, and evicted rather than appending everything.
  3. Rank what to recall back into context by a blend of recency, importance, and relevance to the current task.

The takeaway

Architect memory in tiers — working context, recallable summaries, external stores — with explicit policies for what gets promoted or evicted, rather than relying on context length.

Sources and further reading

Related laws

Read every law in the digital edition Back to all 50 laws