Law 01 · Context & Reliability

Law of Context Decay

Agents fail at context, not reasoning.

The principle

Most bad outputs trace to missing, stale, or poisoned context — not a model that can't think. The model is usually smart enough; it was just reasoning over the wrong picture of the world. Garbage context produces confident garbage, and the confidence is exactly what makes it dangerous.

Why it happens

The failure is mechanical, not mystical: a transformer conditions every output token on whatever sits in the window, so a stale or contradictory fact is treated as ground truth with the same weight as a correct one. RLHF-tuned models make this worse because they are trained to be agreeable, and the Anthropic sycophancy study (Sharma et al., 2023) showed five frontier assistants will revise a correct answer toward a user's stated belief, meaning the model actively bends toward whatever framing the context supplies rather than resisting bad input. The model has no independent sense of freshness or provenance, so a 30-day-old cached record reads as current and the reasoning over it is flawless but pointed at the wrong world. This is why swapping in a stronger model rarely helps: a smarter reasoner over the same poisoned context just produces more confident wrong answers.

Watch for

The same question gives different answers depending on which session or document was loaded first.
Outputs confidently reference facts that are real but out of date, or contradict a source you know is in the window.
Bumping to a larger or newer model produces no measurable accuracy gain on the failing cases.

In practice

Your support agent keeps insisting a customer's subscription is active when it was cancelled last week, so the team files a ticket to upgrade to a smarter model. The real culprit: the RAG pipeline pulls a 30-day-old cached account snapshot, and the agent reasons flawlessly over stale data. Before swapping models, log the exact context the agent saw on three bad runs; you will usually find a contradiction or a stale record, not a dumb model. Fix the freshness and the 'reasoning bug' evaporates.

Apply it

On every bad run, dump and read the exact context the model saw before blaming the model.
Stamp each retrieved fact with its source and timestamp, and drop or refresh anything past a freshness threshold.
Detect contradictions in the assembled context and surface them instead of silently concatenating both.

The takeaway

Before you reach for a bigger model, audit what the agent could actually see. Curate the context window deliberately — fresh, relevant, free of contradictions — and most 'reasoning' failures quietly disappear.

Sources and further reading

Read every law in the digital edition Back to all 50 laws

The principle

Why it happens

Watch for

Apply it

Sources and further reading

Related laws