Category · 5 laws
Safety & Security
How agents get attacked, and how to contain it.
31
The Lethal Trifecta
Private data, untrusted content, and an exfiltration path — pick at most two.
32
Tokens Don't Wear Badges
The model can't tell your instructions from the attacker's — they're all just tokens.
33
The Confused Deputy
An agent with your privileges will wield them on an attacker's behalf.
34
Quarantine Untrusted Tokens
Let the privileged planner orchestrate, but never let it read the poison.
35
Sandbox the Blast Radius
Assume the agent gets compromised — then contain what it can reach.