Category · 5 laws
Evaluation & Measurement
Knowing it works without fooling yourself.
26
Vibes Don't Scale
Eyeballing outputs feels like progress until you can't tell if a change helped.
27
Look at Your Data
The highest-ROI activity in AI is the one teams skip first.
28
The Judge Is Biased
An LLM grader reacts to length and position, not just substance.
29
Goodhart's Trap
When your eval becomes the goal, it stops measuring what you cared about.
30
Regress or Repeat
Every fixed bug is a future regression unless it becomes a test.