Skip to content

Evaluations

Write Stupid Evals

Evals should start simple and get progressively more complex. It's important to start simple because what we're aiming to do is to build a habit for writing these simple assertions early. By doing so, we can start taking our vibes and turning them into objective metrics. This allows us to compare different approaches easily and make a data-driven decision into what works and what doesn't.

Don't overthink it and really just use a assert statement at the start.

There's a famous story about a pottery teacher who divided their class into two groups. The first group would be graded solely on quantity - how many pieces they could produce. The second group would be graded on quality - they just needed to produce one perfect piece. When grading time came, something interesting happened: the best pieces all came from the quantity group. While the quality group got stuck theorizing about perfection, the quantity group learned through iterative practice.