What it means to look at your data
People always talk about looking at your data but what does it actually mean in practice?
In this post, I'll walk you through a short example. After examining failure patterns, we discovered our query understanding was aggressively filtering out relevant items. We improved the recall of a filtering system that I was working on from 0.86 to 1 by working on prompting the model to be more flexible with its filters.
There really are two things that make debugging these issues much easier
- A clear objective metric to optimise for - in this case, I was looking at recall ( whether or not the relevant item was present in the top
k
results ) - A easy way to look at the data - I like using
braintrust
but you can use whatever you want.
Ultimately debugging these systems is all about asking intelligent questions and systematically hunting for failure modes. By the end of the post, you'll have a better idea of how to think about data debugging as an iterative process.