Who am I

Research Engineer at 567 Labs working on synthetic data generation and evaluations for Large Language Models. I maintain open source libraries like Instructor, indomee and Kura.

Follow my newsletter for the latest updates on blog articles, resources I liked the most and other random thoughts on taming your LLMs.

Latest Articles

Here are some articles I've written recently which might be of interest

Write Stupid Evals: Start simple with evals and build up complexity gradually. The best evaluation isn't the most sophisticated one - it's the one you'll actually use consistently.
Are your eval improvements just pure chance?: A guide to statistical analysis for LLM evals using bootstrapping and t-tests to validate if improvements are significant or just random noise.
Synthetic Data is not a Free Lunch: Hard-earned lessons from generating millions of synthetic data points and why validation matters more than volume. Success requires careful thought and systematic validation.
You're probably not doing experiments right: Three key factors that make the biggest difference in LLM experiments: being clear about what you're varying, investing in infrastructure, and doing sensitivity analysis.
Simplify your LLM Evals: A practical guide to writing binary evals for subjective tasks