Ivan's Blog
Research Engineer at 567 Labs working on synthetic data generation and evaluations for Large Language Models. I maintain open source libraries like Instructor and indomee.
Follow my newsletter for the latest updates on blog articles, resources I liked the most and other random thoughts on taming your LLMs.
Latest Articles
Here are some articles I've written recently which might be of interest
-
Write Stupid Evals: Start simple with evals and build up complexity gradually. The best evaluation isn't the most sophisticated one - it's the one you'll actually use consistently.
-
Are your eval improvements just pure chance?: A guide to statistical analysis for LLM evals using bootstrapping and t-tests to validate if improvements are significant or just random noise.
-
Synthetic Data is not a Free Lunch: Hard-earned lessons from generating millions of synthetic data points and why validation matters more than volume. Success requires careful thought and systematic validation.
-
You're probably not doing experiments right: Three key factors that make the biggest difference in LLM experiments: being clear about what you're varying, investing in infrastructure, and doing sensitivity analysis.