Reversing Chinese Poetry
Creating our first RL Pipeline with Verifiers
48 articles on language models, agents, and software design.
Creating our first RL Pipeline with Verifiers
Lessons learnt from going from zero to $100M ARR in 8 months
Building a Byte-Pair Encoding tokenizer from scratch
Getting started with reinforcement learning
Models can't do much without the right context, agentic search does just that
Switch between models with your own custom router
Migrating Our Coding Agent to React Ink
Implementing a coding agent in around 200 lines of Javascript code
If you're not working with MCPs, you're missing out on a powerful tool for rapid prototyping and automation.
Turns out I never liked coding all along
Lessons from building a voice-based chatbot for customer service training
Three key things making it tough to build reliable voice agents
Key considerations when building UIs that rely on streaming LLM content
How the Model Context Protocol is really just a precursor to LLM applications as microservices
How to build reliable LLM applications with structured outputs, synthetic data and binary metrics
How do we understand Conversations at Scale while preserving user privacy?
How do I use Claude on a day to day basis?
A road map for 2025 and some thoughts on how to get started
A year of personal and professional growth
Practical guide to setting up text-to-image and video generation using Modal and ComfyUI
A practical guide to writing binary evals for subjective tasks
Why infrastructure, user experience, and data create lasting competitive advantages in AI applications
Learn how to evaluate and improve your LLM systems
How to start shipping more reliable LLM applications to prod
Grokking simple statistical analysis for LLM evals
Lessons learnt from writing 40k lines of documentation of instructor
The easiest thing you can do today to improve your synthetic data quality
Get 10x the results with LLMs in 3 simple steps
How I'm setting up my Mac for ML Open Source Development
Keep it simple and worry about the rest later
RAG isn't dead, it just got more complicated
Hard-earned lessons from generating millions of synthetic data points and why validation matters more than volume
How to get the results that you want from your LLM experiments
Deciding on the right LLM framework for your application
How your request goes from chat completion to validated Pydantic model
Writing your first eval test in under 5 minutes
Lessons from generating a few million tokens of synthetic data with gpt-4o-mini
Some thoughts from the AI Engineering World Fair
Speedrunning everything I learnt in the past year
A quick guide to creating tools for yourself
A few actionable tips to writing better machine learning scripts
Speedrun your way to becoming a good python developer and don't make the same mistakes I did
Lessons from trying to teach myself about AI over the past 6 months
Using RAG to generate UI Components
A guide to a strong open-source transformer alternative
How to simulate a red-team attack on your own models to improve their robustness
Using LLMs to automatically tag and categorize your favourite eating spots
Implementing an Event-Driven approach for whisper transcriptions