Writing good scripts for machine learning is an art. I struggled with writing them for a long time because of how different it was to my experience working with full-stack frameworks such as React or FastAPI.
There were four main issues that I struggled with
My job has a high probability of failing without any reason
My data might not fit into memory for no reason
Running a single job takes days or more
Optimizing hyper-parameters is genuinely difficult
In the past 6 months, I've 10xed the amount of python code I've written. In this article, I'll show you a few easy actionable tips to write better and more maintainable code. I've been lucky enough to have Jason (@jxnlco on twitter) review a good chunk of my code and I've found that these few things have made a massive difference in my code quality.
Over the past 6 months, I've been trying to learn more about AI and LLMs. ChatGPT had me hooked when I tried it for the first time. Over the course of this period, I've been chatting to more people, shitposting on twitter and working to learn as much as I can in my spare time.
That amounts to roughly 10-20 hours a week since I don't have much of a social life which has been about 4-500 hours in total since the time I started exploring this space so take my experience with a grain of salt. I'm relatively new and you're probably 2-3 months behind me at most, much less if you do it full time.
I've had some people reach out to me for advice on what to do and I figured I'd write a longer blog post so that I could refer to it myself and consolidate some of my ramblings.
The full code for this is avaliable here for reference.
A while ago, I saw a demo video of Vercel's V0 and was blown away by what it could produce. It could take in user prompts, feedback and iteratively generate new and improved UI code using the popular @shadcn/ui library.
This was soon followed by the open-v0 project by raidendotai. Since I didn't have access to v0 via vercel, i figured I would clone the project and try to figure out how it worked.
One eventful friday evening later, I ended up putting together a small prototype which uses context-aware RAG and pydantic to generate valid NextJS Code based on a user prompt which you can see below.
The Gif renders pretty slowly for some reason so if you want to see the original clip, you can check it out here
RWKV is an alternative to the transformer architecture. It's open source and has it's own paper over here. I found out about it sometime back in a paper club and thought i'd write a short article about it with what I had learnt.
Here are some other resources which you might find useful about RWKVs
RKWV by Picocreator This is a markdown file that was used by one of the contributors - Picocreator to give a short presentation on the RWKV architecture.
RKWV in 100 lines Which covers the implementation of RWKV in 100 lines of code. Much of this article is based off the content here - I try to extend and provide my own intuition for some proofs. I've also attached a colab notebook for you if you want to play with the code.
A while ago, a company called Lakera released a challenge called Gandalf on Hacker News which took the LLM community by storm. The premise was simple - get a LLM that they had built to reveal a password. This wasn't an easy task and many people spent days trying to crack it.
Some time after their challenge had been relased, they were then kind enough to release both the solution AND a rough overview of how the challenge was developed. You can check it out here. Inspired by this, I figured I'd try to reproduce it to some degree on my own in a challenge I called The Chinese Wall with Peter Mekhaeil for our annual company's coding competition. We will be releasing the code shortly.
Participants were asked to try and extract a password from a LLM that we provided. We also provided a discord bot that was trained on the challenge documentation which participants could use to ask questions to.
Here's a quick snapshot of it in action
The model uses Open AI's GPT 3.5 under the hood with the instructor library for function calls.
As usual, you can find the code for this specific article here
If you've ever used Google Maps, you've definitely struggled to decide where to go to eat. The UI ... frankly sucks beyond belief for an application that has all the data and compute that it has.