Build powerful applications with limited resources using the magic of transfer learning
Getting started making AI apps with fast.ai
I’m currently taking a course from fast.ai and came across an idea that kind of blew my mind: transfer learning. Back in the early 2010s, I was a researcher and looking into machine learning models. At that time, it seemed that creating meaningful applications with machine learning required powerful computers and a vast amount of data. The Data Deluge was just published in The Economist. Big Data was becoming a hashtag, and soon Data Science would be one too. It felt like a moment.
Neural Nets were more of an oddity and hadn’t blown up yet; deep learning would come years later. Check out the search trends for “deep learning.” You can see it really gets popular around 2017.
Last fall, I had the opportunity to help organize “Hacking Models”, a generative AI hackathon in SF organized by Lenny Bogdonoff. I gave a talk at the start of the event to help spark ideas among attendees. If you’re interested, you can find a summary of my talk in this twitter thread:
After the event, I wanted to start prototyping with these ideas around generative AI. If I were to get into deep learning, should I use TensorFlow or PyTorch? I asked Lenny and he recommended to get started with fast.ai. It’s build on top of PyTorch and makes it easier to get up and running building applications.
What is transfer learning?
One of the main points in the first couple fast.ai lessons is that you don’t need a ton of data or computer power to do compelling things with machine learning. Transfer learning is when you use a model trained for a particular problem on another problem. In other words, you can use a pre-trained model, and fine-tune it for your particular use case. It’s the training that takes a bunch of resources. Once the model is available, transfer learning allows you to do a lot with a little.
You can see an application of transfer learning in this image classifier to recognize birds from the first lesson. The method here is to search for a bunch of images of birds and “not birds” (in this example it is forests). Then take a pre-trained model, in this case resnet18, and then use the images and labels to fine tune the pre-trained model. What’s important to note is that you don’t need 1000s+ examples of birds and “not birds,” and you don’t need tons of computing time. It took only minutes using the cloud GPU.
Tune the model and make your own classifier
Given you can train an image classifier based on image search — what would you make? I went through a few ideas and settled on doors. Given an image of a door, is it open or closed? I can do a search for open doors and closed doors for my two sets of labeled data. There isn’t much to it, and I’m not entirely moved by the idea, but hey it’s something to learn with.
A note on GPU stuff
In picking a cloud GPU provider with a notebook interface, I tried out Kaggle, then Paperspace, then Google Colab Pro. I tried the free Colab and after a half hour of training realized I needed something better. For Paperspace I got on the paid plan, but then all of the GPUs required additional $$ which came to like $3/day on my usage on top of the monthly costs. I switched to Colab Pro. At first I used the premium GPUs but they used up the credits quickly, so I moved onto the regular ones which work fine.
Build the classifier
Working off of the earlier birds example, I created a model for classifying doors. In the second lesson we learned data augmentation. This technique allows you to take a small data set and get more out of it. You warp, scretch, crop, transform the training images slightly to get a larger set of training data. There’s also an interesting technique of building the classifier before cleaning the data. Because the images were just downloaded off a search, some of them aren’t actually good images of doors, for instance, ones that show signs. After classifying, you can list the ones where the classifier had the least confidence to see where to clean the data.
What I found fascinating in this process was coming across images that had low label confidence where the door is open just a crack. Was it open or closed? I’d be forced to choose in relabeling. I thought it would be obvious whether or not it was open, and this made me revisit my assumed definition to make consistent choices. At what point is a door closed enough to be closed?
Deploy the classifier
You can host your machine learning apps on Hugging Face Spaces. It includes an integration with Gradio to make the app building a bit quicker. I’d never used it before, and this tutorial helped. You can see my classifier here.
Note: At the time of working through this course I noticed a bug where the example images didn’t work when clicked. They worked on Gradio locally, but not on Hugging Face Spaces. After looking through everything I reverted to a previous Gradio version 3.15 instead of 3.16 and it worked.
Hugging Face Spaces also includes an API to the uploaded AI model, giving you more flexibility in your webapps. I took this as an opportunity to work in React. Hey, gotta start somewhere. You can see the web app here.
What’s inspiring me about this
Working with transfer learning helped me understand in a more tangible way what makes generative AI so powerful. I can take these models, and focus them on a particular application because of transfer learning.
Now when I go on Hugging Face and see all of the models, I’m thinking, wow these are building blocks to so many possibilities. I see prompt writing in a new way when I play with chatGPT. I can give it examples and direct the output. This is the fine-tuning. And as other generative AI models are released and as they become more powerful I think — what are creative ways to tune them and discover new applications?
Stay tuned, in the coming weeks I’ll share the process building my first GPT-3 app.
If you’re getting into building generative AI apps, I’d love to talk! I’m whichlight on twitter. Feel free to send any inspiring resources and projects, over there or in the comments.