It’s a story that you’ve heard before: the biggest and best machine learning models need enormous datasets to get good performance on complicated tasks. When you are just getting started with deep learning, the sheer volume of data that you hear the state-of-the-art models need can be pretty discouraging. However, today I’m going to show you how to stand on the shoulders of giants to supercharge your next ML project on a tough task like object detection. Do you want quality results without having to train on the same mass of data? Enter: transfer learning.
(For those of you who are already familiar with transfer learning, feel free to skip ahead to the problem setup and results)
“Transfer learning” refers to when we take a model that was trained on a related task and retrain it on a new task of interest. We can then piggyback off of the hundreds/thousands of computing hours spent training the existing model on some other task and train on a much smaller dataset and just fine-tune the model weights to be accurate for our use case.
We effectively leverage all the expertise and hard work leading companies and research groups have already invested to create a good model then take out the last layer and swap it to match our application. Intuitively, since the model was trained on a related task, the weights and internal representations that it already learned should be useful for the new use case. All the normal model training tips and tricks apply, except for the fact that we initialize the model with the pre-trained weights. Transfer learning is a really important technique for making many production-ready ML systems work well.
With the explanation out of the way, let’s see it in action!
Let’s say that I’m an ML-savvy rancher trying to improve the automated monitoring of my horses with a state-of-the-art object detection system (a setting I’m sure you’ve all encountered before). Some initial research suggests that there aren’t any models out there that are pre-trained on horse detection. Unlucky! Good thing we have Rapid and transfer learning.
For my data, I found an open source collection of animal images. With Rapid, I uploaded the images and instructed taskers to draw a tight bounding box around any horses present in the image. I then submitted a small calibration batch to make sure the labelers understood the task at hand and to spot check my own data. Once that was done, I was galloping along! I had 147 images labeled with bounding boxes around the horses back to me within 2 hours, all for around $20.
My pre-trained model that I’m transferring on is the Faster R-CNN model with ResNet 50 backbone.
This model is already trained on the famous COCO (Common Objects in COntext) object detection dataset.
Now that I have a model and data, I just need to train. After slicing my data as I desire in Nucleus, it’s easy to keep my images and files bundled together neatly:
With all the data horseplay out of the way, we just have to load our model and data into the processor and let it do its thing.
I trained the model for 10 epochs, which took about 15 minutes on GPU (or around 4-5ish hours on CPU if you’re not pressed for time). We achieved an F1 score of .517 (higher is better; when Faster R-CNN was initially released, its F1 score on COCO was .507). Way to go transfer learning! Here are some of the examples from the evaluation set:
Not bad, but we could do better. At this point I could spend more time tuning hyperparameters or training for more epochs. However, I’m too busy running the ranch to spend time doing hyperparameter sweeps or ensembling models.
Normally a bottleneck for model improvement is a lack of quality labeled data. Thankfully with Rapid, there’s no such bottleneck. I still have unlabeled images in my dataset – let’s see what happens when I change nothing else but add more labeled data!
The newly labeled data brings my total to 351 annotated images, which only cost me around $45 on Rapid. After 10 more epochs with the new data incorporated, the model improves our F1 score to .655.
Much better, and I didn’t even need to do any hyperparameter tuning! Let’s compare the new and old model predictions:
With more data, it seems that the model gets a little less confused with crowds of horses and captures more of the visible edges of the horse.
We’ve seen how to apply transfer learning to a custom use case. Now all I need to do is productionize and deploy my custom horse detection model, and I’ll be sure to take my ranching (is that a word?) to the next level. I hope you’ve been inspired to try out Rapid for your next transfer learning project! You can get started with Scale Rapid today here. Want to get a jump start for your own experiments? Access the code from my experiment on Scale's Github.