Scale Rapid is a quick and easy way to get clean, high-quality labeled data for your next ML project. We used it to train a sentiment analysis model in a previous post - and we’re back now to show you how to use Scale Rapid to train an unsupervised model!
In this blog, we are going to use CycleGAN (paper) to anime-fy images with Scale Rapid.
CycleGAN is trained in an unsupervised manner, meaning each X input does not need to correspond 1:1 to a y output in order to train the net.
Instead of using Scale Rapid to label training data, we used it to clean data. We separated a group of images scraped from the internet into categories. This classification allows us to make the two inputs necessary for CycleGAN - a class A (real-life images) and class B (Japanese anime-style images).
To start, we downloaded 10k images from Google images from a variety of different searches. At a glance, the corpus had all sorts of images.
Class A contains real-life images and Class B contains anime-style images. Image 3 belongs to Class A, and Image 2 to Class B. Images 1 and 4 do not belong in the dataset as they are neither real-life nor anime-style images.
The ask for Scale Rapid is to discriminate between these three types of images: a real-world, realistically edited photograph, a Japanese-style anime scene, or neither. Rapid responses are then used as ground truth to establish which images should and shouldn’t be in the dataset we train CycleGAN on.
Rapid’s categorization endpoint supports image categorization - and it's only 8 cents per image!
Because CycleGAN is trained on unsupervised data, we don’t need to pair class A and class B images - we can simply separate them from one another and train.
Using Scale Rapid
Scale Rapid works in two phases- first, the user sets up the project & instructions, and runs small calibration batches to ensure the initial results are as expected. Once quality on those looks good, larger production batches may be uploaded for labeling.
For this application, we set up a categorization project. The taxonomy is just one multiple-choice question:
What type of image is this?
Anime (japanese animation) image
Photorealistic (real-life) image
Scale Rapid also has a feature called long hints, which allows us to prompt labelers with specific things to think about upon choice selection via a tooltip. For the first choice (Anime (japanese animation) image) I clarified with the following long hint:
"Japanese animation refers specifically to anime, as in the instructions. Other cartoon-style or generated images belong in the neither category."
We begin by uploading a calibration batch, and follow up with production batches. The labeling results are easily parseable via JSON.
Scale Rapid Results:
A random audit of tasks found just one mistake out of 200 randomly selected labeled images. A few examples of correctly labeled images are below, organized by class.
With the help of Scale Rapid, raw data from the web was transformed into cleaned and confirmed data and separated into class A, class B, and neither.
CycleGAN produces mappings from class A to class B and vice versa using two different types of losses- adversarial loss and cycle consistency loss.
Adversarial loss incentivizes the mapping to generate images that look similar to the target set. We achieve this by optimizing against an adversary D–a discriminator that tries to separate the generated images for the target class from actual data from the target class.
Cycle consistency loss incentivizes the mapping to produce images that look like they fit the distribution of the target class.This occurs by minimizing the difference between original data and that same data transformed into the same target class.
We trained CycleGAN with our Scale Rapid sorted dataset.
We are now able to anime-fy images! A few examples: