Toyota Research Institute

Creating large volumes of training data without sacrificing quality

One of Toyota's car
Overview

Toyota Research Institute’s mission is to build a new approach to mobility and pioneering the technologies that drive its future.

As part of this mission and vision, one of TRI’s key research areas is automated driving, developing two different automated driving modes in parallel - Guardian and Chauffeur.

Guardian monitors a human’s driving task, intervening only when necessary. Chauffeur mode, on the other hand, takes all responsibility for driving, leaving occupants purely as passengers.

Imagine a future where mobility is truly for all. Regardless of physical faculties, age, or income, you can participate in this world because technology has progressed to enable you to move as you want
David Garber
David GarberProduct Manager, TRI

This dual approach allows TRI to “deliver short term value (Guardian) while working on this long term project (Chauffeur),” says Adrien Gaidon, Machine Learning Lead.

As the largest automobile manufacturer in the world, TRI’s research and efforts has the potential to impact millions of car owners worldwide.

The Problem

Large volumes of data, but limited ability to label this data

Led by Adrien Gaidon, TRI’s Machine Learning team found itself with large volumes of data, but limited ability to label this data.

The need for large volumes of annotated data was obvious, but with a commitment to safety being non-negotiable, the TRI team did not want to trade off quality for quantity.

Toyota's car

To support the research teams, David Garber and Ashmi Wadhwani were tasked with managing the data annotation pipeline, data infrastructure and machine learning infrastructure.

Very quickly, they recognized the need for a labeling provider they could rely on long term to meet the ever-changing needs of TRI’s researchers.

Toyota's car viewed from the top showing the different LiDAR sensors
The Solution

Scale’s Sensor Fusion Cuboids, Sensor Fusion Segmentation and Semantic Segmentation

TRI looked into various solutions for its data annotation problem, but made Scale its main provider. This close working relationship led to TRI participating in the private beta of Sensor Fusion Segmentation, a challenging annotation type in which every point in a 3D point cloud needs to be painted.

Beyond new annotation types and features, working with Scale also gives the TRI team greater flexibility and the ability to amend workflows. “One of the things we love about Scale is the fact that we can fully label the world. We can label 2D bounding boxes, 3D bounding boxes, but also semantic segmentation, including in 3D, to understand as much as possible, including scenarios we don’t foresee today,” says Gaidon. “The ability to go to Scale for multi-modal annotation gives TRI an advantage in the automated driving space,” Garber concluded.

Since starting its work with Scale, Garber’s team has been able to support four large annotation pipelines without significantly increasing the size of their team.

Very quickly, our engineers liked what they saw and we asked Scale to ramp 10X throughput in a matter of weeks. Scale’s been able to support the extra throughput request and allowed us to do great research Wadhwani adds, Scale has turned around features for us very quickly. We've even had a feature added to help us get a new annotation to the way the ML team needed within 24 hours.
The Future

Leveraging latest annotations and features

TRI will continue to leverage Scale's latest annotations and features, such as Sensor Fusion Segmentation.

One of the things we love about Scale is the fact that we can fully label the world [...] to understand as much as possible, including scenarios we don’t foresee today.
Adrien Gaidon
Adrien GaidonMachine Learning Lead, Toyota Research