Overview
Kodiak Is Building Perception and Autonomy Systems that Drive the Future of Freight Transportation.
Kodiak Robotics is an autonomous technology company that is building self-driving capabilities and technologies for the long-haul trucking industry. Based in Mountain View, CA, Kodiak leverages a unique sensor fusion system combined with a lightweight mapping solution to safely navigate all aspects of highway driving and deliver freight efficiently and on-time. Kodiak’s team, which includes several self-driving industry veterans, is redefining the long-haul trucking industry by “building the world’s most efficient, reliable, and respected end-to-end delivery solution.”
The Problem
Large Training Datasets, yet Few Examples of Important Edge Cases.
In most ideal driving scenarios, trucks don’t encounter pedestrians on the highway. When they do, though, knowing how to detect and navigate unexpected situations is a requirement for any production-level, autonomous vehicle system. Kodiak’s software stack learns how to identify and navigate rare scenarios by training models on examples. But it’s often difficult to collect enough examples in the real world to reliably handle certain edge cases. For Kodiak, one of those challenging edge cases is pedestrians walking on the highway.
The Solution
Increase Model Robustness by Training on Synthesized Rare Scenarios.
The Kodiak team chose Scale to provide synthetic data to augment Kodiak’s existing ground-truth training data with simulated pedestrians. Scale provides a unique human-in-the-loop synthetic data generation process to create diverse and realistic synthetic data. Trained taskers can validate the placement and poses of synthetic pedestrians to ensure the synthetic data is realistic. Scale delivers the data using the same dashboard and APIs as their existing annotation pipeline, making integration seamless.
The Result
Nucleus Helps Kodiak Identify Where More Synthetic Data will Improve Accuracy.
In Nucleus, Kodiak plans to continue to use Natural Language search and Autotags to find the specific scenes in their dataset that had edge cases they needed to improve their model on. This includes—among other scenarios—scenes where construction workers are present and where a vehicle is traveling under a bridge.
For efficiency, the Kodiak team centralized all of their data, including multiple labeling projects and raw, unlabeled data, into a single dataset. This allows the team to quickly iterate on model experiments, query for specific attributes or metadata on the fly, and close the loop for a more end-to-end data and model management system.
Going forward, the team is able to review both insights and model metrics in Nucleus in order to identify scenes with poor IoU (intersection over union) and curate subsets of data where their model wasn’t performing well, in which additional synthetic data might be helpful.