We founded Scale to create the infrastructure needed to build AI in any
industry, by anyone. We started tackling this complex problem at the root —
turning raw data into high quality training data for models. In this pursuit,
we spent the last four years building ML-augmented annotation products for all
data types, expanding our solutions for major industries, and making
significant technological strides in
in doing so.
But the problem of building effective, accurate, unbiased ML models still
remains. To do this, aggregate metrics in ML are not good enough. Better ML
starts with understanding your data in depth. To improve production ML, you
need to understand their qualitative failure modes, fix them by gathering the
right data, and curate diverse scenarios.
- Before training a model, ML engineers must curate and sample their
- data, ensuring that they have the right data to solve a specific problem. This
- process is too often very manual. For example, to teach a self-driving car
- how to handle left turns, an ML team has to manually crawl through their
- driving sequences to isolate examples of left turns to create a training
- dataset. The data also needs to be representative of the ground
- truth of the problem you’re trying to solve. If you’re building a model to
- assign a gender to faces, you have to ensure representative data for all
- genders to have unbiased outputs. Again, that’s too often a manual and
- highly inefficient process.
- After training, ML teams test and benchmark model performance, ensuring that the test dataset is also sufficiently representative of the
- problem they are trying to assess its performance against. For example, a
- model learning how to tell cars apart from pedestrians needs enough of both
- examples for accurate benchmarking. This requires ML engineers to spend
- significant amounts of time building one-off UIs to chart and share
- performance data.
- After deployment, debugging the model, identifying failure
- modes and fixing them. Too often issues in data only arise after a model has
- entered deployment – requiring time-consuming debugging. One Scale customer,
- for example, found that their vehicle recognition algorithm didn’t perform
- well in certain environments – it turned out that the model was trained on a
- dataset where vehicles were mostly in the bottom of the image, so the model
- associated “bottom of the image” with “likelihood of being a car.”
The Scale team has been working to productize the concept that Andrej Karpathy
calls "Operation Vacation." Nucleus is a new way—the right way—to develop ML models, helping us move
away from the concept of one dataset and towards a paradigm of collections of
scenarios and giving ML engineers the ability to automate time-consuming
manual steps in the ML development process.
Scale Nucleus provides advanced
tooling for understanding, visualizing, curating, and collaborating on your
data – allowing teams to build better ML models via a powerful interface and
APIs. With Scale Nucleus, you can:
- Visualize your dataset, ground truth, and model
- predictions to improve model performance
- Curate interesting slices within your dataset for active
- learning and identifying key edge cases
- Upload and choose data to be annotated for rare event
- mining and dataset balancing
- Search your data based on metadata or ML-produced
- attributes
- Identify edge cases through visual search
- Measure key metrics like dataset balance, class
- correlation, and confusion, via a powerful insights tab that shows the
- overall health of the data
- Debug model performance
- Share your dataset seamlessly to provide a single source
- of truth for data within your team.
Scale Nucleus’s query function, allowing users to return images matching
automatically-generated meta-tags.
Scale Nucleus is directly integrated with Scale AI’s data labeling pipeline
– allowing teams to fix any issues in their data at the source. Nucleus
currently supports Image data, with support for 3D Sensor Fusion, Video,
Text and Document data coming soon.
We built Scale Nucleus to allow data scientists and ML engineers to manage
data more efficiently and increase the marginal value of their data. Deeper
insight into a dataset’s features can have several transformational effects
on the development of AI:
- For engineers, the easier visualization and curation of datasets lowers
- the barriers to entry to building ML systems.
- Spotting hidden modes of failure in datasets before deployment (such as a
- set of driving sequences that doesn’t include any sequences at night)
- makes it much easier to train high-quality models, and provides a robust
- way of eliminating issues like bias at source.
- The ease of debugging could significantly improve iteration speed on model
- fine-tuning post-deployment.
We are excited to take the next step in building the infrastructure to
enable efficient, accurate, and unbiased ML development. If you’d like to
join us in this journey and try Scale Nucleus, contact us at
nucleus@scale.com or sign up on our