Scale RapidThe fastest way to production-quality labels.
Scale StudioLabeling infrastructure for your workforce.
Scale 3D Sensor FusionAdvanced annotations for LiDAR + RADAR data.
Scale ImageComprehensive annotations for images.
Scale VideoScalable annotations for video data.
Scale TextSophisticated annotations for text-based data.
Scale AudioAudio Annotation and Speech Annotation for NLP.
Scale MappingThe flexible solution to develop your own maps.
Scale CatalogCreate, enrich, and enhance eCommerce data.
Scale Enterprise AIModels to support your business use cases.
Scale NucleusThe mission control for your data
Scale LaunchShip and track your models in production
Scale Content UnderstandingManage content for better user experiences
Scale InstantMLNext-day machine learning models, without ML expertise
Scale SpellbookThe platform for large language model apps
Scale SyntheticGenerate synthetic data
Retail & eCommerce
Content & Language
Smart Port Lab
AI Readiness Report 2022
Building In-house Labeling Operations for High-Quality Training Data
Voxel is Changing How Companies Manage Risk and Operations
Voxel is on a mission to leverage AI and computer vision to fundamentally change how companies manage risk and operations. To do this, Voxel enhances its customers' security cameras with real-time AI to detect hazards, risky activities, and operational inefficiencies.
"We enable safety managers to keep more people safe across large warehouses and spaces. Our technology delivers insights that can stop injuries before they happen and ultimately improve the safety culture for hazardous and dynamic environments." - Harishma Dayanidhi, VP Engineering
To develop a robust computer vision system, Voxel needs large amounts of high-quality training data that they can train their models on. A plethora of situations and edge cases need to be accounted for, such as potential hazards, risky activities, and inefficiencies in an industrial setting, that Voxel's system needs to be able to identify.
Producing Quality Training Data while Automating the Process
Voxel's computer vision team faced two challenges: 1) how to maintain high-quality training data and 2) how to automate their labeling process for faster throughput – all while retaining their in-house annotation team.
Voxel had already invested the time and effort to assemble an in-house annotation team of subject-matter experts who were well-versed in handling Voxel's specific use case. Voxel saw a strategic advantage in keeping its internal labeling operations. With a team in place, Voxel began looking for a solution that could introduce greater efficiency to its labeling operations.
Until now, the team had been using an open-source solution called Computer Vision Annotation Tool (CVAT). However, the computer vision team at Voxel was ramping up the volume of annotations they needed for model training and was running into significant bottlenecks with CVAT.
From the operations side, Voxel could not efficiently and programmatically collect data and insights on the data labeling process, resulting in significant manual effort from the data operations team. The open-source tool couldn’t effectively link data quality to individual annotators. Thus, if the team produced a batch of low-quality labels, they couldn't determine whether it was the training, the annotators, or something else. This environment made it difficult for Voxel to automate its data labeling process and scale its labeling operations.
On the engineering side, with CVAT, Voxel needed to custom-build data pipelines for new customer projects. Given the complexity of the data pipelines, this process took multiple engineers four weeks to build the required data infrastructure for each project.
Labeling Operations Require APIs, Admin Features, and Integrated Tools
The confluence of these two factors led them to look for a partner with strong data annotation expertise who understands the challenges and pain points of managing one's annotation team.
Scale Studio was selected because of:
- Studio's comprehensive management features with training courses, benchmark tasks, and annotator metrics (i.e., throughput, efficiency, accuracy, etc.)
- Scale's APIs for easy integration of data pipelines and quick set up of labeling projects
- Scale's ecosystem of integrated ML tools, such as Nucleus for dataset curation and management, and Rapid for Scale-managed dataset annotation
- Scale's experience with processing billions of annotations, confirming Scale's platform reliability and tried-and-true infrastructure
- Finally, Scale's credo of "earn customer love" provides the Voxel team with the responsiveness and support necessary to achieve their ambitious goals
While Studio has proven incredibly easy to use for the project management and annotator teams, Voxel has called out the Scale's customer success and engineering teams' strong technical knowledge and responsiveness in getting the most out of the platform. For example, Scale partnered with the Voxel team to ensure that all frames of complex variable-frame-rate (VFR) videos were extracted to maximize the accuracy of the annotations–and thus the accuracy of the model.
“I would definitely say the support we’ve received in working with the Scale team is the best part of the partnership so far… the responsiveness is amazing. If we have a problem, the Scale team always comes up with a thought-out solution.”
VP Engineering, Voxel
Efficient Operations for High-Quality Training Data and Time Savings
After kicking off the project with Scale Studio, Voxel onboarded their 20+ subject-matter experts onto the platform. Studio gave Voxel's data operations managers visibility into their in-house labeling team with annotator metrics such as throughput, efficiency, and accuracy. Studio also made it easy to streamline a data labeling process with intuitive tools and standardized workflows. They could now forecast labeling capacity and plan to match variable labeling demands. Compared to their previously ad hoc and manual approaches, Voxel's operations managers saved 20% of time each week using Studio.
"Our data operations managers were spending 20% of time each week to manually log batches of data, assign datasets to annotators, and estimate project completion times. We couldn't standardize the workflow and accurately provide visibility to other teams. With Scale Studio, we streamlined this process, giving us the clarity we needed." - Harishma Dayanidhi, VP Engineering
Studio also helped Voxel's computer vision engineering team increase its capacity. With Scale's APIs, it was easy to integrate multiple data pipelines into their operations. There was now less manual work for engineers. With Studio, the engineering team cut their time by 50% to kick off new projects.
"The ease of using Scale's APIs accelerated our timeline for building data pipelines into our operations. It cut our lead time in half. For each new project, our engineers will now free up at least two weeks of their time and can refocus on other priorities."
VP Engineering, Voxel