Scaling Menu Transcription Tasks with Scale Document

byon November 16, 2020

When Covid-19 first hit, many companies had to rapidly reorient their

operations – from altering their supply chains to implementing social

distancing and work-from-home orders.

One sector that was particularly hard hit was the restaurant industry. When

in-person dining was suspended, restaurants turned to delivery to survive. But

setting up a delivery service from scratch takes time and money that many

establishments, with their income drying up, did not have. Suddenly, food

delivery platforms became an economic lifeline.

To ensure restaurants could onboard quickly and seamlessly in these critical

early days, one of the leading delivery platforms turned to Scale.

Scaling labeling

Every time a restaurant joins one of the delivery platforms or changes its

menu, the menu’s contents need to be inputted into the platform so users can

select their food. Inputting menus manually is slow, expensive, and leads to

mistakes. With restaurants relying on delivery to reach customers and support

their staff, a leading delivery platform knew it was vital to ensure an

efficient process for its partner restaurants.

This presented Scale with unique latency and quality control challenges,

requiring both an incredibly high level of accuracy with a fast turnaround to

provide operational efficiencies at a critical time for our customer. Our aim

was to use the combination of partial automation and human quality control in

our Scale Document labeling pipeline

to make dramatic efficiency improvements over manual processes. In all, we

nearly halved the time it takes our customer to process menus and reduced

critical error rates to <1% for all items labeled, allowing them to onboard

restaurants smoothly exactly when they needed it.

Building the right data infrastructure

While processing standardized documents such as food menus, government IDs or

loan application forms might be intuitive and simple for humans, it is a

surprisingly nontrivial problem for algorithms, involving much more than

automatic transcription. In particular, the data often contains many

dependencies that need to be captured accurately. For example, ordering a

pepperoni pizza doesn’t just mean choosing the type, but also the size, the

type of base, choosing to substitute or customize toppings (which can also

vary in price depending on the size of the pizza, creating more

interdependencies), and whether to add extras like burgers or sandwiches. It’s

important to capture these dependencies in the menu correctly, otherwise

inputting errors risk causing losses for both the restaurant and the delivery



Algorithmically, this has to be represented as complex decision trees made

up of categories, items, and lists of options, capturing the relationships

between them. Thankfully, both our document processing pipeline and form

capture tools were already set up to capture these types of nested data

structures automatically without the need for customization. This meant we

were able to quickly pivot our tools for our customer’s requirements.

Judicious Automation and Scaling

Once we have the right tooling in place to pull the raw data and label

complex data structures, we can then start processing the data. Given how

many menus delivery platforms were processing at the start of

shelter-in-place, and how important for restaurants’ continued operations it

was to process this data quickly with no drop-off in quality, we need to

make this process as efficient as possible.

At Scale AI, we believe that a mix of judicious labeling automation with

continued human oversight is the only way to provide data at the scale,

quality, and low cost needed to enable many enterprise applications of AI.

In particular, we aim to automate the lower-skilled parts of the data

labeling pipeline to focus human review on the hard parts – quality control,

edge cases, and the most complex data types. That way, we can help guarantee

quality without sacrificing efficiency.

We use automation to prefill menu taxonomies for expert human labelers to

review and confirm.

We trained our form capture tool to automatically understand document

structure and predict the next transcription.

And when our labelers needed to add inputs manually, we built a “smart

suggestion” feature that provides them with ready prompts for common items

that they have already encountered in their batch of menus.

Both have helped improve both the efficiency and accuracy of labeling

simultaneously: allowing us to transcribe more than 3,000 menus in a single

day and return menus within short turnaround times to onboard restaurants

within 24 hours from start-to-finish. By drawing on our wide network of

well-trained labelers and augmenting their work with careful automation,

we’re able to handle changes in demand dynamically – essential for providing

a smooth service during sudden peaks in demand.

Custom Benchmarks and Integrations

Once data is flowing smoothly and efficiently through this labeling

pipeline, the next step is to ensure that the data is being labeled to an

incredibly high standard, with almost zero errors. We’ve developed a range

of techniques to guarantee label quality, including the randomized use of

benchmark quality tests, confidence-based consensus, and, for ambiguous

labeling requirements, our own

automated benchmark generation system.

Benchmarking is not always an objective science – the most effective

benchmarks are often task-specific. The high stakes of onboarding

restaurants during a pandemic required particularly high guarantees of

labeling quality. To meet them, we worked closely with our customer to build

custom benchmarks for this workflow, turning the most complex mislabeled

menus into benchmarks that assessed our labelers’ performance.

In surveying mislabeled menus, we noticed two key groups of errors. One was

associated with the cross-referencing of menu sections between many items

and mistaking how they relate to each other, while the other group tended to

be mistakes from larger menus where it may be less obvious when certain

features of the menu are missing. By identifying these key groups and

creating benchmarks out of the associated menus, we can ensure that tasker

performance in such cases exceeds the necessary standard to meet menu

quality needs.


The menu on the left showcases large amounts of cross-referencing and item

optionality, while the menu on the right shows a fairly common amount of

menu feature density.

What’s Next

AI models are rapidly becoming critical infrastructure for a huge range of

businesses – a trend of automation that the COVID pandemic has rapidly

accelerated. But gathering effective, accurate, and unbiased data currently

remains such a challenging task that it is preventing real-world AI systems

from reaching their full potential and presenting high complexity barriers

for smaller companies.

We’re tackling a whole host of problems to help accelerate that progress.

Right now, teams at Scale AI are making better use of machine learning to

make labeling orders of magnitude more efficient, building tools to process

increasingly complex data types, such as 3D point clouds in computer vision,

and developing new infrastructure tools to help streamline the management of

data. Our first such management tool,

Scale Nucleus, is already helping our computer vision customers automate time-consuming

manual steps in the ML development process. We’re now working on deploying

it for our customers in natural language.

Ultimately, we want to make it as easy to deploy AI as any other type of

software. If you’re interested in joining us in solving these problems, take

a look at our careers page for our

latest open positions. If you have projects that require high-quality data

labeling, let our team know!

The future of your industry starts here.