Nucleus logo

Document AI

Extract data from complex documents in seconds. At human-level accuracy. No templates.

  • blend
  • flexport
  • brex
  • sap
  • doma
  • paypal
  • square
  • openai
  • nvidia
  • general-motors

Extract any field from

packing lists
commercial invoices
bills of lading
mortgage applications
insurance cards
arrival notices

at 99%+ accuracy

Why Scale Document AI

Challenges Our Customers Faced Before Document AI

In the course of our mission to make AI infrastructure accessible, we’ve learned from a diverse set of companies. Their number one need was to use Machine Learning to automate document processing. Until now, companies:

  • graph wrench icon

    Encountered OCR’s Limits

    Tried rule-based OCR solutions that require extensive effort from engineering teams to set up templates.

  • clock icon

    Suffered from Limited Quality

    Tried rule-based OCR solutions that require extensive effort from engineering teams to set up templates.

  • technology icon

    Dealt with Chronic Delays

    Realized that low quality and lengthening turnaround times caused downstream delays in delivering their product to their customers, resulting in declining customer satisfaction.

  • technology icon

    Spent Excessively on In-House Engineering

    Tried building in-house capabilities with human processing and engineering efforts that caused wasting millions of dollars in hiring, training, maintaining of people and solutions as well as quality assurance without reaching economies of scale.

To address these shortcomings, we built Document AI.

How It Works

Operationalize Machine Learning

Scale Document AI has built base models relying on Scale’s expertise Computer Vision and Natural Language Processing. Document AI fine-tunes these Machine Learning models for your use case by annotating sample documents. The resulting models are ready to process your documents with human-level accuracy, in seconds. No templates needed.

How It Works

Optional
human-in-the-loop QA is available for complex use cases, and is also used to improve model performance.

industries

Logistics, Finance, and Healthcare Depend On Us

logistics box icon

Bills of Lading, Commercial Invoices, Packing Lists, and more

Reduce delays when clearing customs and delivering goods, minimize operational costs, and get paid on time. Document AI is template-free, fast, and extracts data from your documents at human-level accuracy.

client.createDataExtractionTask({
  callback_url: 'http://www.example.com/callback',
  instruction: 'Extract fields and link relationships.',
  params: {
      attachments: [
        {
          type: 'pdf'
          content: 'bill_of_lading.pdf'
        }
      ],
      labels: ['M&No', 'Description', ...],
      boundingboxes: true,
  }
});
financial services icon

Invoices, Mortgage Applications, Tax Documents, IDs, and more

Move to real-time when servicing your customers without sacrificing accuracy. With Document AI, underwriters become more efficient and accounting can issue expenses faster. Optional human-in-the-loop QA is available.

client.createDataExtractionTask({
  callback_url: 'http://www.example.com/callback',
  instruction: 'Instantly classify and extract fields in the invoice.',
  params: {
      attachments: [
        {
          type: 'jpg'
          content: 'invoice.jpg'
        }
      ],
      labels: ['LineItem', ...],
      boundingboxes: true,
  }
});
health insurance icon

Claims Forms, Medical Records, IDs, Insurance Cards, and more

Move to real-time when servicing your customers without sacrificing accuracy With Document AI, healthcare organizations make faster critical decisions. Our solution processes complex medical documents at high quality and instant turnarounds times.

client.createDataExtractionTask({
  callback_url: 'http://www.example.com/callback',
  instruction: 'Identify patient and billing information.',
  params: {
      attachments: [
        {
          type: 'pdf'
          content: 'claims_document.pdf'
        }
      ],
      labels: ['ProviderName', 'Diagnosis Code', ...],
      boundingboxes: true,
  }
});

QUALITY ASSURANCE

Accuracy and Fast Turnaround are Guaranteed

Industry-leading quality engine to reliably tackle ever-changing unstructured data

Human-Level Accuracy

Our use of Computer Vision and Natural Language Processing models, with fine-tuning, enables much higher quality data extraction than either hard-coded templates or human annotation. We optionally provide human-in-the-loop QA when needed.

ML Means Continual Improvement

Our models are trained on millions of data points, and further refined for each customer use case. Thus, our ML models achieve much higher quality, generalize across challenging document types, and continually improve as we continue to process more data.

Transparency In Metrics

To increase your operational efficiency, you get access to our metrics dashboard to review your pipeline performance, visualization tools to audit your data easily, and our feedback platform to provide instructions.

Scale's dashboard

See It In Action

Get to Know Document AI

OFFERINGS

Explore Options

Enterprise

Custom fine-tuned models to fit your specific needs.

Supported Document Types

50+ document types supported and support for new document types.

Supported Languages

15+ languages supported.

Taxonomy

Define and customize the fields you need extracted from your documents.

Quality

Up to 99%+. Includes custom quality SLAs in the contract.

Latency

Less than 5 seconds.

Human-in-the-loop QA

Optional human-in-the-loop QA is available.

Pricing

Custom pricing.

Enterprise Document AI requires an annual contract. Talk to our team and schedule a demo.

Self-Serve

Self-serve, models-only document processing.

Supported Document Types

Commercial Invoices, Bills of Lading, Airway Bills, and Accounts Payable Invoices.

Supported Languages

English

Taxonomy

Pre-defined taxonomies can be found <a href='https://scale.com/docs/supported-document-types'>here</a>.

Quality

Tooling to audit results is provided.

Latency

Less than 5 seconds.

Human-in-the-loop QA

Models only.

Pricing

First 100 pages free. 
Starting at $0.20/page. <a href="https://scale.com/docs/pricing">Learn More.</a>

To try Document AI Go, sign up for the waitlist.

CUSTOMERS

Trusted by World Class Companies

“Scale’s machine learning-based Document AI is very different from traditional OCR models, or template-based learning. No templates, high quality, and low latency every time. We rely on Scale for document processing, because with higher extraction accuracy, almost zero human labor is required afterward to correct it. With lower latency, we can enable products like air freight where document data has to arrive much faster since air shipments take less than two days.”

James Chen

Chief Technology Officer, Flexport

“The combination of the Blend platform with Scale’s Document AI ensures the swift, accurate extraction and validation of data from documents, enabling bankers to make data-driven decisions with confidence.”

Jeff Braddock

Manager of Product Partnerships, Blend

“Unlike OCR that basically just extracts information and then leaves to our engineers all the work of understanding the context, Scale Document actually figures out the context for us, and that requires minimal work on our side to actually build and integrate the whole pipeline.”

Henrique Dubugras

Founder and Co-CEO, Brex

“OpenAI threw a bunch of tasks at Scale AI with difficult characteristics, including tight latency requirements and significant ambiguity in correct answers. In response, Scale worked closely with us to adjust their QA systems to our needs.”

Geoffrey Irving

Member of Technical Staff, OpenAI

“Scale has provided the fuel to put our machine learning systems on overdrive. They make sure the highest quality training data is there in time to meet our aggressive roadmap. Lenders and borrowers will experience faster and more efficient closings sooner as a result.”

Andy Mahdavi

Chief Data Science Officer, Doma

With Scale Document AI, document processing is a breeze.