Skip to main content

Documentation Index

Fetch the complete documentation index at: https://api-reference.scale.com/llms.txt

Use this file to discover all available pages before exploring further.

Create Text Collection Task

This endpoint creates a **textcollection** task. In this task, Scale will collect information from the given attachments and/or through the web following the instructions that you provide. Example use cases include labeling structurally-complex data from an attached image or querying sentiment information given a set of links. This task involves an **attachments** array detailing the attachments to be annotated, and a **fields** parameter which describes all of the different pieces of information to be captured. The **fields** parameter is an array in which each object has a **field_id**, **type**, and **title**. The **field_id** is the key the annotation will be returned under, and it must be unique within the project. The bulk of the task is defined within this array. It may be helpful to consider building this parameter as similar to building an HTML form. You can optionally provide additional markdown-enabled or Google Doc-based instructions via the **instruction** parameter. If successful, Scale will immediately return the generated task object, at which point you could store the **task_id** to have a permanent reference to the task. The parameters **attachments** and **fields** will be stored in the **params** object of the constructed **task** object.
Body Params
object
name
string
required

const request = require('request');

const options = {
  method: 'POST',
  url: 'https://api.scale.com/v1/task/textcollection',
  headers: {accept: 'application/json', 'content-type': 'application/json'},
  body: {
    project: 'string',
    batch: 'string',
    instruction: '**Instructions:** Please annotate all the things',
    callback_url: 'string',
    fields: [
      {
        type: 'category',
        field_id: 'category_field',
        title: 'Please select the most relevant option',
        description: 'string',
        choices: [{label: 'Correct Website', value: 'correct_website'}]
      }
    ],
    attachments: [{type: 'website', content: 'https://www.scale.com/'}],
    title: 'string',
    description: 'string',
    responses_required: 1,
    priority: 30,
    unique_id: 'string',
    clear_unique_id_on_error: true,
    tags: ['string']
  },
  json: true
};

request(options, function (error, response, body) {
  if (error) throw new Error(error);

  console.log(body);
});

{
  "task_id": "5c3f9c25744e7d005052e319",
  "created_at": "2019-01-16T21:03:33.166Z",
  "callback_url": "http://www.example.com/callback",
  "type": "textcollection",
  "status": "completed",
  "instruction": "Find the URL for the hiring page for the company with attached website.",
  "urgency": "day",
  "attachments": [
    {
      "type": "website",
      "content": "https://www.scale.com/"
    }
  ],
  "fields": [
    {
      "type": "category",
      "field_id": "category_field",
      "title": "Please select the most relevant option",
      "description": "string",
      "choices": [
        {
          "label": "Correct Website",
          "value": "correct_website"
        }
      ]
    }
  ],
  "metadata": {}
}

Create Named Entity Recognition Task

This endpoint creates a new **namedentityrecognition** task. In order to complete this task, our labelers will read the provided text and highlight any text entity mentions that correspond to the specified labels. Unlike most tasks, these tasks do not require an **attachment** or **attachments** field containing a link to the attachment to be annotated. Instead, the text to be annotated is provided directly within the **text** parameter of the request body itself.
Body Params
object
project
string
The name of the project to associate this task with. See the Projects Section for more details.
batch
string
The name of the batch to associate this task with. Note that if a batch is specified, you need not specify the project, as the task will automatically be associated with the batch’s project. For Scale Rapid projects specifying a batch is required. See Batches section for more details.
instruction
string
required
A markdown-enabled string or iframe embedded Google Doc explaining how to do the task. You can use markdown to show example images, give structure to your instructions, and more. See our instruction best practices for more details. For Scale Rapid projects, DO NOT set this field unless you specifically want to override the project level instructions.
callback_url
string
The full url (including the scheme **http://** or **https://**) or email address of the callback that will be used when the task is completed.
text
string
required
The text from which to extract named entities.
attachments
array of integers
An array of TextCollectionAttachment objects to be labeled
labels
array of integers
required
An array of NamedEntityRecognitionLabel objects containing descriptions for the text span types to label.
relationships
array of strings
An array of NamedEntityRecognitionRelationshipDefinition objects containing descriptions for the relationships between text spans to annotate.
merge_newlines
boolean
If true, removes the ‘\n’ characters in the input and do not displays line breaks in the task interface.
allow_overlapping_annotations
boolean
If true, allows annotations to overlap. Otherwise, all annotations must cover disjoint text spans.
unique_id
boolean
A arbitrary ID that you can assign to a task and then query for later. This ID must be unique across all projects under your account, otherwise the task submission will be rejected. See Avoiding Duplicate Tasks for more details.
clear_unique_id_on_error
boolean
If set to be true, if a task errors out after being submitted, the unique id on the task will be unset. This param allows workflows where you can re-submit the same unique id to recover from errors automatically
tags
array of strings
Arbitrary labels that you can assign to a task. At most 5 tags are allowed per task. You can query tasks with specific tags through the task retrieval API.
const sdk = require('api')('@scale-ai/v1.3#20c5u82flglvwk99');

sdk.namedEntityRecognition({
  instruction: 'Highlight any **text entity mentions** that correspond to the specified labels',
  text: 'Melt butter in a heavy skillet over medium heat. Add onion; cook and stir until onion starts to brown, about 5 minutes. Season with salt and pepper.',
  attachments: [
    {
      type: 'text',
      content: '**Please review this context**: I want to buy 1oz hand sanitizer please.'
    }
  ],
  merge_newlines: false,
  allow_overlapping_annotations: false,
  labels: [
    {
      name: 'T_INGR',
      display_name: 'Ingredients',
      children: [{name: 'T_Butter', display_name: 'Butter'}]
    }
  ],
  project: 'projectName',
  batch: 'batchName',
  callback_url: 'http://www.example.com/callback',
  relationships: [
    {
      name: 'R_ADD',
      display_name: 'is added to',
      is_directed: true,
      source_label: 'T_INGR',
      target_label: 'T_INGR'
    }
  ],
  tags: ['tag1']
})
  .then(({ data }) => console.log(data))
  .catch(err => console.error(err));
{
  "task_id": "5c3f9c25744e7d005052e319",
  "created_at": "2019-01-16T21:03:33.166Z",
  "callback_url": "http://www.example.com/callback",
  "instruction": "Label all ingredients and cookware. Please include relevant adjectives, for example **red apple** instead of just **apple**.",
  "type": "namedentityrecognition",
  "status": "pending",
  "params": {
    "text": "Melt butter in a heavy skillet over medium heat. Add onion; cook and stir until onion starts to brown, about 5 minutes. Season with salt and pepper.",
    "labels": [
      {
        "name": "T_INGR",
        "display_name": "Ingredients",
        "children": [
          {
            "name": "T_Butter",
            "display_name": "Butter"
          }
        ]
      }
    ]
  },
  "is_test": false,
  "attachments": [
    {
      "type": "text",
      "content": "**Please review this context**: I want to buy 1oz hand sanitizer please."
    }
  ],
  "metadata": {}
}