Create Text Collection Task

This endpoint creates a

textcollection
task. In this task, Scale will collect information from the given attachments and/or through the web following the instructions that you provide. Example use cases include labeling structurally-complex data from an attached image or querying sentiment information given a set of links.

This task involves an

attachments
array detailing the attachments to be annotated, and a
fields
parameter which describes all of the different pieces of information to be captured.

The

fields
parameter is an array in which each object has a
field_id
,
type
, and
title
. The
field_id
is the key the annotation will be returned under, and it must be unique within the project. The bulk of the task is defined within this array. It may be helpful to consider building this parameter as similar to building an HTML form.

You can optionally provide additional markdown-enabled or Google Doc-based instructions via the

instruction
parameter.

If successful, Scale will immediately return the generated task object, at which point you could store the

task_id
to have a permanent reference to the task.

The parameters

attachments
and
fields
will be stored in the
params
object of the constructed
task
object.

Body Params

namestringrequired

___

Request

POST/v1/task/textcollection
const request = require('request');

const options = {
  method: 'POST',
  url: 'https://api.scale.com/v1/task/textcollection',
  headers: {accept: 'application/json', 'content-type': 'application/json'},
  body: {
    project: 'string',
    batch: 'string',
    instruction: '**Instructions:** Please annotate all the things',
    callback_url: 'string',
    fields: [
      {
        type: 'category',
        field_id: 'category_field',
        title: 'Please select the most relevant option',
        description: 'string',
        choices: [{label: 'Correct Website', value: 'correct_website'}]
      }
    ],
    attachments: [{type: 'website', content: 'https://www.scale.com/'}],
    title: 'string',
    description: 'string',
    responses_required: 1,
    priority: 30,
    unique_id: 'string',
    clear_unique_id_on_error: true,
    tags: ['string']
  },
  json: true
};

request(options, function (error, response, body) {
  if (error) throw new Error(error);

  console.log(body);
});
POST/v1/task/textcollection
from scaleapi.tasks import TaskType
from scaleapi.exceptions import ScaleDuplicateResource

payload = dict(
  instruction: '**Instructions:** Find the URL for the hiring page for the company with attached website.',
  attachments: [{type: 'website', content: 'https://www.scale.com/'}],
  responses_required: 1,
  metadata: {newKey: 'New Value'},
  priority: 30,
  project: 'projectName',
  batch: 'batchName',
  callback_url: 'http://www.example.com/callback',
  fields: [
    {
      type: 'category',
      field_id: 'category_field',
      title: 'Please select the most relevant option',
      description: 'fiedl description',
      choices: [{label: 'Correct Website', value: 'correct_website'}]
    }
  ],
  title: 'taskTitle',
  description: 'taskDescription',
  tags: ['tag1']
)

try:
    client.create_task(TaskType.TextCollection, **payload)
except ScaleDuplicateResource as err:
    print(err.message)  # If unique_id is already used for a different task

Response

{
  "task_id": "5c3f9c25744e7d005052e319",
  "created_at": "2019-01-16T21:03:33.166Z",
  "callback_url": "http://www.example.com/callback",
  "type": "textcollection",
  "status": "completed",
  "instruction": "Find the URL for the hiring page for the company with attached website.",
  "urgency": "day",
  "attachments": [
    {
      "type": "website",
      "content": "https://www.scale.com/"
    }
  ],
  "fields": [
    {
      "type": "category",
      "field_id": "category_field",
      "title": "Please select the most relevant option",
      "description": "string",
      "choices": [
        {
          "label": "Correct Website",
          "value": "correct_website"
        }
      ]
    }
  ],
  "metadata": {}
}

Create Named Entity Recognition Task

This endpoint creates a new

namedentityrecognition
task. In order to complete this task, our labelers will read the provided text and highlight any text entity mentions that correspond to the specified labels.

Unlike most tasks, these tasks do not require an

attachment
or
attachments
field containing a link to the attachment to be annotated. Instead, the text to be annotated is provided directly within the
text
parameter of the request body itself.

Body Params

projectstring

The name of the project to associate this task with. See the Projects Section for more details.


batchstring

The name of the batch to associate this task with. Note that if a batch is specified, you need not specify the project, as the task will automatically be associated with the batch's project. For Scale Rapid projects specifying a batch is required. See Batches section for more details.


instructionstringrequired

A markdown-enabled string or iframe embedded Google Doc explaining how to do the task. You can use markdown to show example images, give structure to your instructions, and more. See our instruction best practices for more details. For Scale Rapid projects, DO NOT set this field unless you specifically want to override the project level instructions.


callback_urlstring

The full url (including the scheme http:// or https://) or email address of the callback that will be used when the task is completed.


textstringrequired

The text from which to extract named entities.


attachmentsarray of integers

An array of TextCollectionAttachment objects to be labeled


labelsarray of integersrequired

An array of NamedEntityRecognitionLabel objects containing descriptions for the text span types to label.


relationshipsarray of strings

An array of NamedEntityRecognitionRelationshipDefinition objects containing descriptions for the relationships between text spans to annotate.


merge_newlinesboolean

If true, removes the '\n' characters in the input and do not displays line breaks in the task interface.


allow_overlapping_annotationsboolean

If true, allows annotations to overlap. Otherwise, all annotations must cover disjoint text spans.


unique_idboolean

A arbitrary ID that you can assign to a task and then query for later. This ID must be unique across all projects under your account, otherwise the task submission will be rejected. See Avoiding Duplicate Tasks for more details.


clear_unique_id_on_errorboolean

If set to be true, if a task errors out after being submitted, the unique id on the task will be unset. This param allows workflows where you can re-submit the same unique id to recover from errors automatically


tagsarray of strings

Arbitrary labels that you can assign to a task. At most 5 tags are allowed per task. You can query tasks with specific tags through the task retrieval API.


Request

POST/v1/task/textcollection
const sdk = require('api')('@scale-ai/v1.3#20c5u82flglvwk99');

sdk.namedEntityRecognition({
  instruction: 'Highlight any **text entity mentions** that correspond to the specified labels',
  text: 'Melt butter in a heavy skillet over medium heat. Add onion; cook and stir until onion starts to brown, about 5 minutes. Season with salt and pepper.',
  attachments: [
    {
      type: 'text',
      content: '**Please review this context**: I want to buy 1oz hand sanitizer please.'
    }
  ],
  merge_newlines: false,
  allow_overlapping_annotations: false,
  labels: [
    {
      name: 'T_INGR',
      display_name: 'Ingredients',
      children: [{name: 'T_Butter', display_name: 'Butter'}]
    }
  ],
  project: 'projectName',
  batch: 'batchName',
  callback_url: 'http://www.example.com/callback',
  relationships: [
    {
      name: 'R_ADD',
      display_name: 'is added to',
      is_directed: true,
      source_label: 'T_INGR',
      target_label: 'T_INGR'
    }
  ],
  tags: ['tag1']
})
  .then(({ data }) => console.log(data))
  .catch(err => console.error(err));
POST/v1/task/textcollection
from scaleapi.tasks import TaskType
from scaleapi.exceptions import ScaleDuplicateResource

payload = dict(
    instruction: 'Highlight any **text entity mentions** that correspond to the specified labels',
  text: 'Melt butter in a heavy skillet over medium heat. Add onion; cook and stir until onion starts to brown, about 5 minutes. Season with salt and pepper.',
  attachments: [
    {
      type: 'text',
      content: '**Please review this context**: I want to buy 1oz hand sanitizer please.'
    }
  ],
  merge_newlines: false,
  allow_overlapping_annotations: false,
  labels: [
    {
      name: 'T_INGR',
      display_name: 'Ingredients',
      children: [{name: 'T_Butter', display_name: 'Butter'}]
    }
  ],
  project: 'projectName',
  batch: 'batchName',
  callback_url: 'http://www.example.com/callback',
  relationships: [
    {
      name: 'R_ADD',
      display_name: 'is added to',
      is_directed: true,
      source_label: 'T_INGR',
      target_label: 'T_INGR'
    }
  ],
  tags: ['tag1']
)

try:
    client.create_task(TaskType.NamedEntityRecognition, **payload)
except ScaleDuplicateResource as err:
    print(err.message)  # If unique_id is already used for a different task

Response

{
  "task_id": "5c3f9c25744e7d005052e319",
  "created_at": "2019-01-16T21:03:33.166Z",
  "callback_url": "http://www.example.com/callback",
  "instruction": "Label all ingredients and cookware. Please include relevant adjectives, for example **red apple** instead of just **apple**.",
  "type": "namedentityrecognition",
  "status": "pending",
  "params": {
    "text": "Melt butter in a heavy skillet over medium heat. Add onion; cook and stir until onion starts to brown, about 5 minutes. Season with salt and pepper.",
    "labels": [
      {
        "name": "T_INGR",
        "display_name": "Ingredients",
        "children": [
          {
            "name": "T_Butter",
            "display_name": "Butter"
          }
        ]
      }
    ]
  },
  "is_test": false,
  "attachments": [
    {
      "type": "text",
      "content": "**Please review this context**: I want to buy 1oz hand sanitizer please."
    }
  ],
  "metadata": {}
}
Updated 6 months ago