Project Overview

Projects are a way of organizing similar tasks, so that one can share parameters among tasks, or control what workers are allowed to work on certain classes of tasks. The parameters associated with a project will be inherited by tasks created under that project. See Update Parameters (Project) for more information. A common use-case is to define and update instructions at the project level and let tasks inherit from the current version of the project instructions. To use a project for a given task, add a

project
parameter when creating the task that references the project name; e.g., to associate a task with project
my_project
, add
project: my_project
to the task creation request. Note that projects created using your test API key can only be used for creating test tasks; likewise with the live API key and live tasks.

Using Project-level parameters

Scale uses Projects to organize Tasks, the unit of work. Each type of Task (Image Annotation, Text Collection, Lidar Annotation) has many customizable input parameters that can be used when creating tasks to guide exactly how the tasks are formed and what the response will look like. Task definitions can be stored on the Task level, the Project level, or a hybrid approach.

Task vs. Project Level Parameters

Task Only

All Task parameters (definitions) are set when making the API call to create a Task. The Project has no parameters and is instead simply used just to collect and organize similar sets of tasks. The Task-specific endpoints and parameters documented in our docs would result in a "Task Only" approach.

Project Only

In a Project only approach, all task definitions are stored in a versioned set of project parameters. When creating a task, you only need to specify the attachment and other per-task fields, and the task will inherit all the parameters and task definitions from the project version.

Hybrid Approach

In the Hybrid approach, some task-level parameters are specified on the Project level, while others are defined on a per-task basis. This could be useful to take advantage of the project versioning while allowing flexibility for certain parameters that could change on a per-task basis.

Project-level Parameter Features

Versioning
Scale maintains a queryable list of all past project versions. You can quickly see what was in a given project version of the past. All Tasks will have a

project_param_version
field pointing to which version of the project parameters were used in that task. When creating tasks, you can specify which version of project parameters should be used.

Separation of Concerns
As you get more sophisticated, it's recommended to separate the taxonomy management from the task submission pipeline. The task submission should be fairly "dumb" in that it doesn't need to know all of the details about how a task should be labeled, it just knows that it needs to add a new attachment to a project for labeling - and that the project will dictate how this task will be labeled.

The project params can then be updated independently of the task submission process.

Default Values
Project-level Parameters are automatically passed down and combined with the task parameters when creating tasks within that project. In this way, you can set default values that you want to ensure are specified regardless of how tasks are submitted to the project. A common use-case might be the instructions - but you can specify any default you'd like.

How do I start using Project-level Parameters?

At the Project Level:
We will be leveraging the Update Parameters endpoint to set and create project parameter versions.

When Retrieving or Listing projects, information about the project parameters will be available in the

paramHistory
field.

At the Task Level:

Tasks will automatically use the latest version of project parameters. The parameters specified at the project level are automatically merged in with the task-level parameters. Parameters specified at the task level do override any default parameters that would be inherited by the project, with the exception of the instruction field, which works by concatenating the task level and project level instructions together.

You are able to specify previous versions of the project to use when creating tasks by specifying a

project_param_version
.

Examples

Check out some of our code examples of using this workflow to find out what it all looks like.


Using Project Ontologies

Scale uses Ontologies to maintain complex taxonomies and provide detailed labeling guidance to labelers. We provide a tool to allow both you and the labelers to view how the ontology has changed over time and gain insights on each choice. Ontologies can be used as a replacement or add-on to the instructions of the project.

Ontology Features

Versioning
Scale maintains a queryable list of all past ontology versions. You can quickly see the previous ontology versions of the past and how they've changed over time.

Descriptions
To provide more context on label choices, you can provide descriptions on each choice within your ontology.

How do I start using Ontologies?

At the Project Level:
We will be leveraging the Update Ontology endpoint to set and create project ontology versions.

When Retrieving or Listing projects, information about the project ontologies will be available in the

ontologyHistory
field.

At the Task Level:

Tasks will automatically use the latest version of project ontology when providing guidance to labelers. The ontology is currently not related to the project params and its object choices by default.

Parameter

Type

Description

choice*

string

The name of the choice.

display

string

The displayed name of the choice.

description

string

The description of the choice and any labeling guidance which can be provided.

subchoices

Array<OntologyChoice | string>

Sub-choices to be shown under this parent label. Array can be a mix of OntologyChoice objects or strings.

Ontology Example

ontology = [
  "Road",
  {
    "choice": "Vehicle",
    "description": "a means of carrying or transporting material",
    "subchoices": ["Car", "Truck", "Train", "Motorcycle"]
  },
  {
    "choice": "Pedestrian",
    "subchoices": [
      "Animal", 
      {"choice": "Ped_HeightOverMeter", "display": "Adult" }, 
      {"choice": "Ped_HeightUnderMeter", "display": "Child" }, 
    ]
  }
]

Batches Object Overview

For high-volume projects, batches can optionally be used to further divide work inside a project. Batches can tie to specific datasets you use internally, or can be used to note which tasks were part of a weekly submission for example.

Scale Enterprise Workflow:

  1. Batch is created (status = in_progress)

  2. Tasks are added to the batch by specifying the batch field on the task to be the name of your batch

  3. The final task is completed, the batch is completed (status completed) and the callback is fired with the example to the right.

Scale Rapid Workflow:

If you are using Scale Rapid, batches take on another purpose, where they are deeply linked to the task delivery model.

There are two types of batches, production batches, and calibration batches. Calibration batches help ensure your project is ready for humans to start working on your production data (with "normal" batches).

  1. Batch is created (status = staging)

  2. Tasks are added to the batch by specifying the batch field on the task to be the name of your batch

  3. Batches need to be finalized (status = in_progress)

  4. Once a batch is finalized, tasks will be submitted to our taskers to begin labeling. No tasks may be added to a batch once it has been finalized.

  5. The final task is completed, the batch is completed (status completed) and the callback is fired with the example to the right.

Scale Studio Workflow:

  1. Batch is created (status = staging)

  2. Tasks are added to the batch by specifying the batch field on the task to be the name of your batch

  3. Batches need to be finalized (status = in_progress)

  4. Once a batch is finalized, tasks will be submitted to your team workers to begin labeling. No tasks may be added to a batch once it has been finalized.

  5. The final task is completed, the batch is completed (status completed) and the callback is fired with the example to the right.

Example Batch Object

{
  "project": "kitten_labeling",
  "name": "kitten_labeling_2020-07",
  "status": "staging",
  "callback": "https://example.com/callback",
  "created_at": "2020-07-01T09:09:10.108Z"
}

Tasks Object Overview

A task represents an individual unit of work to be done by a Tasker. There's a 1:1 mapping between a task and the data to be labeled. For example, there'd be 1 task for each image, video, or lidar sequence needing to be labeled.

You specify how the labeling should be done for a given task when making an API call specifying a set of task parameters to the endpoint you'd like to leverage.

Tasks have a type such as "Image Annotation", "Video Annotation", "Lidar Segmentation", or "Document Transcription".

Learn more about our key concepts and workflows in our "Schttps://scale.com/docs/key-conceptsale 101" guides.

Example Task Object

{
  "task_id": "576ba74eec471ff9b01557cc",
  "created_at": "2016-06-23T09:09:34.752Z",
  "updated_at": "2016-06-23T09:10:02.798Z",
  "completed_at": "2016-06-23T09:10:02.798Z",  
  "type": "categorization",
  "status": "completed",
  "instruction": "Would you say this item is big or small?",
  "params": {
    "attachment_type": "text",
    "attachment": "car",
    "categories": [
      "big",
      "small"
    ]
  },
  "callback_url": "http://www.example.com/callback",
  "callback_completed": true,
  "response": {
    "category": "big"
  },
  "metadata": {},
  "audits": [
    {
      "audited_by": "[email protected]",
      "audited_at": "2016-06-24T15:32:03.585Z",
      "audit_time_secs": 120,
      "audit_result": "accepted",
      "audit_source": "customer"
    },
    {
      "audited_by": "[email protected]",
      "audited_at": "2016-06-23T10:01:02.352Z",
      "audit_time_secs": 511,
      "audit_result": "fixed",
      "audit_source": "scale"
    }
  ],
  "tags": ["experiment_1", "owner:david"],
  "unique_id": "product_experiment_dg3d9x83"
}

Property

Type

Description

task_id

string

The task_id is the unique identifier for the task.

type

string

The type of the task, for example, imageannotation or lidarsegmentation

instructions

string

A markdown-enabled string explaining the instructions for the task. You can use markdown to show example images, give structure to your instructions, and more. HTML tags are unsupported.

params

object

An object with the parameters of the task based on the type. For imageannotation type tasks, for example, this will include attachment, and geometries.

response

object

An object corresponding to the response once the task is completed. Each task type has its response format documented.

status

string

The status of the task, one of pending, completed, canceled or error.

created_at

timestamp

A string of the UTC timestamp for when the task was created.

updated_at

timestamp

A string of the UTC timestamp for when the task was last updated. If a task is completed, this timestamp is usually the same as the completed_at timestamp. However, sometimes a task can be redone, and updated after its completion.

completed_at

timestamp

A string of the UTC timestamp for when the task was completed. This will only be filled in after it is completed.

callback_completed

boolean

A boolean stating whether or not the callback succeeded. If the callback returns with a 2xx status code, the value will be true. If the callback fails to return a 2xx status code through all retries, then the value will be false.

customer_review_status

string

The status of the QA'd task, one of pending, accepted, fixed or rejected.

times_redone

number

A counter for the number of times the task has been redone. It will be present if the task has been redone at least once.

metadata

object, default

A set of key/value pairs that you can attach to a task object. It can be useful for storing additional information about the task in a structured format. See Task Metadata for more information.

project_param_version

number | null

Corresponds to the project version a task was created with, or null if no project version exists.

audits

array, default

An array of audit records.

unique_id

string, optional

A arbitrary ID that you can assign to a task and then query for later. This ID must be unique across all projects under your account, otherwise the task submission will be rejected. See Avoiding Duplicate Tasks for more details.

tags

array, optional

Arbitrary labels that you can assign to a task. At most 5 tags are allowed per task. You can query tasks with specific tags through the task retrieval API.


Annotation Object Overview

Annotation Attributes

In many cases, it is useful to have more human-judged metadata on top of each annotation for a given task, for example, measuring the occlusion-level of all vehicles in an image.

To achieve this, we support annotation_attributes, an object representing additional attributes that you'd like to capture per image annotation.

You may use annotation_attributes to define categorical attributes, numerical attributes, angle attributes, text attributes and more for each annotation.

You define the type of the attribute using the type property of the attribute, if no type is specified, it will default to a category type attribute. See the Annotation Attribute Types below for more details.

The format for annotation_attributes is an object whose key-value pairs all specify attributes of each annotation that you want to capture. The schema differs slightly based on the type of attribute.

Attribute Type

Description

Categorical

Multiple choice attribute, with an optional ability to enable selecting multiple options simultaneously.

Numerical

Integer input

Angle

Input to select a value between 0 and 360 with a visual interface supporting angles.

Text

Input for free text

X/Y Offset

Input to select width or height within a box annotation

Linked

Input to link one annotation to another annotation


Request Format

To create a task with attributes, simply add the annotation_attributes parameter to your task creation request using the format described above.

Example task payload for an ¨annotation¨ task

{
  "callback_url": "http://www.example.com/callback",
  "instruction": "Draw boxes around the vehicles in the image.",
  "attachment_type": "image",
  "attachment": "http://i.imgur.com/v4cBreD.jpg",
  "geometries": {
    "box": {
      "objects_to_annotate": ["car","pedestrian"],
    }
  }
  "annotation_attributes": {
    "parked": {
      "description": "Is the car currently parked?",
      "choices": [
        "Yes",
        "No"
      ]
    },
    "heading": {
      "description": "Which direction is the car heading",
      "choices": [
        "left",
        "right",
        "back",
        "front"
      ],
      "conditions": {
        "label_condition": {
          "label": "car"
        },
        "attribute_conditions": [
          {
            "parked": "No"
          }
        ]
      }
    }
  }
}
Updated 6 months ago