Scale
Project Overview
Projects are a way of organizing similar tasks, so that one can share parameters among tasks, or control what workers are allowed to work on certain classes of tasks. The parameters associated with a project will be inherited by tasks created under that project. See Update Parameters (Project) for more information. A common use-case is to define and update instructions at the project level and let tasks inherit from the current version of the project instructions. To use a project for a given task, add a
Using Project-level parameters
Scale uses Projects to organize Tasks, the unit of work. Each type of Task (Image Annotation, Text Collection, Lidar Annotation) has many customizable input parameters that can be used when creating tasks to guide exactly how the tasks are formed and what the response will look like. Task definitions can be stored on the Task level, the Project level, or a hybrid approach.
Task vs. Project Level Parameters
Task Only
All Task parameters (definitions) are set when making the API call to create a Task. The Project has no parameters and is instead simply used just to collect and organize similar sets of tasks. The Task-specific endpoints and parameters documented in our docs would result in a "Task Only" approach.
Project Only
In a Project only approach, all task definitions are stored in a versioned set of project parameters. When creating a task, you only need to specify the attachment and other per-task fields, and the task will inherit all the parameters and task definitions from the project version.
Hybrid Approach
In the Hybrid approach, some task-level parameters are specified on the Project level, while others are defined on a per-task basis. This could be useful to take advantage of the project versioning while allowing flexibility for certain parameters that could change on a per-task basis.
Project-level Parameter Features
Versioning
Scale maintains a queryable list of all past project versions. You can quickly see what was in a given project version of the past. All Tasks will have a
Separation of Concerns
As you get more sophisticated, it's recommended to separate the taxonomy management from the task submission pipeline. The task submission should be fairly "dumb" in that it doesn't need to know all of the details about how a task should be labeled, it just knows that it needs to add a new attachment to a project for labeling - and that the project will dictate how this task will be labeled.
The project params can then be updated independently of the task submission process.
Default Values
Project-level Parameters are automatically passed down and combined with the task parameters when creating tasks within that project. In this way, you can set default values that you want to ensure are specified regardless of how tasks are submitted to the project. A common use-case might be the instructions - but you can specify any default you'd like.
How do I start using Project-level Parameters?
At the Project Level:
We will be leveraging the Update Parameters endpoint to set and create project parameter versions.
When Retrieving or Listing projects, information about the project parameters will be available in the
At the Task Level:
Tasks will automatically use the latest version of project parameters. The parameters specified at the project level are automatically merged in with the task-level parameters. Parameters specified at the task level do override any default parameters that would be inherited by the project, with the exception of the instruction field, which works by concatenating the task level and project level instructions together.
You are able to specify previous versions of the project to use when creating tasks by specifying a
Examples
Check out some of our code examples of using this workflow to find out what it all looks like.
Using Project Ontologies
Scale uses Ontologies to maintain complex taxonomies and provide detailed labeling guidance to labelers. We provide a tool to allow both you and the labelers to view how the ontology has changed over time and gain insights on each choice. Ontologies can be used as a replacement or add-on to the instructions of the project.
Ontology Features
Versioning
Scale maintains a queryable list of all past ontology versions. You can quickly see the previous ontology versions of the past and how they've changed over time.
Descriptions
To provide more context on label choices, you can provide descriptions on each choice within your ontology.
How do I start using Ontologies?
At the Project Level:
We will be leveraging the Update Ontology endpoint to set and create project ontology versions.
When Retrieving or Listing projects, information about the project ontologies will be available in the
At the Task Level:
Tasks will automatically use the latest version of project ontology when providing guidance to labelers. The ontology is currently not related to the project params and its object choices by default.
Parameter | Type | Description |
---|---|---|
choice* | string | The name of the choice. |
display | string | The displayed name of the choice. |
description | string | The description of the choice and any labeling guidance which can be provided. |
subchoices | Array<OntologyChoice | string> | Sub-choices to be shown under this parent label. Array can be a mix of OntologyChoice objects or strings. |
Ontology Example
ontology = [
"Road",
{
"choice": "Vehicle",
"description": "a means of carrying or transporting material",
"subchoices": ["Car", "Truck", "Train", "Motorcycle"]
},
{
"choice": "Pedestrian",
"subchoices": [
"Animal",
{"choice": "Ped_HeightOverMeter", "display": "Adult" },
{"choice": "Ped_HeightUnderMeter", "display": "Child" },
]
}
]
Batches Object Overview
For high-volume projects, batches can optionally be used to further divide work inside a project. Batches can tie to specific datasets you use internally, or can be used to note which tasks were part of a weekly submission for example.
Batch is created (status = in_progress)
Tasks are added to the batch by specifying the batch field on the task to be the name of your batch
The final task is completed, the batch is completed (status completed) and the callback is fired with the example to the right.
If you are using Scale Rapid, batches take on another purpose, where they are deeply linked to the task delivery model.
There are two types of batches, production batches, and calibration batches. Calibration batches help ensure your project is ready for humans to start working on your production data (with "normal" batches).
Batch is created (status = staging)
Tasks are added to the batch by specifying the batch field on the task to be the name of your batch
Batches need to be finalized (status = in_progress)
Once a batch is finalized, tasks will be submitted to our taskers to begin labeling. No tasks may be added to a batch once it has been finalized.
The final task is completed, the batch is completed (status completed) and the callback is fired with the example to the right.
Batch is created (status = staging)
Tasks are added to the batch by specifying the batch field on the task to be the name of your batch
Batches need to be finalized (status = in_progress)
Once a batch is finalized, tasks will be submitted to your team workers to begin labeling. No tasks may be added to a batch once it has been finalized.
The final task is completed, the batch is completed (status completed) and the callback is fired with the example to the right.
Example Batch Object
{
"project": "kitten_labeling",
"name": "kitten_labeling_2020-07",
"status": "staging",
"callback": "https://example.com/callback",
"created_at": "2020-07-01T09:09:10.108Z"
}
Tasks Object Overview
A task represents an individual unit of work to be done by a Tasker. There's a 1:1 mapping between a task and the data to be labeled. For example, there'd be 1 task for each image, video, or lidar sequence needing to be labeled.
You specify how the labeling should be done for a given task when making an API call specifying a set of task parameters to the endpoint you'd like to leverage.
Tasks have a type such as "Image Annotation", "Video Annotation", "Lidar Segmentation", or "Document Transcription".
Learn more about our key concepts and workflows in our "Schttps://scale.com/docs/key-conceptsale 101" guides.
Example Task Object
{
"task_id": "576ba74eec471ff9b01557cc",
"created_at": "2016-06-23T09:09:34.752Z",
"updated_at": "2016-06-23T09:10:02.798Z",
"completed_at": "2016-06-23T09:10:02.798Z",
"type": "categorization",
"status": "completed",
"instruction": "Would you say this item is big or small?",
"params": {
"attachment_type": "text",
"attachment": "car",
"categories": [
"big",
"small"
]
},
"callback_url": "http://www.example.com/callback",
"callback_completed": true,
"response": {
"category": "big"
},
"metadata": {},
"audits": [
{
"audited_by": "[email protected]",
"audited_at": "2016-06-24T15:32:03.585Z",
"audit_time_secs": 120,
"audit_result": "accepted",
"audit_source": "customer"
},
{
"audited_by": "[email protected]",
"audited_at": "2016-06-23T10:01:02.352Z",
"audit_time_secs": 511,
"audit_result": "fixed",
"audit_source": "scale"
}
],
"tags": ["experiment_1", "owner:david"],
"unique_id": "product_experiment_dg3d9x83"
}
Property | Type | Description |
---|---|---|
task_id | string | The task_id is the unique identifier for the task. |
type | string | The type of the task, for example, imageannotation or lidarsegmentation |
instructions | string | A markdown-enabled string explaining the instructions for the task. You can use markdown to show example images, give structure to your instructions, and more. HTML tags are unsupported. |
params | object | An object with the parameters of the task based on the type. For |
response | object | An object corresponding to the response once the task is completed. Each task type has its response format documented. |
status | string | The status of the task, one of pending, completed, canceled or error. |
created_at | timestamp | A string of the UTC timestamp for when the task was created. |
updated_at | timestamp | A string of the UTC timestamp for when the task was last updated. If a task is completed, this timestamp is usually the same as the completed_at timestamp. However, sometimes a task can be redone, and updated after its completion. |
completed_at | timestamp | A string of the UTC timestamp for when the task was completed. This will only be filled in after it is completed. |
callback_completed | boolean | A boolean stating whether or not the callback succeeded. If the callback returns with a 2xx status code, the value will be true. If the callback fails to return a 2xx status code through all retries, then the value will be false. |
customer_review_status | string | The status of the QA'd task, one of pending, accepted, fixed or rejected. |
times_redone | number | A counter for the number of times the task has been redone. It will be present if the task has been redone at least once. |
metadata | object, default | A set of key/value pairs that you can attach to a task object. It can be useful for storing additional information about the task in a structured format. See Task Metadata for more information. |
project_param_version | number | null | Corresponds to the project version a task was created with, or null if no project version exists. |
audits | array, default | An array of audit records. |
unique_id | string, optional | A arbitrary ID that you can assign to a task and then query for later. This ID must be unique across all projects under your account, otherwise the task submission will be rejected. See Avoiding Duplicate Tasks for more details. |
tags | array, optional | Arbitrary labels that you can assign to a task. At most 5 tags are allowed per task. You can query tasks with specific tags through the task retrieval API. |
Annotation Object Overview
Annotation Attributes
In many cases, it is useful to have more human-judged metadata on top of each annotation for a given task, for example, measuring the occlusion-level of all vehicles in an image.
To achieve this, we support annotation_attributes, an object representing additional attributes that you'd like to capture per image annotation.
You may use annotation_attributes to define categorical attributes, numerical attributes, angle attributes, text attributes and more for each annotation.
You define the type of the attribute using the type property of the attribute, if no type is specified, it will default to a category type attribute. See the Annotation Attribute Types below for more details.
The format for annotation_attributes is an object whose key-value pairs all specify attributes of each annotation that you want to capture. The schema differs slightly based on the type of attribute.
Attribute Type | Description |
---|---|
Categorical | Multiple choice attribute, with an optional ability to enable selecting multiple options simultaneously. |
Numerical | Integer input |
Angle | Input to select a value between 0 and 360 with a visual interface supporting angles. |
Text | Input for free text |
X/Y Offset | Input to select width or height within a box annotation |
Linked | Input to link one annotation to another annotation |
Request Format
To create a task with attributes, simply add the annotation_attributes parameter to your task creation request using the format described above.
Example task payload for an ¨annotation¨ task
{
"callback_url": "http://www.example.com/callback",
"instruction": "Draw boxes around the vehicles in the image.",
"attachment_type": "image",
"attachment": "http://i.imgur.com/v4cBreD.jpg",
"geometries": {
"box": {
"objects_to_annotate": ["car","pedestrian"],
}
}
"annotation_attributes": {
"parked": {
"description": "Is the car currently parked?",
"choices": [
"Yes",
"No"
]
},
"heading": {
"description": "Which direction is the car heading",
"choices": [
"left",
"right",
"back",
"front"
],
"conditions": {
"label_condition": {
"label": "car"
},
"attribute_conditions": [
{
"parked": "No"
}
]
}
}
}
}