Scale
TextCollectionAttachment
An array of TextCollectionAttachment objects to be labeled.
Video Support
The video attachment should have
HTML Support in TextCollection Attachments:
When creating a task in TextCollection, customers are able to pass Markdown as the string content. Markdown also allows the use of HTML tags within the Markdown syntax.
However, to ensure the security of the TextCollection platform, we sanitize all HTML tags passed within the Markdown syntax using the HTML-sanitize JavaScript package. This package removes all tags except for the specific set of allowed HTML tags mentioned on the table to the right.
By allowing only these specific HTML tags to be passed through the string, we ensure that the content displayed to the tasker is secure and adheres to our standards. Any HTML tags that are not included in the list of allowed tags will be removed from the string during the sanitization process.
By sanitizing the HTML tags, we prevent any potential security risks that could arise from the use of unauthorized HTML tags, and maintain a high level of security on our platform.
Parameter | Type | Description |
---|---|---|
type* | string | One of |
content* | string | Content or link to relevant file. |
forms | array | Array of |
HTML tags allowed:
Content sectioning | 'address', 'article', 'aside', 'footer', 'header','h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'hgroup', 'main', 'nav', 'section'. |
---|---|
Text content | 'blockquote', 'dd', 'div', 'dl', 'dt', 'figcaption', 'figure', 'hr', 'li', 'main', 'ol', 'p', 'pre', 'ul', |
Inline text semantics | 'a', 'abbr', 'b', 'bdi', 'bdo', 'br', 'cite', 'code', 'data', 'dfn', 'em', 'i', 'kbd', 'mark', 'q', 'rb', 'rp', 'rt', 'rtc', 'ruby', 's', 'samp', 'small', 'span', 'strong', 'sub', 'sup', 'time', 'u', 'var' |
Table content | 'caption', 'col', 'colgroup', 'table', 'tbody', 'td', 'tfoot', 'th', 'thead', 'tr' |
Additional Tags | 'img', 'iframe' |
UnitField
Conditional Fields
Sometimes a field should only be presented if specific choices are selected for other fields. In these cases, you can specify the conditions — the dependent questions and corresponding sets of choices.
The
Key: the
field_idof the dependent fieldValue: an object specifying the desired choices for the dependent field.
For example conditions, please check out the code on the right.
Conditions currently only work with dependent fields of type CategoryField. It is valid syntax on other fields, but may raise errors or undefined behavior.
Parameters
typestringrequired
One of text
, boolean
, number
, datetime
, or category
, select
, time_range
.
field_idstringrequired
A unique identifier for the field, which should not change among tasks within a project.
titlestringrequired
Field title to be displayed to taskers. This should be short and singular. This may change among tasks within a project. Must not be an empty string.
descriptionstring
A brief description about what the response should be. This may change among tasks within a project.
hintstring
Longer explanation of why the field exists and how it should be used. Renders as a tooltip.
requiredboolean
Determines whether or not a response for this field is required. The default is false
.
min_responses_requiredinteger
The minimum number of separate annotations allowed for this field. Must be larger than 0. The default is 1.
max_responses_requiredinteger
The maximum number of separate annotations allowed for this field. Must be larger than or equal to min_responses_required
, with an upper bound of 100. The default is 1
.
conditionsarray of objects
A set of conditions which must be satisfied for this field to be shown. Default is undefined
.
Additional Fieldsobject
See the TextField, BooleanField, NumberField, DatetimeField, and CategoryField sections.
Example
// Example of UnitField with conditions
{
type: "category",
field_id: "occlusion",
title: "Is there occlusion in the image?",
choices: [{label: 'None', value: '0' },
{label: 'A little', value: '1'},
{label: 'A lot', value: '2'}],
conditions: [{}],
},
{
type: "category",
field_id: "occlusion_detail",
title: "What is the cause of the occlusion?",
choices: [{label: 'Rain', value: 'rain'},
{label: 'Shadow', value: 'shadow'}],
conditions: [{
occlusion: ['1', '2'], // show if 1 or 2 are selected
// equivalently {not: [[], ['0']}
// equivalently [{not: []}, {not: ['0']}]
// equivalently [['1'],['2']]
}],
},
{
type: "text",
field_id: "a_lot_of_shadow",
title: "Please describe why there is so much shadow.",
conditions: [{
// show if 2 and shadow are selected in their respective fields
occlusion: ['2'],
occlusion_detail: ['shadow'],
}],
},
TextField
Subclass of UnitField and returns a
Parameters
max_charactersinteger
The maximum number of characters allowed in the field.
show_word_counterboolean
To display word count in text fields, we can include `show_word_count = true` in the text field’s object.
show_markdown_previewboolean
To enable a markdown preview for the text field, we can include `show_markdown_preview = true` in the text field’s object.
max_tokensinteger
To enable maximum word counts to a specific text field, we can include `max_tokens = 1000` to set the maximum words in a text response to be 1000 words.
min_tokensinteger
To enable minimum and maximum word counts to a specific text field, we can include `min_tokens = 100` to set the minimum words in a text response to be 100 words.
disable_pastingboolean
To disable copying and pasting to a specific text field, we can include `disable_pasting = true`.
Example
{
"type": "text",
"field_id": "summary",
"title": "Summary",
"min_responses_required": 1,
"max_responses_required": 3,
"max_characters": 500,
"required": true
}
BooleanField
Subclass of UnitField and returns a
Example
{
"type": "boolean",
"field_id": "availability",
"title": "Item Availability",
"description": "Choose true if available."
}
NumberField
Subclass of UnitField and returns a
Parameters
use_sliderboolean
Set to true
to use a slider instead of textbox.
minfloat
Sets the minimum value of the slider.
maxfloat
Sets the maximum value of the slider.
stepfloat
Sets the step value of the slider.
prefixstring
A string label for the lowest numerical value response.
suffixstring
A string label for the greatest numerical value.
mid_labelstring
A string label for the middle numerical value.
Example
{
"type": "number",
"field_id": "item_price",
"title": "Item Price",
"description": "Leave empty if not applicable.",
"required": false,
"use_slider": true,
"min": 0,
"max": 100
}
DatetimeField
Subclass of UnitField and returns a
Definition: DatetimeSpec
An enum that consists of
Definition: DatetimeAnnotation
An interface that contains optional number fields including
Parameters
includearray of objectsrequired
An array of DatetimeSpec
elements. Must contain at least one element.
Example
{
"type": "datetime",
"field_id": "release_date",
"title": "Date of Product Release",
"description": "Leave empty if not applicable.",
"include": ["year", "month", "day"],
"defaults": {
"year": 2021,
"month": 4,
"day": 13
}
}
CategoryField
Subclass of UnitField and returns an array of selected
Parameters
choicesarray of objectsrequired
An array of CategoryChoice
elements to define the relevant choice.
min_choicesinteger
Minimum number of choices to select.
max_choicesinteger
Maximum number of choices to select. If this value is greater than 1, the form renders a checkbox. Otherwise, it renders a radio button.
CategoryChoice
labelstringrequired
The label of the choice field. This description may change among tasks within a project.
CategoryChoiceValuearray of objects
The value of the choice field. Must be a string
, number
, or boolean
.
hintstring
The tooltip text shown for this choice.
subchoicesarray of objects
An array of CategoryChoice
elements to define the relevant subchoices.
Example
{
"type": "category",
"field_id": "genre",
"title": "Select all genres that apply.",
"choices": [
{
"label": "Hip-Hop/Rap",
"value": "hip-hop-rap",
"hint":
"It consists of a stylized rhythmic music that commonly accompanies rapping, a rhythmic and rhyming speech that is chanted.",
"subchoices": [
{ "label": "Dirty South", "value": "dirty-south" },
{ "label": "Industrial Hip Hop", "value": "industrial-hip-hop" },
{ "label": "Nerdcore", "value": "nerdcore" },
{ "label": "Rap", "value": "rap" },
]
},
{
"label": "R&B/Soul",
"value": "rb-soul",
"subchoices": [
{ "label": "Disco", "value": "disco" },
{ "label": "Funk", "value": "funk" },
{ "label": "Motown", "value": "motown" },
]
},
],
"min_choices": 1,
"max_choices": 5
}
TimerangeField
Subclass of UnitField.
Parameters
default_secondsarray of integersrequired
Must have length 2, and be in range [0, 24 * 60 * 60]
increment_secondsinteger
Must be between 1 and 60 * 60
default_from_fieldstring
Must be a valid field_id
Example
{
"type": "time_range",
"field_id": "hours",
"title": "Store Hours",
"defaults_seconds": [
28800,
72000
],
"increment_seconds": 300,
"max_responses_required": 2,
"min_responses_required": 0
}
SelectField
Subclass of UnitField.
Parameters
choicesarray of objects
An array of selectable options, choices
is not required if choices_from_field
is present.
choices_from_fieldstring
Must be a valid field_id
Example
{
"type": "select",
"field_id": "sentiment",
"title": "Sentiment",
"description": "Choose a sentiment that best describes this text",
"required": True,
"choices_from_field": "Options",
}
RankingField
Returns a
Parameters
titlestring
A brief description about what the response should be. This may change among tasks within a project.
hintstring
An array of child UnitField
and FieldSet
objects. Must contain at least 2 elements.
first_labelstring
Determines whether or not all.
last_labelstring
num_items_to_rankinteger
The number of options required to rank (can be less than number of attachments).
requiredboolean
Determines whether or not all num_items_to_rank
fields should filled.
Example
{
"type": "ranking_order",
"field_id": "relevance_ranking",
"title": "Rank titles based on their relevance to the article",
"hint": "From the most relevant to the least one",
"first_label": "Best",
"last_label": "Worst",
"num_items_to_rank": 3
}
FormField
Returns a
Parameters
typestringrequired
For FormField
Objects, this should be set to form
field_idstringrequired
A unique identifier for the field, which should not change among tasks within a project.
titlestringrequired
Field title to be displayed to taskers. This should be short and singular. This may change among tasks within a project.
descriptionstring
A brief description about what the response should be. This may change among tasks within a project.
fieldsarray of objectsrequired
An array of child UnitField and FieldSet objects. Any FieldSet objects here must have incline set to true
📘Note
FormFieldobjects can only be located on the top level of thefieldstask parameter. If oneFormFieldobject is used, all the other top-level objects must also beFormFieldobjects.
Example
{
"type": "form",
"field_id": "form_query",
"title": "Query Intention",
"fields": [
{
"type": "text",
"field_id": "query_intention",
"title": "Query Intention",
"hint": "Please investigate the search links."
},
]
}
Text Collection Response Format
The
Each annotation will be of the type defined by its field above. If
📘Note
See the Callback section for more details about callbacks.
Example
{
"response": {
"annotations": {
"category_name": "Soup", //TextField
"category_items": [ //FieldSet with max_responses_required greater than one
{
"item_name": "Tom Yum Chicken Soup", //TextField
"item_price": "11.79" //NumberField
},
{
"item_name": "Tom Yum Beef Soup", //TextField
"item_price": "11.79" //NumberField
}
],
"category_metadata": { //FieldSet
"gluten_friendly": true, //BooleanField
"labels": [ //TextField with max_responses_required greater than one
"Free Range",
"All Natural"
]
}
}
},
"task_id": "5774cc78b01249ab09f089dd",
"task": {
// populated task for convenience
}
}
Text Collection Hypothesis
When creating a
In order to add pre-labels in a task using hypothesis, you’ll need to provide these in the
Verify the task response field schema for the desired task type.
Review your project taxonomy (label names, attribute conditions, annotation types, etc).
Generate pre-labels that are formatted to match the aforementioned schema and taxonomy.
Create a task, including a hypothesis field that contains the pre-labels at the same top-level as other task fields such as project and instructions.
The hypothesis format will largely mirror Scale’s task response format. In this particular task type,
The only difference between
You can find these two fields in your task taxonomy
Note: For Text types fields the response format differs from the other types. For this particular field type,
task_payload_with_hypothesis
{
...
"batch": "regular_batch_name",
"hypothesis": {
"annotations": {
"(EXAMPLE) Multiple Choice Question": {
"type": "category",
"field_id": "(EXAMPLE) Multiple Choice Question",
"response": [
[
"B"
]
]
}
}
},
...
}
task_taxonomy
{
"fields": [
{
"type": "category",
"field_id": "(EXAMPLE) Multiple Choice Question",
"title": "Which option best fits this task?",
"choices": [
{
"label": "A",
"value": "A"
},
{
"label": "B",
"value": "B"
},
{
"label": "C",
"value": "C"
}
],
"min_choices": 1,
"max_choices": 1,
"description": "Select one of the following. "
}
]
}
task_payload_with_hypothesis_text_field
{
...
"hypothesis": {
"annotations": {
"Product Description": {
"type": "text",
"field_id": "(EXAMPLE) Text Input Field",
"response": [
"Dolore in dolor occaecat deserunt ex in qui non amet est."
]
}
}
}
...
}
NamedEntityRecognitionLabel
Parameters
namestringrequired
A unique identifier for this label.
display_namestring
An alias for this label to display to taskers.
descriptionstring
A description of what this label should represent. Displayed to taskers to improve quality.
childrenarray of objects
An array of NamedEntityRecognitionLabel
objects to group underneath this label. Specifying this field causes this label itself to no longer be used for labeling text spans.
attributes (optional)object
Parameters
typestring
Only 'select' for now.
optionsarray of objects
List of select option objects.
display_namestring
Optional display name.
descriptionstring
Optional description.
Parameters
valuestring
The value that will show up in the response if this option is selected.
display_namestring
Optional display name if different from the value.
NamedEntityRecognitionRelationshipDefinition
A relationship can either be named or unnamed. A named relationship is useful if you need to distinguish between multiple types of relationship that could exist between the same two text spans. For instance, if you're annotating a description of someone's family history, you might want to distinguish a "child of" relationship from a "sibling of" relationship.
A task can only specify one type of relationship. Either all the relationships in a task must be named, or all must be unnamed.
Parameters
namestring
A unique identifier for this type of relationship. Required for named relationships; disallowed for unnamed relationships.
display_namestring
A description for this relationship to display to taskers. Should be able to be used to construct a short phrase describing the relationship. For example, a relationship between two text spans "A" and "B" with display_name
"is parent of" would be rendered to taskers as "A is parent of B". Required for named relationships; disallowed for unnamed relationships.
is_directedboolean
A field indicating whether the directionality of this relationship matters. For example, a "is parent of" relationship would likely be directed, whereas a "is sibling of" relationship would likely not be directed. Optional for named relationships; disallowed for unnamed relationships.
source_labelstring
A string referencing the name
field of a NamedEntityRecognitionLabel
object. If set, mandates that the source text span of this field must be labeled with the corresponding NamedEntityRecognitionLabel
, or one of its children
. Optional for both named and unnamed relationships.
target_labelstring
A string referencing the name
field of a NamedEntityRecognitionLabel
object. If set, mandates that the target text span of this field must be labeled with the corresponding NamedEntityRecognitionLabel
, or one of its children
. Optional for both named and unnamed relationships.
Named Entity Recognition Callback Format
The
The structure of a response object for named entity recognition consists of two arrays: one for entity annotations and another for relationships between these entities.
The format for an individual entity annotation within the named entity recognition response, detailing the unique identifier, position, and content of the recognized text span, as well as its label and any optional attributes.
NamedEntityRecognitionRelationship
In tasks with undirected relationships, the
Example
{
"annotations": [
{
"id": "b86c22a3-1f7c-4be2-bb8f-899ee9324c0b",
"start": 10,
"end": 17,
"text": "Alex Wang",
"label": "person",
},
{
"id": "a76da53e-4ebd-4466-aed7-80db6fb98329",
"start": 22,
"end": 31,
"text": "Transform",
"label": "conference",
}
],
"relationships": [
{
"id": "ade8e9e9-ef9c-4fc7-9517-62d79a15c1cb",
"source_ref": "b86c22a3-1f7c-4be2-bb8f-899ee9324c0b",
"target_ref": "a76da53e-4ebd-4466-aed7-80db6fb98329",
"name": "speaker_at",
}
]
}
NamedEntityRecognitionResponse
Field | Type | Description |
---|---|---|
annotations | object array | List of |
relationships | object array | List of |
NamedEntityRecognitionAnnotation
Field | Type | Description |
---|---|---|
id | string | Unique identifier. |
start | number | Start index of the text span. |
end | number | End index of the text span. |
text | string | Text of the text span. |
label | string | References the |
attributes (optional) | object | The keys of the object reference keys of the |
NamedEntityRecognitionRelationship
Field | Type | Description |
---|---|---|
id | string | Unique identifier. |
source_ref | string | References the |
target_ref | string | References the |
name (optional) | string | References the |