Project Setup Glossary

Refer to our glossaries to understand the full extent of Rapid's extensive product offering. When you set up your project, we offer a variety of different options and settings to cater to your data needs.

Select a template or custom project

Template Project

To search for a pre-built template, you can type keywords into the search bar at the top of the page describing what you’re looking for. You can also use the Data Type and Use Case filters to find a template that best fits your labeling needs.

If you are not seeing a template that fits your needs, you can either create a custom project or request a template by uploading sample data.

Custom Project

You can also create a custom project. Based on the data type you upload , you can select from a variety of use cases.

For each use case, we also have various project settings that you can configure.

Upload your data

There are 6 options to upload data:

  • Upload from computer

  • Upload from CSV

  • Upload from a previous project

  • Upload from AWS S3

  • Upload from Google Cloud Storage

  • Upload from Microsoft Azure

Option 1: Upload from Computer

Select files from your computer to upload. For faster uploading, consider the other options that allow for async uploading. Please upload no more than 1000 files at one time.

Option 2: Upload from CSV

To upload from a local CSV file, you need to include either a column named "attachment_url" or a column named "text". "attachment_url" columns should be the data's publicly accessible remote urls.

"attachment_url"s will be used to fetch the data from that url for a website or file. For websites, the url will be displayed as a link for taskers to open.

The values in the "text" column can be used for raw text or markdown for text-based projects (API task type: textcollection). We recommend also adding some step-by-step instructions as part of the markdown. You can use our Task Interface Customization tool to help format and generate this column.

Another feature is that we support iframes as input in the "text" column. You can add an iframe to a native app that taskers use to interact with. You can additionally provide an optional "metadata" column to store extra data in JSON. If there are more than 200 assets in the upload, we will upload in the background.

CSV file with metadata
Download Sample CSV

1426

CSV file without metadata
Download Sample CSV

1418

CSV file for website attachments
Download Sample CSV

1424

CSV file with markdown
Download Sample CSV

1410

Another feature is that we support iframes as input in the "text" column. You can add an iframe to a native app that taskers use to interact with.

CSV file with multiple attachments
Download Sample CSV

1262

CSV format for multiple text attachments

1288

CSV format for multiple attachment URLs

Option 3: Upload from a Previous Project

Import data from a previous rapid project.

Option 4: Upload from AWS S3

Provide a S3 bucket and an optional prefix (folder path) and we will import the data directly. Note that you need to give us permission to do so. Check this document for instructions on setting up the permissions. You will need to grant permissions for 'GetObject' and 'ListBucket' actions. Additionally, AWS uploads are capped at 5,000 files.

We have also included instruction videos for setting up IAM delegated access.

Creating role and corresponding policy. You can see which values to input for the role on the Scale integrations page. You can also specify which resources to use with more granularity.

Assigning the read-only policy to the Scale integration role.

1852

Be sure to update your account in the integration settings afterwards!

This is how a sample policy may look.

Text

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "scales3access",
            "Action": [
                "s3:GetObject",
                "s3:ListBucket"
            ],
            "Effect": "Allow",
            "Resource": [
              "arn:aws:s3:::YOUR_BUCKET_NAME/*",
              "arn:aws:s3:::YOUR_BUCKET_NAME"
            ],
        }
    ]
}

Option 5: Upload from Google Cloud Storage

Provide a Google Cloud Storage bucket and an optional file prefix / delimiter and we will import the data directly. Note that you need to give us permission to do so. Check this document for instructions on setting up the permissions. You will need to setup permissions for Storage Legacy Bucket Reader or Storage Bucket Data Reader. Please ensure the bucket does not contain both files and folders.

We have also included instruction videos for setting up service account impersonation.

Adding a service account. Replace {uuid} with the value given to you on the Scale integrations page..

Adding bucket permissions for the service account.

1848

Be sure to update your account in the integration settings afterwards!

Option 6: Upload from Microsoft Azure

Provide an Azure account name, container, and an optional prefix and we will import the data. Check this document for instructions on setting up the permissions. You may need to grant permission for Storage Blob Data Reader.

Build your taxonomy

After selecting a use case or template, you will then access a set of labels to build your taxonomy.

If you are using our visual editor, you will be able to add labels and attributes from the dropdown menu. You can remove labels by clicking on the trash icon next to each label. If you would rather create your task with a JSON according to our API docs, you can copy and paste the JSON directly into the JSON editor. Note that you can switch between the visual and JSON editors as you are building your taxonomy.

2560

A common workflow customers often follow is to use our visual editor to design their task interface so that they can use the outputted JSON when creating tasks via API.

2560

As you are building your taxonomy, you can also see what your task looks like by clicking on the Task Preview. In this preview, you will also be able to replicate the labeler’s experience when they interact with your taxonomy.

2560

Depending on the use case, you will also be able to configure label-specific settings. For instance, you can configure the minimum height and width of all boxes in the “Label Settings” of all box labels when building an Object Detection taxonomy.

2560


General Taxonomy Recommendations

  • Try to lock down your taxonomy as early as possible. Updating your taxonomy within a project after you have already launched a production batch can be complicated since you will need to update your instructions, labeled examples, and quality tasks. To learn more, you can refer to this page.

  • Make sure to keep your taxonomy as easy to understand as possible. For instance, label names should be straightforward and make logical sense. A box drawn around a dog should have a label name of “dog” and not “puppy.”

  • Make sure to keep your taxonomy as simple as possible. It is harder to achieve quality if you have a complicated annotation use case that involves over 20 different classes of labels or a large tree selection with over 20 different options. Reducing the size and complexity of your taxonomy will also reduce labeler cognitive overload. If you are unable to reduce the size of your taxonomy, we highly recommend using our taxonomy chunking pipeline.

Write your instructions

High quality instructions unlock high quality labels! We recommend investing time into writing clear and comprehensive instructions in order to ensure labelers are aligned with your labeling goals.

After creating your taxonomy, we auto-generate an instructions outline that you can populate. We highly recommend that you fill in every section that is generated:

  • Summary: Provide labelers an introduction to your task. You can also use this section to give any other context that would be useful for the labeler when annotating your data. For example, what type of scenery will they see? How many frames will the labeler see in each task? What types of objects will they need to look for? What is unusual about this task?

  • Workflow: In this section, you should write a step by step guide on how to complete a task. Some ideas to consider: What should the labeler notice first about the task? What sort of deductive reasoning may the labeler encounter? Will a single annotation affect any other annotations?

  • Rules: You can use the Rules section to describe annotation rules that apply to multiple labels or attributes. For each rule, you will be prompted to fill out a Description to describe what the rule is and you can apply the rule to different labels. You will also be prompted to add well-labeled and poorly labeled examples.

  • Label / Attribute / Field: You can use the label-/attribute-/field-specific sections to describe rules that uniquely apply to the label/attribute/field. Again, you will be prompted to fill out a Description and to add well-labeled and poorly labeled examples.

Adding examples to the Rule, Label, and Attribute sections: You can add various example blocks to separate different groups of examples from each other under the same annotation rule. Each example block includes a description for the block of examples and two columns for well-labeled examples and poorly labeled examples. You can use the side-by-side nature of the examples to help highlight differences between correct and incorrect labeling.

Each column can contain multiple examples. When you click on “Add another example,” you can either select an asset that you have uploaded already or upload a new asset. Then, you can label the task and mark it as either a well-labeled or poorly labeled example.

You can also create instructions examples while you’re auditing. Read here for more details.

Writing instructions is very important! We highly recommend that you dedicate time to write good instructions in order to achieve high quality on the platform. For more tips, read more here.

Launch a batch

The final step is to launch a batch of tasks for labeling. There are three types of batches:

Self-label batch

To test your taxonomy setup or experience labeling on the Rapid platform, you can create a self-label batch where you would create a batch of data to be labeled by you or your team member.

Calibration batch

It is important to be able to iterate on your taxonomy setup and instructions. To do this, you can create a calibration batch, which is a smaller set of tasks that you send to the Scale workforce for labeling. You will generally receive labeler-written feedback on your instructions and the fully delivered batch of tasks, which you can use to create quality tasks from, after only a few hours. We pride ourselves in our quick turnaround time in order to facilitate quick experimentation and iteration.

Note that calibration batch tasks go through less quality controls and low calibration batch quality does not necessarily reflect production batch quality. These quality controls include instruction-reading checks (e.g. labelers must spend a certain amount of time on the instructions before continuing) and management of a special pool of trusted labelers. The purpose of launching calibration batches is to iterate and improve upon your taxonomy and instructions while building your suite of quality tasks.

You can use your Calibration Score to gauge how well labelers are able to understand your instructions and label your data. We generally recommend achieving a Calibration Score of at least 80% before proceeding to production. You can learn more about your Calibration Score and how to improve it here.

To read more about calibration batches and the ideal calibration workflow, you can go here .

Production batch

After launching a few calibration batches, iterating on your taxonomy and instructions, and building your quality task suite, you will be ready to scale to production volumes. You can launch production batches, which are larger sets of data, to the Scale workforce for labeling. When labelers first onboard onto your project, they must read through your instructions and complete your training tasks to check their understanding of your instructions. Then, before they touch your production data, we serve them a few of your evaluation tasks that they must perform well on in order to proceed. If they do not perform well, we screen them off your project. Note that labelers also do not know that we are checking their knowledge on the backend. Those who pass this diagnostic can then continue labeling on your project. As they are labeling your production data, we periodically check their performance by serving evaluation tasks. Those who do not perform well may be demoted from being reviewers to being attempters or may be screened off your project. Similarly, those who do well may be promoted from being attempters to being reviewers. To read more about training and evaluation tasks, which we refer to together as quality tasks, you can refer here.

Project Setup Glossary

Refer to our glossaries to understand the full extent of Rapid's extensive product offering. When you set up your project, we offer a variety of different options and settings to cater to your data needs.

Select a template or custom project

Template Project

To search for a pre-built template, you can type keywords into the search bar at the top of the page describing what you’re looking for. You can also use the Data Type and Use Case filters to find a template that best fits your labeling needs.

If you are not seeing a template that fits your needs, you can either create a custom project or request a template by uploading sample data.

Custom Project

You can also create a custom project. Based on the data type you upload , you can select from a variety of use cases.

For each use case, we also have various project settings that you can configure.

Upload your data

There are 6 options to upload data:

  • Upload from computer

  • Upload from CSV

  • Upload from a previous project

  • Upload from AWS S3

  • Upload from Google Cloud Storage

  • Upload from Microsoft Azure

Option 1: Upload from Computer

Select files from your computer to upload. For faster uploading, consider the other options that allow for async uploading. Please upload no more than 1000 files at one time.

Option 2: Upload from CSV

To upload from a local CSV file, you need to include either a column named "attachment_url" or a column named "text". "attachment_url" columns should be the data's publicly accessible remote urls.

"attachment_url"s will be used to fetch the data from that url for a website or file. For websites, the url will be displayed as a link for taskers to open.

The values in the "text" column can be used for raw text or markdown for text-based projects (API task type: textcollection). We recommend also adding some step-by-step instructions as part of the markdown. You can use our Task Interface Customization tool to help format and generate this column.

Another feature is that we support iframes as input in the "text" column. You can add an iframe to a native app that taskers use to interact with. You can additionally provide an optional "metadata" column to store extra data in JSON. If there are more than 200 assets in the upload, we will upload in the background.

CSV file with metadata
Download Sample CSV

1426

CSV file without metadata
Download Sample CSV

1418

CSV file for website attachments
Download Sample CSV

1424

CSV file with markdown
Download Sample CSV

1410

Another feature is that we support iframes as input in the "text" column. You can add an iframe to a native app that taskers use to interact with.

CSV file with multiple attachments
Download Sample CSV

1262

CSV format for multiple text attachments

1288

CSV format for multiple attachment URLs

Option 3: Upload from a Previous Project

Import data from a previous rapid project.

Option 4: Upload from AWS S3

Provide a S3 bucket and an optional prefix (folder path) and we will import the data directly. Note that you need to give us permission to do so. Check this document for instructions on setting up the permissions. You will need to grant permissions for 'GetObject' and 'ListBucket' actions. Additionally, AWS uploads are capped at 5,000 files.

We have also included instruction videos for setting up IAM delegated access.

Creating role and corresponding policy. You can see which values to input for the role on the Scale integrations page. You can also specify which resources to use with more granularity.

Assigning the read-only policy to the Scale integration role.

1852

Be sure to update your account in the integration settings afterwards!

This is how a sample policy may look.

Text

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "scales3access",
            "Action": [
                "s3:GetObject",
                "s3:ListBucket"
            ],
            "Effect": "Allow",
            "Resource": [
              "arn:aws:s3:::YOUR_BUCKET_NAME/*",
              "arn:aws:s3:::YOUR_BUCKET_NAME"
            ],
        }
    ]
}

Option 5: Upload from Google Cloud Storage

Provide a Google Cloud Storage bucket and an optional file prefix / delimiter and we will import the data directly. Note that you need to give us permission to do so. Check this document for instructions on setting up the permissions. You will need to setup permissions for Storage Legacy Bucket Reader or Storage Bucket Data Reader. Please ensure the bucket does not contain both files and folders.

We have also included instruction videos for setting up service account impersonation.

Adding a service account. Replace {uuid} with the value given to you on the Scale integrations page..

Adding bucket permissions for the service account.

1848

Be sure to update your account in the integration settings afterwards!

Option 6: Upload from Microsoft Azure

Provide an Azure account name, container, and an optional prefix and we will import the data. Check this document for instructions on setting up the permissions. You may need to grant permission for Storage Blob Data Reader.

Build your taxonomy

After selecting a use case or template, you will then access a set of labels to build your taxonomy.

If you are using our visual editor, you will be able to add labels and attributes from the dropdown menu. You can remove labels by clicking on the trash icon next to each label. If you would rather create your task with a JSON according to our API docs, you can copy and paste the JSON directly into the JSON editor. Note that you can switch between the visual and JSON editors as you are building your taxonomy.

2560

A common workflow customers often follow is to use our visual editor to design their task interface so that they can use the outputted JSON when creating tasks via API.

2560

As you are building your taxonomy, you can also see what your task looks like by clicking on the Task Preview. In this preview, you will also be able to replicate the labeler’s experience when they interact with your taxonomy.

2560

Depending on the use case, you will also be able to configure label-specific settings. For instance, you can configure the minimum height and width of all boxes in the “Label Settings” of all box labels when building an Object Detection taxonomy.

2560


General Taxonomy Recommendations

  • Try to lock down your taxonomy as early as possible. Updating your taxonomy within a project after you have already launched a production batch can be complicated since you will need to update your instructions, labeled examples, and quality tasks. To learn more, you can refer to this page.

  • Make sure to keep your taxonomy as easy to understand as possible. For instance, label names should be straightforward and make logical sense. A box drawn around a dog should have a label name of “dog” and not “puppy.”

  • Make sure to keep your taxonomy as simple as possible. It is harder to achieve quality if you have a complicated annotation use case that involves over 20 different classes of labels or a large tree selection with over 20 different options. Reducing the size and complexity of your taxonomy will also reduce labeler cognitive overload. If you are unable to reduce the size of your taxonomy, we highly recommend using our taxonomy chunking pipeline.

Write your instructions

High quality instructions unlock high quality labels! We recommend investing time into writing clear and comprehensive instructions in order to ensure labelers are aligned with your labeling goals.

After creating your taxonomy, we auto-generate an instructions outline that you can populate. We highly recommend that you fill in every section that is generated:

  • Summary: Provide labelers an introduction to your task. You can also use this section to give any other context that would be useful for the labeler when annotating your data. For example, what type of scenery will they see? How many frames will the labeler see in each task? What types of objects will they need to look for? What is unusual about this task?

  • Workflow: In this section, you should write a step by step guide on how to complete a task. Some ideas to consider: What should the labeler notice first about the task? What sort of deductive reasoning may the labeler encounter? Will a single annotation affect any other annotations?

  • Rules: You can use the Rules section to describe annotation rules that apply to multiple labels or attributes. For each rule, you will be prompted to fill out a Description to describe what the rule is and you can apply the rule to different labels. You will also be prompted to add well-labeled and poorly labeled examples.

  • Label / Attribute / Field: You can use the label-/attribute-/field-specific sections to describe rules that uniquely apply to the label/attribute/field. Again, you will be prompted to fill out a Description and to add well-labeled and poorly labeled examples.

Adding examples to the Rule, Label, and Attribute sections: You can add various example blocks to separate different groups of examples from each other under the same annotation rule. Each example block includes a description for the block of examples and two columns for well-labeled examples and poorly labeled examples. You can use the side-by-side nature of the examples to help highlight differences between correct and incorrect labeling.

Each column can contain multiple examples. When you click on “Add another example,” you can either select an asset that you have uploaded already or upload a new asset. Then, you can label the task and mark it as either a well-labeled or poorly labeled example.

You can also create instructions examples while you’re auditing. Read here for more details.

Writing instructions is very important! We highly recommend that you dedicate time to write good instructions in order to achieve high quality on the platform. For more tips, read more here.

Launch a batch

The final step is to launch a batch of tasks for labeling. There are three types of batches:

Self-label batch

To test your taxonomy setup or experience labeling on the Rapid platform, you can create a self-label batch where you would create a batch of data to be labeled by you or your team member.

Calibration batch

It is important to be able to iterate on your taxonomy setup and instructions. To do this, you can create a calibration batch, which is a smaller set of tasks that you send to the Scale workforce for labeling. You will generally receive labeler-written feedback on your instructions and the fully delivered batch of tasks, which you can use to create quality tasks from, after only a few hours. We pride ourselves in our quick turnaround time in order to facilitate quick experimentation and iteration.

Note that calibration batch tasks go through less quality controls and low calibration batch quality does not necessarily reflect production batch quality. These quality controls include instruction-reading checks (e.g. labelers must spend a certain amount of time on the instructions before continuing) and management of a special pool of trusted labelers. The purpose of launching calibration batches is to iterate and improve upon your taxonomy and instructions while building your suite of quality tasks.

You can use your Calibration Score to gauge how well labelers are able to understand your instructions and label your data. We generally recommend achieving a Calibration Score of at least 80% before proceeding to production. You can learn more about your Calibration Score and how to improve it here.

To read more about calibration batches and the ideal calibration workflow, you can go here .

Production batch

After launching a few calibration batches, iterating on your taxonomy and instructions, and building your quality task suite, you will be ready to scale to production volumes. You can launch production batches, which are larger sets of data, to the Scale workforce for labeling. When labelers first onboard onto your project, they must read through your instructions and complete your training tasks to check their understanding of your instructions. Then, before they touch your production data, we serve them a few of your evaluation tasks that they must perform well on in order to proceed. If they do not perform well, we screen them off your project. Note that labelers also do not know that we are checking their knowledge on the backend. Those who pass this diagnostic can then continue labeling on your project. As they are labeling your production data, we periodically check their performance by serving evaluation tasks. Those who do not perform well may be demoted from being reviewers to being attempters or may be screened off your project. Similarly, those who do well may be promoted from being attempters to being reviewers. To read more about training and evaluation tasks, which we refer to together as quality tasks, you can refer here.

Updated about 1 month ago