nuTonomy & Scale

nuScenes by Aptiv

Large-scale open source dataset for autonomous driving.

  • Scene #1
  • Scene #2
  • Scene #3
  • Scene #4
  • Scene 5 soon
  • Scene 6 soon
Overview

Support for computer vision and autonomous driving research

nuScenes is an initiative intended to support research to further advance the mobility industry.

With this goal in mind, the dataset includes 1000 scenes collected in Boston and Singapore and is the largest multi-sensor dataset for autonomous vehicles.

  • Full sensor suite: 1x LiDAR, 5x RADAR, 6x camera, IMU, GPS
  • 1000 scenes of 20s each
  • 1,440,000 camera images
  • 400,000 LiDAR sweeps
  • Two diverse cities: Boston and Singapore
  • Left versus right hand traffic
Nutonomy Car 1
Nutonomy Car 2
Data Collection

Careful scene planning by Aptiv

For the dataset, Aptiv managed the collection of data, carefully choosing to capture challenging scenarios and a diversity of locations, times and weather conditions.

Collected in Boston's Seaport and Singapore's One North, Queenstown and Holland Village districts, each of the 1000 scenes in the dataset were manually selected.

Cat Setup

Vehicle, Sensor and Camera Details

Two identical cars with identical sensor layouts were used to drive in Boston and Singapore.

Refer to the image below for the placement of the sensors:

    • 12Hz capture frequency
    • 1/1.8'' CMOS sensor of 1600x1200 resolution
    • Bayer8 format for 1 byte per pixel encoding
    • 1600x900 ROI is cropped from the original resolution to reduce processing and transmission bandwidth
    • Auto exposure with exposure time limited to the maximum of 20 ms
    • Images are unpacked to BGR format and compressed to JPEG
    • See camera orientation and overlap in the figure below.
    6Cameras
    • 20Hz capture frequency
    • 32 channels
    • 360° Horizontal FOV, +10° to -30° Vertical FOV
    • 80m-100m Range, Usable returns up to 70 meters, ± 2 cm accuracy
    • Up to ~1.39 Million Points per Second
    1Spinning LiDAR
    • 77GHz
    • 13Hz capture frequency
    • Independently measures distance and velocity in one cycle using Frequency Modulated Continuous Wave
    • Up to 250m distance
    • Velocity accuracy of ±0.1 km/h
    5Long range RADAR sensor
    • 12Hz capture frequency
    • 1/1.8'' CMOS sensor of 1600x1200 resolution
    • Bayer8 format for 1 byte per pixel encoding
    • 1600x900 ROI is cropped from the original resolution to reduce processing and transmission bandwidth
    • Auto exposure with exposure time limited to the maximum of 20 ms
    • Images are unpacked to BGR format and compressed to JPEG
    • See camera orientation and overlap in the figure below.
    6Cameras
    • 20Hz capture frequency
    • 32 channels
    • 360° Horizontal FOV, +10° to -30° Vertical FOV
    • 80m-100m Range, Usable returns up to 70 meters, ± 2 cm accuracy
    • Up to ~1.39 Million Points per Second
    1Spinning LiDAR
    • 77GHz
    • 13Hz capture frequency
    • Independently measures distance and velocity in one cycle using Frequency Modulated Continuous Wave
    • Up to 250m distance
    • Velocity accuracy of ±0.1 km/h
    5Long range RADAR sensor

Flythrough of the Nuscenes Teaser


Flythrough of the Nuscenes Teaser
Sensor Calibration

Data alignment between sensors and cameras.

To achieve a high quality multi-sensor dataset, it is essential to calibrate the extrinsics and intrinsics of every sensor.

We express extrinsic coordinates relative to the ego frame, i.e. the midpoint of the rear vehicle axle.

The most relevant steps are described below:

  • LiDAR extrinsics
  • Camera extrinsics
  • IMU extrinsics
  • Camera intrinsic calibration
Car Sensors

Sensor Synchronization

In order to achieve cross-modality data alignment between the LiDAR and the cameras, the exposure on each camera was triggered when the top LiDAR sweeps across the center of the camera’s FOV. This method was selected as it generally yields good data alignment. Note that the cameras run at 12Hz while the LiDAR runs at 20Hz.

The 12 camera exposures are spread as evenly as possible across the 20 LiDAR scans, so not all LiDAR scans have a correspondingcamera frame.

Reducing the frame rate of the cameras to 12Hz helps to reduce the compute, bandwidth and storage requirement of the perception system.

Sensor Synchronization Capture
Data Annotation

Complex Label Taxonomy

As Aptiv's partner in developing nuScenes, Scale contributed two things: First, is the data annotation, including deciding on the taxonomy and developing instructions for labelers and managing QA.

Second is Scale's web-based visualizer for LiDAR and camera data for exploring the dataset. Scale's visualizer allows point cloud data to be easily embedded into any webpage and shared.

Scene 1
nuscenes cover

Get Started with Nuscenes

Ready to get started with nuScenes? This tutorial will give you an overview of the dataset without the need to download it. Please note that this page is a rendered version of a Jupyter Notebook.