Open Datasets

Pandaset

Public large-scale dataset for autonomous driving provided by Hesai & Scale. It enables researchers to study challenging urban driving situations using the full sensor suit of a real self-driving-car

The full PandaSet will be available for download soon.

  • Scene #1
  • Scene #2
  • Scene #3
  • Scene #4
  • Scene 5 soon
  • Scene 6 soon
Overview

Sophisticated LiDAR technology meets high quality data annotation

PandaSet aims to promote and advance research and development in autonomous driving and machine learning.

Combining Hesai’s best in class LiDAR sensors with Scale’s high-quality data annotation, PandaSet is the first public dataset to feature solid-state LiDAR (PandarGT) and point cloud segmentation (Sensor Fusion Segmentation).

It features:

  • 60,000 camera images
  • 20,000 LiDAR sweeps
  • 125 scenes of 8s each
  • 28 annotation classes
  • 37 semantic segmentation labels
  • Full sensor suite: 1x mechanical LiDAR, 1x solid-state LiDAR, 6x cameras, On-board GPS/IMU
Data Collection

Complex Driving Scenarios in Urban Environments

For PandaSet we carefully planned routes and selected scenes that would showcase complex urban driving scenarios, including steep hills, construction, dense traffic and pedestrians, and a variety of times of day and lighting conditions in the morning, afternoon, dusk and evening.

Pandaset scenes are selected from 2 routes in Silicon Valley:

Cat Setup

Vehicle, Sensor and Camera Details

Two identical cars with identical sensor layouts were used to drive in Boston and Singapore.

Refer to the image below for the placement of the sensors:

    • 10 Hz capture frequency
    • 1/2.7” CMOS sensor of 1920x1080 resolution
    • Images are unpacked to YUV 4:4:4 format and compressed to JPEG
    5Wide Angle Cameras
    • 10 Hz capture frequency
    • 1/2.7” CMOS sensor of 1920x1080 resolution
    • Images are unpacked to YUV 4:4:4 format and compressed to JPEG
    1Long Focus Camera
    • 1x Spinning LiDARs
    • 64 channels
    • 200m range @ 10% reflectivity
    • 360° horizontal FOV; 40° vertical FOV (-25° to +15°)
    • 0.2° horizontal angular resolution (10 Hz); 0.167° vertical angular resolution (finest)
    • 10 Hz capture frequency
    1Pandar64: Mechanical LiDAR
    • Equivalent to 150 channels at 10 Hz
    • 300m range @ 10% reflectivity
    • 60° horizontal FOV; 20° vertical FOV (-10° to +10° with ±5° offset, configurable)
    • 0.1° horizontal angular resolution; 0.07° vertical angular resolution (finest) at 10 Hz
    • 10 Hz capture frequency
    1PandarGT: Solid-State LiDAR
    • 10 Hz capture frequency
    • 1/2.7” CMOS sensor of 1920x1080 resolution
    • Images are unpacked to YUV 4:4:4 format and compressed to JPEG
    5Wide Angle Cameras
    • 10 Hz capture frequency
    • 1/2.7” CMOS sensor of 1920x1080 resolution
    • Images are unpacked to YUV 4:4:4 format and compressed to JPEG
    1Long Focus Camera
    • 1x Spinning LiDARs
    • 64 channels
    • 200m range @ 10% reflectivity
    • 360° horizontal FOV; 40° vertical FOV (-25° to +15°)
    • 0.2° horizontal angular resolution (10 Hz); 0.167° vertical angular resolution (finest)
    • 10 Hz capture frequency
    1Pandar64: Mechanical LiDAR
    • Equivalent to 150 channels at 10 Hz
    • 300m range @ 10% reflectivity
    • 60° horizontal FOV; 20° vertical FOV (-10° to +10° with ±5° offset, configurable)
    • 0.1° horizontal angular resolution; 0.07° vertical angular resolution (finest) at 10 Hz
    • 10 Hz capture frequency
    1PandarGT: Solid-State LiDAR

PandarGT Road Test


Sensor Calibration

Data alignment between sensors and cameras.

To achieve a high quality multi-sensor dataset, it is essential to calibrate the extrinsics and intrinsics of every sensor.

We express extrinsic coordinates relative to the ego frame, i.e. the midpoint of the rear vehicle axle.

The most relevant steps are described below:

  • LiDAR extrinsics
  • Camera extrinsics
  • IMU extrinsics
  • Camera intrinsic calibration
Car Sensors
Data Annotation

Complex Label Taxonomy

Scale’s data annotation platform combines human work and review with smart tools, statistical confidence checks and machine learning checks to ensure the quality of annotations.

The resulting accuracy is consistently higher than what a human or synthetic labeling approach can achieve independently as measured against seven rigorous quality areas for each annotation.

Scene #1
pandaset cover

Get Started with Pandaset

The full PandaSet will be available for download soon.