The dataset comprises the following information, captured and synchronized at 10 Hz:
Raw (unsynced+unrectified) and processed (synced+rectified) grayscale stereo sequences (0.5 Megapixels, stored in png format)
Raw (unsynced+unrectified) and processed (synced+rectified) color stereo sequences (0.5 Megapixels, stored in png format)
3D Velodyne point clouds (100k points per frame, stored as binary float matrix)
3D GPS/IMU data (location, speed, acceleration, meta information, stored as text file)
Calibration (Camera, Camera-to-GPS/IMU, Camera-to-Velodyne, stored as text file)
3D object tracklet labels (cars, trucks, trams, pedestrians, cyclists, stored as xml file)
Here, "unsynced+unrectified" refers to the raw input frames where images are distorted and the frame indices do not correspond, while "synced+rectified" refers to the processed data where images have been rectified and undistorted and where the data frame numbers correspond across all sensor streams. For both settings, files with timestamps are provided. Most people require only the "synced+rectified" version of the files.