Joint Attention for Autonomous Driving
(JAAD) Dataset

Timeline of events as the pedestrian is crossing the street
recovered from the behavioral data in the dataset

ABOUT

JAAD is a new dataset for studying joint attention in the context of autonomous driving. In particular, the focus is on pedestrian and driver behaviors at the point of crossing and factors that influence them. To this end, JAAD dataset provides an annotated collection of short video clips representing scenes typical for everyday urban driving in various weather conditions.

DATA

JAAD dataset contains 346 high-resolution video clips (most are 5-10 sec) extracted from approx. 240 hours of driving videos filmed in several locations in North America and Eastern Europe.

EXPLORE

Explore the JAAD dataset here.

STATS

Our dataset contains 88K frames with 2793 unique pedestrians labeled with over 390K bounding boxes. Occlusion tags are provided for each bounding box. ~55K (13%) of bounding boxes are tagged with partial occlusion and ~49K (12%) with heavy occlusion.

Behavioral data and attributes are provided for 868 pedestrians.

Note that statistics provided here may differ from the numbers in published papers since the dataset has been updated several times after the paper submission deadlines.
ANNOTATIONS

There are two types of annotations provided for the dataset: textual and bounding boxes.

Textual annotations contain descriptions of behaviors for those pedestrians and cars that interact with or require attention of the driver. For each video there are several tags (weather, location, whether it is a designated crossing, time of the day, age and gender of the pedestrians, etc.), 3 types of subjects (driver, car, pedestrian) and timestamped behavior descriptions from a fixed list (e.g. stopped, moving fast, walking, looking, signalling, etc).

Behavioral annotations are created using BORIS - an event logging software for video observations.

Bounding boxes are provided for all pedestrians and some cars. Bounding boxes are in .vbb format and require Piotr Dollar's Computer Vision Matlab Toolbox.

In addition, we provide a list of attributes for each pedestrian (e.g. age, gender, direction of motion, etc.) and a list of visible traffic scene elements (e.g. stop sign, traffic signal, etc.) for each video.

Follow the links below to download JAAD dataset and find more information about the available annotations.

CITATION

If you are using JAAD dataset for your research, please consider citing our papers:
I. Kotseruba, A. Rasouli, J. K. Tsotsos. "Joint Attention in Autonomous Driving (JAAD)." arXiv preprint arXiv:1609.04741 (2016).
A. Rasouli, I. Kotseruba, J. K. Tsotsos. "Agreeing to Cross: How Drivers and Pedestrians Communicate." In Proceedings of the IEEE Intelligent Vehicles Symposium (IV) (2017).
A. Rasouli, I. Kotseruba, J. K. Tsotsos. "Are They Going to Cross? A Benchmark Dataset and Baseline for Pedestrian Crosswalk Behavior." ICCVW (2017).
NEW A. Rasouli, I. Kotseruba, and John K. Tsotsos. "Understanding Pedestrian Behavior in Complex Traffic Scenes." IEEE Transactions on Intelligent Vehicles (2017).

DOWNLOAD

UPDATES

  • 06/03/2018 MAJOR UPDATE: Updated all bounding boxes and occlusion tags, added bounding boxes for 100 new pedestrians and behavioral tags for 20 more pedestrians. Moved all annotation data to github.
  • 18/02/2018 Posted a link to the IEEE IV Transactions 2017 paper
  • 01/11/2017 Posted a link to the ICCV 2017 paper.
  • 20/10/2017 MAJOR UPDATE: Added bounding boxes (with occlusion tags) for all pedestrians visible in the scene and text tags for scene elements.
  • 4/08/2017 Posted a link to the IV 2017 paper.
  • 10/02/2017 MAJOR UPDATE: Updated behavioral data and bounding boxes, converted bounding boxes to .vbb format and added occlusion information, renamed video files sequentially for better usability.
  • 01/01/2017 Updated behavioral data, removed redudant video, added extra tags for each subject featured in the video.
  • 29/11/2016 Added behavioral data in yaml format, added extra tags for the videos, fixed incorrect tags and bounding boxes.