The TUD Crossing dataset from Micha Andriluka, Stefan Roth and Bernt Schiele consists of 201 images with 1008 highly overlapping pedestrians with significant variation in clothing and articulation. The original annotation by Andriluka et al. [?] contains 1008 tight bounding boxes for pedestrians with at least 50% visibility, ignoring many overlapping pedestrians. A second annotation by Barinova et al. annotated 1018 pedestrians, still missing many pedestrians. Typically three scales are used to evaluate on this dataset, however there are only minor scale changes, thus a good single scale may also be sufficient. A more complete annotation was done by Riemenschneider et al, consisting of 1216 pedestrian instances with densely segmented overlapping pedestrians. * 1008 instances 50% overlap [Andriluka] * 1018 instances 3 scales [Barinova] * 1218 segmented instances 90% overlap [Riemenschneider] * 21 frames [Horbert] * 5 frames [cvpr2015] The dataset is typically used for tracking evaluation, however recently single-frame detectors were evaluated because of the challenge in overlaps. The protocol entails training on the TUD Pedestrian training dataset and its specifications. The evaluation criterion is the PASCAL IOU 50%. References: Andriluka, M., Roth, S., Schiele, B.: People-tracking-by-detection and peopledetection-by-tracking. CVPR. (2008) Hough Regions for Joining Instance Detection and Segmentation H. Riemenschneider, S. Sternig, M. Donoser, P. Roth, H. Bischof, ECCV 2012 Other references on much smaller annotations (21 frames): Level-Set Person Segmentation and Tracking with Multi-Region Appearance Models and Top-Down Shape Information, E. Horbert, K. Rematas, B. Leibe, (ICCV11), 2011.

Related datasets