The PETS 2009 dataset contains 3 parts showing multi-view sequences containing pedestrians walking in an outdoor environment. The parts are used for person counting and density estimation, people tracking and flow analysis with event recognition. The calibration data for the eight different cameras is available on the website. The dataset contains a set of training images for background, city center and regular flow of people training data. The person counting sequence is the most commonly used for pedestrian detection and contains 875 images. The images are split into three regions of interest: full image, upper left street, lower right street and are used for density estimation. The typical selected sequence consists of 188 frames and 4307 people and is highly crowded and thus very challenging. It was used by Sternig et al. (AVSS) and Stalder et al. [18]. These works assume ground plane information and incorporate additional 3D tracking.

