Abstract
Most pedestrian detection approaches that achieve high accuracy and precision rate and that can be used for realtime applications are based on histograms of gradient orientations. Usually multiscale detection is attained by resizing the image several times and by recomputing the image features or using multiple classifiers for different scales. In this paper we present a pedestrian detection approach that uses the same classifier for all pedestrian scales based on image features computed for a single scale. We go beyond the low level pixel-wise gradient orientation bins and use higher level visual wordsorganized into Word Channels. Boosting is used to learn classification features from the integral Word Channels. The proposed approach is evaluated on multiple datasets and achieves outstanding results on the INRIA and Caltech-USA benchmarks. By using a GPU implementation we achieve a classification rate of over 10 million bounding boxes per second and a 16 FPS rate for multiscale detection in a 640×480 image.