Abstract. Occlusions present a great challenge for pedestrian detection
in practical applications. In this paper, we propose a novel approach to
simultaneous pedestrian detection and occlusion estimation by regressing
two bounding boxes to localize the full body as well as the visible part of
a pedestrian respectively. For this purpose, we learn a deep convolutional
neural network (CNN) consisting of two branches, one for full body estimation and the other for visible part estimation. The two branches are
treated differently during training such that they are learned to produce
complementary outputs which can be further fused to improve detection
performance. The full body estimation branch is trained to regress full
body regions for positive pedestrian proposals, while the visible part estimation branch is trained to regress visible part regions for both positive
and negative pedestrian proposals. The visible part region of a negative
pedestrian proposal is forced to shrink to its center. In addition, we introduce a new criterion for selecting positive training examples, which
contributes largely to heavily occluded pedestrian detection. We validate
the effectiveness of the proposed bi-box regression approach on the Caltech and CityPersons datasets. Experimental results show that our approach achieves promising performance for detecting both non-occluded
and occluded pedestrians, especially heavily occluded ones