Abstract
This paper presents a method of learning reconfigurable hier- archical And-Or models to integrate context and occlusion for car detec- tion. The And-Or model represents the regularities of car-to-car context and occlusion patterns at three levels: (i) layouts of spatially-coupled N cars, (ii) single cars with different viewpoint-occlusion configurations, and (iii) a small number of parts. The learning process consists of two stages. We first learn the structure of the And-Or model with three components: (a) mining N -car contextual patterns based on layouts of annotated single car bounding boxes, (b) mining the occlusion config- urations based on the overlapping statistics between single cars, and (c) learning visible parts based on car 3D CAD simulation or heuris- tically mining latent car parts. The And-Or model is organized into a directed and acyclic graph which leads to the Dynamic Programming algorithm in inference. In the second stage, we jointly train the model parameters (for appearance, deformation and bias) using Weak-Label Structural SVM. In experiments, we test our model on four car datasets: the KITTI dataset [11], the street parking dataset [19], the PASCAL VOC2007 car dataset [7], and a self-collected parking lot dataset. We compare with state-of-the-art variants of deformable part-based models and other methods. Our model obtains significant improvement consis- tently on the four datasets.