Abstract
Multiview ob ject detection methods achieve robustness in adverse imaging conditions by exploiting pro jective consistency across views. In this paper, we present an algorithm that achieves performance comparable to multiview methods from a single camera by employing geometric primitives as proxies for the true 3D shape of ob jects, such as pedestrians or vehicles. Our key insight is that for a calibrated camera, geometric primitives produce predetermined location-specific patterns in occupancy maps. We use these to define spatially-varying kernel func- tions of pro jected shape. This leads to an analytical formation model of occupancy maps as the convolution of locations and pro jected shape kernels. We estimate ob ject locations by deconvolving the occupancy map using an efficient template similarity scheme. The number of ob- jects and their positions are determined using the mean shift algorithm. The approach is highly parallel because the occupancy probability of a particular geometric primitive at each ground location is an independent computation. The algorithm extends to multiple cameras without requir- ing significant bandwidth. We demonstrate comparable performance to multiview methods and show robust, realtime ob ject detection on full resolution HD video in a variety of challenging imaging conditions.