Abstract
We present a Bayesian approach for simultaneously estimat- ing the number of people in a crowd and their spatial locations by sam- pling from a posterior distribution over crowd configurations. Although this framework can be naturally extended from single to multiview de- tection, we show that the naive extension leads to an inefficient sampler that is easily trapped in local modes. We therefore develop a set of novel proposals that leverage multiview geometry to propose global moves that jump more efficiently between modes of the posterior distribution. We also develop a statistical model of crowd configurations that can han- dle dependencies among people and while not requiring discretization of their spatial locations. We quantitatively evaluate our algorithm on a publicly available benchmark dataset with different crowd densities and environmental conditions, and show that our approach outperforms other state-of-the-art methods for detecting and counting people in crowds.