Abstract
Ob ject detection has seen a surge of interest in recent years, which has lead to increasingly effective techniques. These techniques, however, still mostly perform detection based on local evidence in the input image. While some progress has been made towards exploiting scene context, the resulting methods typically only consider a single im- age at a time. Intuitively, however, the information contained jointly in multiple images should help overcoming phenomena such as occlusion and poor resolution. In this paper, we address the co-detection problem that aims to leverage this collective power to achieve ob ject detection si- multaneously in all the images of a set. To this end, we formulate ob ject co-detection as inference in a fully-connected CRF whose edges model the similarity between ob ject candidates. We then learn a similarity func- tion that allows us to efficiently perform inference in this fully-connected graph, even in the presence of many ob ject candidates. This is in con- trast with existing co-detection techniques that rely on exhaustive or greedy search, and thus do not scale well. Our experiments demonstrate the benefits of our approach on several co-detection datasets.