Abstract
Learning a new ob ject class from cluttered training images is very challenging when the location of ob ject instances is unknown. Previous works generally require ob jects covering a large portion of the images. We present a novel approach that can cope with extensive clutter as well as large scale and appearance variations between ob ject instances. To make this possible we propose a conditional random field that starts from generic knowledge and then progressively adapts to the new class. Our approach simultaneously localizes ob ject instances while learning an appearance model specific for the class. We demonstrate this on the chal- lenging Pascal VOC 2007 dataset. Furthermore, our method enables to train any state-of-the-art ob ject detector in a weakly supervised fashion, although it would normally require ob ject location annotations.