Abstract
Localizing ob jects in cluttered backgrounds is a challenging task in weakly supervised localization. Due to large ob ject variations in cluttered images, ob jects have large ambiguity with backgrounds. How- ever, backgrounds contain useful latent information, e.g., the sky for aeroplanes. If we can learn this latent information, ob ject-background ambiguity can be reduced to suppress the background. In this paper, we propose the latent category learning (LCL), which is an unsupervised learning problem given only image-level class labels. Firstly, inspired by the latent semantic discovery, we use the typical probabilistic La- tent Semantic Analysis (pLSA) to learn the latent categories, which can represent ob jects, ob ject parts or backgrounds. Secondly, to determine which category contains the target ob ject, we propose a category selec- tion method evaluating each category’s discrimination. We evaluate the method on the PASCAL VOC 2007 database and ILSVRC 2013 detec- tion challenge. On VOC 2007, the proposed method yields the annotation accuracy of 48%, which outperforms previous results by 10%. More im- portantly, we achieve the detection average precision of 30.9%, which improves previous results by 8% and can be competitive with the su- pervised deformable part model (DPM) 5.0 baseline 33.7%. On ILSVRC 2013 detection, the method yields the precision of 6.0%, which is also competitive with the DPM 5.0.