Abstract
Spatial pyramid matching (SPM) based pooling has been the dominant choice for state-of-art image classification systems. In contrast, we propose a novel object-centric spatial pooling (OCP) approach, follow- ing the intuition that knowing the location of the ob ject of interest can be useful for image classification. OCP consists of two steps: (1) inferring the location of the ob jects, and (2) using the location information to pool foreground and background features separately to form the image-level representation. Step (1) is particularly challenging in a typical classifica- tion setting where precise ob ject location annotations are not available during training. To address this challenge, we propose a framework that learns ob ject detectors using only image-level class labels, or so-called weak labels. We validate our approach on the challenging PASCAL07 dataset. Our learned detectors are comparable in accuracy with state- of-the-art weakly supervised detection methods. More importantly, the resulting OCP approach significantly outperforms SPM-based pooling in image classification.