Abstract
Although Faster R-CNN and its variants have shown
promising performance in object detection, they only exploit simple first-order representation of object proposals for final classification and regression. Recent classifi-
cation methods demonstrate that the integration of highorder statistics into deep convolutional neural networks can
achieve impressive improvement, but their goal is to model whole images by discarding location information so that
they cannot be directly adopted to object detection. In this
paper, we make an attempt to exploit high-order statistics in
object detection, aiming at generating more discriminative
representations for proposals to enhance the performance
of detectors. To this end, we propose a novel Multi-scale
Location-aware Kernel Representation (MLKP) to capture
high-order statistics of deep features in proposals. Our MLKP can be efficiently computed on a modified multi-scale
feature map using a low-dimensional polynomial kernel approximation. Moreover, different from existing orderless
global representations based on high-order statistics, our
proposed MLKP is location retentive and sensitive so that
it can be flexibly adopted to object detection. Through integrating into Faster R-CNN schema, the proposed MLKP
achieves very competitive performance with state-of-the-art
methods, and improves Faster R-CNN by 4.9% (mAP), 4.7%
(mAP) and 5.0% (AP at IOU=[0.5:0.05:0.95]) on PASCAL
VOC 2007, VOC 2012 and MS COCO benchmarks, respectively. Code is available at: https://github.com/
Hwang64/MLKP