Abstract. Current top-performing object detectors depend on deep CNN backbones, such as ResNet-101 and Inception, benefiting from their
powerful feature representations but suffering from high computational
costs. Conversely, some lightweight model based detectors fulfil real time
processing, while their accuracies are often criticized. In this paper, we
explore an alternative to build a fast and accurate detector by strengthening lightweight features using a hand-crafted mechanism. Inspired by
the structure of Receptive Fields (RFs) in human visual systems, we
propose a novel RF Block (RFB) module, which takes the relationship
between the size and eccentricity of RFs into account, to enhance the feature discriminability and robustness. We further assemble RFB to the
top of SSD, constructing the RFB Net detector. To evaluate its effectiveness, experiments are conducted on two major benchmarks and the
results show that RFB Net is able to reach the performance of advanced
very deep detectors while keeping the real-time speed. Code is available
at https://github.com/ruinmessi/RFBNet.