Abstract While most steps in the modern object detection methods
are learnable, the region feature extraction step remains largely handcrafted, featured by RoI pooling methods. This work proposes a general
viewpoint that unifies existing region feature extraction methods and
a novel method that is end-to-end learnable. The proposed method removes most heuristic choices and outperforms its RoI pooling counterparts. It moves further towards fully learnable object detection