Abstract
The development of object detection in the era of deep
learning, from R-CNN [11], Fast/Faster R-CNN [10, 31] to
recent Mask R-CNN [14] and RetinaNet [24], mainly come
from novel network, new framework, or loss design. However, mini-batch size, a key factor for the training of deep
neural networks, has not been well studied for object detection. In this paper, we propose a Large Mini-Batch Object
Detector (MegDet) to enable the training with a large minibatch size up to 256, so that we can effectively utilize at
most 128 GPUs to significantly shorten the training time.
Technically, we suggest a warmup learning rate policy and
Cross-GPU Batch Normalization, which together allow us
to successfully train a large mini-batch detector in much
less time (e.g., from 33 hours to 4 hours), and achieve even
better accuracy. The MegDet is the backbone of our submission (mmAP 52.5%) to COCO 2017 Challenge, where
we won the 1st place of Detection task