Abstract
We demonstrate that many detection methods are designed to identify only a sufficently accurate bounding box,
rather than the best available one. To address this issue
we propose a simple and fast modification to the existing
methods called Fitness NMS. This method is tested with the
DeNet model and obtains a significantly improved MAP at
greater localization accuracies without a loss in evaluation
rate, and can be used in conjunction with Soft NMS for additional improvements. Next we derive a novel bounding
box regression loss based on a set of IoU upper bounds that
better matches the goal of IoU maximization while still providing good convergence properties. Following these novelties we investigate RoI clustering schemes for improving
evaluation rates for the DeNet wide model variants and provide an analysis of localization performance at various input image dimensions. We obtain a MAP of 33.6%@79Hz
and 41.8%@5Hz for MSCOCO and a Titan X (Maxwell).
Source code available from: https://github.com/
lachlants/denet