Learning Efficient Single-stage Pedestrian
Detectors by Asymptotic Localization Fitting
Abstract. Though Faster R-CNN based two-stage detectors have witnessed significant boost in pedestrian detection accuracy, it is still slow
for practical applications. One solution is to simplify this working flow
as a single-stage detector. However, current single-stage detectors (e.g.
SSD) have not presented competitive accuracy on common pedestrian
detection benchmarks. This paper is towards a successful pedestrian
detector enjoying the speed of SSD while maintaining the accuracy of
Faster R-CNN. Specifically, a structurally simple but effective module
called Asymptotic Localization Fitting (ALF) is proposed, which stacks
a series of predictors to directly evolve the default anchor boxes of
SSD step by step into improving detection results. As a result, during
training the latter predictors enjoy more and better-quality positive
samples, meanwhile harder negatives could be mined with increasing
IoU thresholds. On top of this, an efficient single-stage pedestrian
detection architecture (denoted as ALFNet) is designed, achieving stateof-the-art performance on CityPersons and Caltech, two of the largest
pedestrian detection benchmarks, and hence resulting in an attractive
pedestrian detector in both accuracy and speed. Code is available at
https://github.com/VideoObjectSearch/ALFNet