Abstract
This paper proposes a deep learning architecture based
on Residual Network that dynamically adjusts the number
of executed layers for the regions of the image. This architecture is end-to-end trainable, deterministic and problemagnostic. It is therefore applicable without any modifications
to a wide range of computer vision problems such as image
classification, object detection and image segmentation. We
present experimental results showing that this model improves the computational efficiency of Residual Networks on
the challenging ImageNet classification and COCO object
detection datasets. Additionally, we evaluate the computation time maps on the visual saliency dataset cat2000 and
find that they correlate surprisingly well with human eye
fixation positions.