Abstract
To reduce the significant redundancy in deep Convolutional Neural Networks (CNNs), most existing methods
prune neurons by only considering the statistics of an individual layer or two consecutive layers (e.g., prune one layer
to minimize the reconstruction error of the next layer), ignoring the effect of error propagation in deep networks. In
contrast, we argue that for a pruned network to retain its
predictive power, it is essential to prune neurons in the entire neuron network jointly based on a unified goal: minimizing the reconstruction error of important responses in
the “final response layer” (FRL), which is the second-tolast layer before classification. Specifically, we apply feature ranking techniques to measure the importance of each
neuron in the FRL, formulate network pruning as a binary
integer optimization problem, and derive a closed-form solution to it for pruning neurons in earlier layers. Based on
our theoretical analysis, we propose the Neuron Importance
Score Propagation (NISP) algorithm to propagate the importance scores of final responses to every neuron in the
network. The CNN is pruned by removing neurons with least
importance, and it is then fine-tuned to recover its predictive
power. NISP is evaluated on several datasets with multiple
CNN models and demonstrated to achieve significant acceleration and compression with negligible accuracy loss