Abstract
Every bit matters in the hardware design of quantized
neural networks. However, extremely-low-bit representation usually causes large accuracy drop. Thus, how to train
extremely-low-bit neural networks with high accuracy is of
central importance. Most existing network quantization approaches learn transformations (low-bit weights) as well as
encodings (low-bit activations) simultaneously. This tight
coupling makes the optimization problem difficult, and thus
prevents the network from learning optimal representations.
In this paper, we propose a simple yet effective Two-Step
Quantization (TSQ) framework, by decomposing the network quantization problem into two steps: code learning
and transformation function learning based on the learned
codes. For the first step, we propose the sparse quantization
method for code learning. The second step can be formulated as a non-linear least square regression problem with
low-bit constraints, which can be solved efficiently in an iterative manner. Extensive experiments on CIFAR-10 and
ILSVRC-12 datasets demonstrate that the proposed TSQ
is effective and outperforms the state-of-the-art by a large
margin. Especially, for 2-bit activation and ternary weight
quantization of AlexNet, the accuracy of our TSQ drops only
about 0.5 points compared with the full-precision counterpart, outperforming current state-of-the-art by more than 5
points