资源论文ADDITIVE POWERS -OF -T WO QUANTIZATION :A NON -UNIFORM DISCRETIZATION FOR NEURAL NETWORKS

ADDITIVE POWERS -OF -T WO QUANTIZATION :A NON -UNIFORM DISCRETIZATION FOR NEURAL NETWORKS

2019-12-30 | |  66 |   54 |   0

Abstract

We proposed Additive Powers-of-Two (APoT) quantization, an efficient nonuniform quantization scheme that attends to the bell-shaped and long-tailed distribution of weights in neural networks. By constraining all quantization levels as a sum of several Powers-of-Two terms, APoT quantization enjoys the overwhelming efficiency of computation and a good match with weights’ distribution. A simple reparameterization on the clipping function is applied to generate a betterdefined gradient for updating of optimal clipping threshold. Moreover, weight normalization is presented to refine the input distribution of weights to be more stable and consistent. Experimental results show that our proposed method outperforms state-of-the-art methods, and is even competitive with the full-precision models demonstrating the effectiveness of our proposed APoT quantization. For example, our 5-bit quantized ResNet-50 on ImageNet achieves 76.8% top-1 accuracy without bells and whistles, meanwhile, our model is capable to decrease 31% fixed point computation overhead in uniformly quantized counterpart.

上一篇:THE LOGICAL EXPRESSIVENESS OFG RAPH NEURAL NETWORKS

下一篇:SHIFTED AND SQUEEZED 8- BIT FLOATING POINT FOR -MAT FOR LOW-P RECISION TRAINING OF DEEP NEU -RAL NETWORKS

用户评价
全部评价

热门资源

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Rating-Boosted La...

    The performance of a recommendation system reli...