资源论文MetaQuant: Learning to Quantize by Learning to Penetrate Non-differentiable Quantization

MetaQuant: Learning to Quantize by Learning to Penetrate Non-differentiable Quantization

2020-02-21 | |  34 |   32 |   0

Abstract

Tremendous amount of parameters make deep neural networks impractical to be deployed for edge-device-based real-world applications due to the limit of computational power and storage space. Existing studies have made progress on learning quantized deep models to reduce model size and energy consumption, i.e. converting full-precision weights (r’s) into discrete values (q’s) in a supervised training manner. However, the training process for quantization is non-differentiable, which leads to either infinite or zero gradients (gr ) w.r.t. r. To address this problem, most training-based quantization methods use the gradient w.r.t. q (gq ) with clipping to approximate gr by Straight-Through-Estimator (STE) or manually design their computation. However, these methods only heuristically make training-based quantization applicable, without further analysis on how the approximated gradients can assist training of a quantized network. In this paper, we propose to learn gr by a neural network. Specifically, a meta network is trained using gq and r as inputs, and outputs gr for subsequent weight updates. The meta network is updated together with the original quantized network. Our proposed method alleviates the problem of non-differentiability, and can be trained in an end-to-end manner. Extensive experiments are conducted with CIFAR10/100 and ImageNet on various deep networks to demonstrate the advantage of our proposed method in terms of a faster convergence rate and better performance. Codes are released at: https://github.com/csyhhu/MetaQuant

上一篇:Control What You Can Intrinsically Motivated Task-Planning Agent

下一篇:Planning in entropy-regularized Markov decision processes and games

用户评价
全部评价

热门资源

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Learning to learn...

    The move from hand-designed features to learned...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...