资源论文AND THE BIT GOES DOWN :R EVISITING THE QUAN -TIZATION OF NEURAL NETWORKS

AND THE BIT GOES DOWN :R EVISITING THE QUAN -TIZATION OF NEURAL NETWORKS

2019-12-30 | |  57 |   44 |   0

Abstract

In this paper, we address the problem of reducing the memory footprint of convolutional network architectures. We introduce a vector quantization method that aims at preserving the quality of the reconstruction of the network outputs rather than its weights. The principle of our approach is that it minimizes the loss reconstruction error for in-domain inputs. Our method only requires a set of unlabelled data at quantization time and allows for efficient inference on CPU by using bytealigned codebooks to store the compressed weights. We validate our approach by quantizing a high performing ResNet-50 model to a memory size of 5 MB (20× compression factor) while preserving a top-1 accuracy of 76.1% on ImageNet object classification and by compressing a Mask R-CNN with a 26× factor.1

上一篇:AFAIR COMPARISON OF GRAPH NEURAL NETWORKSFOR GRAPH CLASSIFICATION

下一篇:EMERGENCE OF FUNCTIONAL AND STRUCTURALPROPERTIES OF THE HEAD DIRECTION SYSTEM BY OP -TIMIZATION OF RECURRENT NEURAL NETWORKS

用户评价
全部评价

热门资源

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Rating-Boosted La...

    The performance of a recommendation system reli...