资源论文PRECISION GATING :I MPROVING NEURAL NETWORKE FFICIENCY WITH DYNAMIC DUAL -P RECISION ACTI -VATIONS

PRECISION GATING :I MPROVING NEURAL NETWORKE FFICIENCY WITH DYNAMIC DUAL -P RECISION ACTI -VATIONS

2019-12-30 | |  69 |   42 |   0

Abstract

We propose precision gating (PG), an end-to-end trainable dual-precision quantization technique for deep neural networks. PG computes most features in a low precision and only a small proportion of important features in a higher precision. Precision gating is very lightweight and widely applicable to many neural network architectures. Experimental results show that precision gating can greatly reduce the average bitwidth of computations in both CNNs and LSTMs with negligible accuracy loss. Compared to state-of-the-art counterparts, PG achieves the same or better accuracy with 2.4× less compute on ImageNet. Compared to 8-bit uniform quantization, PG obtains a 1.2% improvement in perplexity per word with 2.8× computational cost reduction on LSTM on the Penn Tree Bank dataset. Precision gating has the potential to greatly reduce the execution costs of DNNs on both commodity and dedicated hardware accelerators. We implement the sampled dense-dense matrix multiplication kernel in PG on CPU, which achieves up to 8.3× wall clock speedup over the dense baseline.

上一篇:BIOLOGICALLY INSPIRED SLEEP ALGORITHM FOR IN -CREASED GENERALIZATION AND ADVERSARIAL RO -BUSTNESS IN DEEP NEURAL NETWORKS

下一篇:AT STABILITY ’S EDGE :H OW TO ADJUST HYPER -PARAMETERS TO PRESERVE MINIMA SELECTION INA SYNCHRONOUS TRAINING OF NEURAL NETWORKS

用户评价
全部评价

热门资源

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Learning to learn...

    The move from hand-designed features to learned...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...