资源论文SVD-Softmax: Fast Softmax Approximation on Large Vocabulary Neural Networks

SVD-Softmax: Fast Softmax Approximation on Large Vocabulary Neural Networks

2020-02-12 | |  155 |   58 |   0

Abstract

 We propose a fast approximation method of a softmax function with a very large vocabulary using singular value decomposition (SVD). SVD-softmax targets fast and accurate probability estimation of the topmost probable words during inference of neural network language models. The proposed method transforms the weight matrix used in the calculation of the output vector by using SVD. The approximate probability of each word can be estimated with only a small part of the weight matrix by using a few large singular values and the corresponding elements for most of the words. We applied the technique to language modeling and neural machine translation and present a guideline for good approximation. The algorithm requires only approximately 20% of arithmetic operations for an 800K vocabulary case and shows more than a three-fold speedup on a GPU.

上一篇:Multimodal Learning and Reasoning for Visual Question Answering

下一篇:Generalization Bounds for Uniformly Stable Algorithms

用户评价
全部评价

热门资源

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • Learning to learn...

    The move from hand-designed features to learned...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...