Accelerating Sparse Matrix Operations in Neural Networks on Graphics Processing Units

资源分类

2019-09-19 |

106 |

58 |

Abstract Graphics Processing Units (GPUs) are commonly used to train and evaluate neural networks effiffifficiently. While previous work in deep learning has focused on accelerating operations on dense matrices/tensors on GPUs, efforts have concentrated on operations involving sparse data structures. Operations using sparse structures are common in natural language models at the input and output layers, because these models operate on sequences over discrete alphabets. We present two new GPU algorithms: one at the input layer, for multiplying a matrix by a few-hot vector (generalizing the more common operation of multiplication by a one-hot vector) and one at the output layer, for a fused softmax and top-N selection (commonly used in beam search). Our methods achieve speedups over state-of-theart parallel GPU baselines of up to 7× and 50×, respectively. We also illustrate how our methods scale on difffferent GPU architectures

上一篇：Zero-Shot Semantic Parsing for Instructions

下一篇：Adversarial Multitask Learning for Joint Multi-Feature and Multi-Dialect Morphological Modeling

用户评价

全部评价

还没有评论，说两句吧！

热门资源

A Mathematical Mo...

Direct democracy, where each voter casts one vo...
Learning to Predi...

Much of model-based reinforcement learning invo...
The Variational S...

Unlike traditional images which do not offer in...
Hierarchical Task...

We extend hierarchical task network planning wi...
Shape-based Autom...

We present an algorithm for automatic detection...

智能在线

400-630-6780
聆听.建议反馈

E-mail: support@tusaishared.com