资源论文KALEIDOSCOPE :A NE FFICIENT, LEARNABLE REPRE -SENTATION FOR ALL STRUCTURED LINEAR MAPS

KALEIDOSCOPE :A NE FFICIENT, LEARNABLE REPRE -SENTATION FOR ALL STRUCTURED LINEAR MAPS

2020-01-02 | |  49 |   43 |   0

Abstract

Modern neural network architectures use structured linear transformations, such as low-rank matrices, sparse matrices, permutations, and the Fourier transform, to improve inference speed and reduce memory usage compared to general linear maps. However, choosing which of the myriad structured transformations to use (and its associated parameterization) is a laborious task that requires trading off speed, space, and accuracy. We consider a different approach: we introduce a family of matrices called kaleidoscope matrices (K-matrices) that provably capture any structured matrix with near-optimal space (parameter) and time (arithmetic operation) complexity. We empirically validate that K-matrices can be automatically learned within end-to-end pipelines to replace hand-crafted procedures, in order to improve model quality. For example, replacing channel shuffles in ShuffleNet improves classification accuracy on ImageNet by up to 5%. Learnable K-matrices can also simplify hand-engineered pipelines—we replace filter bank feature computation in speech data preprocessing with a kaleidoscope layer, resulting in only 0.4% loss in accuracy on the TIMIT speech recognition task. K-matrices can also capture latent structure in models: for a challenging permuted image classification task, a K-matrix based representation of permutations is able to learn the right latent structure and improves accuracy of a downstream convolutional model by over 9%. We provide a practically efficient implementation of our approach, and use K-matrices in a Transformer network to attain 36% faster end-to-end inference speed on a language translation task.

上一篇:AC ONSTRUCTIVE PREDICTION OF THEG ENERALIZATION ERROR ACROSS SCALES

下一篇:OVERLEARNING REVEALS SENSITIVE ATTRIBUTES

用户评价
全部评价

热门资源

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • The Variational S...

    Unlike traditional images which do not offer in...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Rating-Boosted La...

    The performance of a recommendation system reli...