Lightweight and Efficient Neural Natural Language Processing
with Quaternion Networks
Abstract
Many state-of-the-art neural models for NLP
are heavily parameterized and thus memory
inefficient. This paper proposes a series of
lightweight and memory efficient neural architectures for a potpourri of natural language
processing (NLP) tasks. To this end, our models exploit computation using Quaternion algebra and hypercomplex spaces, enabling not
only expressive inter-component interactions
but also significantly (75%) reduced parameter size due to lesser degrees of freedom in
the Hamilton product. We propose Quaternion
variants of models, giving rise to new architectures such as the Quaternion attention Model
and Quaternion Transformer. Extensive experiments on a battery of NLP tasks demonstrates
the utility of proposed Quaternion-inspired
models, enabling up to 75% reduction in parameter size without significant loss in performance