ON IDENTIFIABILITY IN TRANSFORMERS

资源分类

2020-01-02 |

56 |

36 |

Abstract

In this work we delve deep in the Transformer architecture by investigating two of its core components: self-attention and contextual embeddings. In particular, we study the identifiability of attention weights and token embeddings, and the aggregation of context into hidden tokens. We show that attention weights are not unique and propose effective attention as a complementary tool for improving explanatory interpretations based on attention. Furthermore, we are the first to show that input tokens retain to a large degree their identity across the model. We also study how identity information propagates and find that it is mainly encoded in the angle of the embeddings and gradually decreases with depth. Finally, we demonstrate strong mixing of input information in the generation of contextual embeddings by means of a novel quantification method based on gradient attribution. Overall, we show that self-attention distributions are not directly interpretable and present tools to further investigate Transformer models.

上一篇：N-BEATS: NEURAL BASIS EXPANSION ANALYSIS FORINTERPRETABLE TIME SERIES FORECASTING

下一篇：Fantastic Generalization Measuresand Where to Find Them

用户评价

全部评价

还没有评论，说两句吧！

热门资源

Learning to Predi...

Much of model-based reinforcement learning invo...
Stratified Strate...

In this paper we introduce Stratified Strategy ...
The Variational S...

Unlike traditional images which do not offer in...
Learning to learn...

The move from hand-designed features to learned...
A Mathematical Mo...

Direct democracy, where each voter casts one vo...

智能在线

400-630-6780
聆听.建议反馈

E-mail: support@tusaishared.com