Abstract
Typical techniques for sequence classification are designed for well-segmented sequences which have been
edited to remove noisy or irrelevant parts. Therefore, such
methods cannot be easily applied on noisy sequences expected in real-world applications. In this paper, we present
the Temporal Attention-Gated Model (TAGM) which integrates ideas from attention models and gated recurrent networks to better deal with noisy or unsegmented sequences.
Specifically, we extend the concept of attention model to
measure the relevance of each observation (time step) of
a sequence. We then use a novel gated recurrent network
to learn the hidden representation for the final prediction.
An important advantage of our approach is interpretability
since the temporal attention weights provide a meaningful
value for the salience of each time step in the sequence.
We demonstrate the merits of our TAGM approach, both for
prediction accuracy and interpretability, on three different
tasks: spoken digit recognition, text-based sentiment analysis and visual event recognition