GCDT: A Global Context Enhanced Deep Transition Architecture
for Sequence Labeling
Abstract
Current state-of-the-art systems for the sequence labeling tasks are typically based on
the family of Recurrent Neural Networks
(RNNs). However, the shallow connections
between consecutive hidden states of RNNs
and insufficient modeling of global information restrict the potential performance of those
models. In this paper, we try to address these
issues, and thus propose a Global Context enhanced Deep Transition architecture for sequence labeling named GCDT. We deepen the
state transition path at each position in a sentence, and further assign every token with a
global representation learned from the entire
sentence. Experiments on two standard sequence labeling tasks show that, given only
training data and the ubiquitous word embeddings (Glove), our GCDT achieves 91.96 F1
on the CoNLL03 NER task and 95.43 F1 on
the CoNLL2000 Chunking task, which outperforms the best reported results under the same
settings. Furthermore, by leveraging BERT as
an additional resource, we establish new stateof-the-art results with 93.47 F1 on NER and
97.30 F1 on Chunking