Abstract
Recent neural network models have achieved the
state-of-the-art performance on the task of named
entity recognition (NER). However, previous neural network models typically treat the input sentences as a linear sequence of words but ignore rich
structural information, such as the coreference relations among non-adjacent words, phrases or entities. In this paper, we propose a novel approach
to learn coreference-aware word representations for
the NER task at the document level. In particular, we enrich the well-known neural architecture
“CNN-BiLSTM-CRF” with a coreference layer on
top of the BiLSTM layer to incorporate coreferential relations. Furthermore, we introduce the coreference regularization to ensure the coreferential entities to share similar representations and consistent predictions within the same coreference cluster. Our proposed model achieves new state-of-theart performance on two NER benchmarks: CoNLL-
2003 and OntoNotes v5.0. More importantly, we
demonstrate that our framework does not rely on
gold coreference knowledge, and can still work
well even when the coreferential relations are generated by a third-party toolkit