LSTMEmbed: Learning Word and Sense Representations from a Large
Semantically Annotated Corpus with Long Short-Term Memories
Abstract
While word embeddings are now a de facto
standard representation of words in most NLP
tasks, recently the attention has been shifting
towards vector representations which capture
the different meanings, i.e., senses, of words.
In this paper we explore the capabilities of a
bidirectional LSTM model to learn representations of word senses from semantically annotated corpora. We show that the utilization
of an architecture that is aware of word order, like an LSTM, enables us to create better representations. We assess our proposed
model on various standard benchmarks for
evaluating semantic representations, reaching
state-of-the-art performance on the SemEval-
2014 word-to-sense similarity task. We release the code and the resulting word and sense
embeddings at http://lcl.uniroma1.
it/LSTMEmbed.