Modeling both Context- and Speaker-Sensitive Dependence for Emotion Detectionin Multi-speaker Conversations
Abstract
Recently, emotion detection in conversations becomes a hot research topic in the Natural Language Processing community. In this paper, we
focus on emotion detection in multi-speaker conversations instead of traditional two-speaker conversations in existing studies. Different from nonconversation text, emotion detection in conversation text has one specific challenge in modeling the
context-sensitive dependence. Besides, emotion
detection in multi-speaker conversations endorses
another specific challenge in modeling the speakersensitive dependence. To address above two challenges, we propose a conversational graph-based
convolutional neural network. On the one hand,
our approach represents each utterance and each
speaker as a node. On the other hand, the contextsensitive dependence is represented by an undirected edge between two utterances nodes from the
same conversation and the speaker-sensitive dependence is represented by an undirected edge between
an utterance node and its speaker node. In this way,
the entire conversational corpus can be symbolized
as a large heterogeneous graph and the emotion detection task can be recast as a classification problem
of the utterance nodes in the graph. The experimental results on a multi-modal and multi-speaker conversation corpus demonstrate the great effectiveness of the proposed approach