Cross-Modal Commentator: Automatic Machine Commenting Based on
Cross-Modal Information
Abstract
Automatic commenting of online articles can
provide additional opinions and facts to the
reader, which improves user experience and
engagement on social media platforms. Previous work focuses on automatic commenting
based solely on textual content. However, in
real-scenarios, online articles usually contain
multiple modal contents. For instance, graphic
news contains plenty of images in addition to
text. Contents other than text are also vital because they are not only more attractive to the
reader but also may provide critical information. To remedy this, we propose a new task:
cross-model automatic commenting (CMAC),
which aims to make comments by integrating
multiple modal contents. We construct a largescale dataset for this task and explore several
representative methods. Going a step further,
an effective co-attention model is presented to
capture the dependency between textual and
visual information. Evaluation results show
that our proposed model can achieve better
performance than competitive baselines