Abstract
tags to mark keywords or topics. Along with the fast growing of social network, the task of automatically recommending hashtags has received considerable attention in recent years. Previous works focused only on the use of textual information. However, many microblog posts contain not only texts but also the corresponding images. These images can provide additional information that is not included in the text, which could be helpful to improve the accuracy of hashtag recommendation. Motivated by the successful use of the attention mechanism, we propose a co-attention network incorporating textual and visual information to recommend hashtags for multimodal tweets. Experimental results on the data collected from Twitter demonstrated that the proposed method can achieve better performance than state-of-the-art methods using textual information only.