A Multi-task Learning Approach for Image Captioning

资源分类

2019-11-07 |

91 |

53 |

Abstract In this paper, we propose a Multi-task Learning Approach for Image Captioning (MLAIC), motivated by the fact that humans have no difficulty performing such task because they have the capabilities of multiple domains. Specifically, MLAIC consists of three key components: (i) A multi-object classification model that learns rich category-aware image representations using a CNN image encoder; (ii) A syntax generation model that learns better syntaxaware LSTM based decoder; (iii) An image captioning model that generates image descriptions in text, sharing its CNN encoder and LSTM decoder with the object classification task and the syntax generation task, respectively. In particular, the image captioning model can benefit from the additional object categorization and syntax knowledge. The experimental results on MS-COCO dataset demonstrate that our model achieves impressive results compared to other strong competitors.1

上一篇：Distortion-aware CNNs for Spherical Images

下一篇：DehazeGAN: When Image Dehazing Meets Differential Programming

用户评价

全部评价

还没有评论，说两句吧！

热门资源

The Variational S...

Unlike traditional images which do not offer in...
Learning to Predi...

Much of model-based reinforcement learning invo...
Stratified Strate...

In this paper we introduce Stratified Strategy ...
A Mathematical Mo...

Direct democracy, where each voter casts one vo...
Rating-Boosted La...

The performance of a recommendation system reli...

智能在线

400-630-6780
聆听.建议反馈

E-mail: support@tusaishared.com