资源论文Incorporating Copying Mechanism in Image Captioning for Learning Novel Objects

Incorporating Copying Mechanism in Image Captioning for Learning Novel Objects

2019-12-06 | |  60 |   37 |   0
Abstract Image captioning often requires a large set of training image-sentence pairs. In practice, however, acquiring suf- ficient training pairs is always expensive, making the recent captioning models limited in their ability to describe objects outside of training corpora (i.e., novel objects). In this paper, we present Long Short-Term Memory with Copying Mechanism (LSTM-C) — a new architecture that incorporates copying into the Convolutional Neural Networks (CNN) plus Recurrent Neural Networks (RNN) image captioning framework, for describing novel objects in captions. Specifically, freely available object recognition datasets are leveraged to develop classifiers for novel objects. Our LSTM-C then nicely integrates the standard word-by-word sentence generation by a decoder RNN with copying mechanism which may instead select words from novel objects at proper places in the output sentence. Extensive experiments are conducted on both MSCOCO image captioning and ImageNet datasets, demonstrating the ability of our proposed LSTM-C architecture to describe novel objects. Furthermore, superior results are reported when compared to stateof-the-art deep models

上一篇:Improved Stereo Matching with Constant Highway Networks and Reflective Confidence Learning

下一篇:Intrinsic Grassmann Averages for Online Linear and Robust Subspace Learning

用户评价
全部评价

热门资源

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • The Variational S...

    Unlike traditional images which do not offer in...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Rating-Boosted La...

    The performance of a recommendation system reli...