资源论文Natural Language Object Retrieval

Natural Language Object Retrieval

2019-12-27 | |  85 |   45 |   0

Abstract

In this paper, we address the task of natural languageobject retrieval, to localize a target object within a givenimage based on a natural language query of the object. Natural language object retrieval differs from text-based imageretrieval task as it involves spatial information about ob-jects within the scene and global scene context. To addressthis issue, we propose a novel Spatial Context Recurrent ConvNet (SCRC) model as scoring function on candidate boxes for object retrieval, integrating spatial configurationsand global scene-level contextual information into the network. Our model processes query text, local image de-scriptors, spatial configurations and global context features through a recurrent network, outputs the probability of the query text conditioned on each candidate box as a score for the box, and can transfer visual-linguistic knowledge from image captioning domain to our task. Experimental results demonstrate that our method effectively utilizes both local and global information, outperforming previous baselinemethods significantly on different datasets and scenarios, and can exploit large scale vision and language datasetsfor knowledge transfer.

上一篇:Semantic 3D Reconstruction with Continuous Regularization and Ray Potentials Using a Visibility Consistency Constraint

下一篇:DenseCap: Fully Convolutional Localization Networks for Dense Captioning

用户评价
全部评价

热门资源

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Rating-Boosted La...

    The performance of a recommendation system reli...