Multi-grained Attention with Object-level Grounding for Visual Question Answering

资源分类

2019-09-20 |

116 |

63 |

Abstract Attention mechanisms are widely used in Visual Question Answering (VQA) to search for visual clues related to the question. Most approaches train attention models from a coarsegrained association between sentences and images, which tends to fail on small objects or uncommon concepts. To address this problem, this paper proposes a multi-grained attention method. It learns explicit wordobject correspondence by two types of wordlevel attention complementary to the sentenceimage association. Evaluated on the VQA benchmark, the multi-grained attention model achieves competitive performance with stateof-the-art models. And the visualized attention maps demonstrate that addition of objectlevel groundings leads to a better understanding of the images and locates the attended objects more precisely

上一篇：Learning Representation Mapping for Relation Detection in Knowledge Base Question Answering

下一篇：Multi-Hop Paragraph Retrieval for Open-Domain Question Answering

用户评价

全部评价

还没有评论，说两句吧！

热门资源

The Variational S...

Unlike traditional images which do not offer in...
Learning to Predi...

Much of model-based reinforcement learning invo...
Stratified Strate...

In this paper we introduce Stratified Strategy ...
Learning to learn...

The move from hand-designed features to learned...
A Mathematical Mo...

Direct democracy, where each voter casts one vo...

智能在线

400-630-6780
聆听.建议反馈

E-mail: support@tusaishared.com