资源论文Answer-Type Prediction for Visual Question Answering

Answer-Type Prediction for Visual Question Answering

2019-12-26 | |  48 |   46 |   0

Abstract

Recently, algorithms for object recognition and relatedtasks have become sufficiently proficient that new visiontasks can now be pursued. In this paper, we build a sys-tem capable of answering open-ended text-based questionsabout images, which is known as Visual Question Answer-ing (VQA). Our approach’s key insight is that we can pre-dict the form of the answer from the question. We formu-late our solution in a Bayesian framework. When our ap-proach is combined with a discriminative model, the com-bined model achieves state-of-the-art results on four benchmark datasets for open-ended VQA: DAQUAR, COCO-QA, The VQA Dataset, and Visual7W.

上一篇:Harnessing Object and Scene Semantics for Large-Scale Video Understanding

下一篇:Object Detection from Video Tubelets with Convolutional Neural Networks

用户评价
全部评价

热门资源

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Learning to learn...

    The move from hand-designed features to learned...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...