资源论文Yin and Yang: Balancing and Answering Binary Visual Questions

Yin and Yang: Balancing and Answering Binary Visual Questions

2019-12-26 | |  57 |   38 |   0

Abstract

scenes play two roles (1) They allow us to focus on the highlevel semantics of the VQA task as opposed to the low-level recognition problems, and perhaps more importantly, (2) They provide us the modality to balance the dataset such that language priors are controlled, and the role of vision is essential. In particular, we collect fine-grained pairs of scenes for every question, such that the answer to the question is “yes” for one scene, and “no” for the other for the exact same question. Indeed, language priors alone do not perform better than chance on our balanced dataset. Moreover, our proposed approach matches the performance of a state-of-the-art VQA approach on the unbalanced dataset,and outperforms it on the balanced dataset.

上一篇:Feature Space Optimization for Semantic Video Segmentation

下一篇:Video Paragraph Captioning Using Hierarchical Recurrent Neural Networks

用户评价
全部评价

热门资源

  • Learning to learn...

    The move from hand-designed features to learned...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • Rating-Boosted La...

    The performance of a recommendation system reli...

  • Hierarchical Task...

    We extend hierarchical task network planning wi...