资源算法 vqa-counting

vqa-counting

2020-01-13 | |  31 |   0 |   0

Counting component for VQA

Overview of what the method accomplishes

This is the official implementation of our ICLR 2018 paper Learning to Count Objects in Natural Images for Visual Question Answering in PyTorch. In this paper, we introduce a counting component that allows VQA models to count objects from an attention map, achieving state-of-the-art results on the number category of VQA v2.

The core module is fully contained in counting.py. If you want to use the counting component, that is the only file that you need.

Check out the README's in the vqa-v2 directory for VQA v2 and toy directory for our toy dataset for more specific information on how to train and evaluate on these datasets.

Single-model results on VQA v2 test-std split

As of time of writing, our accuracy on number questions is state-of-the art for single and ensemble models. The accuracy on the overall category is, as far as we know, the second best among single models (see MFH), though our approach is complementary to theirs.

Yes/NoNumberOtherAll
83.5651.3959.1168.41

UPDATE: With this year's VQA Challenge, our number results are no longer SotA. However, Bilinear Attention Networks [code] use this counting component with their improved attention model and get 54.04% on the number category, which is the new SotA on the number category. This validates our claim that a better attention model should lead to further improvements in counting through our counting module.

BibTeX entry

@InProceedings{zhang2018vqacount,
  author    = {Yan Zhang and Jonathon Hare and Adam Pr"ugel-Bennett},
  title     = {Learning to Count Objects in Natural Images for Visual Question Answering},
  booktitle = {International Conference on Learning Representations},
  year      = {2018},
  eprint    = {1802.05766},
  url       = {https://openreview.net/forum?id=B12Js_yRb},
}


上一篇:VQA-tensorflow

下一篇:convolution-visualizer

用户评价
全部评价

热门资源

  • Keras-ResNeXt

    Keras ResNeXt Implementation of ResNeXt models...

  • seetafaceJNI

    项目介绍 基于中科院seetaface2进行封装的JAVA...

  • spark-corenlp

    This package wraps Stanford CoreNLP annotators ...

  • capsnet-with-caps...

    CapsNet with capsule-wise convolution Project ...

  • inferno-boilerplate

    This is a very basic boilerplate example for pe...