资源论文Fine-grained Image Classification via Combining Vision and Language

Fine-grained Image Classification via Combining Vision and Language

2019-12-10 | |  93 |   34 |   0

Abstract

Fine-grained image classifification is a challenging task due to the large intra-class variance and small inter-class variance, aiming at recognizing hundreds of sub-categories belonging to the same basic-level category. Most existing fifine-grained image classifification methods generally learn part detection models to obtain the semantic parts for better classifification accuracy. Despite achieving promising results, these methods mainly have two limitations: (1) not all the parts which obtained through the part detection models are benefificial and indispensable for classififi- cation, and (2) fifine-grained image classifification requires more detailed visual descriptions which could not be provided by the part locations or attribute annotations. For addressing the above two limitations, this paper proposes the two-stream model combining vision and language (CVL) for learning latent semantic representations. The vision stream learns deep representations from the original visual information via deep convolutional neural network. The language stream utilizes the natural language descriptions which could point out the discriminative parts or characteristics for each image, and provides a flflexible and compact way of encoding the salient visual aspects for distinguishing sub-categories. Since the two streams are complementary, combining the two streams can further achieves better classifification accuracy. Comparing with 12 state-ofthe-art methods on the widely used CUB-200-2011 dataset for fifine-grained image classifification, the experimental results demonstrate our CVL approach achieves the best performance

上一篇:Convolutional Random Walk Networks for Semantic Image Segmentation

下一篇:Fine-tuning Convolutional Neural Networks for Biomedical Image Analysis: Actively and Incrementally

用户评价
全部评价

热门资源

  • Learning to learn...

    The move from hand-designed features to learned...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • Rating-Boosted La...

    The performance of a recommendation system reli...

  • Hierarchical Task...

    We extend hierarchical task network planning wi...