资源论文TVSum: Summarizing Web Videos Using Titles

TVSum: Summarizing Web Videos Using Titles

2019-12-25 | |  63 |   44 |   0

Abstract

Video summarization is a challenging problem in partbecause knowing which part of a video is important requiresprior knowledge about its main topic. We present TVSum, an unsupervised video summarization framework that uses title-based image search results to find visually important shots. We observe that a video title is often carefully cho-sen to be maximally descriptive of its main topic, and henceimages related to the title can serve as a proxy for impor-tant visual concepts of the main topic. However, because titles are free-formed, unconstrained, and often written am-biguously, images searched using the title can contain noise (images irrelevant to video content) and variance (images of different topics). To deal with this challenge, we devel-oped a novel co-archetypal analysis technique that learnscanonical visual concepts shared between video and images, but not in either alone, by finding a joint-factorial representation of two data sets. We introduce a new benchmark dataset, TVSum50, that contains 50 videos and their shotlevel importance scores annotated via crowdsourcing. Experimental results on two datasets, SumMe and TVSum50, suggest our approach produces superior quality summaries compared to several recently proposed approaches.

上一篇:Book2Movie: Aligning Video scenes with Book chapters

下一篇:Saliency-Aware Geodesic Video Object Segmentation

用户评价
全部评价

热门资源

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Learning to learn...

    The move from hand-designed features to learned...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...