Retrospective Encoders for Video Summarization

资源分类

2019-10-23 |

110 |

81 |

Abstract. Supervised learning techniques have shown substantial progress on video summarization. State-of-the-art approaches mostly regard the predicted summary and the human summary as two sequences (sets), and minimize discriminative losses that measure element-wise discrepancy. Such training objectives do not explicitly model how well the predicted summary preserves semantic information in the video. Moreover, those methods often demand a large amount of human generated summaries. In this paper, we propose a novel sequence-to-sequence learning model to address these deficiencies. The key idea is to complement the discriminative losses with another loss which measures if the predicted summary preserves the same information as in the original video. To this end, we propose to augment standard sequence learning models with an additional “retrospective encoder” that embeds the predicted summary into an abstract semantic space. The embedding is then compared to the embedding of the original video in the same space. The intuition is that both embeddings ought to be close to each other for a video and its corresponding summary. Thus our approach adds to the discriminative loss a metric learning loss that minimizes the distance between such pairs while maximizing the distances between unmatched ones. One important advantage is that the metric learning loss readily allows learning from videos without human generated summaries. Extensive experimental results show that our model outperforms existing ones by a large margin in both supervised and semi-supervised settings

上一篇：Universal Sketch Perceptual Grouping

下一篇：Distortion-Aware Convolutional Filters for Dense Prediction in Panoramic Images

用户评价

全部评价

还没有评论，说两句吧！

热门资源

A Mathematical Mo...

Direct democracy, where each voter casts one vo...
Learning to Predi...

Much of model-based reinforcement learning invo...
Hierarchical Task...

We extend hierarchical task network planning wi...
The Variational S...

Unlike traditional images which do not offer in...
Shape-based Autom...

We present an algorithm for automatic detection...

智能在线

400-630-6780
聆听.建议反馈

E-mail: support@tusaishared.com