Abstract
We address the problem of composing a story out of mul-tiple short video clips taken by a person during an activityor experience. Inspired by plot analysis of written stories,our method generates a sequence of video clips ordered insuch a way that it reflects plot dynamics and content co-herency. That is, given a set of multiple video clips, ourmethod composes a video which we call a video-story. Wedefine metrics on scene dynamics and coherency by denseoptical flow features and a patch matching algorithm. Us-ing these metrics, we define an objective function for thevideo-story. To efficiently search for the best video-story,we introduce a novel Branch-and-Bound algorithm which guarantees the global optimum. We collect the dataset consisting of 23 video sets from the web, resulting in a total of 236 individual video clips. With the acquired dataset, we perform extensive user studies involving 30 human subjects by which the effectiveness of our approach is quantitatively and qualitatively verified.