Abstract
We propose a robust algorithm to generate video seg-ment proposals. The proposals generated by our methodcan start from any frame in the video and are robust tocomplete occlusions. Our method does not assume spe-cific motion models and even has a limited capability togeneralize across videos. We build on our previous leastsquares tracking framework, where image segment proposals are generated and tracked using learned appearance models. The innovation in our new method lies in the useof two efficient moves, the merge move and free addition, to efficiently start segments from any frame and track them through complete occlusions, without much additional computation. Segment size interpolation is used for effectively detecting occlusions. We propose a new metric for evaluating video segment proposals on the challenging VSB-100 benchmark and present state-of-the-art results. Preliminary results are also shown for the potential use of our framework to track segments across different videos.