Abstract
This paper addresses the segmentation of videos with arbitrary motion, including dynamic textures, using novel motion features and a supervised learning approach. Dynamic textures are commonplace in natural scenes, and exhibit complex patterns of appearance and motion (e.g. water, smoke, swaying foliage). These are diffificult for existing segmentation algorithms, often violate the brightness constancy assumption needed for optical flflow, and have complex segment characteristics beyond uniform appearance or motion. Our solution uses custom spatiotemporal fifilters that capture texture and motion cues, along with a novel metric-learning framework that optimizes this representation for specifific objects and scenes. This is used within a hierarchical, graph-based segmentation setting, yielding state-of-the-art results for dynamic texture segmentation. We also demonstrate the applicability of our approach to general object and motion segmentation, showing signififi- cant improvements over unsupervised segmentation and results comparable to the best task specifific approaches.