Abstract
Functional ob ject recognition in video is an emerging prob- lem for visual surveillance and video understanding problem. By func- tional ob jects, we mean ob jects with specific purpose such as postman and delivery truck, which are defined more by their actions and behaviors than by appearance. In this work, we present an approach for content- based learning and recognition of the function of moving ob jects given video-derived tracks. In particular, we show that semantic behaviors of movers can be captured in location-independent manner by attributing them with features which encode their relations and actions w.r.t. scene contexts. By scene context, we mean local scene regions with different functionalities such as doorways and parking spots which moving ob jects often interact with. Based on these representations, functional models are learned from examples and novel instances are identified from un- seen data afterwards. Furthermore, recognition in the presence of track fragmentation, due to imperfect tracking, is addressed by a boosting- based track linking classifier. Our experimental results highlight both promising and practical aspects of our approach.