Abstract
Vast amounts of video data are available on the web and are being generated daily using surveillance cameras or other sources. Being able to ef?ciently analyse and process this data is essential for a number of different applications. We want to be able to ef?ciently detect activities in these videos or be able to extract and store essential information contained in these videos for future use and easy search and access. Cohn et al. (2012) proposed a comprehensive representation of spatial features that can be ef?ciently extracted from video and used for these purposes. In this paper, we present a modi?ed version of this approach that is equally ef?cient and allows us to extract spatial information with much higher accuracy than previously possible. We present ef?cient algorithms both for extracting and storing spatial information from video, as well as for processing this information in order to obtain useful spatial features. We evaluate our approach and demonstrate that the extracted spatial information is considerably more accurate than that obtained from existing approaches.