Abstract
Action analysis in image and video has been attracting more and more attention in computer vision. Recognizingspecific actions in video clips has been the main focus. Wemove in a new, more general direction in this paper and askthe critical fundamental question: what is action, how isaction different from motion, and in a given image or video where is the action? We study the philosophical and vi-sual characteristics of action, which lead us to define actionness: intentional bodily movement of biological agents(people, animals). To solve the general problem, we pro-pose the lattice conditional ordinal random field model that incorporates local evidence as well as neighboring orderagreement. We implement the new model in the continuous domain and apply it to scoring actionness in both image and video datasets. Our experiments demonstrate not only that our new model can outperform the popular ranking SVM but also that indeed action is distinct from motion.