Abstract
This paper presents a learning based approach to tracking articulated human body motion from a single camera. In order to ad- dress the problem of pose ambiguity, a one-to-many mapping from image features to state space is learned using a set of relevance vector ma- chines, extended to handle multivariate outputs. The image features are Hausdorff matching scores obtained by matching different shape tem- plates to the image, where the multivariate relevance vector machines (MVRVM) select a sparse set of these templates. We demonstrate that these Hausdorff features reduce the estimation error in clutter compared to shape-context histograms. The method is applied to the pose esti- mation problem from a single input frame, and is embedded within a probabilistic tracking framework to include temporal information. We apply the algorithm to 3D hand tracking and full human body tracking.