Abstract
We show that, from the output of a simple 3D human pose tracker one can infer physical attributes (e.g., gender and weight) and aspects of mental state (e.g., happiness or sadness). This task is useful for man-machine communication, and it provides a natural benchmark for evaluating the performance of 3D pose tracking methods (vs. conventional Euclidean joint error metrics). Based on an ex- tensive corpus of motion capture data, with physical and perceptual ground truth, we analyze the inference of subtle biologically-inspired attributes from cyclic gait data. It is shown that inference is also possible with partial observations of the body, and with motions as short as a single gait cycle. Learning models from small amounts of noisy video pose data is, however, prone to over- fitting. To mit- igate this we formulate learning in terms of domain adaptation, for which mocap data is uses to regularize models for inference from video-based data.