Abstract
Reconstructing an arbitrary configuration of 3D points from their pro jection in an image is an ill-posed problem. When the points hold semantic meaning, such as anatomical landmarks on a body, hu- man observers can often infer a plausible 3D configuration, drawing on extensive visual memory. We present an activity-independent method to recover the 3D configuration of a human figure from 2D locations of anatomical landmarks in a single image, leveraging a large motion cap- ture corpus as a proxy for visual memory. Our method solves for anthro- pometrically regular body pose and explicitly estimates the camera via a matching pursuit algorithm operating on the image pro jections. Anthro- pometric regularity (i.e., that limbs obey known proportions) is a highly informative prior, but directly applying such constraints is intractable. Instead, we enforce a necessary condition on the sum of squared limb- lengths that can be solved for in closed form to discourage implausible configurations in 3D. We evaluate performance on a wide variety of hu- man poses captured from different viewpoints and show generalization to novel 3D configurations and robustness to missing data.