Abstract
We propose a new learning-based method for estimat-ing 2D human pose from a single image, using Dual-SourceDeep Convolutional Neural Networks (DS-CNN). Recently,many methods have been developed to estimate human poseby using pose priors that are estimated from physiologi-cally inspired graphical models or learned from a holis-tic perspective. In this paper, we propose to integrate boththe local (body) part appearance and the holistic view ofeach local part for more accurate human pose estimation. Specifically, the proposed DS-CNN takes a set of image patches (category-independent object proposals for training and multi-scale sliding windows for testing) as the input and then learns the appearance of each local part by considering their holistic views in the full body. Using DS-CNN, we achieve both joint detection, which determines whetheran image patch contains a body joint, and joint localization,which finds the exact location of the joint in the image patch.Finally, we develop an algorithm to combine these joint detection/localization results from all the image patches for estimating the human pose. The experimental results show the effectiveness of the proposed method by comparing to the state-of-the-art human-pose estimation methods based on pose priors that are estimated from physiologically inspired graphical models or learned from a holistic perspective.