Abstract
In this paper, we focus on heterogeneous feature learning for RGB-D activity recognition. Considering that features from different channels could share some similar hidden structures, we propose a joint learning model to simultaneously explore the shared and feature-specifific components as an instance of heterogenous multi-task learning. The proposed model in an unifified framework is capable of: 1) jointly mining a set of subspaces with the same dimensionality to enable the multi-task classififier learning, and 2) meanwhile, quantifying the shared and feature-specifific components of features in the subspaces. To effificiently train the joint model, a three-step iterative optimization algorithm is proposed, followed by two inference models. Extensive results on three activity datasets have demonstrated the effificacy of the proposed method. In addition, a novel RGB-D activity dataset focusing on human-object interaction is collected for evaluating the proposed method, which will be made available to the community for RGB-D activity benchmarking and analysis.