Abstract
Zero-shot learning (ZSL) aims to recognize objects of
unseen classes with available training data from another
set of seen classes. Existing solutions are focused on exploring knowledge transfer via an intermediate semantic
embedding (e.g., attributes) shared between seen and unseen classes. In this paper, we propose a novel projection
framework based on matrix tri-factorization with manifold
regularizations. Specifically, we learn the semantic embedding projection by decomposing the visual feature matrix
under the guidance of semantic embedding and class label
matrices. By additionally introducing manifold regularizations on visual data and semantic embeddings, the learned
projection can effectively capture the geometrical manifold
structure residing in both visual and semantic spaces. To
avoid the projection domain shift problem, we devise an effective prediction scheme by exploiting the test-time manifold structure. Extensive experiments on four benchmark
datasets show that our approach significantly outperforms
the state-of-the-arts, yielding an average improvement ratio
by 7.4% and 31.9% for the recognition and retrieval task,
respectively