Abstract
Image and video annotations are challenging but important tasks to understand digital multimedia contents in computer vision, which by nature is a multi-label multi-class classification problem be- cause every image is usually associated with more than one semantic keyword. As a result, label assignments are no longer confined to class membership indications as in traditional single-label multi-class classifi- cation, which also convey important characteristic information to assess ob ject similarity from knowledge perspective. Therefore, besides implic- itly making use of label assignments to formulate label correlations as in many existing multi-label classification algorithms, we propose a novel Multi-Label Feature Transform (MLFT) approach to also explicitly use them as part of data features. Through two transformations on attributes and label assignments respectively, MLFT approach uses kernel to im- plicitly construct a label-augmented feature vector to integrate attributes and labels of a data set in a balanced manner, such that the data dis- criminability is enhanced because of taking advantage of the information from both data and label perspectives. Promising experimental results on four standard multi-label data sets from image annotation and other applications demonstrate the effectiveness of our approach.