Abstract
Transfer learning addresses the problems that la-beled training data are insufficient to produce a high-performance model. Typically, given a tar-get learning task, most transfer learning approaches require to select one or more auxiliary tasks as sources by the designers. However, how to select the right source data to enable effective knowledge transfer automatically is still an unsolved problem,which limits the applicability of transfer learning.In this paper, we take one step ahead and pro-pose a novel transfer learning framework, known as source-selection-free tra nsfer learning (SSFTL), to free users from the need to select source domains.Instead of asking the users for source and target data pairs, as traditional transfer learning does, SS-FTL turns to some online information sources such as World Wide Web or the Wikipedia for help.The source data for transfer learning can be hid-den somewhere within this large online informa-tion source, but the users do not know where they are. Based on the online information sources, we train a large number of classifiers. Then, given a target task, a bridge is built for labels of the poten-tial source candidates and the target domain data in SSFTL via some large online social media with tag cloud as a label translator. An added advantage of SSFTL is that, unlike many previous transfer learn-ing approaches, which are difficult to scale up to the Web scale, SSFTL is highly scalable and can off-set much of the training work to offline stage. We demonstrate the effectiveness and efficiency of SS-FTL through extensive experiments on several real-world datasets in text classification