Abstract Despite deep neural networks have demonstrated strong power in face photo-sketch synthesis task, their performance, however, are still limited by the lack of training data (photo-sketch pairs). Knowledge Transfer (KT), which aims at training a smaller and fast student network with the information learned from a larger and accurate teacher network, has attracted much attention recently due to its superior performance in the acceleration and compression of deep neural networks. This work has brought us great inspiration that we can train a relatively small student network on limited training data by transferring knowledge from a larger teacher model trained on enough training data for other tasks. Therefore, we propose a novel knowledge transfer framework to synthesize face photos from face sketches or synthesize face sketches from face photos. Particularly, we utilize two teacher networks trained on large amount of data in related task to learn knowledge of face photos and knowledge of face sketches separately and transfer them to two student networks simultaneously. The two student networks, one for photo → sketch task and the other for sketch → photo task, can mimic and transform two kind of knowledge and transfer their knowledge mutually. With the proposed method, we can train a model which has superior performance using a small set of photosketch pairs. We validate the effectiveness of our method across several datasets. Quantitative and qualitative evaluations illustrate that our model outperforms other state-of-the-art methods in generating face sketches (or photos) with high visual quality and recognition ability