Abstract
Neural style transfer has drawn broad attention in recent years. However, most existing methods aim to explicitly model the transformation between different styles, and
the learned model is thus not generalizable to new styles.
We here attempt to separate the representations for styles
and contents, and propose a generalized style transfer network consisting of style encoder, content encoder, mixer and
decoder. The style encoder and content encoder are used
to extract the style and content factors from the style reference images and content reference images, respectively.
The mixer employs a bilinear model to integrate the above
two factors and finally feeds it into a decoder to generate
images with target style and content. To separate the style
features and content features, we leverage the conditional
dependence of styles and contents given an image. During
training, the encoder network learns to extract styles and
contents from two sets of reference images in limited size,
one with shared style and the other with shared content.
This learning framework allows simultaneous style transfer among multiple styles and can be deemed as a special
‘multi-task’ learning scenario. The encoders are expected
to capture the underlying features for different styles and
contents which is generalizable to new styles and contents.
For validation, we applied the proposed algorithm to the
Chinese Typeface transfer problem. Extensive experiment
results on character generation have demonstrated the effectiveness and robustness of our method