Abstract
Compositing is one of the most common operations in
photo editing. To generate realistic composites, the appearances of foreground and background need to be adjusted
to make them compatible. Previous approaches to harmonize composites have focused on learning statistical relationships between hand-crafted appearance features of the
foreground and background, which is unreliable especially
when the contents in the two layers are vastly different. In
this work, we propose an end-to-end deep convolutional
neural network for image harmonization, which can capture both the context and semantic information of the composite images during harmonization. We also introduce an
efficient way to collect large-scale and high-quality training data that can facilitate the training process. Experiments on the synthesized dataset and real composite images
show that the proposed network outperforms previous stateof- the-art methods