Abstract
Time-lapse videos usually contain visually appealing
content but are often difficult and costly to create. In this
paper, we present an end-to-end solution to synthesize a
time-lapse video from a single outdoor image using deep
neural networks. Our key idea is to train a conditional
generative adversarial network based on existing datasets of
time-lapse videos and image sequences. We propose a multiframe joint conditional generation framework to effectively
learn the correlation between the illumination change of
an outdoor scene and the time of the day. We further
present a multi-domain training scheme for robust training
of our generative models from two datasets with different
distributions and missing timestamp labels. Compared
to alternative time-lapse video synthesis algorithms, our
method uses the timestamp as the control variable and does
not require a reference video to guide the synthesis of the
final output. We conduct ablation studies to validate our
algorithm and compare with state-of-the-art techniques both
qualitatively and quantitatively.