Real-Time Monocular Depth Estimation using Synthetic Data
with Domain Adaptation via Image Style Transfer
Abstract
Monocular depth estimation using learning-based approaches has become promising in recent years. However,
most monocular depth estimators either need to rely on
large quantities of ground truth depth data, which is extremely expensive and difficult to obtain, or predict disparity as an intermediary step using a secondary supervisory
signal leading to blurring and other artefacts. Training a
depth estimation model using pixel-perfect synthetic data
can resolve most of these issues but introduces the problem
of domain bias. This is the inability to apply a model trained
on synthetic data to real-world scenarios. With advances in
image style transfer and its connections with domain adaptation (Maximum Mean Discrepancy), we take advantage of
style transfer and adversarial training to predict pixel perfect depth from a single real-world color image based on
training over a large corpus of synthetic environment data.
Experimental results indicate the efficacy of our approach
compared to contemporary state-of-the-art techniques