Abstract
Efficient estimation of depth from pairs of stereo images
is one of the core problems in computer vision. We efficiently
solve the specialized problem of stereo matching under active illumination using a new learning-based algorithm. This
type of ‘active’ stereo i.e. stereo matching where scene texture is augmented by an active light projector is proving compelling for designing depth cameras, largely due to improved
robustness when compared to time of flight or traditional
structured light techniques. Our algorithm uses an unsupervised greedy optimization scheme that learns features
that are discriminative for estimating correspondences in
infrared images. The proposed method optimizes a series of
sparse hyperplanes that are used at test time to remap all the
image patches into a compact binary representation in O(1).
The proposed algorithm is cast in a PatchMatch Stereo-like
framework, producing depth maps at 500Hz. In contrast to
standard structured light methods, our approach generalizes to different scenes, does not require tedious per camera
calibration procedures and is not adversely affected by interference from overlapping sensors. Extensive evaluations
show we surpass the quality and overcome the limitations of
current depth sensing technologies