Abstract
Recently, there has been a paradigm shift in stereo matching with learning-based methods achieving the best results
on all popular benchmarks. The success of these methods
is due to the availability of training data with ground truth;
training learning-based systems on these datasets has allowed them to surpass the accuracy of conventional approaches based on heuristics and assumptions. Many of
these assumptions, however, had been validated extensively
and hold for the majority of possible inputs. In this paper,
we generate a matching volume leveraging both data with
ground truth and conventional wisdom. We accomplish this
by coalescing diverse evidence from a bidirectional matching
process via random forest classifiers. We show that the resulting matching volume estimation method achieves similar
accuracy to purely data-driven alternatives on benchmarks
and that it generalizes to unseen data much better. In fact, the
results we submitted to the KITTI and ETH3D benchmarks
were generated using a classifier trained on the Middlebury
2014 dataset