CarFusion: Combining Point Tracking and Part Detection for
Dynamic 3D Reconstruction of Vehicles
Abstract
Despite significant research in the area, reconstruction
of multiple dynamic rigid objects (eg. vehicles) observed
from wide-baseline, uncalibrated and unsynchronized cameras, remains hard. On one hand, feature tracking works
well within each view but is hard to correspond across multiple cameras with limited overlap in fields of view or due to
occlusions. On the other hand, advances in deep learning
have resulted in strong detectors that work across different
viewpoints but are still not precise enough for triangulationbased reconstruction. In this work, we develop a framework to fuse both the single-view feature tracks and multiview detected part locations to significantly improve the detection, localization and reconstruction of moving vehicles,
even in the presence of strong occlusions. We demonstrate
our framework at a busy traffic intersection by reconstructing over 62 vehicles passing within a 3-minute window.
We evaluate the different components within our framework
and compare to alternate approaches such as reconstruction using tracking-by-detection