Amodal Detection of 3D Objects:
Inferring 3D Bounding Boxes from 2D Ones in RGB-Depth Images
Abstract
This paper addresses the problem of amodal perception
of 3D object detection. The task is to not only find object
localizations in the 3D world, but also estimate their physical sizes and poses, even if only parts of them are visible
in the RGB-D image. Recent approaches have attempted
to harness point cloud from depth channel to exploit 3D
features directly in the 3D space and demonstrated the superiority over traditional 2.5D representation approaches.
We revisit the amodal 3D detection problem by sticking
to the 2.5D representation framework, and directly relate
2.5D visual appearance to 3D objects. We propose a novel
3D object detection system that simultaneously predicts objects’ 3D locations, physical sizes, and orientations in indoor scenes. Experiments on the NYUV2 dataset show
our algorithm significantly outperforms the state-of-the-art
and indicates 2.5D representation is capable of encoding
features for 3D amodal object detection. All source code
and data is on https://github.com/phoenixnn/
Amodal3Det.