Abstract
We present a network architecture for processing point
clouds that directly operates on a collection of points represented as a sparse set of samples in a high-dimensional
lattice. Na¨?vely applying convolutions on this lattice scales
poorly, both in terms of memory and computational cost, as
the size of the lattice increases. Instead, our network uses
sparse bilateral convolutional layers as building blocks.
These layers maintain efficiency by using indexing structures to apply convolutions only on occupied parts of the lattice, and allow flexible specifications of the lattice structure
enabling hierarchical and spatially-aware feature learning,
as well as joint 2D-3D reasoning. Both point-based and
image-based representations can be easily incorporated in
a network with such layers and the resulting model can be
trained in an end-to-end manner. We present results on 3D
segmentation tasks where our approach outperforms existing state-of-the-art techniques