Representing and Learning High Dimensional Data with the Optimal Transport
Map from a Probabilistic Viewpoint
Abstract
In this paper, we propose a generative model in the space
of diffeomorphic deformation maps. More precisely, we utilize the Kantarovich-Wasserstein metric and accompanying
geometry to represent an image as a deformation from templates. Moreover, we incorporate a probabilistic viewpoint
by assuming that each image is locally generated from a reference image. We capture the local structure by modelling
the tangent planes at reference images.
Once basis vectors for each tangent plane are learned
via probabilistic PCA, we can sample a local coordinate,
that can be inverted back to image space exactly. With experiments using 4 different datasets, we show that the generative tangent plane model in the optimal transport (OT)
manifold can be learned with small numbers of images and
can be used to create infinitely many ‘unseen’ images. In
addition, the Bayesian classification accompanied with the
probabilist modeling of the tangent planes shows improved
accuracy over that done in the image space. Combining the
results of our experiments supports our claim that certain
datasets can be better represented with the KantarovichWasserstein metric. We envision that the proposed method
could be a practical solution to learning and representing
data that is generated with templates in situatons where
only limited numbers of data points are available