R2P2: A ReparameteRized Pushforward Policy
for Diverse, Precise Generative Path Forecasting
Abstract. We propose a method to forecast a vehicle’s ego-motion as
a distribution over spatiotemporal paths, conditioned on features (e.g.,
from LIDAR and images) embedded in an overhead map. The method
learns a policy inducing a distribution over simulated trajectories that is
both “diverse” (produces most paths likely under the data) and “precise”
(mostly produces paths likely under the data). This balance is achieved
through minimization of a symmetrized cross-entropy between the distribution and demonstration data. By viewing the simulated-outcome distribution as the pushforward of a simple distribution under a simulation
operator, we obtain expressions for the cross-entropy metrics that can
be efficiently evaluated and differentiated, enabling stochastic-gradient
optimization. We propose concrete policy architectures for this model,
discuss our evaluation metrics relative to previously-used metrics, and
demonstrate the superiority of our method relative to state-of-the-art
methods in both the Kitti dataset and a similar but novel and larger
real-world dataset explicitly designed for the vehicle forecasting domain