Abstract
We address the task of inferring the future actions of peo- ple from noisy visual input. We denote this task activity forecasting. To achieve accurate activity forecasting, our approach models the effect of the physical environment on the choice of human actions. This is ac- complished by the use of state-of-the-art semantic scene understanding combined with ideas from optimal control theory. Our unified model also integrates several other key elements of activity analysis, namely, destination forecasting, sequence smoothing and transfer learning. As proof-of-concept, we focus on the domain of tra jectory-based activity analysis from visual input. Experimental results demonstrate that our model accurately predicts distributions over future actions of individu- als. We show how the same techniques can improve the results of tracking algorithms by leveraging information about likely goals and tra jectories.