Abstract
In group activity recognition, the temporal dynamics ofthe whole activity can be inferred based on the dynamicsof the individual people representing the activity. We builda deep model to capture these dynamics based on LSTM(long short-term memory) models. To make use of these ob-servations, we present a 2-stage deep temporal model forthe group activity recognition problem. In our model, aLSTM model is designed to represent action dynamics ofindividual people in a sequence and another LSTM modelis designed to aggregate person-level information for wholeactivity understanding. We evaluate our model over twodatasets: the Collective Activity Dataset and a new vol-leyball dataset. Experimental results demonstrate that ourproposed model improves group activity recognition performance compared to baseline methods.