Abstract
The predominant strategy for facial expressions analysis and temporal analysis of facial events is the following: a generic facial land- marks tracker, usually trained on thousands of carefully annotated exam- ples, is applied to track the landmark points, and then analysis is performed using mostly the shape and more rarely the facial texture. This paper chal- lenges the above framework by showing that it is feasible to perform joint landmarks localization (i.e. spatial alignment) and temporal analysis of be- havioural sequence with the use of a simple face detector and a simple shape model. To do so, we propose a new component analysis technique, which we call Autoregressive Component Analysis (ARCA), and we show how the parameters of a motion model can be jointly retrieved. The method does not require the use of any sophisticated landmark tracking methodology and simply employs pixel intensities for the texture representation.