Abstract
Statistical shape modelling(SSM) is a popular technique in computer vision applications, where the variation of shape of a given structure is modelled by principal component analysis (PCA) on a set of training samples. The issue of sample size sufficiency is not generally considered. In this paper, we propose a framework to investigate the sources of SSM inaccuracy. Based on this frame- work, we propose a procedure to determine sample size sufficiency by testing whether the training data stabilises the SSM. Also, the number of principal modes to retain (PCA dimension) is usually chosen using rules that aim to cover a per- centage of the total variance or to limit the residual to a threshold. However, an ideal rule should retain modes that correspond to real structural variation and dis- card those that are dominated by noise. We show that these commonly used rules are not reliable, and we propose a new rule that uses bootstrap stability analysis on mode directions to determine the PCA dimension. For validation we use synthetic 3D face datasets generated using a known number of structural modes with added noise. A 4-way ANOVA is applied for the model reconstruction accuracy on sample size, shape vector dimension, PCA dimension, and the noise level. It shows that there is no universal sample size guideline for SSM, nor is there a simple relationship to the shape vector dimen- sion (with p-Value=0.2932). Validation of our rule for retaining structural modes showed it detected the correct number of modes to retain where the conventional methods failed. The methods were also tested on real 2D (22 points) and 3D (500 points) face data, retaining 24 and 70 modes with sample sufficiency being reached at approximately 50 and 150 samples respectively. We provide a foun- dation for appropriate selection of PCA dimension and determination of sample size sufficiency in statistical shape modelling.