Variance of average surprisal: a better predictor for quality of grammarfrom unsupervised PCFG induction
Abstract
In unsupervised grammar induction, data likelihood is known to be only weakly correlated with parsing accuracy, especially at
convergence after multiple runs. In order
to find a better indicator for quality of induced grammars, this paper correlates several linguistically- and psycholinguisticallymotivated predictors to parsing accuracy on
a large multilingual grammar induction evaluation data set. Results show that variance
of average surprisal (VAS) better correlates
with parsing accuracy than data likelihood,
and that using VAS instead of data likelihood
for model selection provides a significant accuracy boost. Further evidence shows VAS
to be a better candidate than data likelihood
for predicting word order typology classification. Analyses show that VAS seems to separate content words from function words in natural language grammars, and to better arrange
words with different frequencies into separate
classes that are more consistent with linguistic
theory