An automated framework for fast cognate detection and Bayesian
phylogenetic inference in computational historical linguistics
Abstract
We present a fully automated workflow for
phylogenetic reconstruction on large datasets,
consisting of two novel methods, one for fast
detection of cognates and one for fast Bayesian
phylogenetic inference. Our results show that
the methods take less than a few minutes to
process language families that have so far required large amounts of time and computational power. Moreover, the cognates and
the trees inferred from the method are quite
close, both to gold standard cognate judgments
and to expert language family trees. Given
its speed and ease of application, our framework is specifically useful for the exploration
of very large datasets in historical linguistics.