Abstract
Probabilistic finite automata (PFAs) are common statistical language model in natural language and speech processing. A typical task
for PFAs is to compute the probability of all
strings that match a query pattern. An important special case of this problem is computing
the probability of a string appearing as a pre-
fix, suffix, or infix. These problems find use in
many natural language processing tasks such
word prediction and text error correction.
Recently, we gave the first incremental algorithm to efficiently compute the infix probabilities of each prefix of a string (Cognetta et al.,
2018). We develop an asymptotic improvement of that algorithm and solve the open
problem of computing the infix probabilities
of PFAs from streaming data, which is crucial
when processing queries online and is the ultimate goal of the incremental approach