Abstract
In combinatorial statistics, we are interested in a
statistical test of combinatorial correlation, i.e., existence a subset from an underlying combinatorial
structure such that the observation is large on the
subset. The combinatorial scan statistics has been
proposed for such a statistical test; however, it is
not commonly used in practice because of its high
computational cost. In this study, we restrict our attention to the case that the number of data points is
moderately small (e.g., 50), the outcome is binary,
and the underlying combinatorial structure is represented by a zero-suppressed binary decision diagram (ZDD), and consider the problem of computing the p-value of the combinatorial scan statistics
exactly. First, we prove that this problem is a #Phard problem. Then, we propose a practical algorithm that solves the problem. Here, the algorithm
constructs a binary decision diagram (BDD) for a
set of realizations of the random variables by a dynamic programming on the ZDD, and computes the
p-value by a dynamic programming on the BDD.
We conducted experiments to evaluate the performance of the proposed algorithm using real-world
datasets