Abstract
We consider the problem of computing a restricted nonnegative matrix factorization (NMF) of an m × n matrix X. Specifically, we seek a factorization X BC, where the k columns of B are a subset of those from X and . Equivalently, given the matrix X, consider the problem of finding a small subset, S, of the columns of X such that the conic hull of S -approximates the conic hull of the columns of X, i.e., the distance of every column of X to the conic hull of the columns of S should be at most an ?-fraction of the angular diameter of X. If k is the size of the smallest -approximation, then we produce an sized -approximation, yielding the first provable, polynomial time -approximation for this class of NMF problems, where also desirably the approximation is independent of n and m. Furthermore, we prove an approximate conic Carathéodory theorem, a general sparsity result, that shows that any column of X can be -approximated with an sparse combination from S. Our results are facilitated by a reduction to the problem of approximating convex hulls, and we prove that both the convex and conic hull variants are d-SUM-hard, resolving an open problem. Finally, we provide experimental results for the convex and conic algorithms on a variety of feature selection tasks.