资源论文Near-Optimal Entrywise Sampling for Data Matrices

Near-Optimal Entrywise Sampling for Data Matrices

2020-01-16 | |  58 |   32 |   0

Abstract

We consider the problem of selecting non-zero entries of a matrix A in order to produce a sparse sketch of it, B, that minimizes 图片.png For large 图片.png matrices, such that 图片.png (for example, representing n observations over m attributes) we give sampling distributions that exhibit four important properties. First, they have closed forms computable from minimal information regarding A. Second, they allow sketching of matrices whose non-zeros are presented to the algorithm in arbitrary order as a stream, with O 1 computation per non-zero. Third, the resulting sketch matrices are not only sparse, but their non-zero entries are highly compressible. Lastly, and most importantly, under mild assumptions, our distributions are provably competitive with the optimal offline distribution. Note that the probabilities in the optimal offline distribution may be complex functions of all the entries in the matrix. Therefore, regardless of computational complexity, the optimal distribution might be impossible to compute in the streaming model.

上一篇:Confidence Intervals and Hypothesis Testing for High-Dimensional Statistical Models

下一篇:Global MAP-Optimality by Shrinking the Combinatorial Search Area with Convex Relaxation

用户评价
全部评价

热门资源

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Rating-Boosted La...

    The performance of a recommendation system reli...