资源论文A Single-Pass Algorithm for Efficiently Recovering Sparse Cluster Centers of High-dimensional Data

A Single-Pass Algorithm for Efficiently Recovering Sparse Cluster Centers of High-dimensional Data

2020-03-04 | |  77 |   42 |   0

Abstract

Learning a statistical model for high-dimensional data is an important topic in machine learning. Although this problem has been well studied in the supervised setting, little is known about its unsupervised counterpart. In this work, we focus on the problem of clustering high-dimensional data with sparse centers. In particular, we address the following open question in unsupervised learning: “is it possible to reliably cluster high-dimensional data when the number of samples is smaller than the data dimensionality?” We develop an efficient clustering algorithm that is able to estimate sparse cluster centers with a single pass over the data. Our theoretical analysis shows that the proposed algorithm is able to accurately recover cluster centers with only O(s log d) number of samples (data points), provided all the cluster centers are s-sparse vectors in a d dimensional space. Experimental results verify both the effectiveness and efficiency of the proposed clustering algorithm compared to the state-of-the-art algorithms on several benchmark datasets. Proceedings of the 31 st International Conference on MachLearning, Beijing, China, 2014. JMLR: W&CP volume 32. Copright 2014 by the author(s).

上一篇:Bayesian Nonparametric Multilevel Clustering with Group-Level Contexts

下一篇:Von Mises-Fisher Clustering Models

用户评价
全部评价

热门资源

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Rating-Boosted La...

    The performance of a recommendation system reli...