资源论文Fast and Accurate k-llleans For Large Datasets

Fast and Accurate k-llleans For Large Datasets

2020-01-08 | |  98 |   54 |   0

Abstract

Clustering is a popular problem with many applications. We consider the k-means problem in the situation where the data is too large to be stored in main memory and must be accessed sequentially, such as from a disk, and where we must use as little memory as possible. Our algorithm is based on recent theoretical results, with significant improvements to make it practical. Our approach greatly simplifies a recently developed algorithm, both in design and in analysis, and eliminates large constant factors in the approximation guarantee, the memory requirements, and the running time. We then incorporate approximate nearest neighbor search to compute k-means in 图片.png (where n is the number of data points; note that computing the cost, given a solution, takes 图片.png. We show that our algorithm compares favorably to existing algorithms both theoretically and experimentally, thus providing state-of-the-art performance in both theory and practice.

上一篇:Speedy Q-Learning

下一篇:Convergent Bounds on the Euclidean Distance

用户评价
全部评价

热门资源

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • Learning to learn...

    The move from hand-designed features to learned...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...