资源论文Clustering High Dimensional Dynamic Data Streams

Clustering High Dimensional Dynamic Data Streams

2020-03-10 | |  62 |   49 |   0

Abstract

We present data streaming algorithms for the kmedian problem in high-dimensional dynamic geometric data streams, i.e. streams allowing both insertions and deletions of points from a discrete Euclidean space 图片.png Our algorithms use 图片.png poly(d log图片.png) space/time and maintain with high probability a small weighted set of points (a coreset) such that for every set of k centers the cost of the coreset (1 + ε)approximates the cost of the streamed point set. We also provide algorithms that guarantee only positive weights in the coreset with additional logarithmic factors in the space and time complexities. We can use this positively-weighted coreset to compute a (1 + ε)-approximation for the k-median problem by any efficient offline kmedian algorithm. All previous algorithms for computing a (1 + ε)-approximation for the kmedian problem over dynamic data streams required space and time exponential in d. Our algorithms can be generalized to metric spaces of bounded doubling dimension.

上一篇:High Dimensional Bayesian Optimization with Elastic Gaussian Process

下一篇:Efficient softmax approximation for GPUs

用户评价
全部评价

热门资源

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Rating-Boosted La...

    The performance of a recommendation system reli...