资源论文Massively Parallel Algorithms and Hardness for Single-Linkage Clustering under Lp Distances

Massively Parallel Algorithms and Hardness for Single-Linkage Clustering under Lp Distances

2020-03-19 | |  97 |   64 |   0

Abstract

We present first massively parallel (MPC) algorithms and hardness of approximation results for computing Single-Linkage Clustering of n input d-dimensional vectors under Hamming, 图片.png and 图片.png distances. All our algorithms run in O(log n) rounds of MPC for any fixed d and achieve (1 +ε )-approximation for all distances (except Hamming for which we show an exact algorithm). We also show constant-factor inapproximability results for o(log n)-round algorithms under standard MPC hardness assumptions (for sufficiently large dimension depending on the distance used). Efficiency of implementation of our algorithms in Apache Spark is demonstrated through experiments on the largest available vector datasets from the UCI machine learning repository exhibiting speedups of several orders of magnitude.

上一篇:Learning Implicit Generative Models with the Method of Learned Moments

下一篇:Practical Contextual Bandits with Regression Oracles

用户评价
全部评价

热门资源

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • The Variational S...

    Unlike traditional images which do not offer in...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Rating-Boosted La...

    The performance of a recommendation system reli...