资源论文Subquadratic High-Dimensional Hierarchical Clustering

Subquadratic High-Dimensional Hierarchical Clustering

2020-02-19 | |  48 |   38 |   0

Abstract

We consider the widely-used average-linkage, single-linkage, and Ward’s methods for computing hierarchical clusterings of high-dimensional Euclidean inputs. It is easy to show that there is no efficient implementation of these algorithms in high dimensional Euclidean space since it implicitly requires to solve the closest pair problem, a notoriously difficult problem. However, how fast can these algorithms be implemented if we allow approximation? More precisely: these algorithms successively merge the clusters that are at closest average (for average-linkage), minimum distance (for single-linkage), or inducing the least sum-of-square error (for Ward’s). We ask whether one could obtain a significant running-time improvement if the algorithm can merge 图片.png-approximate closest clusters (namely, clusters that are at distance (average, minimum, or sum-of-square error) at most 图片.png times the distance of the closest clusters). We show that one can indeed take advantage of the relaxation and compute the approximate hierarchical clustering tree using 图片.png-approximate nearest neighbor queries. This leads to an algorithm running in time 图片.png for d-dimensional Euclidean space. We then provide experiments showing that these algorithms perform as well as the non-approximate version for classic classification tasks while achieving a significant speed-up.

上一篇:Distributionally Robust Optimization and Generalization in Kernel Methods

下一篇:On the (In)fidelity and Sensitivity of Explanations

用户评价
全部评价

热门资源

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • The Variational S...

    Unlike traditional images which do not offer in...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Rating-Boosted La...

    The performance of a recommendation system reli...