Abstract
By always mapping data from lower dimensional space into higher or even in?nite dimensional space, kernel k-means is able to organize data into groups when data of different clusters are not linearly separable. However, kernel k-means incurs the large scale computation due to the representation theorem, i.e. keeping an extremely large kernel matrix in memory when using popular Gaussian and spatial pyramid matching kernels, which largely limits its use for processing large scale data. Also, existing kernel clustering can be over?tted by outliers as well. In this paper, we introduce an Euler clustering, which can not only maintain the bene?t of nonlinear modeling using kernel function but also signi?cantly solve the large scale computational problem in kernel-based clustering. This is realized by incorporating Euler kernel. Euler kernel is relying on a nonlinear and robust cosine metric that is less sensitive to outliers. More important it intrinsically induces an empirical map which maps data onto a complex space of the same dimension. Euler clustering takes these advantages to measure the similarity between data in a robust way without increasing the dimensionality of data, and thus solves the large scale problem in kernel k-means. We evaluate Euler clustering and show its superiority against related methods on ?ve publicly available datasets.