Abstract
The extension of conventional clustering to hypergraph clus- tering, which involves higher order similarities instead of pairwise simi- larities, is increasingly gaining attention in computer vision. This is due to the fact that many grouping problems require an affinity measure that must involve a subset of data of size more than two, i.e., a hyperedge. Al- most all previous works, however, have considered the smallest possible hyperedge size, due to a lack of study into the potential benefits of large hyperedges and effective algorithms to generate them. In this paper, we show that large hyperedges are better from both theoretical and empir- ical standpoints. We then propose a novel guided sampling strategy for large hyperedges, based on the concept of random cluster models. Our method can generate pure large hyperedges that significantly improve grouping accuracy without exponential increases in sampling costs. In the important applications of face clustering and motion segmentation, our method demonstrates substantially better accuracy and efficiency.