资源论文Coresets for Vector Summarization with Applications to Network Graphs

Coresets for Vector Summarization with Applications to Network Graphs

2020-03-09 | |  56 |   50 |   0

Abstract

We provide a deterministic data summarization algorithm that approximates the mean 图片.png = 图片.png of a set P of n vectors in 图片.png by a weighted mean 图片.png of a subset of O(1/ε) vectors, i.e., independent of both n and d. We prove that the squared Euclidean distance between 图片.png and 图片.png is at most ε multiplied by the variance of P . We use this algorithm to maintain an approximated sum of vectors from an unbounded stream, using memory that is independent of d, and logarithmic in the n vectors seen so far. Our main application is to extract and represent in a compact way friend groups and activity summaries of users from underlying data exchanges. For example, in the case of mobile networks, we can use GPS traces to identify meetings; in the case of social networks, we can use information exchange to identify friend groups. Our algorithm provably identifies the Heavy Hitter entries in a proximity (adjacency) matrix. The Heavy Hitters can be used to extract and represent in a compact way friend groups and activity summaries of users from underlying data exchanges. We evaluate the algorithm on several large data sets.

上一篇:Learning to Learn without Gradient Descent by Gradient Descent

下一篇:Equivariance Through Parameter-Sharing

用户评价
全部评价

热门资源

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Learning to learn...

    The move from hand-designed features to learned...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...