资源论文Distributed Exploration in Multi-Armed Bandits

Distributed Exploration in Multi-Armed Bandits

2020-01-16 | |  71 |   45 |   0

Abstract

We study exploration in Multi-Armed Bandits in a setting where k players collaborate in order to identify an 图片.png-optimal arm. Our motivation comes from recent employment of bandit algorithms in computationally intensive, large-scale applications. Our results demonstrate a non-trivial tradeoff between the number of arm pulls required by each of the players, and the amount of communication between them. In particular, our main result shows that by allowing the k players to communicate only once, they are able to learn 图片.png times faster than  a single player. That is, distributing learning to 图片.png players gives rise to a factor k parallel speedup. We complement this result with a lower bound showing this is in general the best possible. On the other extreme, we present an algorithm that achieves the ideal factor k speed-up in learning performance, with communication only logarithmic in 图片.png

上一篇:Online Learning of Dynamic Parameters in Social Networks

下一篇:Compressive Feature Learning

用户评价
全部评价

热门资源

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • Learning to learn...

    The move from hand-designed features to learned...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Learning to Predi...

    Much of model-based reinforcement learning invo...