资源论文Almost Optimal Algorithms for Linear Stochastic Bandits with Heavy-Tailed Payoffs

Almost Optimal Algorithms for Linear Stochastic Bandits with Heavy-Tailed Payoffs

2020-02-17 | |  71 |   41 |   0

Abstract 

In linear stochastic bandits, it is commonly assumed that payoffs are with subGaussian noises. In this paper, under a weaker assumption on noises, we study the problem of linear stochastic bandits with heavy-tailed payoffs (LinBET), where the distributions have finite moments of order 1 + ϵ, for some image.png We rigorously analyze the regret lower bound of LinBET as image.png implying that finite moments of order 2 (i.e., finite variances) yield the bound of image.png with T being the total number of rounds to play bandits. The provided lower bound also indicates that the state-of-the-art algorithms for LinBET are far from optimal. By adopting median of means with a well-designed allocation of decisions and truncation based on historical information, we develop two novel bandit algorithms, where the regret upper bounds match the lower bound up to polylogarithmic factors. To the best of our knowledge, we are the first to solve LinBET optimally in the sense of the polynomial order on T . Our proposed algorithms are evaluated based on synthetic datasets, and outperform the state-of-the-art results.

上一篇:Online Learning with an Unknown Fairness Metric

下一篇:C O L A: Decentralized Linear Learning

用户评价
全部评价

热门资源

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Rating-Boosted La...

    The performance of a recommendation system reli...