资源论文Polynomial Cost of Adaptation for X -Armed Bandits

Polynomial Cost of Adaptation for X -Armed Bandits

2020-02-23 | |  185 |   47 |   0

Abstract

In the context of stochastic continuum-armed bandits, we present an algorithm that adapts to the unknown smoothness of the objective function. We exhibit and compute a polynomial cost of adaptation to the Holder regularity for regret minimization. To do this, we first reconsider the recent lower bound of Locatelli and Carpentier [21], and define and characterize admissible rate functions. Our new algorithm matches any of these minimal rate functions. We provide a finite-time analysis and a thorough discussion about asymptotic optimality.

上一篇:Learning Hawkes Processes from a Handful of Events

下一篇:Reducing the variance in online optimization by transporting past gradients

用户评价
全部评价

热门资源

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • Learning to learn...

    The move from hand-designed features to learned...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...