资源论文Bounded Regret for Finite-Armed Structured Bandits

Bounded Regret for Finite-Armed Structured Bandits

2020-01-19 | |  54 |   40 |   0

Abstract

We study a new type of K-armed bandit problem where the expected return of one arm may depend on the returns of other arms. We present a new algorithm for this general class of problems and show that under certain circumstances it is possible to achieve finite expected cumulative regret. We also give problemdependent lower bounds on the cumulative regret showing that at least in special cases the new algorithm is nearly optimal.

上一篇:Distributed Variational Inference in Sparse Gaussian Process Regression and Latent Variable Models

下一篇:Ef?cient Minimax Signal Detection on Graphs

用户评价
全部评价

热门资源

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Rating-Boosted La...

    The performance of a recommendation system reli...