资源论文An optimal algorithm for the Thresholding Bandit Problem

An optimal algorithm for the Thresholding Bandit Problem

2020-03-06 | |  65 |   37 |   0

Abstract

We study a specific combinatorial pure exploration stochastic bandit problem where the learner aims at finding the set of arms whose means are above a given threshold, up to a given precision, and for a fixed time horizon. We propose a parameter-free algorithm based on an original heuristic, and prove that it is optimal for this problem by deriving matching upper and lower bounds. To the best of our knowledge, this is the first non-trivial pure exploration setti with fixed budget for which optimal strategies are constructed.

上一篇:Fast k-Nearest Neighbour Search via Dynamic Continuous Indexing

下一篇:Matrix Eigen-decomposition via Doubly Stochastic Riemannian Optimization

用户评价
全部评价

热门资源

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Joint Pose and Ex...

    Facial expression recognition (FER) is a challe...