资源论文A KL-LUCB Bandit Algorithm for Large-Scale Crowdsourcing

A KL-LUCB Bandit Algorithm for Large-Scale Crowdsourcing

2020-02-10 | |  479 |   118 |   0

Abstract 

This paper focuses on best-arm identification in multi-armed bandits with bounded rewards. We develop an algorithm that is a fusion of lil-UCB and KL-LUCB, offering the best qualities of the two algorithms in one method. This is achieved by proving a novel anytime confidence bound for the mean of bounded distributions, which is the analogue of the LIL-type bounds recently developed for sub-Gaussian distributions. We corroborate our theoretical results with numerical experiments based on the New Yorker Cartoon Caption Contest. 

上一篇:Flexible statistical inference for mechanistic models of neural dynamics

下一篇:A Decomposition of Forecast Error in Prediction Markets

用户评价
全部评价

热门资源

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • dynamical system ...

    allows to preform manipulations of heavy or bul...

  • Rating-Boosted La...

    The performance of a recommendation system reli...

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Learning to learn...

    The move from hand-designed features to learned...