资源论文Bandits Dueling on Partially Ordered Sets

Bandits Dueling on Partially Ordered Sets

2020-02-10 | |  112 |   57 |   0

Abstract 

We address the problem of dueling bandits defined on partially ordered sets, or posets. In this setting, arms may not be comparable, and there may be several (incomparable) optimal arms. We propose an algorithm, UnchainedBandits, that efficiently finds the set of optimal arms —the Pareto front— of any poset even when pairs of comparable arms cannot be a priori distinguished from pairs of incomparable arms, with a set of minimal assumptions. This means that UnchainedBandits does not require information about comparability and can be used with limited knowledge of the poset. To achieve this, the algorithm relies on the concept of decoys, which stems from social psychology. We also provide theoretical guarantees on both the regret incurred and the number of comparison required by UnchainedBandits, and we report compelling empirical results.

上一篇:A Decomposition of Forecast Error in Prediction Markets

下一篇:Scalable Demand-Aware Recommendation

用户评价
全部评价

热门资源

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Rating-Boosted La...

    The performance of a recommendation system reli...