资源论文Constant Regret, Generalized Mixability, and Mirror Descent

Constant Regret, Generalized Mixability, and Mirror Descent

2020-02-14 | |  43 |   35 |   0

Abstract 

We consider the setting of prediction with expert advice; a learner makes predictions by aggregating those of a group of experts. Under this setting, and for the right choice of loss function and “mixing” algorithm, it is possible for the learner to achieve a constant regret regardless of the number of prediction rounds. For example, a constant regret can be achieved for mixable losses using the aggregating algorithm. The Generalized Aggregating Algorithm (GAA) is a name for a family of algorithms parameterized by convex functions on simplices (entropies), which reduce to the aggregating algorithm when using the Shannon entropy S. For a given entropy image.png, losses for which a constant regret is possible using the GAA are called image.png-mixable. Which losses are image.png-mixable was previously left as an open question. We fully characterize image.png-mixability and answer other open questions posed by [6]. We show that the Shannon entropy S is fundamental in nature when it comes to mixability; any image.png-mixable loss is necessarily S-mixable, and the lowest worst-case regret of the GAA is achieved using the Shannon entropy. Finally, by leveraging the connection between the mirror descent algorithm and the update step of the GAA, we suggest a new adaptive generalized aggregating algorithm and analyze its performance in terms of the regret bound.

上一篇:Regret bounds for meta Bayesian optimization with an unknown Gaussian process prior

下一篇:Causal Inference with Noisy and Missing Covariates via Matrix Factorization

用户评价
全部评价

热门资源

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • The Variational S...

    Unlike traditional images which do not offer in...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Rating-Boosted La...

    The performance of a recommendation system reli...