资源论文Multiclass Classification with Bandit Feedback using Adaptive Regularization

Multiclass Classification with Bandit Feedback using Adaptive Regularization

2020-02-27 | |  98 |   38 |   0

Abstract

We present a new multiclass algorithm in the bandit framework, where after making a prediction, the learning algorithm receives only partial feedback, i.e., a single bit of right-orwrong, rather then the true label. Our algorithm is based on the 2nd-order Perceptron, and uses upper-confidence bounds to trade off exploration and exploitation. We analyze this algorithm in a partial adversarial setting, where instances are chosen adversarially, while the labels are chosen according to a linear probabilistic model, which is also chosen ? adversarially. We show a regret of 图片.png, which improves over the current best bounds of 图片.png in the fully adversarial setting. We evaluate our algorithm on nine real-world text classification problems, obtaining state-of-the-art results, even compared with non-bandit online algorithms, especially when label noise is introduced.

上一篇:Multiclass Boosting with Hinge Loss based on Output Coding

下一篇:Efficient Sparse Modeling with Automatic Feature Grouping

用户评价
全部评价

热门资源

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • The Variational S...

    Unlike traditional images which do not offer in...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Rating-Boosted La...

    The performance of a recommendation system reli...