Multiclass Classification with Bandit Feedback using Adaptive Regularization

资源分类

2020-02-27 |

98 |

38 |

Abstract

We present a new multiclass algorithm in the bandit framework, where after making a prediction, the learning algorithm receives only partial feedback, i.e., a single bit of right-orwrong, rather then the true label. Our algorithm is based on the 2nd-order Perceptron, and uses upper-confidence bounds to trade off exploration and exploitation. We analyze this algorithm in a partial adversarial setting, where instances are chosen adversarially, while the labels are chosen according to a linear probabilistic model, which is also chosen ? adversarially. We show a regret of 图片.png , which improves over the current best bounds of in the fully adversarial setting. We evaluate our algorithm on nine real-world text classification problems, obtaining state-of-the-art results, even compared with non-bandit online algorithms, especially when label noise is introduced.

上一篇：Multiclass Boosting with Hinge Loss based on Output Coding

下一篇：Efficient Sparse Modeling with Automatic Feature Grouping

用户评价

全部评价

还没有评论，说两句吧！

热门资源

Learning to Predi...

Much of model-based reinforcement learning invo...
Stratified Strate...

In this paper we introduce Stratified Strategy ...
The Variational S...

Unlike traditional images which do not offer in...
A Mathematical Mo...

Direct democracy, where each voter casts one vo...
Rating-Boosted La...

The performance of a recommendation system reli...

智能在线

400-630-6780
聆听.建议反馈

E-mail: support@tusaishared.com