No-Regret Algorithms for Heavy-Tailed Linear Bandits

资源分类

2020-03-06 |

92 |

36 |

Abstract

We analyze the problem of linear bandits under heavy tailed noise. Most of the work on linear bandits has been based on the assumption of bounded or sub-Gaussian noise. This assumption however is often violated in common scenarios such as financial markets. We present two algorithms to tackle this problem: one based on dynamic truncation and one based on a median of means estimator. We show that, when the noise admits only a 1 + ε moment, these algorithms are still able to achieve regret in 图片.png and respectively. In particular, they guar antee sublinear regret as long as the noise has finite variance. We also present empirical results showing that our algorithms achieve a better performance than the current state of the art for bounded noise when the L1 bound on the noise is large yet the 1+ε moment of the noise is small.

上一篇：A Superlinearly-Convergent Proximal Newton-Type Method for the Optimization of Finite Sums

下一篇：Stochastic Optimization for Multiview Representation Learning using Partial Least Squares

用户评价

全部评价

还没有评论，说两句吧！

热门资源

Learning to Predi...

Much of model-based reinforcement learning invo...
Stratified Strate...

In this paper we introduce Stratified Strategy ...
The Variational S...

Unlike traditional images which do not offer in...
A Mathematical Mo...

Direct democracy, where each voter casts one vo...
Rating-Boosted La...

The performance of a recommendation system reli...

智能在线

400-630-6780
聆听.建议反馈

E-mail: support@tusaishared.com