资源论文Oracle-Efficient Algorithms for Online Linear Optimization with Bandit Feedback?

Oracle-Efficient Algorithms for Online Linear Optimization with Bandit Feedback?

2020-02-19 | |  118 |   86 |   0

Abstract

We propose computationally efficient algorithms for online linear optimization with bandit feedback, in which a player chooses an action vector from a given (possibly infinite) set A ⊆ 图片.png, and then suffers a loss that can be expressed as a linear function in action vectors. Although existing algorithms achieve an optimal regret bound of 图片.png(√T) for T rounds (ignoring factors of poly(d,logT)), computationally efficient ways of implementing them have not yet been specified, in particular when|A|is not bounded by a polynomial size in d. A standard way to pursue computational efficiency is to assume that we have an efficient algorithm referred to as oracle that solves (offline) linear optimization problems over A. Under this assumption, the computational efficiency of a bandit algorithm can then be measured in terms of oracle complexity, i.e., the number of oracle calls. Our contribution is to propose algorithms that offer optimal regret bounds of 图片.png(√T) as well as low oracle complexity for both non-stochastic settings and stochastic settings. Our algorithm for non-stochastic settings has an oracle complexity of  图片.png(T) and is the first algorithm that achieves both a regret bound of 图片.png(√T) and an oracle complexity of 图片.png(poly(T)), given only linear optimization oracles. Our algorithm for stochastic settings calls the oracle only O(poly(d,logT)) times, whichissmallerthanthecurrentbestoraclecomplexityof O(T) if T issufficiently large.

上一篇:Deep Signature Transforms

下一篇:Cross Attention Network for Few-shot Classification

用户评价
全部评价

热门资源

  • Deep Cross-media ...

    Cross-media retrieval is a research hotspot in ...

  • Regularizing RNNs...

    Recently, caption generation with an encoder-de...

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Joint Pose and Ex...

    Facial expression recognition (FER) is a challe...

  • Visual Reinforcem...

    For an autonomous agent to fulfill a wide range...