资源论文The Lingering of Gradients: How to Reuse Gradients over Time

The Lingering of Gradients: How to Reuse Gradients over Time

2020-02-17 | |  38 |   41 |   0

Abstract

 Classically, the time complexity of a first-order method is estimated by its number of gradient computations. In this paper, we study a more refined complexity by taking into account the “lingering” of gradients: once a gradient is computed at xk , the additional time to compute gradients at image.png , . . . may be reduced. We show how this improves the running time of gradient descent and SVRG. For instance, if the “additional time” scales linearly with respect to the traveled distance, then the “convergence rate” of gradient descent can be improved from 1/T to exp(image.png ). On the empirical side, we solve a hypothetical revenue management problem on the Yahoo! Front Page Today Module application with 4.6m users to image.png error (or image.png dual error) using 6 passes of the dataset.

上一篇:Contextual bandits with surrogate losses: Margin bounds and efficient algorithms

下一篇:Zeroth-order (Non)-Convex Stochastic Optimization via Conditional Gradient and Gradient Updates

用户评价
全部评价

热门资源

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • The Variational S...

    Unlike traditional images which do not offer in...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Rating-Boosted La...

    The performance of a recommendation system reli...