资源论文Fast Probabilistic Optimization from Noisy Gradients

Fast Probabilistic Optimization from Noisy Gradients

2020-03-02 | |  53 |   44 |   0

Abstract

Stochastic gradient descent remains popular in large-scale machine learning, on account of its very low computational cost and robustness to noise. However, gradient descent is only linearly efficient and not transformation invariant. Scaling by a local measure can substantially improve its performance. One natural choice of such a scale is the Hessian of the objective function: Were it available, it would turn linearly efficient gradient descent into the quadratically efficient Newton-Raphson optimization. Existing covariant methods, though, are either super-linearly expensive or do not address noise. Generalising recent results, this paper constructs a nonparametric Bayesian quasi-Newton algorithm that learns gradient and Hessian from noisy evaluations of the gradient. Importantly, the resulting algorithm, like stochastic gradient descent, has cost linear in the number of input dimensions.

上一篇:Estimation of Causal Peer Influence Effects

下一篇:Learning Policies for Contextual Submodular Prediction

用户评价
全部评价

热门资源

  • Learning to learn...

    The move from hand-designed features to learned...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • Rating-Boosted La...

    The performance of a recommendation system reli...

  • Hierarchical Task...

    We extend hierarchical task network planning wi...