资源论文The Implicit Bias of AdaGrad on Separable Data

The Implicit Bias of AdaGrad on Separable Data

2020-02-21 | |  47 |   37 |   0

Abstract

We study the implicit bias of AdaGrad on separable linear classification problems. We show that AdaGrad converges to a direction that can be characterized as the solution of a quadratic optimization problem with the same feasible set as the hard SVM problem. We also give a discussion about how different choices of the hyperparameters of AdaGrad might impact this direction. This provides a deeper understanding of why adaptive methods do not seem to have the generalization ability as good as gradient descent does in practice.

上一篇:muSSP: Efficient Min-cost Flow Algorithm for Multi-object Tracking

下一篇:Optimal Analysis of Subset-Selection Based `p Low-Rank Approximation

用户评价
全部评价

热门资源

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • The Variational S...

    Unlike traditional images which do not offer in...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Learning to learn...

    The move from hand-designed features to learned...