资源论文Data-Dependent Stability of Stochastic Gradient Descent

Data-Dependent Stability of Stochastic Gradient Descent

2020-03-19 | |  41 |   31 |   0

Abstract

We establish a data-dependent notion of algorithmic stability for Stochastic Gradient Descent (SGD), and employ it to develop novel generalization bounds. This is in contrast to previous distribution-free algorithmic stability results fo SGD which depend on the worst-case constants. By virtue of the data-dependent argument, our bounds provide new insights into learning with SGD on convex and non-convex problems. In the convex case, we show that the bound on the generalization error depends on the risk at the initialization point. In the non-convex case, we prove that the expected curvature of the objective function around the initialization point has crucial influence on the generalization error. In bot cases, our results suggest a simple data-driven strategy to stabilize SGD by pre-screening its ini tialization. As a corollary, our results allow us to show optimistic generalization bounds that exhibit fast convergence rates for SGD subject to a vanishing empirical risk and low noise of stochastic gradient.

上一篇:Been There, Done That: Meta-Learning with Episodic Recall

下一篇:Discrete-Continuous Mixtures in Probabilistic Programming: Generalized Semantics and Inference Algorithms

用户评价
全部评价

热门资源

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Rating-Boosted La...

    The performance of a recommendation system reli...