资源论文Differential Equations for Modeling Asynchronous Algorithms

Differential Equations for Modeling Asynchronous Algorithms

2019-11-05 | |  64 |   39 |   0
Abstract Asynchronous stochastic gradient descent (ASGD) is a popular parallel optimization algorithm in machine learning. Most theoretical analysis on ASGD take a discrete view and prove upper bounds for their convergence rates. However, the discrete view has its intrinsic limitations: there is no characterization of the optimization path and the proof techniques are induction-based and thus usually complicated. Inspired by the recent successful adoptions of stochastic differential equations (SDE) to the theoretical analysis of SGD, in this paper, we study the continuous approximation of ASGD by using stochastic differential delay equations (SDDE). We introduce the approximation method and study the approximation error. Then we conduct theoretical analysis on the convergence rates of ASGD algorithm based on the continuous approximation. There are two methods: moment estimation and energy function minimization can be used to analyze the convergence rates. Moment estimation depends on the specific form of the loss function, while energy function minimization only leverages the convex property of the loss function, and does not depend on its specific form. In addition to the convergence analysis, the continuous view also helps us derive better convergence rates. All of this clearly shows the advantage of taking the continuous view in gradient descent algorithms.

上一篇:MIXGAN: Learning Concepts from Different Domains for Mixture Generation

下一篇:Preventing Disparate Treatment in Sequential Decision Making Hoda Heidari1 and Andreas Krause1

用户评价
全部评价

热门资源

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • Learning to learn...

    The move from hand-designed features to learned...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Learning to Predi...

    Much of model-based reinforcement learning invo...