Faster Derivative-Free Stochastic Algorithm for Shared Memory Machines

资源分类

2020-03-16 |

67 |

45 |

Abstract

Asynchronous parallel stochastic gradient optimization has been playing a pivotal role to solve large-scale machine learning problems in big data applications. Zeroth-order (derivative-free) methods estimate the gradient only by two function evaluations, thus have been applied to solve the problems where the explicit gradient calculations are computationally expensive or infeasible. Recently, the first asynchronous parallel stochastic zeroth-order algorithm (AsySZO) was proposed. However, its convergence rate is O( 图片.png ) for the smooth, possibly non-convex learning problems, which is significantly slower than O( ) the bes convergence rate of (asynchronous) stochastic gradient algorithm. To fill this gap, in this paper, first point out the fundamental reason leading to the slow convergence rate of AsySZO, and then propose a new asynchronous stochastic zerothorder algorithm (AsySZO+). We provide a faster 1 convergence rate O( bT ) (b is the mini-batch si for AsySZO+ by the rigorous theoretical analysis, which is a significant improvement over O( 图片.png ). The experimental results on the application of ensemble learning confirm that our AsySZO+ has a faster convergence rate than the existing (asynchronous) stochastic zeroth-order algorithms.

上一篇：The Uncertainty Bellman Equation and Exploration

下一篇：Training Neural Machines with Trace-Based Supervision

用户评价

全部评价

还没有评论，说两句吧！

热门资源

The Variational S...

Unlike traditional images which do not offer in...
Learning to Predi...

Much of model-based reinforcement learning invo...
Stratified Strate...

In this paper we introduce Stratified Strategy ...
Learning to learn...

The move from hand-designed features to learned...
A Mathematical Mo...

Direct democracy, where each voter casts one vo...

智能在线

400-630-6780
聆听.建议反馈

E-mail: support@tusaishared.com