How To Make the Gradients Small Stochastically: Even Faster Convex and Nonconvex SGD?

登录免费注册

论文
算法
数据集
经验分享
技术动态
行业动态

论文
学习
研究领域

算法
学习
研究领域

数据集
自动驾驶
图片

经验分享
学习
研究领域

技术动态
计算机视觉
自然语言处理

行业动态
教育
语音识别

》资源》论文》How To Make the Gradients Small Stochastically: Even Faster Convex and Nonconvex SGD?

How To Make the Gradients Small Stochastically: Even Faster Convex and Nonconvex SGD?

2020-02-14 |

|

81 |

48 |

0

0

How To Make the Gradients Small Stochastically: Even Faster Convex and Nonconvex SGD?
论文

Abstract

Stochastic gradient descent (SGD) gives an optimal convergence rate when minimizing convex stochastic objectives f (x). However, in terms of making the gradients small, the original SGD does not give an optimal rate, even when f (x) is convex. If f (x) is convex, to find a point with gradient norm , we design an algorithm SGD3 with a near-optimal rate , improving the best known rate . If f (x) is nonconvex, to find its -approximate local minimum, we design an algorithm SGD5 with rate , where previously SGD variants only achieve [6, 14, 30]. This is no slower than the best known stochastic version of Newton’s method in all parameter regimes [27].

上一篇：Watch Your Step: Learning Node Embeddings via Graph Attention

下一篇：Bayesian Distributed Stochastic Gradient Descent

用户评价

登录
注册

全部评价

还没有评论，说两句吧！

热门资源

The Variational S...

Unlike traditional images which do not offer in...
Learning to Predi...

Much of model-based reinforcement learning invo...
Stratified Strate...

In this paper we introduce Stratified Strategy ...
Learning to learn...

The move from hand-designed features to learned...
A Mathematical Mo...

Direct democracy, where each voter casts one vo...

智能在线

400-630-6780
聆听.建议反馈

E-mail: support@tusaishared.com

关于我们
智享云简介联系我们隐私声明
服务与支持
使用帮助联系我们
快速链接
启迪智享官网
咨询电话：010-82353090

工作日早9:00-晚6:00

© 2009-2019 tusaishared.com.cn 版权所有京ICP备19018324号