资源论文SA DAM :A VARIANT OF ADAM FORS TRONGLY CONVEX FUNCTIONS

SA DAM :A VARIANT OF ADAM FORS TRONGLY CONVEX FUNCTIONS

2020-01-02 | |  53 |   39 |   0

Abstract

The Adam algorithm has become extremely popular for large-scale machine learning.? Under convexity condition, it has been proved to enjoy a data-dependent O( T ) regret bound where T is the time horizon. However, whether strong convexity can be utilized to further improve the performance remains an open problem. In this paper, we give an affirmative answer by developing a variant of Adam (referred to as SAdam) which achieves a data-dependent O(log T ) regret bound for strongly convex functions. The essential idea is to maintain a faster decaying yet under controlled step size for exploiting strong convexity. In addition, under a special configuration of hyperparameters, our SAdam reduces to SC-RMSprop, a recently proposed variant of RMSprop for strongly convex functions, for which we provide the first data-dependent logarithmic regret bound. Empirical results on optimizing strongly convex functions and training deep networks demonstrate the effectiveness of our method.

上一篇:HIGHER -O RDER FUNCTION NETWORKS FOR LEARN -ING COMPOSABLE 3D OBJECT REPRESENTATIONS

下一篇:GAP -AWARE MITIGATION OF GRADIENT STALENESS

用户评价
全部评价

热门资源

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Rating-Boosted La...

    The performance of a recommendation system reli...