BOUNDS ON OVER -PARAMETERIZATIONFOR GUARANTEED EXISTENCE OF DESCENT PATHSIN SHALLOW RE LU NETWORKS

资源分类

2020-01-02 |

108 |

44 |

Abstract

We study the landscape of squared loss in neural networks with one-hidden layer and ReLU activation functions. Let m and d be the widths of hidden and input layers, respectively. We show that there exist poor local minima with positive curvature for some training sets of size 图片.png By positive curvature of a local minimum, we mean that within a small neighborhood the loss function is strictly increasing in all directions. Consequently, for such training sets, there are initialization of weights from which there is no descent path to global optima. It is known that for 图片.png there always exist descent paths to global optima from all initial weights. In this perspective, our results provide a somewhat sharp characterization of the over-parameterization required for “existence of descent paths” in the loss landscape.

上一篇：Pay Attention to Features,Transfer Learn Faster CNNs

下一篇：AG ENERALIZED TRAINING APPROACHFOR MULTIAGENT LEARNING

用户评价

全部评价

还没有评论，说两句吧！

热门资源

The Variational S...

Unlike traditional images which do not offer in...
Learning to Predi...

Much of model-based reinforcement learning invo...
Stratified Strate...

In this paper we introduce Stratified Strategy ...
Learning to learn...

The move from hand-designed features to learned...
A Mathematical Mo...

Direct democracy, where each voter casts one vo...

智能在线

400-630-6780
聆听.建议反馈

E-mail: support@tusaishared.com