On the Quality of the Initial Basin in Overspecified Neural Networks

资源分类

2020-03-05 |

70 |

48 |

Abstract

Deep learning, in the form of artificial neural net works, has achieved remarkable practical success in recent years, for a variety of difficult machine learning applications. However, a theoretical explanation for this remains a major open problem, since training neural networks involves optimizing a highly non-convex objective function, and is known to be computationally hard in the worst case. In this work, we study the geometric structure of the associated non-convex objective function, in the context of ReLU networks and starting from a random initialization of the network parameters. We identify some conditions under which it becomes more favorable to optimization, in the sense of (i) High probability of initializing at a point from which there is a monotonically decreasing path to a global minimum; and (ii) High probability of initializing at a basi (suitably defined) with a small minimal objective value. A common theme in our results is that such properties are more likely to hold for larger (“overspecified”) networks, which accords with some recent empirical and theoretical observations.

上一篇：Gaussian process nonparametric tensor estimator and its minimax optimality

下一篇：Markov-modulated Marked Poisson Processes for Check-in Data

用户评价

全部评价

还没有评论，说两句吧！

热门资源

The Variational S...

Unlike traditional images which do not offer in...
Learning to Predi...

Much of model-based reinforcement learning invo...
Stratified Strate...

In this paper we introduce Stratified Strategy ...
A Mathematical Mo...

Direct democracy, where each voter casts one vo...
Rating-Boosted La...

The performance of a recommendation system reli...

智能在线

400-630-6780
聆听.建议反馈

E-mail: support@tusaishared.com