资源论文Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs

Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs

2020-03-09 | |  56 |   35 |   0

Abstract

Deep learning models are often successfully trained using gradient descent, despite the worst case hardness of the underlying non-convex optimization problem. The key question is then under what conditions can one prove that optimization will succeed. Here we provide a strong result of this kind. We consider a neural net with one hidden layer and a convolutional structure with no overlap, and a ReLU activation function. For this architecture we show that learning is NP-complete in the general case, but that when the input distribution is Gaussian, gradient descent converges to the global optimum in polynomial time. To the best of our knowledge, this is the first global optimality guarantee of gradien descent on a convolutional neural network with ReLU activations.

上一篇:Failures of Gradient-Based Deep Learning

下一篇:Deep Value Networks Learn to Evaluate and Iteratively Refine Structured Outputs

用户评价
全部评价

热门资源

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • The Variational S...

    Unlike traditional images which do not offer in...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Rating-Boosted La...

    The performance of a recommendation system reli...