Time/Accuracy Tradeoffs for Learning a ReLU with respect to Gaussian Marginals

资源分类

2020-02-20 |

76 |

47 |

Abstract

We consider the problem of computing the best-fitting ReLU with respect to squareloss on a training set when the examples have been drawn according to a spherical Gaussian distribution (the labels can be arbitrary). Let opt < 1 be the population loss of the best-fitting ReLU. We prove:

• Finding a ReLU with square-loss opt + is as hard as the problem of learning sparse parities with noise, widely thought to be computationally intractable. This is the first hardness result for learning a ReLU with respect to Gaussian marginals, and our results imply –unconditionally– that gradient descent cannot converge to the global minimum in polynomial time.

• There exists an efficient approximation algorithm for finding the best-fitting ReLU that achieves error 图片.png The algorithm uses a novel reduction to noisy halfspace learning with respect to 0/1 loss. Prior work due to Soltanolkotabi [18] showed that gradient descent can find the best-fitting ReLU with respect to Gaussian marginals, if the training set is exactly labeled by a ReLU.

上一篇：Evaluating Protein Transfer Learning with TAPE

下一篇：Sequential Neural Processes

用户评价

全部评价

还没有评论，说两句吧！

热门资源

Learning to Predi...

Much of model-based reinforcement learning invo...
Stratified Strate...

In this paper we introduce Stratified Strategy ...
The Variational S...

Unlike traditional images which do not offer in...
A Mathematical Mo...

Direct democracy, where each voter casts one vo...
Rating-Boosted La...

The performance of a recommendation system reli...

智能在线

400-630-6780
聆听.建议反馈

E-mail: support@tusaishared.com