资源论文SGD Learns the Conjugate Kernel Class of the Network

SGD Learns the Conjugate Kernel Class of the Network

2020-02-10 | |  43 |   29 |   0

Abstract 

We show that the standard stochastic gradient decent (SGD) algorithm is guaranteed to learn, in polynomial time, a function that is competitive with the best function in the conjugate kernel space of the network, as de?ned in Daniely et al. [2016]. The result holds for logdepth networks from a rich family of architectures. To the best of our knowledge, it is the first polynomial-time guarantee for the standard neural network learning algorithm for networks of depth more that two. As corollaries, it follows that for neural networks of any depth between 2 and log(n), SGD is guaranteed to learn, in polynomial time, constant degree polynomials with polynomially bounded coef?cients. Likewise, it follows that SGD on large enough networks can learn any continuous function (not in polynomial time), complementing classical expressivity results.

上一篇:Near Minimax Optimal Players for the Finite-Time 3-Expert Prediction Problem

下一篇:Online Learning for Multivariate Hawkes Processes

用户评价
全部评价

热门资源

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Rating-Boosted La...

    The performance of a recommendation system reli...