资源论文Understanding Generalization and Optimization Performance of Deep CNNs

Understanding Generalization and Optimization Performance of Deep CNNs

2020-03-16 | |  48 |   39 |   0

Abstract

This work aims to provide understandings on the remarkable success of deep convolutional neural networks (CNNs) by theoretically analyzing their generalization performance and establishing optimization guarantees for gradient descent based training algorithms. Specifically, for a CNN model consisting of l convolutional layers and one fully connected layer, we provepthat its generalization error is bounded by 图片.png where denotes freedom degree of the network parameQl ters and 图片.png图片.pngencapsulates architecture parameters including the kernel size ki , stride si , pooling p and parameter magnitude bi . To our best knowledge, this is the first generalization bound that Ql+1 only depends on 图片.png, tighter than existing ones that all involve an exponential term  like 图片.png. Besides, we prove that for an ar bitrary gradient descent algorithm, the computed approximate stationary point by minimizing empirical risk is also an approximate stationary poi to the population risk. This well explains why gra dient descent training algorithms usually perform sufficiently well in practice. Furthermore, we prove the one-to-one correspondence and convergence guarantees for the non-degenerate stationary points between the empirical and population risks. It implies that the computed local minimum for the empirical risk is also close to a local mi imum for the population risk, thus ensuring the good generalization performance of CNNs.

上一篇:x2 Generative Adversarial Network

下一篇:Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples

用户评价
全部评价

热门资源

  • Learning to learn...

    The move from hand-designed features to learned...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • Rating-Boosted La...

    The performance of a recommendation system reli...

  • Hierarchical Task...

    We extend hierarchical task network planning wi...