Neural Taylor Approximations: Convergence and Exploration in Rectifier Networks

资源分类

2020-03-09 |

88 |

70 |

Abstract

Modern convolutional networks, incorporating rectifiers and max-pooling, are neither smooth nor convex; standard guarantees therefore do not apply. Nevertheless, methods from convex optimization such as gradient descent and Adam are widely used as building blocks for deep learning algorithms. This paper provides the first convergence guarantee applicable to modern convnets, which furthermore matches a lower bound for convex nonsmooth functions. The key technical tool is the neural Taylor approximation – a straightforward application of Taylor expansions to neural networks – and the associated Taylor loss. Experiments on a range of optimizers, layers, and tasks provide evidence that the analysis accurately captures the dynamics of neural optimization. The second half of the paper applies the Taylor approximation to isolate the main difficulty in training rectifier nets – that gradient are shattered – and investigates the hypothesis that, by exploring the space of activation configurations more thoroughly, adaptive optimizers such as RMSProp and Adam are able to converge to better solutions.

上一篇：DARLA: Improving Zero-Shot Transfer in Reinforcement Learning

下一篇：Developing Bug-Free Machine Learning Systems With Formal Mathematics

用户评价

全部评价

还没有评论，说两句吧！

热门资源

A Mathematical Mo...

Direct democracy, where each voter casts one vo...
Learning to Predi...

Much of model-based reinforcement learning invo...
Hierarchical Task...

We extend hierarchical task network planning wi...
The Variational S...

Unlike traditional images which do not offer in...
Shape-based Autom...

We present an algorithm for automatic detection...

智能在线

400-630-6780
聆听.建议反馈

E-mail: support@tusaishared.com